Random WU's & Hosts Hanging After Upgrade to 5.4.9

Message boards : BOINC Manager : Random WU's & Hosts Hanging After Upgrade to 5.4.9
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 4610 - Posted: 1 Jun 2006, 12:09:27 UTC

I'm noticing a definite random trend (excuse the oxymoron) since upgrading to 5.4.9.

I use BOINCView to manage my little farmlet, and since upgrading to 5.4.9 I'm starting to notice with a sort of regularity a number of hosts being highlighted with cpu efficiency of 0 - ie. hung wu.

So far the wu's and hosts are random but it's happening enough for me to recognise a trend.

Has anyone else noticed this?
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 4610 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 4612 - Posted: 1 Jun 2006, 13:37:24 UTC

If those hung results are Seti Enhanced results, then yes, that is known. But you're better off discussing that on the Seti NC forum.

The hanging of results isn't so much a BOINC issue. BOINC doesn't crunch, it's a managing program. The science applications under BOINC do the crunching. So if anything hangs, it's not BOINC's fault, but more likely the science application.
ID: 4612 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 4628 - Posted: 2 Jun 2006, 19:54:54 UTC

Just checking BV and its highlighted two wu's that are "running" with 0% cpu effiency. Both wu's are from LHC and both hosts are running linux. One is hung at 100% and the other at 79.84%.


CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 4628 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 4631 - Posted: 2 Jun 2006, 21:59:47 UTC - in response to Message 4628.  

Just exit BOINC completely and restart it.

If that doesn't help, reboot your computer.
ID: 4631 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 4639 - Posted: 3 Jun 2006, 6:23:48 UTC

That's what I do.

Just had another wu on a different host "hang" - this time a FAAH wu from WCG, again a linux host. I started to suspect Pirates as a cause, they are trying to work out why there are errors on linux boxes when the code is compiled in FC4 but works fine when compiled in FC3. I started suspecting Pirates because the two hosts that hung this morning had both just completed "bad" Pirate wu's, but this latest incident doesn't bear that out.

I think it is occurring with the work unit handover, not in the midst of processing a wu, ie it's a BOINC manager problem not an individual app problem. I'm also starting to suspect that its a linux problem only, but I can't be sure yet.
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 4639 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 4649 - Posted: 5 Jun 2006, 10:22:45 UTC - in response to Message 4639.  

I'm also starting to suspect that its a linux problem only, but I can't be sure yet.


It's not just a linux problem. Just noticed a malaria control wu "hang" on a win2k host.
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 4649 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 4694 - Posted: 12 Jun 2006, 11:15:15 UTC

The problem seems to be at handover. If you look in the tasks pane in BOINC Manager it shows the workunit as running, but it doesn't progress, check the messages pane and you can see the previous workunit is stopped, but there is no corresponding message for the new workunit commencing/recommencing.

In many cases, pausing and then resuming the workunit fixes the problem.

Also it's not an exit with 0 result status situation, this is different.
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 4694 · Report as offensive

Message boards : BOINC Manager : Random WU's & Hosts Hanging After Upgrade to 5.4.9

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.