WU misbehaving while starting BOINC client (Linux x86_64)

Message boards : Questions and problems : WU misbehaving while starting BOINC client (Linux x86_64)
Message board moderation

To post messages, you must log in.

AuthorMessage
Andris Pavenis

Send message
Joined: 30 Aug 08
Posts: 3
Finland
Message 19848 - Posted: 30 Aug 2008, 13:18:22 UTC

For a rather long time I have noticed the following behavior when starting BOINC client:

1) One WU sometimes gets the CPU time of another one (like Climate prediction WU time is set to one of WU of some other project). See for example:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=8003980

2) Other problem is that WU sometimes crashes at startup. In case of Climate prediction I'm getting a message that WU have exited without result file and it restarts after that. The first problem seems to happen at the same time when this one. In case of SETI@home I'm getting a kernel message about SIGSEGV.
In several cases I have seen that when SETI@home WU crashes at this way, it gots also CPU time of another WU.

These things does not happen when client is up and running, only when starting BOINC client and sometimes when resuming the project.

I tried to suspend all projects before shutting down system. It did not help. For example I resumed at first Climate prediction project and after some short time also Einstein@Home. Climate prediction WU restarted (problem 2) when I resumed Einstein@home project.

All that does not seem to depend on BOINC version.

Some system related information:
Intel Core 2 Quad 2.4GHz, Fedora 9, x86_64.

Earlier I used BOINC package provided by Fedora 9 (5.10.45). Later I replaced it by 64 bit version of BOINC 6.2.14 and after that 6.2.15. Nothing changed, I'm still having these problems.
ID: 19848 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 19849 - Posted: 30 Aug 2008, 14:00:41 UTC

It sounds like a problem that occurred to a person on cpdn a few years ago, where the data in the slots folders got mixed up; the data for one wu is in the folder allocated to a different wu.

ID: 19849 · Report as offensive
Andris Pavenis

Send message
Joined: 30 Aug 08
Posts: 3
Finland
Message 19850 - Posted: 30 Aug 2008, 19:08:48 UTC - in response to Message 19849.  

It sounds like a problem that occurred to a person on cpdn a few years ago, where the data in the slots folders got mixed up; the data for one wu is in the folder allocated to a different wu.



If so, what would be best way to fix it?

Simplest could be to finish all WU and then to start from scratch after cleaning
BOINC directory. Only thing is that finishing CPDN WUs would still take a rather long time. Could it be enough to detach from all projects except CPDN and then reattach?
ID: 19850 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 19852 - Posted: 30 Aug 2008, 22:00:25 UTC

Sorry, late getting up this morning.

I don't remember the cure, but I agree with Dagorath about a possibility.
If you have to, just abort the climate model as well. There have been thousands of them lost over the years, so one more won't matter.

ID: 19852 · Report as offensive

Message boards : Questions and problems : WU misbehaving while starting BOINC client (Linux x86_64)

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.