Linux: problem after unexpected reboot

Message boards : BOINC client : Linux: problem after unexpected reboot
Message board moderation

To post messages, you must log in.

AuthorMessage
scarecrow

Send message
Joined: 28 Nov 05
Posts: 15
United States
Message 1723 - Posted: 9 Dec 2005, 18:08:34 UTC

This may not be fully accurate since unexpected reboots are usually rare, but this sequence of events has occured in both unexpected reboots I've experienced.

In a nut shell, in both occurances, client_state.xml is found to be in error and is moved to the lost&found directory. If the repairs are completed, and the final reboot back to normal continues, boinc will create a new client_state.xml, run benchmarks, download all new work, and continue on, but 'losing' any previously exising work 'being kept track of' in previous client_state.xml

This may be a fluke, or intermittent occurance, but it's happened 2 out of 2 times for me. It's not a critical show stopper, just loses work it was doing before the reboot/shutdown

I discovered a work-around the 2nd time this occured, but I don't know if it's something that will work every time. This isn't intended as a gripe or complaint, just as a heads up that should an ungraceful shutdown or reboot occur, boinc could be affected indirectly, and that with a little manual intervention, the boinc related problems may be able to be avoided with no loss of work that was in the hopper when the reboot happened.
ID: 1723 · Report as offensive
Paul D. Buck

Send message
Joined: 29 Aug 05
Posts: 225
Message 1750 - Posted: 10 Dec 2005, 10:23:49 UTC

This is a problem that they have been trying to track down and eliminate. Perhaps the work around you have found may provide a clue as to what is happening.

This is one of those intermittant problems that usually only affects a small number of people, so tha makes it very hard to track down.
ID: 1750 · Report as offensive
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 1751 - Posted: 10 Dec 2005, 12:35:57 UTC

There's one quick and dirty thing to try until they find the solution to the problem. In your case (you've lost the client_state.xml entirely) you can add a check to your scripts that start BOINC CC. If the client_state.xml file doesn't exist, don't start the BOINC CC. This way you can try to find it (either in lost+found or use client_state_prev.xml) and only then start BOINC CC.
Metod ...
ID: 1751 · Report as offensive
scarecrow

Send message
Joined: 28 Nov 05
Posts: 15
United States
Message 1763 - Posted: 10 Dec 2005, 19:18:49 UTC - in response to Message 1751.  
Last modified: 10 Dec 2005, 19:29:49 UTC

There's one quick and dirty thing to try until they find the solution to the problem. In your case (you've lost the client_state.xml entirely) you can add a check to your scripts that start BOINC CC. If the client_state.xml file doesn't exist, don't start the BOINC CC. This way you can try to find it (either in lost+found or use client_state_prev.xml) and only then start BOINC CC.

This was in fact the aforementioned work-around. The initial fsck run went into the "unexpected problems found, rerun fsck manually" mode. The manual run found problems with client_state.xml and moved it to lost&found. Pretty easy to find in this case, it was the only file there. It's contents were not munged up at all, so copying it from lost&found back to the BOINC directory before letting boinc start back up was the 'fix' for boinc being able to pick up where it left off. The larger problem, of course, is the need for a manual run of fsck keeping the entire system from rebooting because of the problems with the boinc related file. With over 190 days of uptime on the machine running boinc before mother nature saw fit to coat us with ice and steal our electricity for 48 hours, I don't anticipate it being a major problem for me at least.

@Paul
This happened on 2 Dec so the error logs have already rotated into oblivion, and my memory is far to shot to recall the exact cause for fsck to flag client_state.xml as a problem child. If I can find anything pertinent and specfic I'll pass it along.
ID: 1763 · Report as offensive

Message boards : BOINC client : Linux: problem after unexpected reboot

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.