Bad Tasks Corrupting Projects?

Message boards : Questions and problems : Bad Tasks Corrupting Projects?
Message board moderation

To post messages, you must log in.

AuthorMessage
SuperSluether

Send message
Joined: 6 Jul 14
Posts: 94
United States
Message 57130 - Posted: 28 Oct 2014, 16:57:07 UTC

I just had a really weird problem. As far as I know, my computer was just about ready to finish up a GPU task for MilkyWay, and after that it was going to switch to a GPU task from GPUGrid. I can't say for certain, but just before (or as) the MilkyWay task completed, my computer locked up. After a hard reboot, the same thing happened as soon as the BOINC manager loaded. After another reboot, I moved the project files before starting the manager so I could suspend GPU activity. When I moved the files back and restarted the client, it didn't recognize the existing tasks for those projects. Even more confusing, the Collatz Conjecture project was also missing after I moved the files back. (Sorry if this explanation is too detailed)

Is it possible that one of the tasks went haywire and corrupted Collatz Conjecture? I looked in the data directory, and the project's folder was completely empty (not even a placeholder). Or do you think this sounds more like a hard drive failure? (I'm using an older drive to store BOINC data and a few other large, not-so-important files)
ID: 57130 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 57131 - Posted: 28 Oct 2014, 17:08:57 UTC - in response to Message 57130.  

Why do you think a hard drive failure and not a GPU failure?
What operating system do you run?
What kind of GPU?
Drivers?
Which BOINC version?
Ever read the revamped when requesting help on these forums thread?
Any error messages of note?
What does it say in stderrdae.txt, stderrout.txt, stderrgui.txt?
If Windows, what does Windows Event Viewer say about the hangups?
If another operating system, does it have error logging?
ID: 57131 · Report as offensive
SuperSluether

Send message
Joined: 6 Jul 14
Posts: 94
United States
Message 57203 - Posted: 29 Oct 2014, 18:10:31 UTC - in response to Message 57131.  

Oops, sorry. I guess I should have read through first. Anyway, I didn't think it was a GPU failure because the card isn't even a month old, and all the GPU tasks before and after this issue worked fine.

I run Windows 8.1 with an Nvidia GeForce 760 GPU from EVGA with driver version 344.48. BOINC version is 7.2.42 (x64)

stderrdae.txt says

10-Oct-2014 12:29:09 Another instance of BOINC is running.
GLE: 10-Oct-2014 12:29:09 Another instance of BOINC is running.
GLE: 10-
10-Oct-2014 12:29:10 gstate.init() failed
Error Code: 183
10-Oct-2014 12:29:58 gstate.init() failed
Error Code: 183
10-Oct-2014 12:30:20 gstate.init() failed
Error Code: 183
10-Oct-2014 12:31:49 gstate.init() failed
Error Code: 183
12-Oct-2014 13:47:10 Another instance of BOINC is running.
GLE: 12-Oct-2014 13:47:10 Another instance of BOINC is running.
GLE: 12-
14-Oct-2014 21:44:27 Another instance of BOINC is running.
GLE: 14-Oct-2014 21:44:27 Another instance of BOINC is running.
GLE: 14-
14-Oct-2014 23:38:28 Another instance of BOINC is running.
GLE: 14-Oct-2014 23:38:28 Another instance of BOINC is running.
GLE: 14-
15-Oct-2014 07:57:28 Another instance of BOINC is running.
GLE: 15-Oct-2014 07:57:28 Another instance of BOINC is running.
GLE: 15-
20-Oct-2014 14:10:29 Another instance of BOINC is running.
GLE: 20-Oct-2014 14:10:29 Another instance of BOINC is running.
GLE: 20-
22-Oct-2014 14:25:00 Another instance of BOINC is running.
GLE: 22-Oct-2014 14:25:00 Another instance of BOINC is running.
GLE: 22-
25-Oct-2014 18:28:26 Another instance of BOINC is running.
GLE: 25-Oct-2014 18:28:26 Another instance of BOINC is running.
GLE: 25-
25-Oct-2014 18:34:37 Another instance of BOINC is running.
GLE: 25-Oct-2014 18:34:37 Another instance of BOINC is running.
GLE: 25-
25-Oct-2014 19:35:21 Another instance of BOINC is running.
GLE: 25-Oct-2014 19:35:21 Another instance of BOINC is running.
GLE: 25-
25-Oct-2014 21:07:21 Another instance of BOINC is running.
GLE: 25-Oct-2014 21:07:21 Another instance of BOINC is running.
GLE: 25-
26-Oct-2014 08:18:19 Another instance of BOINC is running.
GLE: 26-Oct-2014 08:18:19 Another instance of BOINC is running.
GLE: 26-

But this problem happened the 28th and the most recent entry is the 26th.

I couldn't find an stderrout.txt, and stderrgui.txt is empty.

All I can find in the Windows Event Viewer are from when I had to manually reboot. Aside from normal bootup information, all I can find is "last shutdown was unexpected" and "the system turned off without properly shutting down first."
ID: 57203 · Report as offensive
SuperSluether

Send message
Joined: 6 Jul 14
Posts: 94
United States
Message 57204 - Posted: 29 Oct 2014, 18:16:45 UTC - in response to Message 57203.  

I found a file called stdoutdae.txt that looks like a saved log from the BOINC Event Log. On the 28th right after a GPUGrid task finished downloading, there's a break in the log. The next part has the normal startup info, but has various errors saying it can't parse files for Collatz Conjecture. This is right around where I moved the project files for MilkyWay and GPUGrid somewhere else so BOINC would not try to start a GPU task. I have a lot of errors saying that files, applications, and workunits are "outside project in state file." I also have "no project URL in task state file" for GPUGrid, Rosetta@home, and climateprediction.net at this point in the file. After the messages about the files, I have a lot of tasks saying they exited with zero status but no "finished" file from rosetta and climateprediction. After that, the log goes back to normal. (This is where I reset the projects that had GPU tasks)

I didn't know Rosetta and CPDN were affected by this issue. I sure hope it doesn't happen again.
ID: 57204 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 57208 - Posted: 29 Oct 2014, 21:50:55 UTC - in response to Message 57204.  

BOINC may have been writing client_state.xml or account*.xml files when the computer locked up. That may have corrupted the files. Or, without knowing what files you moved, you may have caused the problems yourself. BOINC doesn't like when its files disappear.

The next time you need to control the client without starting it, edit client_state.xml file. Elements that correspond to the Activity menu in Manager are:

<user_run_request>
<user_gpu_request>
<user_network_request>

And the values to use:

1 = always
2 = based on preferences
3 = suspend
ID: 57208 · Report as offensive
SuperSluether

Send message
Joined: 6 Jul 14
Posts: 94
United States
Message 57209 - Posted: 29 Oct 2014, 22:19:41 UTC - in response to Message 57208.  

BOINC may have been writing client_state.xml or account*.xml files when the computer locked up. That may have corrupted the files. Or, without knowing what files you moved, you may have caused the problems yourself. BOINC doesn't like when its files disappear.

The next time you need to control the client without starting it, edit client_state.xml file. Elements that correspond to the Activity menu in Manager are:

<user_run_request>
<user_gpu_request>
<user_network_request>

And the values to use:

1 = always
2 = based on preferences
3 = suspend


Thanks, I didn't know there was a way to do that without opening the manager.
ID: 57209 · Report as offensive

Message boards : Questions and problems : Bad Tasks Corrupting Projects?

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.