(Waiting on GPU) Returns!...Repeatable on demand!

Message boards : Questions and problems : (Waiting on GPU) Returns!...Repeatable on demand!
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 42241 - Posted: 25 Jan 2012, 18:55:54 UTC - in response to Message 42240.  

Not without sending along your complete BOINC Data directory. It isn't just copying the tasks along to another computer, it requires an attachment to that project, entries made in the client_state.xml file etc.
ID: 42241 · Report as offensive
Jimmy G (BA)

Send message
Joined: 26 Sep 11
Posts: 41
Message 42242 - Posted: 25 Jan 2012, 20:07:31 UTC - in response to Message 42241.  
Last modified: 25 Jan 2012, 20:23:52 UTC

Not without sending along your complete BOINC Data directory. It isn't just copying the tasks along to another computer, it requires an attachment to that project, entries made in the client_state.xml file etc.


Tell me exactly what you need.../library/application support/boinc data? That weighs in at 10.71GB...to big for an email attachment. :) Perhaps I can send along the contents in a series of attachments?

:)

Edit: Looking at the contents of that folder the bulk of it is the CPDN folder @ 10.45GB...that's all been suspended by me for the past week waiting to hear back from you folks on tripping First Issue... above. That leaves a remainder of ~0.26GB...much more reasonable... :)
ID: 42242 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 42251 - Posted: 26 Jan 2012, 9:54:39 UTC

Hi Jimmy G,

The application that was restarting appears to be the BOINC Client, not the BOINC Manager (which is just the user interface to the Client). But since there was no crash report in stderrdae.txt, I don't have any idea why it restarted itself.

I have forwarded the information you sent me to David Anderson. He is the person who has been working on the BOINC scheduler. I really know very little about that aspect of BOINC. I am hoping he can help you figure this out.

Cheers,
--Charlie

Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 42251 · Report as offensive
Jimmy G (BA)

Send message
Joined: 26 Sep 11
Posts: 41
Message 42259 - Posted: 26 Jan 2012, 13:49:50 UTC - in response to Message 42251.  
Last modified: 26 Jan 2012, 13:57:13 UTC

Hi Jimmy G,

The application that was restarting appears to be the BOINC Client, not the BOINC Manager (which is just the user interface to the Client). But since there was no crash report in stderrdae.txt, I don't have any idea why it restarted itself.

I have forwarded the information you sent me to David Anderson. He is the person who has been working on the BOINC scheduler. I really know very little about that aspect of BOINC. I am hoping he can help you figure this out.

Cheers,
--Charlie


Hi Charlie,

Thanks for getting back on this... :)

My hope here has been to lend you folks a hand on (what have turned out to be) some of the not-so-common (?) glitches with running BOINC projects...though my particular setup seems to encounter them with a fair bit of regularity...

The Waiting to run (waiting for GPU memory) status occurs several times a month on my end and, as mentioned in the OP, perpetrates itself in a couple of scenarios. It took me a while of paying attention (read: babysitting) to figure out the series of events leading up to the error event.

My other hope here has been to be able to facilitate whatever I can on my end to help you folks log, track down or otherwise gather some important clues as to the cause of these glitches. Fortunately (for you) I have been temporarily retired and have had some free time on my hands to play with this...though that situation will change on my end. Sooo, I figured, let's take a crack-at-it, while I still can, and maybe you folks can walk away with a better understanding of some of the anomalies that are occurring with your code.

Like I've mentioned several times...if there is any troubleshooting code or instructions or scripts that you folks need me to run on my end to better monitor the wayward processes kindly let me know where I can help! FWIW, I'm retired 32-year telecom tech...so I'm not completely without skills! Ha!

Best to you,
:)

P.S. I still have those 7 WUs sitting in both the queue and in memory and they will be due back by their February 5th deadline...so the clock is ticking on these if there's anything you'd like me to do with them to help you out!

...tick, tick, tick, tick, tick...
ID: 42259 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 42273 - Posted: 27 Jan 2012, 1:11:26 UTC - in response to Message 42259.  
Last modified: 27 Jan 2012, 1:11:43 UTC

Thanks for your offers of help. We don't need any more information regarding the "Waiting for GPU memory" issue; we are well aware of that. It means just what it says: there is not enough free memory in your GPU to run the task. If you want to get more information, set the coproc_debug flag in your cc_config.xml file as explained here.

Unfortunately, due to a bug in some drivers, BOINC currently cannot check available GPU RAM periodically, so the check is made only when you launch BOINC. That means that if more GPU RAM becomes available after BOINC starts, BOINC does not know about it. We are working on this problem.

My concerns are the several other issues you have raised, such as the md5 errors, spontaneous restarts, etc.

Have you considered becoming a volunteer BOINC tester? We always need more testers, especially on the Mac. You can learn about this here.
Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 42273 · Report as offensive
Jimmy G (BA)

Send message
Joined: 26 Sep 11
Posts: 41
Message 42275 - Posted: 27 Jan 2012, 3:18:38 UTC - in response to Message 42273.  

Hi Charlie,

Well, the (waiting for GPU memory) status is odd for showing up on my machine as I don't have a GPU for which any of my BOINC projects could call into action! The AMD Radeon HD 6970M should never be receiving any calls from any of the software involved. Am I missing something here?

Also, it would appear that I cannot
set the coproc_debug flag in your cc_config.xml file
as I'm still running 6.12.35. From your link...

<coproc_debug>
Show details of coprocessor (GPU) scheduling. List-add.png New in 6.3


As for...

My concerns are the several other issues you have raised, such as the md5 errors, spontaneous restarts, etc.


...I just suffered another CPU lockup issue so I decided to look into my system logs to see if any clues existed there. I'll be starting a new thread for that issue once I have my information gathered. For now, it looks like the BOINC Manager is going into an endless loop without an exit door built in...but more on that in the upcoming post.

As for testing, I have been giving it my consideration...I went so far as to subscribe to the Boinc Alpha Mailing List <boinc_alpha at ssl.berkeley dot edu> to get a better idea of the work involved...it looks interesting. I've got a busy Spring ahead of me, but things might be more forgiving time-wise some time over the summer... :)

Best,
JG
ID: 42275 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 42278 - Posted: 27 Jan 2012, 7:27:15 UTC - in response to Message 42275.  

Well, the (waiting for GPU memory) status is odd for showing up on my machine as I don't have a GPU for which any of my BOINC projects could call into action!

Ah, I missed that. That is another bug that was fixed in the BOINC 7.0 series:

David 12 Sept 2011
in GUI RPC, change RESULT.gpu_mem_wait to scheduler_wait.
It means that the app did a boinc_temporary_exit(),
and is waiting to be rescheduled.
GPU mem wait is one source of this, not the only one

We're no longer updating the 6.12 series because we're concentrating on getting 7.0 out the door, and so we really aren't following up on BOINC 6.12.x bugs because there have been so many changes for 7.0.x. I suggest you try the latest 7.0.x alpha version and see if you still have these problems.

As you determined, the coproc_debug flag makes no sense if you don't have GPU tasks.
Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 42278 · Report as offensive
Jimmy G (BA)

Send message
Joined: 26 Sep 11
Posts: 41
Message 42779 - Posted: 28 Feb 2012, 13:53:10 UTC

Resolution Update: Looks like the problem was identified as a narrow Mac OSX 10.6.8 BOINC 6.12.34/35 issue which was cleared with BOINC 6.12.41 as noted here...

"WU Freezes BOINC Manager" Redux...:
http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2789&nowrap=true#53464

...thanks Kashi and scasady for figuring this out! Also thanks to everyone at MWAH & BOINC who spent time with me trying to solve this!

Oh, happy day,
:)
ID: 42779 · Report as offensive
Jimmy G (BA)

Send message
Joined: 26 Sep 11
Posts: 41
Message 43602 - Posted: 19 Apr 2012, 12:51:02 UTC

6-Week Final Update:...looks like BOINC Manager 6.12.43 has solved all of my issues with spontaneous restarts, waiting for gpu statuses and mdnsresponder system freezes...yay!...

...seeing how 6.12.35 was not playing well with OSX 10.6.8, perhaps the kind folks at BOINC would consider elevating 6.12.43 as their preferred v.6 OSX install on this page?...

Download BOINC client software:
http://boinc.berkeley.edu/download_all.php

...and again, my sincere thanks to everyone for all your time, help, insights and suggestions!... :)
ID: 43602 · Report as offensive
Previous · 1 · 2

Message boards : Questions and problems : (Waiting on GPU) Returns!...Repeatable on demand!

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.