BOINC 'suspened' but progress bar keeps moving

Message boards : Questions and problems : BOINC 'suspened' but progress bar keeps moving
Message board moderation

To post messages, you must log in.

AuthorMessage
Siroaf

Send message
Joined: 7 Nov 12
Posts: 7
United States
Message 46220 - Posted: 7 Nov 2012, 10:58:25 UTC

BOINC Ver: 7.0.28 (x64) (Windows)

Anyway.. I've noticed recently that when I go AFK from my computer for a bit and then come back, BOINC is still 'crunching' away at tasks, even tho it says suspended.

So.. it says "Suspended - Computer in use" but the progress bar keeps moving, and it takes between 1 and 10mins to stop.. kinda like it's trying to "finish" a task so to speak..

I usually see it with my video cards mainly.. Both cards are running 100% on the GPU, but the CPU is maybe ~5%.. Could there be an issue with GPU/CUDA tasks and BOINC suspended commands?

Anyone else have this issue??

Yes.. Preferences are set properly. :P
Intel 2600k @ 4.8ghz
GTX 560Ti/GTS450 both OC'd
Many MODs! :)
ID: 46220 · Report as offensive
Siroaf

Send message
Joined: 7 Nov 12
Posts: 7
United States
Message 46245 - Posted: 12 Nov 2012, 12:18:30 UTC - in response to Message 46220.  

eh? no responses? :P
ID: 46245 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 46246 - Posted: 12 Nov 2012, 14:07:09 UTC - in response to Message 46245.  

eh? no responses? :P

At what projects? with which of their apps? Best to ask at those projects.

Claggy
ID: 46246 · Report as offensive
Siroaf

Send message
Joined: 7 Nov 12
Posts: 7
United States
Message 46270 - Posted: 14 Nov 2012, 10:30:26 UTC - in response to Message 46246.  

eh? no responses? :P

At what projects? with which of their apps? Best to ask at those projects.

Claggy


Seti@home :P
ID: 46270 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 46271 - Posted: 14 Nov 2012, 14:06:40 UTC - in response to Message 46270.  
Last modified: 14 Nov 2012, 14:12:27 UTC

eh? no responses? :P

At what projects? with which of their apps? Best to ask at those projects.

Claggy


Seti@home :P


With which one of their apps? the Cuda_fermi 6.10 MB app or the OpenCL 6.04 AP app?, or are you running anonymous platform with one of a dozen or so other Cuda apps?

Can we have a link to your host at Seti as there are no users called Siroaf there.

Claggy
ID: 46271 · Report as offensive
Siroaf

Send message
Joined: 7 Nov 12
Posts: 7
United States
Message 46283 - Posted: 15 Nov 2012, 12:30:52 UTC - in response to Message 46271.  
Last modified: 15 Nov 2012, 12:36:45 UTC

eh? no responses? :P

At what projects? with which of their apps? Best to ask at those projects.

Claggy


Seti@home :P


With which one of their apps? the Cuda_fermi 6.10 MB app or the OpenCL 6.04 AP app?, or are you running anonymous platform with one of a dozen or so other Cuda apps?

Can we have a link to your host at Seti as there are no users called Siroaf there.

Claggy


I'm retarded, sorry, lol.. User: Astroman305
Sir Oaf is just for the forums here..

I'm not running anything else beside BOINC with Seti.. I also haven't sat down and studied every aspect of this program, nor whatever else goes into it..
I'm kinda 'set and forget' to a CERTAIN point.. depends on WHAT peeks my interest :P

Note: It hasn't done what I wrote in my opening post since that day.. I was kinda thinking that maybe it was a problem between my screen saver with falling asleep to Netflix on fullscreen, paused.. Who knows.. :P
ID: 46283 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 46288 - Posted: 15 Nov 2012, 14:06:38 UTC - in response to Message 46283.  

eh? no responses? :P

At what projects? with which of their apps? Best to ask at those projects.

Claggy


Seti@home :P


With which one of their apps? the Cuda_fermi 6.10 MB app or the OpenCL 6.04 AP app?, or are you running anonymous platform with one of a dozen or so other Cuda apps?

Can we have a link to your host at Seti as there are no users called Siroaf there.

Claggy


I'm retarded, sorry, lol.. User: Astroman305
Sir Oaf is just for the forums here..

I'm not running anything else beside BOINC with Seti.. I also haven't sat down and studied every aspect of this program, nor whatever else goes into it..
I'm kinda 'set and forget' to a CERTAIN point.. depends on WHAT peeks my interest :P

Note: It hasn't done what I wrote in my opening post since that day.. I was kinda thinking that maybe it was a problem between my screen saver with falling asleep to Netflix on fullscreen, paused.. Who knows.. :P


Simple, you're running Buggy Nvidia Graphics drivers:

http://setiathome.berkeley.edu/result.php?resultid=2705433362

setiathome_CUDA: No CUDA devices found
setiathome_CUDA: Found 0 CUDA device(s):
setiathome_CUDA: CUDA Device 1 specified, checking...
Device cannot be used
SETI@home NOT using CUDA, falling back on host CPU processing
setiathome_enhanced 6.09 Visual Studio/Microsoft C++


295.xx and 296.xx drivers have a Sleeping Monitor Bug, once the Monitor goes to sleep the Cuda device disappears,
when the next Wu starts because the cuda device isn't available the Cuda app goes into CPU fallback mode and takes forever to complete a Wu, quite often doing Maximum Time Exceeded,
eithier run 290.xx or earlier drivers, or 301.xx and later drivers, I did have a sticky post on the Seti Number Crunching Forum, but it's been replaced by one by Richard:

NVidia driver problems which cause computation errors

Claggy
ID: 46288 · Report as offensive
Siroaf

Send message
Joined: 7 Nov 12
Posts: 7
United States
Message 46308 - Posted: 16 Nov 2012, 12:00:56 UTC - in response to Message 46288.  

Thanks for the info!

That's weird.. because it's using both of my video cards..
When the program starts after my idle time, both video cards are rackin' away at 100%.. so weird.. :P
ID: 46308 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 46309 - Posted: 16 Nov 2012, 13:33:50 UTC - in response to Message 46308.  
Last modified: 16 Nov 2012, 13:48:21 UTC

Thanks for the info!

That's weird.. because it's using both of my video cards..
When the program starts after my idle time, both video cards are rackin' away at 100%.. so weird.. :P

Both your cards? Only One is being reported:

Computer 6821378

Coprocessors NVIDIA GeForce GTX 560 Ti (1023MB) driver: 306.97


Boinc will only use the most capable card by default, the other one will be stated as 'Not Used' in the startup messages,
to utilise both you'll need to make a cc_config.xml with the following in it, drop it in your Boinc Data directory (the location is in your Boinc startup messages and is likely hidden), and restart Boinc:

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
</options>
</cc_config>


Client configuration

Claggy
ID: 46309 · Report as offensive
Siroaf

Send message
Joined: 7 Nov 12
Posts: 7
United States
Message 46317 - Posted: 16 Nov 2012, 21:05:01 UTC - in response to Message 46309.  
Last modified: 16 Nov 2012, 21:06:04 UTC

Both your cards? Only One is being reported:


Wow.. that is NOT the screen I saw last night! lol
What it said last night was something along the line of: GTS 450 GPU [2] something.. I'll copy and paste it when I can, AS the computer specs seem to be changing.. That's funny...

According to my monitoring software(evga precision) when seti starts running, both cards go upto 99%, and heat up.. so i assume they are doing something with seti.. :P

But I'll look into that config file.
ID: 46317 · Report as offensive
mps

Send message
Joined: 11 Dec 12
Posts: 3
Canada
Message 46711 - Posted: 11 Dec 2012, 4:59:22 UTC - in response to Message 46317.  

Hello
I've been having a similar problem with my ATI GPU. Except that with mine it just keeps running until I shut down BOINC.
It does this with workunits from any of the projects that use my GPU (PrimeGrid, SETI@Home Beta, Einstein@home).
I don't remember exactly when it started, but it was within the last few months (I thought it might have just been a bad work unit the first few times I noticed it).

I have my preferences set so that BOINC starts running after I'm away for a couple of minutes. When I come back the CPU tasks stop properly, but the GPU one often (not always) keeps running. When it does this, it doesn't respond to me trying to manually suspend it either. It says that it's suspended, but the time keeps progressing, and the task manager says that the process is still running. I have to shut down the BOINC Manager in order to stop the process.

Any help or advice would be appreciated.
Thanks
ID: 46711 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 46712 - Posted: 11 Dec 2012, 8:40:40 UTC - in response to Message 46711.  

And that's with which BOINC version and on which operating system?
If Linux and any version before 7.0.29, there's a bug in the idle detection for the (wireless) (USB) keyboards/mice in versions before 7.0.29 for Linux.

Else, make sure that you use the correct preferences. Online or local.
ID: 46712 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 46713 - Posted: 11 Dec 2012, 10:18:38 UTC - in response to Message 46712.  

And that's with which BOINC version and on which operating system?
If Linux and any version before 7.0.29, there's a bug in the idle detection for the (wireless) (USB) keyboards/mice in versions before 7.0.29 for Linux.

Else, make sure that you use the correct preferences. Online or local.

Provided both activity settings are set correctly:

* Run based on preferences
* Use GPU based on preferences

there's no way that idle detection could work for the CPU but not work for the GPU.

I've submitted two documented examples of the Einstein Windows GPU application (BRP4 v1.32) failing to respond to a 'suspend' instruction from BOINC (BRP4 1.31/1.32 GPU app release: feedback thread)

In one case, two tasks were running on the same card at the same time: one noticed that it was supposed to suspend, the other carried on regardless. I'm beginning to think this is primarily an application problem, rather than a BOINC problem - it would be helpful if future posters to this thread could indicate which project/application is active when they observe the problem.
ID: 46713 · Report as offensive
Profile ITgreybeard
Avatar

Send message
Joined: 22 Dec 10
Posts: 14
United States
Message 46754 - Posted: 13 Dec 2012, 5:52:38 UTC - in response to Message 46220.  

Me2 on a Win 7/64 Pro w/ATI Firepro 4800 board. BOINC 7.0.28 (x64)

This condition of the GPU continuing to work though suspense is indicated does not seem to occur immediately after rebooting but rather after some indeterminate length of time. It's caused me to shut down the connected client for the remainder of my uptime session.

Have seen this only the in past 2 weeks, but had been away for the 2 weeks prior.
Win 10-64 Pro on: Dual Xeon Quad E5472s 3.0GHz w/128GB DDR2 Main Memory + ATI FirePro W5000 GPU; Quad+HT i7-860 2.8GHz 8GB + ATI FirePro V4800 GPU; Quad Q6600-775 2.4GHz 4GB + ATI FirePro V4800 GPU; + 3 laptops
ID: 46754 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 46760 - Posted: 13 Dec 2012, 7:14:39 UTC - in response to Message 46754.  

Me2 on a Win 7/64 Pro w/ATI Firepro 4800 board. BOINC 7.0.28 (x64)

And you have told the affected project about this? As, as Richard indicates, it's very well possible that the project's science application is not 'listening' to what BOINC is telling it to do and ignoring the suspend request. then it's up to the project to fix that, as what else is BOINC to do about it when the science application ignores the command decisions?
ID: 46760 · Report as offensive
Profile ITgreybeard
Avatar

Send message
Joined: 22 Dec 10
Posts: 14
United States
Message 46778 - Posted: 13 Dec 2012, 16:13:19 UTC - in response to Message 46760.  
Last modified: 13 Dec 2012, 16:14:34 UTC

Thank you for the suggestion.

As a longtime designer and coder, I would not have thought that the BOINC executive application was not engineered to be in a "command and control" position in relation to the running project code. I would have thought there to be a BOINC library module built into every project, that would allocate and grant continued access to resources, and that would report project task requests back to the executive application. If the exec app does not receive X such requests from a task per unit time, then the exec app would report the problem to a log file, to the GUI and perhaps even back to BOINC web central. I recall that method as being a simple way to complete the feedback loop on the goodness of task execution.

It's possible that my machine's running - and not suspending - an IBM World Community Grid 'Help Conquer Cancer' task should have been information included in my posting, in that it is other than the Einstein or SETI project tasks that were reported earlier in the thread. Once a symptom appears across a variety of tasks, it is usually but not always the case that the problem is systemic. And the variety of GPU models would lead one to reduce in rank GPU driver or architecture as a possible causative source.

Would you still recommend that this be reported to the project owners?
Win 10-64 Pro on: Dual Xeon Quad E5472s 3.0GHz w/128GB DDR2 Main Memory + ATI FirePro W5000 GPU; Quad+HT i7-860 2.8GHz 8GB + ATI FirePro V4800 GPU; Quad Q6600-775 2.4GHz 4GB + ATI FirePro V4800 GPU; + 3 laptops
ID: 46778 · Report as offensive
SekeRob2

Send message
Joined: 6 Jul 10
Posts: 585
Italy
Message 46779 - Posted: 13 Dec 2012, 17:10:28 UTC - in response to Message 46778.  
Last modified: 13 Dec 2012, 17:10:39 UTC

Can't remember having seen zombie processes being reported at the WCG forums in the recent weeks/month [yes report it at WCG with as much detail as you can collect].

Internally, science apps have a kill self type of switch. If the run-time is e.g. 10x greater than the original estimate, it's supposed to go south. These parms it is given at start, similar as that it is told at start how frequent it's allowed, per user setting, to write a progress backup to disk, at most. Also, if the core client does not hear of the science app for longer than 30 seconds, it's supposed to restart the process. Think it does that on basis of PID. If it does that 100x for an science task, the job is killed. Symptoms logged is °zero status... restart client... if this happens often...° series of messages.
Coelum Non Animum Mutant, Qui Trans Mare Currunt
ID: 46779 · Report as offensive
mps

Send message
Joined: 11 Dec 12
Posts: 3
Canada
Message 46853 - Posted: 18 Dec 2012, 17:00:18 UTC

Sorry for taking so long to respond. I thought I had set it up to email me when there was activity on this thread...

I'm on Windows 7, using BOINC version 7.0.28 x64. I'm using a hyperthreaded quad core, if that's helpful (yay 8 vCPU's!).
I double checked the preferences and they are not set to use the GPU while the computer is in use (it does sometimes stop the GPU properly).

The last task to do this (just a few minutes ago), was SETI@home Beta, the application is SETI@Home v7 6.99 (opencl_ati_sah).

I've seen it happen with PrimeGrid and Einstein@home as well. I could record and post the specific application names next time it happens if that would be helpful. As others have suggested, it seems like (from the outside) this is a BOINC client issue since it's occurring with multiple projects and applications. Either that or there are just a number of projects that are missing something when they're writing their code in regards to properly suspending the GPU tasks.



ID: 46853 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 46857 - Posted: 18 Dec 2012, 17:18:30 UTC - in response to Message 46853.  

Do you run more than one task at the same time on the GPU? Some people run 2 to 6 of them. And then it's possible that the 2nd to 6th application does not get the 'suspend' message from BOINC. So in that case is that an application bug, where it's ignoring what the BOINC client tells it to. Although no application builder builds his apps specifically to be run in multitude on a GPU.
ID: 46857 · Report as offensive
mps

Send message
Joined: 11 Dec 12
Posts: 3
Canada
Message 46858 - Posted: 18 Dec 2012, 18:10:58 UTC - in response to Message 46857.  

I have one of those laptops where there's a lower powered Intel GPU as well as a better (but more power hungry) ATI GPU.

I'm not sure how I could tell it to run more than one application on the GPU, or how to check if it's allowed to (I don't see any options in the preferences about that). I've never seen more than one application running with the GPU at once. And the client log only says that it recognizes the ATI GPU.
ID: 46858 · Report as offensive

Message boards : Questions and problems : BOINC 'suspened' but progress bar keeps moving

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.