7.16.3 has idle GPU: which parameter is causing the delay?

Message boards : Questions and problems : 7.16.3 has idle GPU: which parameter is causing the delay?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 95266 - Posted: 18 Jan 2020, 16:07:57 UTC
Last modified: 18 Jan 2020, 16:27:00 UTC

I am trying to reduce my count of Einstein tasks that I accidently downloaded and had set SETI to NNT to concentrate on getting rid of the backlog of Einstein tasks. Due to limited CPU / Threads, the Einstein project is set to maximum of 6 concurrent tasks, one per each of the first 6 GPUS. This leaves 3 GPU idle and there are two threads available.

After about 18 hours I decided to let SETI start downloading but I set the resource to 0 by selecting the "work=0" venue and requesting an update. Unlike my attempt at Einstein, I verified the resource was 0 before allowing more tasks. I am not getting any tasks, three GPUs are idle so I requested another update and nothing happened. I looked at the SETI project properties and am posting some of what I see as I suspect something there is causing the lack of work.
Duration correction factor	1.0000000000
Scheduling priority	-1,012.61
CPU backoff time	--
Backoff Interval	-
NVIDIA backoff time	1/18/2020 9:57:21 AM
Backoff Interval	00:10:00


in the time it took me to write this post (10 minutes?) the above changed as follows:
Duration correction factor	1.0000000000
Scheduling priority	-1,000.97
CPU backoff time	--
Backoff Interval	-
NVIDIA backoff time	1/18/2020 10:12:21 AM
Backoff Interval	00:20:00


I noticed the scheduling priority is slowly get back to a positive number. When that becomes positive will I start getting tasks? What can be done to speed this up assuming my guess is correct?

assuming it increased 10 points in 10 minutes it looks like 1000 / 10 = 100 minutes to wait. Can the priority be set to a value not a huge amount under zero?
ID: 95266 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 95267 - Posted: 18 Jan 2020, 16:21:23 UTC - in response to Message 95266.  

Set <sched_op_debug> and see what you're actually asking for. You need to distinguish between "SETI doesn't have any work available" and "I didn't even ask for any work, available or not".
ID: 95267 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 95268 - Posted: 18 Jan 2020, 16:44:46 UTC - in response to Message 95267.  
Last modified: 18 Jan 2020, 16:50:09 UTC

Set <sched_op_debug> and see what you're actually asking for. You need to distinguish between "SETI doesn't have any work available" and "I didn't even ask for any work, available or not".


Issuing "read config" seems to have messed with those parameters. I did not expect to see "0" for priority. I also verified using boinc manager, not just boinctasks

108			1/18/2020 10:29:14 AM	Re-reading cc_config.xml	
---
140	Einstein@Home	1/18/2020 10:29:19 AM	Sending scheduler request: To report completed tasks.	
141	Einstein@Home	1/18/2020 10:29:19 AM	Reporting 1 completed tasks	
142	Einstein@Home	1/18/2020 10:29:19 AM	Not requesting tasks: "no new tasks" requested via Manager	
143	Einstein@Home	1/18/2020 10:29:21 AM	Scheduler request completed	
144	SETI@home	1/18/2020 10:29:47 AM	update requested by user	
145	SETI@home	1/18/2020 10:29:51 AM	Sending scheduler request: Requested by user.	
146	SETI@home	1/18/2020 10:29:51 AM	Requesting new tasks for NVIDIA GPU	
147	SETI@home	1/18/2020 10:30:36 AM	Scheduler request completed: got 0 new tasks

Duration correction factor	1.0000000000
Scheduling priority	0.00
CPU backoff time	--
Backoff Interval	-
NVIDIA backoff time	--
Backoff Interval	-

163	SETI@home	1/18/2020 10:37:52 AM	Sending scheduler request: To fetch work.	
164	SETI@home	1/18/2020 10:37:52 AM	Requesting new tasks for NVIDIA GPU	
165	SETI@home	1/18/2020 10:38:34 AM	Scheduler request completed: got 0 new tasks	


so my theory that getting the priority positive seems a too simple solution to a complex problem.

Not getting any tasks and not a lot of help from the "project properties" toward diagnosing the problem. This might even be a problem with the project servers. I just got a timeout trying to access my account at the site
ID: 95268 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 95269 - Posted: 18 Jan 2020, 16:50:33 UTC - in response to Message 95268.  

I was hoping to see some actual numbers:

18/01/2020 16:44:37 | SETI@home | Sending scheduler request: To fetch work.
18/01/2020 16:44:37 | SETI@home | Reporting 17 completed tasks
18/01/2020 16:44:37 | SETI@home | Requesting new tasks for NVIDIA GPU
18/01/2020 16:44:37 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
18/01/2020 16:44:37 | SETI@home | [sched_op] NVIDIA GPU work request: 41239.70 seconds; 0.00 devices
18/01/2020 16:46:49 | SETI@home | Scheduler request completed: got 0 new tasks
18/01/2020 16:46:49 | SETI@home | Project has no tasks available
The last line is getting increasingly common.
ID: 95269 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 95270 - Posted: 18 Jan 2020, 16:54:15 UTC - in response to Message 95268.  

... not a lot of help from the "project properties" toward diagnosing the problem.
No, it's not a helpful diagnostic tool. Better to see whether you have, or have not, requested work - and if you have, what for and how much.
ID: 95270 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 95271 - Posted: 18 Jan 2020, 16:58:47 UTC - in response to Message 95270.  
Last modified: 18 Jan 2020, 17:12:11 UTC

Did not notice the "debug" so I set just the sched_op to 1 not the debug flag.

I think the problem is the project out of work / server busy and a coincidence it happened at this time.

However, the "resetting" of the parameters after requested a "read config" was not expected. why would backoff times be reset to 0?
[edit]

Scheduling priority is back to -1,000.97 as shown by both bonctasks and boinc manager.
BT shows 20 minutes backoff interval for nvidia.. I assume this is all correct as the server has problems. Should have checked their servers before posting. My other systems were crunching SETI just fine but I didn't check to see if they were getting new work.
ID: 95271 · Report as offensive

Message boards : Questions and problems : 7.16.3 has idle GPU: which parameter is causing the delay?

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.