Thread 'Starting New Tasks with Many Tasks in "waiting to run" State'

Author	Message
d_j_liu Send message Joined: 9 Jun 08 Posts: 4	Message 34865 - Posted: 23 Sep 2010, 18:10:49 UTC On computers running multiple projects, sometimes a project may have several tasks in "waiting to run" state, yet the scheduler starts new tasks of that projects instead of resuming old tasks waiting for their turn. Is this a bug? ID: 34865 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15571	Message 34869 - Posted: 23 Sep 2010, 19:44:56 UTC - in response to Message 34865. BOINC will always try to run all work before its deadline. So it can happen that it starts newer tasks before continuing with ones that had run already for a bit. ID: 34869 ·

d_j_liu Send message Joined: 9 Jun 08 Posts: 4	Message 34876 - Posted: 23 Sep 2010, 23:07:47 UTC - in response to Message 34869. I just noticed that the BOINC manager did not have the "leave applications in memory while suspended" option enabled. Let's see what happens if I enable it. ID: 34876 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15571	Message 34878 - Posted: 23 Sep 2010, 23:15:04 UTC - in response to Message 34876. Then it will leave all the tasks that are stopping and going to "waiting to run" in memory. Have enough tasks and you could fill up your memory. It won't 'force' tasks to completion. ID: 34878 ·

d_j_liu Send message Joined: 9 Jun 08 Posts: 4	Message 34892 - Posted: 24 Sep 2010, 15:58:18 UTC - in response to Message 34878. Yes, you are right -- the total number of running and waiting tasks decreased, but still exceeds the number of CPU in the system. Thanks. ID: 34892 ·

manuel oliveira Send message Joined: 6 Feb 10 Posts: 18	Message 35093 - Posted: 4 Oct 2010, 10:54:31 UTC - in response to Message 34892. Last modified: 4 Oct 2010, 11:12:52 UTC In my opinion this is a bug. As I have already mentioned somewhere in this forum, it is not normal to start work units after wu's, without finishing them, reaching deadline, a complete mess... Downgrading to 6.10.17, the same work is orderly performed(FIFO). This is easily seen when crunching very small work units. This happens to me while working for EDGeS@home and Ibercivis projects using 6.10.5x's both linux and microsoft OS's. Regards. ID: 35093 ·

manuel oliveira Send message Joined: 6 Feb 10 Posts: 18	Message 35330 - Posted: 21 Oct 2010, 18:30:06 UTC - in response to Message 35093. Last modified: 21 Oct 2010, 18:30:59 UTC Using now version 6.10.58 and all is Ok! Regards. ID: 35330 ·

manuel oliveira Send message Joined: 6 Feb 10 Posts: 18	Message 35562 - Posted: 31 Oct 2010, 10:39:59 UTC - in response to Message 35330. Unfortunately, after some time working well, this fault returned both on windows and linux OS machines. So I am now using 6.10.17 / 18 without issues, 100% OK. Regards. ID: 35562 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15571	Message 35563 - Posted: 31 Oct 2010, 11:53:57 UTC - in response to Message 35562. The developers are working on a new tack: http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen ID: 35563 ·

manuel oliveira Send message Joined: 6 Feb 10 Posts: 18	Message 35565 - Posted: 31 Oct 2010, 18:06:04 UTC - in response to Message 35563. Last modified: 31 Oct 2010, 18:14:06 UTC Thank you for your reply. I would like to add that this is happening even in computers running just wu's of one application of "just one" project, so the nature of this issue may be other than those described in http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen Regards ------- Part of http://boinc.berkeley.edu/trac/wiki/ClientSched This looks like a problem somewhere in... CPU scheduling policy The CPU scheduler uses an earliest-deadline-first (EDF) policy for results that are in danger of missing their deadline, and weighted round-robin among other projects if additional CPUs exist. This allows the client to meet deadlines that would otherwise be missed, while honoring resource shares over the long term. The scheduling policy is: 1. Set the 'anticipated debt' of each project to its short-term debt 2. Let P be the project with the earliest-deadline runnable result among projects with deadlines_missed(P)>0. Let R be P's earliest-deadline runnable result not scheduled yet. Tiebreaker: least index in result array. 3. If such an R exists, schedule R, decrement P's anticipated debt, and decrement deadlines_missed(P). 4. If there are more CPUs, and projects with deadlines_missed(P)>0, go to 1. 5. If all CPUs are scheduled, stop. 6. If there is a result R that is currently running, and has been running for less than the CPU scheduling period, schedule R and go to 5. 7. Find the project P with the greatest anticipated debt, select one of P's runnable results (picking one that is already running, if possible, else the one received first from the project) and schedule that result. 8. Decrement P's anticipated debt by the 'expected payoff' (the scheduling period divided by NCPUS). 9. Go to 5. The CPU scheduler runs when a result is completed, when the end of the user-specified scheduling period is reached, when new results become runnable, or when the user performs a UI interaction (e.g. suspending or resuming a project or result). CPU schedule enforcement The CPU scheduler decides what results should run, but it doesn't enforce this decision. This enforcement is done by a separate scheduler enforcement function, which is called by the CPU scheduler at its conclusion. Let X be the set of scheduled results that are not currently running, let Y be the set of running results that are not scheduled, and let T be the time the scheduler last ran. The enforcement policy is as follows: 1. If deadline_missed(R) for some R in X, then preempt a result in Y, and run R (preempt the result with the least CPU wall time since checkpoint). Repeat as needed. 2. If there is a result R in Y that checkpointed more recently than T, then preempt R and run a result in X. (...something wrong in the scheduler enforcement function?) ID: 35565 ·

Chris Send message Joined: 11 Nov 10 Posts: 1	Message 35692 - Posted: 11 Nov 2010, 13:17:20 UTC - in response to Message 35565. Last modified: 11 Nov 2010, 13:19:46 UTC I am getting really annoyed with this bug. I have hundreds of WU's that are 50-99.9% complete but the scheduler ignores them and starts up another fresh one. After lots of analysis in my opinion the scheduler just goes and finds the next WU with the lowest amount of work done instead of the highest. You can test this by suspending everything except a few WU's with varying % complete.... When the scheduler moves onto one of these "waiting to run" tasks it will invariably pick the WU with the lowest % compete every time grrr I have this problem with 6.10.58 x32 and x64... In my opinion there is something wrong with the scheduler enforcement function... ID: 35692 ·

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.