Message boards : Questions and problems : Starting New Tasks with Many Tasks in "waiting to run" State
Message board moderation
Author | Message |
---|---|
Send message Joined: 9 Jun 08 Posts: 4 |
On computers running multiple projects, sometimes a project may have several tasks in "waiting to run" state, yet the scheduler starts new tasks of that projects instead of resuming old tasks waiting for their turn. Is this a bug? |
Send message Joined: 29 Aug 05 Posts: 15571 |
BOINC will always try to run all work before its deadline. So it can happen that it starts newer tasks before continuing with ones that had run already for a bit. |
Send message Joined: 9 Jun 08 Posts: 4 |
I just noticed that the BOINC manager did not have the "leave applications in memory while suspended" option enabled. Let's see what happens if I enable it. |
Send message Joined: 29 Aug 05 Posts: 15571 |
Then it will leave all the tasks that are stopping and going to "waiting to run" in memory. Have enough tasks and you could fill up your memory. It won't 'force' tasks to completion. |
Send message Joined: 9 Jun 08 Posts: 4 |
Yes, you are right -- the total number of running and waiting tasks decreased, but still exceeds the number of CPU in the system. Thanks. |
Send message Joined: 6 Feb 10 Posts: 18 |
In my opinion this is a bug. As I have already mentioned somewhere in this forum, it is not normal to start work units after wu's, without finishing them, reaching deadline, a complete mess... Downgrading to 6.10.17, the same work is orderly performed(FIFO). This is easily seen when crunching very small work units. This happens to me while working for EDGeS@home and Ibercivis projects using 6.10.5x's both linux and microsoft OS's. Regards. |
Send message Joined: 6 Feb 10 Posts: 18 |
Using now version 6.10.58 and all is Ok! Regards. |
Send message Joined: 6 Feb 10 Posts: 18 |
Unfortunately, after some time working well, this fault returned both on windows and linux OS machines. So I am now using 6.10.17 / 18 without issues, 100% OK. Regards. |
Send message Joined: 29 Aug 05 Posts: 15571 |
The developers are working on a new tack: http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen |
Send message Joined: 6 Feb 10 Posts: 18 |
Thank you for your reply. I would like to add that this is happening even in computers running just wu's of one application of "just one" project, so the nature of this issue may be other than those described in http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen Regards ------- Part of http://boinc.berkeley.edu/trac/wiki/ClientSched This looks like a problem somewhere in... CPU scheduling policy The CPU scheduler uses an earliest-deadline-first (EDF) policy for results that are in danger of missing their deadline, and weighted round-robin among other projects if additional CPUs exist. This allows the client to meet deadlines that would otherwise be missed, while honoring resource shares over the long term. The scheduling policy is: 1. Set the 'anticipated debt' of each project to its short-term debt 2. Let P be the project with the earliest-deadline runnable result among projects with deadlines_missed(P)>0. Let R be P's earliest-deadline runnable result not scheduled yet. Tiebreaker: least index in result array. 3. If such an R exists, schedule R, decrement P's anticipated debt, and decrement deadlines_missed(P). 4. If there are more CPUs, and projects with deadlines_missed(P)>0, go to 1. 5. If all CPUs are scheduled, stop. 6. If there is a result R that is currently running, and has been running for less than the CPU scheduling period, schedule R and go to 5. 7. Find the project P with the greatest anticipated debt, select one of P's runnable results (picking one that is already running, if possible, else the one received first from the project) and schedule that result. 8. Decrement P's anticipated debt by the 'expected payoff' (the scheduling period divided by NCPUS). 9. Go to 5. The CPU scheduler runs when a result is completed, when the end of the user-specified scheduling period is reached, when new results become runnable, or when the user performs a UI interaction (e.g. suspending or resuming a project or result). CPU schedule enforcement The CPU scheduler decides what results should run, but it doesn't enforce this decision. This enforcement is done by a separate scheduler enforcement function, which is called by the CPU scheduler at its conclusion. Let X be the set of scheduled results that are not currently running, let Y be the set of running results that are not scheduled, and let T be the time the scheduler last ran. The enforcement policy is as follows: 1. If deadline_missed(R) for some R in X, then preempt a result in Y, and run R (preempt the result with the least CPU wall time since checkpoint). Repeat as needed. 2. If there is a result R in Y that checkpointed more recently than T, then preempt R and run a result in X. (...something wrong in the scheduler enforcement function?) |
Send message Joined: 11 Nov 10 Posts: 1 |
I am getting really annoyed with this bug. I have hundreds of WU's that are 50-99.9% complete but the scheduler ignores them and starts up another fresh one. After lots of analysis in my opinion the scheduler just goes and finds the next WU with the lowest amount of work done instead of the highest. You can test this by suspending everything except a few WU's with varying % complete.... When the scheduler moves onto one of these "waiting to run" tasks it will invariably pick the WU with the lowest % compete every time *grrr* I have this problem with 6.10.58 x32 and x64... In my opinion there is something wrong with the scheduler enforcement function... |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.