Why doesn't Boinc schedule earlier deadlines first?

Message boards : Questions and problems : Why doesn't Boinc schedule earlier deadlines first?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 94701 - Posted: 5 Jan 2020, 11:32:12 UTC - in response to Message 94700.  

BOINC is not designed for single project users. BOINC is designed to handle, as safely as possible, the complexities introduced by running multiple projects with deadlines ranging from hours to years, and with runtimes taking anything from seconds to months.

If 'Earliest Deadline First' running became the standard, some forms of scientific research (in particular, climate research) would become impossible under BOINC - their work would keep getting deferred until it became too stale to be useful.

That's why BOINC preferentially runs work in FIFO order - in the order in which it's allocated. There are safeguards in place which endeavour to prioritise tasks which are in danger of missing their individual deadline.
ID: 94701 · Report as offensive
ProDigit

Send message
Joined: 8 Nov 19
Posts: 718
United States
Message 94725 - Posted: 7 Jan 2020, 2:41:49 UTC

What I don't get, is that it would leave certain tasks at 98%, just to start another project with an earlier deadline.
Why not finish the job, before moving on to a new job?
It barely takes a few minutes to an hour to finish the last 2%?
ID: 94725 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 94730 - Posted: 7 Jan 2020, 7:46:21 UTC - in response to Message 94725.  

It barely takes a few minutes to an hour to finish the last 2%?
If it does that, the other task in in danger of going over its deadline, is why it'll run earlier. Up to such a point that BOINC is reasonably sure it'll be able to finish that task before its deadline. Then it will return to the task leaving 2% as that it knows it can run before the deadline.
ID: 94730 · Report as offensive
ProDigit

Send message
Joined: 8 Nov 19
Posts: 718
United States
Message 94766 - Posted: 8 Jan 2020, 3:48:32 UTC - in response to Message 94730.  
Last modified: 8 Jan 2020, 3:49:17 UTC

It barely takes a few minutes to an hour to finish the last 2%?
If it does that, the other task in in danger of going over its deadline, is why it'll run earlier. Up to such a point that BOINC is reasonably sure it'll be able to finish that task before its deadline. Then it will return to the task leaving 2% as that it knows it can run before the deadline.

The other tasks still had several days ahead.
I think it's choice was because of project allocation...
Once one project has reached it's '100%' quota, it jumps to another project that also has 100 allocated but lower actual percentage done.
ID: 94766 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 94769 - Posted: 8 Jan 2020, 11:44:53 UTC - in response to Message 94766.  

If you want to see how BOINC does its internal calculations on this, run a couple of minutes with cpu_sched_debug on, then disable that and run a couple of minutes with rr_simulation. Mind that both of these will output a lot of data depending on your task queue size, and about your task queue. If you need help understanding what it all means, I'm sure someone here (Richard H.) can give an explanation of what you see.
ID: 94769 · Report as offensive
Profile Gary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2463
United States
Message 94779 - Posted: 8 Jan 2020, 22:02:19 UTC

If I may interject, computer scheduling has been extensively studied. An overview is available https://en.wikipedia.org/wiki/Scheduling_(computing)

BOINC uses FIFO scheduling until it sees that a particular work unit will be close to deadline. When it sees a work unit may miss (or is late) it switches that work unit to EDF scheduling. It isn't perfect, no scheduling algorithm can be. It works fairly well until users start messing with it. Where it fails miserably is when the project can't predict correctly how much CPU time will be needed. It also fails where it can't predict how much CPU time will be available, e.g. computer powered down say vacation.

As to multiple projects, that is a function of a long term scheduler. That is the work fetch scheduler. Different beast. It tries to long term - weeks - keep the division of work as you have set with the preferences. Obviously it goes ape when a project doesn't have work available long term.

I mentioned vacation. Some days before leaving, drop your work fetch queues to something such as not more that half a day's worth of work. Then a day or so ahead of your shutdown set no new tasks on all your short deadline projects. BOINC will empty itself except for the long term work units and won't over gorge on them because you aren't asking for a huge queue. When you come back enable work fetch and change your queues back. This might need to be tweaked if you have either very fast or very slow computer, but the idea is the same.
ID: 94779 · Report as offensive
Profile Gary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2463
United States
Message 94789 - Posted: 9 Jan 2020, 2:12:40 UTC
Last modified: 9 Jan 2020, 2:14:16 UTC

Yes ape. With only two projects you won't notice. With a dozen and one doesn't have work long term then the work fetch will go ape. It will try and try to get work which isn't there, when it fails it will get the smallest it can from another project just so the machine isn't idle. Remember it is trying to level the work - FLOPS - to the preferences you have set. It expects there to be work next time it fetches and it wants to keep room for it and with a large deficit it will not fill the cache with work from other projects. You can look at the work fetch priority in the advanced manager by getting the project properties.
ID: 94789 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1283
United Kingdom
Message 94797 - Posted: 9 Jan 2020, 18:52:53 UTC

People who try to "beat" BOINC are the ones who suffer the most....
Left to its own devices with "sensible" cache settings ("extra days" significantly LESS than "days stored".) and BOINC will handle a dozen very disparate projects quite well. It will take time to actually sort out the required balance, and don't expect to see all the projects getting work all the time. That said there are one or two projects that do not obey the rules as well as they should do These are the projects that on getting a request for work send out vastly more work than was requested with very short deadlines and so BOINC does panic.
One thing to understand is that BOINC takes a long view, and that work requests are based on a "work deficit" calculation, not an instant share of work. Crudely this means that one project may run ahead of the rest for a time, but BOINC will stop requesting work so the rest have a chance to get get their work deficit back towards zero. And this balancing can take days or weeks depending on the projects, their resource shares and the availability of work from those projects.
ID: 94797 · Report as offensive
Profile Gary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2463
United States
Message 94798 - Posted: 9 Jan 2020, 18:57:43 UTC - in response to Message 94792.  
Last modified: 9 Jan 2020, 19:02:28 UTC

Yes ape. With only two projects you won't notice. With a dozen and one doesn't have work long term then the work fetch will go ape. It will try and try to get work which isn't there, when it fails it will get the smallest it can from another project just so the machine isn't idle. Remember it is trying to level the work - FLOPS - to the preferences you have set. It expects there to be work next time it fetches and it wants to keep room for it and with a large deficit it will not fill the cache with work from other projects. You can look at the work fetch priority in the advanced manager by getting the project properties.


Not sure why it would be more likely to go ape with a dozen projects. With two, it tries to get it from the one that's been offline, but when it fails, it fills my cache from the other one. Wouldn't it do the same from one or more of your other 11? I have my Boinc set to 1+1 days of cache - maybe that might make a difference for yours? Common sense would dictate it would get 1 day of work from the low priority ones if it had to, but 2 days from the high priority one. But then common sense doesn't always prevail, Boinc does some weird things sometimes.

I did say long term. It takes a while for the deficit to build. Once it does it will not fill the cache from the other projects because it wants to leave space for the work from the project that has none. The scheduler doesn't know there is no work, it just knows it needs to run lots of work from the project to reduce the deficit. When that project has work again that may be the only work in your cache for some time - days - until the deficit is reduced.

<ed>Rob, correct a denial of crunch by a project re all others. PITA. You didn't mention sensible work shares. Give one project a share of 10^10 and another of 10^-10 and things go sideways too.
ID: 94798 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1283
United Kingdom
Message 94804 - Posted: 9 Jan 2020, 22:22:21 UTC

The actual range is 0 to 1000.
BUT zero has a very special meaning.... It means something like "Try and get work from this project if there is none available from any projects with non-zero resource shares, and then only get a very limited number of tasks at a time (normally one).
And another BUT - not all projects have their lower limit set to zero, some have one - and that is a non-zero value as if you didn't guess)
ID: 94804 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 94809 - Posted: 10 Jan 2020, 0:24:51 UTC - in response to Message 94804.  

The actual range is 0 to 1000.
According to the source code the max is 9999999, which can be set. Just tested that on Seti.
ID: 94809 · Report as offensive
ProDigit

Send message
Joined: 8 Nov 19
Posts: 718
United States
Message 94827 - Posted: 10 Jan 2020, 22:29:40 UTC
Last modified: 10 Jan 2020, 22:30:10 UTC

Not to interject, but none of that applies to me.
Like said, it starts tasks with a 2 weeks deadline, that it counts as only 2 hours to complete, and leaves tasks with 2,5 weeks deadline at 98%.

It's not the first time this happened. And I believe it has to do with project resource allocation in boinc BAM, not deadline, and no cache messing going on here. Everything is pretty much set to default.
ID: 94827 · Report as offensive
ProDigit

Send message
Joined: 8 Nov 19
Posts: 718
United States
Message 94828 - Posted: 10 Jan 2020, 22:32:40 UTC - in response to Message 94809.  

The actual range is 0 to 1000.
According to the source code the max is 9999999, which can be set. Just tested that on Seti.

+1 I've tried the same.
But in my case, it's not these projects that get a large number assigned, that are the problem..
ID: 94828 · Report as offensive
1 · 2 · 3 · 4 · Next

Message boards : Questions and problems : Why doesn't Boinc schedule earlier deadlines first?

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.