Why doesn't Boinc schedule earlier deadlines first?

Author	Message
Richard Haselgrove Volunteer tester Help desk expert Send message Joined: 5 Oct 06 Posts: 5082	Message 94701 - Posted: 5 Jan 2020, 11:32:12 UTC - in response to Message 94700. BOINC is not designed for single project users. BOINC is designed to handle, as safely as possible, the complexities introduced by running multiple projects with deadlines ranging from hours to years, and with runtimes taking anything from seconds to months. If 'Earliest Deadline First' running became the standard, some forms of scientific research (in particular, climate research) would become impossible under BOINC - their work would keep getting deferred until it became too stale to be useful. That's why BOINC preferentially runs work in FIFO order - in the order in which it's allocated. There are safeguards in place which endeavour to prioritise tasks which are in danger of missing their individual deadline. ID: 94701 ·

ProDigit Send message Joined: 8 Nov 19 Posts: 718	Message 94725 - Posted: 7 Jan 2020, 2:41:49 UTC What I don't get, is that it would leave certain tasks at 98%, just to start another project with an earlier deadline. Why not finish the job, before moving on to a new job? It barely takes a few minutes to an hour to finish the last 2%? ID: 94725 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15486	Message 94730 - Posted: 7 Jan 2020, 7:46:21 UTC - in response to Message 94725. It barely takes a few minutes to an hour to finish the last 2%? If it does that, the other task in in danger of going over its deadline, is why it'll run earlier. Up to such a point that BOINC is reasonably sure it'll be able to finish that task before its deadline. Then it will return to the task leaving 2% as that it knows it can run before the deadline. ID: 94730 ·

ProDigit Send message Joined: 8 Nov 19 Posts: 718	Message 94766 - Posted: 8 Jan 2020, 3:48:32 UTC - in response to Message 94730. Last modified: 8 Jan 2020, 3:49:17 UTC It barely takes a few minutes to an hour to finish the last 2%? If it does that, the other task in in danger of going over its deadline, is why it'll run earlier. Up to such a point that BOINC is reasonably sure it'll be able to finish that task before its deadline. Then it will return to the task leaving 2% as that it knows it can run before the deadline. The other tasks still had several days ahead. I think it's choice was because of project allocation... Once one project has reached it's '100%' quota, it jumps to another project that also has 100 allocated but lower actual percentage done. ID: 94766 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15486	Message 94769 - Posted: 8 Jan 2020, 11:44:53 UTC - in response to Message 94766. If you want to see how BOINC does its internal calculations on this, run a couple of minutes with cpu_sched_debug on, then disable that and run a couple of minutes with rr_simulation. Mind that both of these will output a lot of data depending on your task queue size, and about your task queue. If you need help understanding what it all means, I'm sure someone here (Richard H.) can give an explanation of what you see. ID: 94769 ·

Gary Charpentier Send message Joined: 23 Feb 08 Posts: 2465	Message 94779 - Posted: 8 Jan 2020, 22:02:19 UTC If I may interject, computer scheduling has been extensively studied. An overview is available https://en.wikipedia.org/wiki/Scheduling_(computing) BOINC uses FIFO scheduling until it sees that a particular work unit will be close to deadline. When it sees a work unit may miss (or is late) it switches that work unit to EDF scheduling. It isn't perfect, no scheduling algorithm can be. It works fairly well until users start messing with it. Where it fails miserably is when the project can't predict correctly how much CPU time will be needed. It also fails where it can't predict how much CPU time will be available, e.g. computer powered down say vacation. As to multiple projects, that is a function of a long term scheduler. That is the work fetch scheduler. Different beast. It tries to long term - weeks - keep the division of work as you have set with the preferences. Obviously it goes ape when a project doesn't have work available long term. I mentioned vacation. Some days before leaving, drop your work fetch queues to something such as not more that half a day's worth of work. Then a day or so ahead of your shutdown set no new tasks on all your short deadline projects. BOINC will empty itself except for the long term work units and won't over gorge on them because you aren't asking for a huge queue. When you come back enable work fetch and change your queues back. This might need to be tweaked if you have either very fast or very slow computer, but the idea is the same. ID: 94779 ·

Gary Charpentier Send message Joined: 23 Feb 08 Posts: 2465	Message 94789 - Posted: 9 Jan 2020, 2:12:40 UTC Last modified: 9 Jan 2020, 2:14:16 UTC Yes ape. With only two projects you won't notice. With a dozen and one doesn't have work long term then the work fetch will go ape. It will try and try to get work which isn't there, when it fails it will get the smallest it can from another project just so the machine isn't idle. Remember it is trying to level the work - FLOPS - to the preferences you have set. It expects there to be work next time it fetches and it wants to keep room for it and with a large deficit it will not fill the cache with work from other projects. You can look at the work fetch priority in the advanced manager by getting the project properties. ID: 94789 ·

robsmith Volunteer tester Help desk expert Send message Joined: 25 May 09 Posts: 1284	Message 94797 - Posted: 9 Jan 2020, 18:52:53 UTC People who try to "beat" BOINC are the ones who suffer the most.... Left to its own devices with "sensible" cache settings ("extra days" significantly LESS than "days stored".) and BOINC will handle a dozen very disparate projects quite well. It will take time to actually sort out the required balance, and don't expect to see all the projects getting work all the time. That said there are one or two projects that do not obey the rules as well as they should do These are the projects that on getting a request for work send out vastly more work than was requested with very short deadlines and so BOINC does panic. One thing to understand is that BOINC takes a long view, and that work requests are based on a "work deficit" calculation, not an instant share of work. Crudely this means that one project may run ahead of the rest for a time, but BOINC will stop requesting work so the rest have a chance to get get their work deficit back towards zero. And this balancing can take days or weeks depending on the projects, their resource shares and the availability of work from those projects. ID: 94797 ·

Gary Charpentier Send message Joined: 23 Feb 08 Posts: 2465	Message 94798 - Posted: 9 Jan 2020, 18:57:43 UTC - in response to Message 94792. Last modified: 9 Jan 2020, 19:02:28 UTC Yes ape. With only two projects you won't notice. With a dozen and one doesn't have work long term then the work fetch will go ape. It will try and try to get work which isn't there, when it fails it will get the smallest it can from another project just so the machine isn't idle. Remember it is trying to level the work - FLOPS - to the preferences you have set. It expects there to be work next time it fetches and it wants to keep room for it and with a large deficit it will not fill the cache with work from other projects. You can look at the work fetch priority in the advanced manager by getting the project properties. Not sure why it would be more likely to go ape with a dozen projects. With two, it tries to get it from the one that's been offline, but when it fails, it fills my cache from the other one. Wouldn't it do the same from one or more of your other 11? I have my Boinc set to 1+1 days of cache - maybe that might make a difference for yours? Common sense would dictate it would get 1 day of work from the low priority ones if it had to, but 2 days from the high priority one. But then common sense doesn't always prevail, Boinc does some weird things sometimes. I did say long term. It takes a while for the deficit to build. Once it does it will not fill the cache from the other projects because it wants to leave space for the work from the project that has none. The scheduler doesn't know there is no work, it just knows it needs to run lots of work from the project to reduce the deficit. When that project has work again that may be the only work in your cache for some time - days - until the deficit is reduced. <ed>Rob, correct a denial of crunch by a project re all others. PITA. You didn't mention sensible work shares. Give one project a share of 10^10 and another of 10^-10 and things go sideways too. ID: 94798 ·

robsmith Volunteer tester Help desk expert Send message Joined: 25 May 09 Posts: 1284	Message 94804 - Posted: 9 Jan 2020, 22:22:21 UTC The actual range is 0 to 1000. BUT zero has a very special meaning.... It means something like "Try and get work from this project if there is none available from any projects with non-zero resource shares, and then only get a very limited number of tasks at a time (normally one). And another BUT - not all projects have their lower limit set to zero, some have one - and that is a non-zero value as if you didn't guess) ID: 94804 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15486	Message 94809 - Posted: 10 Jan 2020, 0:24:51 UTC - in response to Message 94804. The actual range is 0 to 1000. According to the source code the max is 9999999, which can be set. Just tested that on Seti. ID: 94809 ·

ProDigit Send message Joined: 8 Nov 19 Posts: 718	Message 94827 - Posted: 10 Jan 2020, 22:29:40 UTC Last modified: 10 Jan 2020, 22:30:10 UTC Not to interject, but none of that applies to me. Like said, it starts tasks with a 2 weeks deadline, that it counts as only 2 hours to complete, and leaves tasks with 2,5 weeks deadline at 98%. It's not the first time this happened. And I believe it has to do with project resource allocation in boinc BAM, not deadline, and no cache messing going on here. Everything is pretty much set to default. ID: 94827 ·

ProDigit Send message Joined: 8 Nov 19 Posts: 718	Message 94828 - Posted: 10 Jan 2020, 22:32:40 UTC - in response to Message 94809. The actual range is 0 to 1000. According to the source code the max is 9999999, which can be set. Just tested that on Seti. +1 I've tried the same. But in my case, it's not these projects that get a large number assigned, that are the problem.. ID: 94828 ·

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.