High Priority = Wrong Priority

Message boards : Questions and problems : High Priority = Wrong Priority
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36092 - Posted: 17 Dec 2010, 12:40:15 UTC

When running at high priority, priority should be given to the unit with the earliest deadline and nothing else (except possibly +/- the estimated runtime). Any other system of selection will only and can only increase the chance of units being returned late.
ID: 36092 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14825
Netherlands
Message 36093 - Posted: 17 Dec 2010, 13:08:16 UTC - in response to Message 36092.  
Last modified: 17 Dec 2010, 13:09:41 UTC

BOINC version?
Operating system?
General system specs?
Which projects are you attached to?
How much work is on the machine?
What is the connect to interval plus the additional amount of days worth of work that should be stored?
What's the on_frac and active_frac?
What's the DCF of the affected projects?
And where is your proof, given by a the cc_config.xml file with the following flags enabled:

<cpu_sched_debug>: problems involving the choice of applications to run.
<rr_simulation>: problems involving jobs being run in high-priority mode.
<time_debug>: updates to on_frac, active_frac, connected_frac.
<debt_debug>: Show changes to project debt.
<dcf_debug>: When enabled it shows the calculation of the duration correction factor per project at the start and end of tasks.

Without all of that information your post is just text on a monitor. No developer will be able to do anything with it as there's nothing to compare it with, or no way to reproduce it.
ID: 36093 · Report as offensive
Pepo
Avatar

Send message
Joined: 3 Apr 06
Posts: 547
Slovakia
Message 36099 - Posted: 17 Dec 2010, 14:12:20 UTC - in response to Message 36092.  

When running at high priority, priority should be given to the unit with the earliest deadline and nothing else (except possibly +/- the estimated runtime). Any other system of selection will only and can only increase the chance of units being returned late.

Imagine you have two tasks on your single-core machine:

  • task A was already running 16 days, 5 hours to go and its deadline is in 6 hours.
  • task B was not running yet, it has 3 hours to go and its deadline is in 4 hours.


Should the client crunch task B, report it soon enough, then continue with task A, report it 2 hours past deadline and risk more than 16 days of work being discarded?
Or should it crunch task A, report it soon enough, then possibly either abort the unstarted task B (because it is already past deadline) and ask for a replacement (to be crunched in-time), or (if not yet past deadline) crunch the task B, report it and risk 3 hours of work being discarded?

The decision is not always that simple, like your imperative "earliest deadline and nothing else"...

Peter

ID: 36099 · Report as offensive
SekeRob

Send message
Joined: 25 Aug 06
Posts: 1596
Message 36101 - Posted: 17 Dec 2010, 15:29:18 UTC - in response to Message 36099.  

Easily reproducible if 1/more of multiple active projects has a low resource share and long hours to run, that if not temporarily run in HP in intervals would otherwise not be able to meet deadline.

Example: A 1 core device with a 24 hour task due in 10 days with only 5% resource share and a fair size cache. Normally the task gets 1.2 hours daily, but needs twice as much. Other projects have shorter run times and bigger resource shares.... et voila.
Coelum Non Animum Mutant, Qui Trans Mare Currunt
ID: 36101 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36102 - Posted: 17 Dec 2010, 15:32:56 UTC - in response to Message 36099.  
Last modified: 17 Dec 2010, 16:24:01 UTC

When running at high priority, priority should be given to the unit with the earliest deadline and nothing else (except possibly +/- the estimated runtime). Any other system of selection will only and can only increase the chance of units being returned late.

Imagine you have two tasks on your single-core machine:

  • task A was already running 16 days, 5 hours to go and its deadline is in 6 hours.
  • task B was not running yet, it has 3 hours to go and its deadline is in 4 hours.


Should the client crunch task B, report it soon enough, then continue with task A, report it 2 hours past deadline and risk more than 16 days of work being discarded?
Or should it crunch task A, report it soon enough, then possibly either abort the unstarted task B (because it is already past deadline) and ask for a replacement (to be crunched in-time), or (if not yet past deadline) crunch the task B, report it and risk 3 hours of work being discarded?

The decision is not always that simple, like your imperative "earliest deadline and nothing else"...

Peter



except possibly +/- the estimated runtime


I was thinking along those lines also.

A = Present time
B = Task x's due time
e = Task x's estimated time
t = The difference between A and B
n = t +/- e (doesn't matter if it is added or subtraced so long as the same rule is applied for each task)

When running under "High Priority" the task with the lowest value for n is processed first.
ID: 36102 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36103 - Posted: 17 Dec 2010, 16:04:45 UTC - in response to Message 36093.  
Last modified: 17 Dec 2010, 16:52:04 UTC

...Without all of that information your post is just text on a monitor. No developer will be able to do anything with it as there's nothing to compare it with, or no way to reproduce it.


Actually it is a statement of fact first and formost (i.e. doesn't require proof. It is logical argument which you can workout for yourself).
I did assume that it was a problem in the core code (i.e. it does not follow the rule I suggest) and did not pertain to any particular version, system, etc. So if you care to overfill your cache you can see for yourself how boinc doesn't pick the task with the earliest deadline (+/- estimated runtime) first.

If boinc already uses the system I suggest then the process of debugging can begin and I will throw as much info as you ask for your way. If it doesn't then it needs to be changed.
ID: 36103 · Report as offensive
Aurora Borealis
Avatar

Send message
Joined: 8 Jan 06
Posts: 448
Canada
Message 36105 - Posted: 17 Dec 2010, 18:53:56 UTC
Last modified: 17 Dec 2010, 18:58:00 UTC

Sorry, but a statement is only an hypotheses requiring a reproducible proof to become a theory which may eventually be considered as fact. For your future reference and enlightenment I point you to a simplified explanation Hypothesis versus Theory versus Fact.

EDF and its latest incarnation HP have been studied, discussed and simulated under a multitude of scenarios by the developers and testers. Cache size, over/under estimated run times, long/short/slack/tight due dates, high/low resource share ... etc. The solution is nowhere near simple considering the diversity of projects. The current algorithm isn't perfect, and probably never will be.

My personal experience since joining Boinc when it went public is that it does the job very well under most circumstance. I've only had to intervene once to delete excess work and that was when a project put out WU's that took >100 times the given estimate and with a relatively short deadline. From my point of view the present system although needing continued fine tuning, requires no major overhaul.

Boinc V 7.4.36
Win7 i5 3.33G 4GB NVidia 470
ID: 36105 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14825
Netherlands
Message 36106 - Posted: 17 Dec 2010, 19:24:04 UTC - in response to Message 36103.  

So if you care to overfill your cache you can see for yourself how boinc doesn't pick the task with the earliest deadline (+/- estimated runtime) first.

Again, run with debug flags to corroborate this claim. In the least <rr_simulation>

Personally I run BOINC with work from 7 different projects, all with their own different (by application) deadlines and it manages to get all work that's cached in by their respective deadlines. Without me needing to do much of anything, heck for the past 4 weeks I have even managed to play a lot of intensive 3D games without that interfering with the work being done and returned in time.

Of course, overfilling your cache on purpose... well, what do you expect then? Do you also fill up your Diesel with gasoline to forcibly point out to the car manufacturer that this won't work?

ID: 36106 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36107 - Posted: 17 Dec 2010, 20:26:56 UTC - in response to Message 36105.  

Sorry, but a statement is only an hypotheses requiring a reproducible proof to become a theory which may eventually be considered as fact.


It is a fact, and I defy you to describe a scenario where it holds to not be true.
ID: 36107 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36109 - Posted: 17 Dec 2010, 20:48:26 UTC - in response to Message 36106.  
Last modified: 17 Dec 2010, 20:52:29 UTC

So if you care to overfill your cache you can see for yourself how boinc doesn't pick the task with the earliest deadline (+/- estimated runtime) first.

Again, run with debug flags to corroborate this claim. In the least <rr_simulation>


Will this do?
http://img152.imageshack.us/img152/452/bottem.png
(4 cores) You can see units have stopped being worked on by the HP system, and units at most risk of going back late are ignored. Sticking to FIFO would be a better solution (in this case) than the HP system.

Of course, overfilling your cache on purpose... well, what do you expect then?

Be it by design or by accident (BTW my system will send all units back on time), I expect a system which sole function is to ensure that work is done on time (or do damage limitation) to actually work as intended. Rather than effectively making the situation worse.
ID: 36109 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14825
Netherlands
Message 36110 - Posted: 17 Dec 2010, 21:10:46 UTC - in response to Message 36109.  
Last modified: 17 Dec 2010, 21:11:06 UTC

Yet you refuse to give definite proof, by debug logs.
You won't give any information other than that 'you seen it'.

Or you will, but only after the developers change the program so it does what you want it to do, despite what that does to all the other people out there? Bit of a weird demand.

If you want to change it so it runs as you think it should, get the source code and start hacking at it. It's open source. Should be easy for you to make it do what you want, seeing as how you can tell without a shred of evidence that the system is broken beyond (your) repair. :-)

Have fun. I'll not return here, as you make me laugh too much and that's still too painful. :-(
ID: 36110 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36116 - Posted: 18 Dec 2010, 0:38:54 UTC - in response to Message 36110.  
Last modified: 18 Dec 2010, 0:55:53 UTC

Yet you refuse to give definite proof, by debug logs.
You won't give any information other than that 'you seen it'.

Or you will, but only after the developers change the program so it does what you want it to do, despite what that does to all the other people out there? Bit of a weird demand.

If you want to change it so it runs as you think it should, get the source code and start hacking at it. It's open source. Should be easy for you to make it do what you want, seeing as how you can tell without a shred of evidence that the system is broken beyond (your) repair. :-)

Have fun. I'll not return here, as you make me laugh too much and that's still too painful. :-(


Really? That's big of you. Excuse me for trying to improve boinc. Enjoy your power trip.
PS I haven't refused to provide anything, I was avoiding it because a) it shouldn't be necessary (also the pic I provided illustrates my point perfectly) and b) it is a PITA. However seeing as you walked out of the thread in a big hissy fit not to return it would be pointless to get any debugging data.
ID: 36116 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14825
Netherlands
Message 36121 - Posted: 18 Dec 2010, 7:54:10 UTC - in response to Message 36116.  
Last modified: 18 Dec 2010, 7:56:10 UTC

PS I haven't refused to provide anything, I was avoiding it because a) it shouldn't be necessary (also the pic I provided illustrates my point perfectly) and b) it is a PITA. However seeing as you walked out of the thread in a big hissy fit not to return it would be pointless to get any debugging data.

I have requested the debug data 3 times, all the 3 times you parried with saying it wasn't necessary in your opinion, or not until after BOINC had made the changes you found necessary.

A PITA or not, it is the only way in which you can prove beyond a reasonable doubt that your BOINC isn't doing things according to how it should be doing it, your opinion and screen shots don't do those. It's not so much for me that the logs are for but for the developers. You want them to change BOINC, to fix possible bugs, right? Then you'll have to come with proof they can read.

Screen shots aren't proof.
They don't show over the duration of time what your system was doing. They're a testimony of one point in time only, not showing what happened before, not showing what happened since, not showing what might be going to happen next.

Now, I'm giving you one more chance to get the data. And don't just run for 10 minutes with the debug flags, that's not going to help much. You will need a 24 hours log to show everything in complete order.

Your stdoutdea.txt won't log that all? Then increase it.
That same cc_config.xml file with the option:

<max_stdout_file_size>size_in_bytes</max_stdout_file_size>, where size_in_bytes is something big enough. Say 80MB, or 83886080.

Save to cc_config.xml, then exit & restart BOINC.

So your total cc_config.xml should show like the following:

<cc_config>
<log_flags>
<cpu_sched_debug>1</cpu_sched_debug>
<work_fetch_debug>1</work_fetch_debug>
<rr_simulation>1</rr_simulation>
<time_debug>1</time_debug>
<debt_debug>1</debt_debug>
<dcf_debug>1<dcf_debug>
</log_flags>
<options>
<max_stdout_file_size>83886080</max_stdout_file_size>
</options>
</cc_config>


(I added work_fetch_debug as well)
ID: 36121 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36130 - Posted: 18 Dec 2010, 13:17:35 UTC - in response to Message 36121.  

A PITA or not, it is the only way in which you can prove beyond a reasonable doubt that your BOINC isn't doing things according to how it should be doing it, your opinion and screen shots don't do those. It's not so much for me that the logs are for but for the developers. You want them to change BOINC, to fix possible bugs, right? Then you'll have to come with proof they can read.


It was and still is my point that boinc is running correctly. That the HP system has been over complicated to the point of not working as intended. A developer who knows how HP is coded/implemented would know whether the HP system uses my system or something else. Without several miles of text.

It doesn't look like 80MB is going to be big enough to cover 24 hours
38 minutes - 7.62MB
http://rapidshare.com/files/438000304/stdoutdae.old
ID: 36130 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36152 - Posted: 19 Dec 2010, 9:52:56 UTC - in response to Message 36121.  

Proof

http://img152.imageshack.us/img152/452/bottem.png
http://img683.imageshack.us/img683/3161/bottem2.png

Look at the task near the bottem of the pics for Help Cure Muscular Dystrophy. 67% completed in one and 100% in the other. This task has ~2.5 days more until the deadline than the tasks above it. The HP system by returning to complete this task has increased the chance of the tasks above it being returned late. This should be reflected in the pending log.
ID: 36152 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 36155 - Posted: 19 Dec 2010, 12:29:58 UTC

Here are live links:

http://img152.imageshack.us/img152/452/bottem.png
http://img683.imageshack.us/img683/3161/bottem2.png

It does seem strange, doesn't it? Sometimes the Boinc scheduler seems more like the Sybil at Delphi than a piece of man-made code. Inscrutable and, for most of us, largely incomprehensible apart from its most general aspects.

It will be interesting to see whether all these tasks get completed on time or not.

Have you considered reducing the additional work buffer so the computer isn't faced with digesting quite such big dinners in future?

ID: 36155 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36157 - Posted: 19 Dec 2010, 13:35:13 UTC - in response to Message 36155.  

Have you considered reducing the additional work buffer so the computer isn't faced with digesting quite such big dinners in future?


I usually run with half a day of cache. DDDT tasks come in spurts with long periods of none being available. So when they are available I grab as many as I can.


Here is the log file. For some reason it restarted so it's only 12 hours long. If you need a longer period please say so immediately and I will start logging again.

http://rapidshare.com/files/438185963/stdoutdae__2_.rar
ID: 36157 · Report as offensive
cw64

Send message
Joined: 27 Aug 10
Posts: 16
United Kingdom
Message 36159 - Posted: 19 Dec 2010, 13:58:29 UTC - in response to Message 36157.  
Last modified: 19 Dec 2010, 14:02:58 UTC

To sum up

When running at high priority, priority should be given to the unit with the earliest deadline and nothing else (except +/- the estimated runtime). Any other system of selection will only and can only increase the chance of units being returned late.

The HP system should* work as follows

A = Present time
B = Task x's due time
e = Task x's estimated time
t = The difference between A and B
n = t +/- e (doesn't matter if it is added or subtracted so long as the same rule is applied for each task)

When running under "High Priority" the task with the lowest value for n is processed first. Once a task selected by HP is running it will run to completion.
If two or more tasks have the same value for n then priority is given to the one with the most accumulated runtime. If two or more of those have the same accumulated runtime then priority is given to the one with the shortest remaining estimated time. If two or more of those have the same remaining time then the choice is arbitrary. Picking one over the other will not effect the chance of units being returned late. n needs only to be calculated once for each task.

* IMO
ID: 36159 · Report as offensive
PMH_UK

Send message
Joined: 24 Dec 10
Posts: 5
United Kingdom
Message 36200 - Posted: 24 Dec 2010, 12:35:33 UTC - in response to Message 36157.  



I usually run with half a day of cache. DDDT tasks come in spurts with long periods of none being available. So when they are available I grab as many as I can.


I am also experiencing this with lots of DDDT in cache.
Suspending some of the later tasks allowed normal running but resuming set high-priority again and task switched to much laster tasks.
These tasks have 7 to 9 days to go and total estimated to run is under 5 days (222 hours dual-core).

This behaviour does appear wrong, more so choosing much later tasks to run.
Will try with debug.

Win XP home, WCG BOINC 6.10.58, Intel dual-core, 2GB.

Paul.
ID: 36200 · Report as offensive
Professor Ray

Send message
Joined: 31 Mar 08
Posts: 59
United States
Message 36691 - Posted: 1 Feb 2011, 9:20:58 UTC

My issue with BOINC v6.10.58 is that it solicits WU's when ALL of the WU's it presently has are HP.
ID: 36691 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : High Priority = Wrong Priority

Copyright © 2022 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.