Scheduler weakness

Message boards : BOINC client : Scheduler weakness
Message board moderation

To post messages, you must log in.

AuthorMessage
W-K ID 666

Send message
Joined: 30 Dec 05
Posts: 457
United Kingdom
Message 13163 - Posted: 20 Oct 2007, 13:29:29 UTC

I run several projects, with Seti having the largest resource share. The main project I rely on for work when Seti is down is Einstein.

Due to the problems Seti had a couple of months ago Einstein got a much bigger share than normal and therefore went severely into LTD.

This Einstein debt rather than reducing has grown larger, because every time Seti has the slightest glitch, the scheduler decides it MUST get work from somewhere. As the other projects quite often have no work available it downloads more from Einstein.

This wouldn't be so bad if it downloaded one unit per cpu but no 'IT' decides that because I run a one day cache 'IT' will down load enough work for at least 24 hours/cpu. This usually means 30+ hours of work (six S5R3 units).
Because the resource share is set to be able to do one Einstein unit/cpu within the deadline, as it received three times more work than I want BOINC then goes into EDF mode and the LTD for Einstein keeps growing.

Therefore we need a mechanism to only download units from a project in heavy LTD if the work on our computers is going to run out in the next few hours and only download the minimum of work, i.e one unit/cpu and no more. Further requests can be made if the difficulty continues.

I am running BOINC 5.10.xx, connected 24/7.
ID: 13163 · Report as offensive
W-K ID 666

Send message
Joined: 30 Dec 05
Posts: 457
United Kingdom
Message 13195 - Posted: 22 Oct 2007, 9:37:03 UTC - in response to Message 13164.  

The issue is that on one hand u want to have a 24 hour buffer (connect once per 24 hours), yet on the other hand BOINC never knowing if any project has work. It will simply follow your instruction and fill up that buffer. Now if Seti is Main and other projects are just fillers for idle time, set their resource share / weight very low. With BOINC 5.10.x you can approach the situation differently, but dont know how it will work out in your situation.

1. Set Connect xxxx days short, like 0.01 days
2. Set Additional buffer to 1 day (local preferences)

In principle this set up takes care of continuous backfilling, thus if Seti would not be able to supply, Einstein would only have room to download little at the back end. That would continue until Seti comes back.

Give it a try, the assumption here made that internet is permanently open.

That is what I do, connection/cache is exactly as you suggest, resource share for Einstein is 7% on Pent M and 4% on C2D which just allows enough time to do one unit/cpu in deadline without going into EDF.

I have to assume it is a BOINC problem because the same pattern is happening on both computers.
I could micro-manage mine the Pent M, but don't want to. The C2D is sons and is not always here.

Andy

ID: 13195 · Report as offensive
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 13199 - Posted: 22 Oct 2007, 14:29:36 UTC - in response to Message 13195.  
Last modified: 22 Oct 2007, 14:30:58 UTC

I have to assume it is a BOINC problem because the same pattern is happening on both computers.


This is BOINC by design as I see it. And IMHO it's flawed.

If I understand it right then BOINC client accumulates LTD when certain project is out of work/inaccessible/down for some time (among other reasons). IMHO this is wrong. I agree that BOINC client should accumulate (and spend) LTD if BOINC client runs into some kind of problems (eg. no internet connectivity, EDF, ...).

However, I think that LTD shouldn't accumulate if there are problems on projects' side. If a project doesn't provide any work, then IMHO this project is actually voluntarily giving up it's resource share. Other project troubles (such as server crashes etc.) are not voluntary as such, however time to bring project up again is in project's staff hands (well, more or less). Hence the same reasoning: if project can't provide users with work, LTD should not build up.

Distinction between client-side and project-side troubles is not trivial at all times. There is mechanism to check whether BOINC client is unable to contact project's server due to local or remote connection problems (everybody noticed that sometimes BOINC client connects to google) so this kind of distinction could be done.

My humble opinion is that LTD should build up only if client is in EDF or if BOINC client can not fetch new work from any of attached projects.

Metod ...
ID: 13199 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 13202 - Posted: 22 Oct 2007, 14:53:29 UTC - in response to Message 13199.  

I have to assume it is a BOINC problem because the same pattern is happening on both computers.


This is BOINC by design as I see it. And IMHO it's flawed.

See this post.
ID: 13202 · Report as offensive
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 13205 - Posted: 22 Oct 2007, 15:22:59 UTC - in response to Message 13202.  

I have to assume it is a BOINC problem because the same pattern is happening on both computers.


This is BOINC by design as I see it. And IMHO it's flawed.

See this post.


I volunteer my resources to BOINC projects and unlike some others I accept BOINC as is. This doesn't stop me from being annoyed about scheduler's behaviour though. :)
Metod ...
ID: 13205 · Report as offensive
Keck_Komputers
Avatar

Send message
Joined: 29 Aug 05
Posts: 304
United States
Message 13207 - Posted: 22 Oct 2007, 20:56:42 UTC

@Metod
I may be reading your post wrong but I think the LTD works closer to what you are requesting than you seem to think.

When a project is in a deferral for any reason (including NNW and suspensions) the LTD basically does not change. It changes when another project or projects is hogging the CPU or the queue on the host. There is some drift when all LTDs are normalized but it usually takes much longer to make a substantial difference.
BOINC WIKI

BOINCing since 2002/12/8
ID: 13207 · Report as offensive
W-K ID 666

Send message
Joined: 30 Dec 05
Posts: 457
United Kingdom
Message 13210 - Posted: 23 Oct 2007, 4:02:02 UTC

It certainly takes much more time than one would orignally think to sort it self out. This problem I have with Einstein, large -ve LTD and still down loading too many units has been going on for at least three months, since Aug.
And because it downloads too many units the LTD has increase from -200k to over -400k as it stands now. We have aborted all Einstein units on the C2D and hope the AP units from Seti Beta run ok and decrease the LTD that way.

Andy
ID: 13210 · Report as offensive
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 13214 - Posted: 23 Oct 2007, 8:22:19 UTC - in response to Message 13207.  

When a project is in a deferral for any reason (including NNW and suspensions) the LTD basically does not change. It changes when another project or projects is hogging the CPU or the queue on the host. There is some drift when all LTDs are normalized but it usually takes much longer to make a substantial difference.


I agree with what you write. But: the problematic behaviour is when client can not connect project scheduler (eg. when project scheduler itself is down) ... in this case LTD builds up. As far as I can see this case (connection to project scheduler unsuccessful) is handled just the same way as the case when client doesn't request any work due to other reasons (eg. EDF, too low LTD, ...).
Metod ...
ID: 13214 · Report as offensive
Keck_Komputers
Avatar

Send message
Joined: 29 Aug 05
Posts: 304
United States
Message 13236 - Posted: 24 Oct 2007, 3:53:22 UTC - in response to Message 13214.  

When a project is in a deferral for any reason (including NNW and suspensions) the LTD basically does not change. It changes when another project or projects is hogging the CPU or the queue on the host. There is some drift when all LTDs are normalized but it usually takes much longer to make a substantial difference.


I agree with what you write. But: the problematic behaviour is when client can not connect project scheduler (eg. when project scheduler itself is down) ... in this case LTD builds up. As far as I can see this case (connection to project scheduler unsuccessful) is handled just the same way as the case when client doesn't request any work due to other reasons (eg. EDF, too low LTD, ...).

When the client can not get work it causes a deferral and the LTD does not change. If the client has a full queue when the deferral expires then the LTD will start moving again.
BOINC WIKI

BOINCing since 2002/12/8
ID: 13236 · Report as offensive

Message boards : BOINC client : Scheduler weakness

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.