Debit reset behaviour

Message boards : BOINC client : Debit reset behaviour
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Ananas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 8425 - Posted: 25 Feb 2007, 8:59:43 UTC

Not a bug report, just a thought :

Short term debits reset to 0 when a project is suspended.

Wouldn't it be consequent to reset the long term debits when the project download is suspended?
ID: 8425 · Report as offensive
Profile Ananas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 8433 - Posted: 26 Feb 2007, 9:18:37 UTC
Last modified: 26 Feb 2007, 9:24:32 UTC

Project reset drops all workunits :-/

I have one dual P3s/1266 that has 50% CPDN, set to "no new work" in order to have just the one model that it is working on. The second CPU is shared between Einstein and QMC@Home (25% each)

It handles the short term debts fine, CPDN always with a small advantage so one CPU sticks to CPDN - but the long term debits of CPDN pile up very high.

I have patched the BOINC client so it resets all debits to 0 on a restart (it writes "long_term_debt" but tries to read "xong_term_debt")

An "official" solution would be nicer though.
ID: 8433 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15486
Netherlands
Message 8438 - Posted: 26 Feb 2007, 18:26:44 UTC

Resetting Short Term Debt? Why? It's only a measurement that decides which project will crunch next. Long Term Debt decides which project downloads work next.
ID: 8438 · Report as offensive
Profile Ananas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 8440 - Posted: 26 Feb 2007, 21:20:45 UTC
Last modified: 26 Feb 2007, 21:27:45 UTC

Short term debits can already be reset (suspend project for a moment).

The problem kicks in, when a project should work now according to high positive STD but the high negative LTD inhibit downloads. It will download only the necessary amount to keep the CPUs happy but it will not fill the cache.

It makes sense to be able to influence both debit counters, as long as the absolute value of the debits doesn't decay (which would really make sense) similar to the RAC (but slower) - for extremely high values.

We often have reports in our forum that people do not understand why it does not download from this or that even though it has a deficit. Everyone who reads the project fora will sure have seen reports that this or that project steal CPU time from the other projects by taking more time than the ressource share allows - which is complete nonsense of course.

It usually shows that one of their projects has piled up weeks or even months of debits, which is just too much. Usually either projects with no constant WU flow are involved or CPDN with those long running models without the need to download anything for months.

A complete debit reset (both types) would sort this out very quickly.


In order to really understand the problem, one needs to "emulate" an unexperienced user who does not understand how the core client handles that stuff and why.
��u�
ID: 8440 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 8448 - Posted: 27 Feb 2007, 10:23:14 UTC - in response to Message 8440.  

Short term debits can already be reset (suspend project for a moment).

The problem kicks in, when a project should work now according to high positive STD but the high negative LTD inhibit downloads. It will download only the necessary amount to keep the CPUs happy but it will not fill the cache.

It makes sense to be able to influence both debit counters, as long as the absolute value of the debits doesn't decay (which would really make sense) similar to the RAC (but slower) - for extremely high values.

We often have reports in our forum that people do not understand why it does not download from this or that even though it has a deficit. Everyone who reads the project fora will sure have seen reports that this or that project steal CPU time from the other projects by taking more time than the ressource share allows - which is complete nonsense of course.

It usually shows that one of their projects has piled up weeks or even months of debits, which is just too much. Usually either projects with no constant WU flow are involved or CPDN with those long running models without the need to download anything for months.

A complete debit reset (both types) would sort this out very quickly.


In order to really understand the problem, one needs to "emulate" an unexperienced user who does not understand how the core client handles that stuff and why.


G'day Ananas

I'm still a bit confused with your explanation.

STD for a project is only applicable when there is a wu from that project present on the host. It also only affects the clients decision on what project gets crunched next so I don't understand what you mean when you say the client should download more work when there is high +STD and high -LTD. The high STD means that for the period that that project's wu's have been on the host they have sat patiently waiting to be crunched while other projects have been given preference. Once they are all crunched the STD for the project will be zero.
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 8448 · Report as offensive
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 8451 - Posted: 27 Feb 2007, 12:26:32 UTC
Last modified: 27 Feb 2007, 12:27:39 UTC

I think I can (partially at least) uderstand Ananas here ... I've found myself bitter sometimes seeing large neagtive LTD while BOINC CC seemingly did nothing to decay it in some not-so-distant future.
It seems to me that LTD is only taken into account when decission about downloading new work is taken. If there's no need for new work, hugely negative LTD remains for long times. Positive as well. What surprised me was that even one project had huge negative LTD (such as negative one million), work still got downloaded. Even without that work CPU wouldn't be idle.

My proposal would be this: if LTD of a project drops below twice the connect every multiplied by number of procesors, then this should also affect the handling of CPU scheduller ... this project would get a bit less of CPU so that STD would get positive ... and in mid-term (such as a week), this would in turn allow LTD to get closer to 0 again.

As things are now, BOINC CPU scheduller tries to have STD close to 0 and LTD can not get decayed - for months if it includes those 4-month CPDN work units.

{edit:} typoes
Metod ...
ID: 8451 · Report as offensive
John McLeod VII
Avatar

Send message
Joined: 29 Aug 05
Posts: 147
Message 8454 - Posted: 27 Feb 2007, 14:14:01 UTC

Let us try not to confuse Long Term and Short Term. Long term is exactly that, it tracks the usage of the project over the long term. It happens that it is used to determine which project to fetch work from next, and yes, it is true that projects that constantly have work on the host will tend to have an LTD that drops. However, I would like to note that if all projects share the CPU by their resource shares all of the time and all projects provide work when asked, the LTD values will stay near 0. It is only when one project has to borrow or some project does not provide work that the LTD values drift away from 0.

Short term debt is also exactly that -- short term. It tracks the usage of projects that have work on the host. It happens that it is used to determine which project to run next after the consideration of borrowing time to meet deadlines.

Yes, it is true that you will not have work for all projects all of the time, but under certain load situations, that is not possible in any case.

Also note that work fetch and CPU scheduling was changed again for 5.8.x.

BOINC WIKI
ID: 8454 · Report as offensive
KAMasud

Send message
Joined: 13 Feb 07
Posts: 21
Pakistan
Message 8457 - Posted: 27 Feb 2007, 14:41:13 UTC


Ok! what will happen to LTD after i finish crunching BBC, SAP and QMC all of which have WU's which are long?. Lasting any where from four weeks to a year. I will be owing a lot of time to other projects by the time i finish them.:-(
Regards
Masud.
ID: 8457 · Report as offensive
MikeMarsUK

Send message
Joined: 16 Apr 06
Posts: 386
United Kingdom
Message 8462 - Posted: 1 Mar 2007, 8:10:29 UTC


You can run many long-duration workunits on the same machine without long-term-debt problems, provided that those projects have been given sufficent resource share so they can complete within the allocated time (and that the PC is fast enough, of course).

Only if you ignore resource share, or get too enthusiastic with a slower machine, will you get LTD issues.

ID: 8462 · Report as offensive

Message boards : BOINC client : Debit reset behaviour

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.