Posts by squeak

1) Message boards : BOINC client : Work fetch policy (Message 44036)
Posted 7 May 2012 by squeak
Post:
Work is only fetched when the total queue [my emphasis] falls down the minimum work buffer value. Then work will be fetched from the project with the highest priority, based on work already done and resource share.


I continue to be unhappy about the fact that there seems to be no clean way of ensuring that each project has some work to do. My own multiproject environment includes CPDN which will always have lots of work because of massive WUs, and so BOINC seems to say to itself that it doesn't need to download WUs from other projects because there's always plenty to do. I would prefer the above algorithm to be implemented on a per project basis.

Also, the lag between WUs finishing and reporting of them puzzles me. I have tried to leave BOINC to its own devices and find that sometimes several days elapse between a bunch of WUs finishing and the completed units disappearing out of my BOINC. When BOINC does finally decide it needs to get more work it generally seems to get lots, because the average work done is so far below the resource usage targets for everyone except CPDN. BOINC runs in feast and famine mode for me.
2) Message boards : Questions and problems : 7.0.25 Doesn't fetch work when cc_config excludes GPU's (Message 43952)
Posted 3 May 2012 by squeak
Post:
Interesting!

Where did you find 7.0.27? I appreciate that it's still an alpha release, but I'm presuming that if I try it and decide it's a bit buggy, I can still revert to 7.0.25 (is this still true?)
3) Message boards : Questions and problems : 7.0.25 Doesn't fetch work when cc_config excludes GPU's (Message 43942)
Posted 3 May 2012 by squeak
Post:
BOINC 7.0 does not work this way. It will only fetch work when the total work cache is below that of the minimum work buffer value and then it will only fetch work from the project with the highest priority (priority == 0, or close to it, like -0.05), and only if that project does not have work, will it fetch work from the next, and the next, etc.


OK, just to clarify, is the work cache you're talking about here an aggregate, so across all projects, or does this algorithm apply to each project individually?

If it's over all projects, then one project like CPDN with massive WUs will always have work well beyond the minimum work buffer value, and so BOINC would not be motivated to go fetch anything new from other projects. This is pretty consistent with the behaviour I'm observing.

If it's per individual project, then I have a dilemma, because I can't see that behaviour happening.
4) Message boards : BOINC client : massive work fetch bug in 7.0.25 (Message 43923)
Posted 2 May 2012 by squeak
Post:
Hi. My 7.0.25 has been in for a while now, I managed previously to kick it into having work for all my projects, which worked fine for a while, but now all of the projects except CPDN have run down to zero WUs, and this has been the case for a couple of days now. I've just been leaving it alone, following the theory that if I leave it alone, it'll sort itself out. It doesn't seem to be doing that.

Here is my cc_config.xml
------snip----------------
<cc_config>
<options>
<client_version_check_url>http://www.worldcommunitygrid.org/download.php?xml=1</client_version_check_url>
<client_download_url>http://www.worldcommunitygrid.org/download.php</client_download_url>
<network_test_url>http://www.ibm.com/</network_test_url>
<start_delay>120</start_delay>
</options>
<log_flags>
<cpu_sched_debug>1</cpu_sched_debug>
<work_fetch_debug>1</work_fetch_debug>
</log_flags>
</cc_config>
-------unsnip-----------
I have no idea why the worldcommunitygrid lines are in there (maybe someone can explain), but it might be the reason why my BOINC always has a WCG personality.

My projects settings are as follows:

project avg_work_done resource_share
rosetta 71.85 36%
seti@home 81.09 30%
CPDN 98.79 18%
WCG 49.26 16%

Here is a piece of my log.

------snip-------------
3/05/2012 00:21:32 | | [cpu_sched_debug] Request CPU reschedule: periodic CPU scheduling
3/05/2012 00:21:32 | | [cpu_sched_debug] schedule_cpus(): start
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] scheduling hadam3p_pnw_8l0f_2000_1_007823766_0 (CPU job, priority order) (prio -1.000000)
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] scheduling hadam3p_pnw_yzms_1963_1_006910844_1 (CPU job, priority order) (prio -1.019052)
3/05/2012 00:21:32 | | [cpu_sched_debug] enforce_schedule(): start
3/05/2012 00:21:32 | | [cpu_sched_debug] preliminary job list:
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] 0: hadam3p_pnw_8l0f_2000_1_007823766_0 (MD: no; UTS: no)
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] 1: hadam3p_pnw_yzms_1963_1_006910844_1 (MD: no; UTS: no)
3/05/2012 00:21:32 | | [cpu_sched_debug] final job list:
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] 0: hadam3p_pnw_8l0f_2000_1_007823766_0 (MD: no; UTS: no)
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] 1: hadam3p_pnw_yzms_1963_1_006910844_1 (MD: no; UTS: no)
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] scheduling hadam3p_pnw_8l0f_2000_1_007823766_0
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] scheduling hadam3p_pnw_yzms_1963_1_006910844_1
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] hadam3p_pnw_8l0f_2000_1_007823766_0 sched state 2 next 2 task state 1
3/05/2012 00:21:32 | climateprediction.net | [cpu_sched_debug] hadam3p_pnw_yzms_1963_1_006910844_1 sched state 2 next 2 task state 1
3/05/2012 00:21:32 | | [cpu_sched_debug] enforce_schedule: end
3/05/2012 00:21:35 | | [work_fetch] work fetch start
3/05/2012 00:21:35 | | [work_fetch] ------- start work fetch state -------
3/05/2012 00:21:35 | | [work_fetch] target work buffer: 259200.00 + 345600.00 sec
3/05/2012 00:21:35 | rosetta@home | [work_fetch] REC 70.998 priority -0.666025
3/05/2012 00:21:35 | climateprediction.net | [work_fetch] REC 89.165 priority -2.270503
3/05/2012 00:21:35 | SETI@home | [work_fetch] REC 67.264 priority -0.757205
3/05/2012 00:21:35 | World Community Grid | [work_fetch] REC 68.682 priority -1.449677
3/05/2012 00:21:35 | | [work_fetch] CPU: shortfall 236574.72 nidle 0.00 saturated 368225.28 busy 0.00
3/05/2012 00:21:35 | rosetta@home | [work_fetch] CPU: fetch share 0.360 rsc backoff (dt 0.00, inc 0.00)
3/05/2012 00:21:35 | climateprediction.net | [work_fetch] CPU: fetch share 0.180 rsc backoff (dt 0.00, inc 0.00)
3/05/2012 00:21:35 | SETI@home | [work_fetch] CPU: fetch share 0.300 rsc backoff (dt 0.00, inc 0.00)
3/05/2012 00:21:35 | World Community Grid | [work_fetch] CPU: fetch share 0.160 rsc backoff (dt 0.00, inc 0.00)
3/05/2012 00:21:35 | | [work_fetch] ------- end work fetch state -------
3/05/2012 00:21:35 | | [work_fetch] No project chosen for work fetch
-------unsnip---------

BOINC is showing no inclination to fetch any WUs, and has maintained this stance for some days now. Is this expected behaviour?
5) Message boards : BOINC client : massive work fetch bug in 7.0.25 (Message 43665)
Posted 22 Apr 2012 by squeak
Post:
Thanks you, ageless, for the response. It goes some of the way to clarifying things for me. Nonetheless, I wasn't quibbling about the use of debug flags, I totally agree that they should be nowhere near a user interface. I was only concerned about the suggestions to put stuff in cc_config.xml which were relevant at the user interface. Now maybe (bearing in mind the nature of this particular message board, the people receiving this advice were developers and testers, and so were accustomed to dealing at that level. If so, mea culpa.

Anyway, since you mentioned debug flags, and since you had suggested to someone that they put the flags
<log_flags>
<cpu_sched_debug>1</cpu_sched_debug>
<work_fetch_debug>1</work_fetch_debug>
</log_flags>
into cc_config.xml, I thought I'd try it and gather up the output.

Well, after 4 or 5 days of BOINC steadfastly refusing to fetch any work, putting in those debug flags caused it immediately to request work from all my other projects EXCEPT the one with the highest resource share setting, which in my case is rosetta. It generated a lot of output which I'm happy to send in, if anyone is interested, not sure whether an inline post is appropriate, what is the best way sending a file?
6) Message boards : BOINC client : massive work fetch bug in 7.0.25 (Message 43617)
Posted 20 Apr 2012 by squeak
Post:
I've heard the suggestion that I was attacking ageless personally in my last post. So, firstly, let me apologise for causing offence. It was not intended as a personal attack on ageless or anyone else, but I was certainly concerned by some of the implications of the advice being provided.

The intent of the post was to point out what I believe were significant philosophical issues about user interface design and about the level of advice given to users as against developers or testers. Having managed teams of developers in the past, I understand well the temptation to resolve questions by tweaking internal settings or adjusting things which are not typically exposed to general users. However while these approaches are OK for developers and testers trying to identify the reasons for a tool's behaviour, they should not be mechanisms for users to control the behaviour of the tool. Their toolbox should be restricted to the user interface provided, which perhaps may need extension, but should not be bypassed.

Anyway, apologies again. I would like to see some response to the technical isues I've raised.

I'd also like to see my BOINC 7.0.25 ask my various projects for some work other than CPDN, which it is steadfastly refusing to do.
7) Message boards : Questions and problems : Why does World Community Grid take over BOINC? (Message 43616)
Posted 20 Apr 2012 by squeak
Post:
After adding the World Community Grid to my list of projects, I discovered that WCG is a pretty arrogant project. It retitled my BOINC, and despite doing a full uninstall and reinstall, and several upgrades, BOINC still always comes up as "World Community Grid - BOINC". What I don't know if there are other little trojans left behind by WCG which may be stuffing up my BOINC environment. Anyone know how I go back to "vanilla" BOINC?
8) Message boards : Questions and problems : BOINC 7 not getting tasks (Message 43615)
Posted 20 Apr 2012 by squeak
Post:
Huh? Or should I say bitte? Are you serious?

I guess I could spend the next week trying every possible combination of parameters, but generally someone testing something likes to have a particular hypothesis to prove or disprove, and testing computer systems generally involves boundary value or specific values based on an understanding of the behaviour of the subject. Also generally you have a something which appears to work but you are trying to break it.

In this case, BOINC is clearly not responding to what I've told it, so it is already broken, and there are no clues as to what changes should be applied to make to work. Typically in problem diagnosis, you look for the root cause, rather than randomly pick alternative inputs in the hope that it might work.
9) Message boards : Questions and problems : BOINC 7 not getting tasks (Message 43604)
Posted 19 Apr 2012 by squeak
Post:
BOINC 7 will have to learn all over again about the project's applications, their different run times, how different the estimates are from reality etc. This will take a week or more. Depends on how much your machine is on and when BOINC is allowed to do work on CPU


Yes, well that's all well and good. But if BOINC7 is not loading WUs for any more than 1 project, how is to going to learn how they others behave? I've set BOINC to keep work for 3 days and an extra 4, but it has its own opinion of the need.
10) Message boards : BOINC client : massive work fetch bug in 7.0.25 (Message 43594)
Posted 19 Apr 2012 by squeak
Post:
I too have been frustrated by 7.0.25. Once I finally managed to get it installed, after having to locate the BOINC.msi file myself because the BOINC installer lost the reference to the folder (very sloppy), I found that things were different. It immediately decided that rosetta had to run in high priority mode. For some time I have running with the settings "Maintain enough tasks to keep busy for at least 3 days" and "... and up to an additional 4 days", mainly to be able to ride through the times which all projects seem to have whereby there are no WUs available, or the project is shut down for maintenance. At the time I put in 7.0.25, I had rosetta WUs amounting to about 8 hours, with deadlines about a week away, and my resource shares meant that Rosetta had a target resource share of 36%, so 8 hours of work in those conditions didn't seem to justify BOINC going into high priority mode. Anyway, I watched with interest. BOINC then went through my other projects one at a time clearing out all WUs and not asking for any more, until only CPDN was left. My CPDN WUs won't finish for weeks, and have deadlines well into 2013. I have tried updating the projects, resetting the projects, but to no avail.

Now I see the comment from "ageless" that BOINC 7 "will not go and fetch work, or schedule which projects to run, as previous versions did". I am staggered, as I thought that this was the whole raison d'etre for BOINC. I then see comments about "In 6.12 and before, you'd set connect to interval to x.xx and additional work to x.xx" but "In 7.0 you set minimum work buffer to 1.0 and max additional work buffer to 0.01". Now the "connect to..." and "additional work" settings are part of the user interface, and indeed I have been setting them as described above. However, there are no parts of the user interface relating to work buffers.
I see "ageless" also referring to the cc_config.xml file. I did a search across my hard drive and it eventually found such a file buried in part of my /Documents and Settings tree. The file hadn't been touched for 2 years, and had no entries related to work buffer settings.

I am not a novice, having been a developer/designer/architect for 40 years. Mind you I have always figured that developers put settings into a user interface because the intention was that these were the things that users should play with. Internal config files in obscure directories were put there precisely because they contained stuff that users should NOT be playing with. Or is the dominant logic here that in order to make BOINC work reliably you needed to know as much as the BOINC developers?

The release notes for 7.0.25 say ...
"The new scheduler observes the resource share setting better than the old scheduler.

Another change is the client will no longer attempt to get work right after completing a job. Instead it will wait until it drops below a threshold and then start asking around for work. You can change both the lower threshold and upper threshold by changing these preference settings:

'Maintain enough tasks to keep busy for at least' (lower threshold) and
'... and up to an additional' (upper threshold)"

Now this contradicts ageless, as it suggests that the user interface settings are the way to control things. I have those thresholds set but BOINC is not honouring them. Mind you the release notes do not clarify if the amount of work relates to an overall figure adding up all projects, or is on a per project basis. If it is an overall thing, then one project like CPDN will always have more than enough work, so that nothing else gets a look in. On the other hand, if it is supposed to apply separately to each project (as clearly most respondents expect), then it's certainly not actually working that way.

Also, the idea that BOINC won't necessarily report completed WUs until something else happens seems a litle strange. Some projects like to know when WUs are done, so they can cross them off. If thereare a bunch of BOINCs out there hiding their finished WUs until a convenient moment, then life in the projects can slow down considerably. Have the various projects signed off on this slowdown of results?

Lastly, a (possibly) unrelated issue. After adding the World Community Grid to my list of projects, I discovered that WCG is a pretty arrogant project. It retitled my BOINC, and despite doing a full uninstall and reinstall, BOINC still always comes up as "World Community Grid - BOINC". What I don't know if there are other little trojans left behind by WCG which may be stuffing up my BOINC environment. Anyone know how I go back to "vanilla" BOINC?

I'll climb off my soapbox now. :)
11) Message boards : Questions and problems : Hoops and loops of installing 7.0.25 (Message 43593)
Posted 19 Apr 2012 by squeak
Post:
I also experienced this problem, had to do a search of my hard drive to find the right one. The problem is not that Windows forgets, as the installer gets told the right place, and is expected to remember it for later. All previous versions of BOINC (and half a billion other installers around the world) manage to do that. Sorry for being blunt. 7.0.25 has a few other glitches as can be found in other posts.

Graham
12) Message boards : Questions and problems : Regularly have trouble getting BOINC to request new WUs in projects (Message 39381)
Posted 30 Jul 2011 by squeak
Post:
OK, that's good. But my problem at the moment is not that the projects are unable to supply WUs, but rather that BOINC does not feel the need to ask for them. It hasn't asked for them for Rosetta or CPDN for at least a week.
13) Message boards : Questions and problems : Regularly have trouble getting BOINC to request new WUs in projects (Message 39373)
Posted 30 Jul 2011 by squeak
Post:
I run 6.12.33m on SETI@home, Rosetta, and ClimatePrediction. On the curent and previous releases of BOINC, have experienced an issue where one or two of the projects complete all downloaded WUs, but the third project still has plenty of WUs left, so BOINC will not request more for the projects without any downloaded. For example at the moment, have yesterda run out of ClimatePrediction and Rosetta, but at that time had a couple of S@H WUs left. BOINC did not request more WUs for Rosetta when it ran out, likewise for CPDN. however when BOINC only had S@H left, it requested more S@H WUs. I now have 300 hours of processing in the queue for S@H only. My settings specify connect every 3 days, keep extra 5 days. I do this because on regular intervals one or other project runs out of available WUs (sometimes more than one). My other settings are that my %shares are 50% rosetta, 25% S@H and 25% CPDN. My current average work done for the three (normalised) is 35% rosetta, 40% S@H and 25% CPDN. My Rosetta is well under the target, and S@H is well over, So why does BOINC feel the need for more S@H WUs but not rosetta? Deadlines are not cutting in, nothing runs at high priority.




Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.