Problem with the work buffer

Message boards : BOINC client : Problem with the work buffer
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
KAMasud

Send message
Joined: 13 Feb 07
Posts: 21
Pakistan
Message 8455 - Posted: 27 Feb 2007, 14:18:27 UTC


:-)I think i will agree with Luca that BOINC is in control and forcing our hands which is no fun.:-( I used to get 60 WU's of Malaria before upgrade now i only get 6 WU's. The same is true for Tanpaku. Now i dont know much about DCF etc. but i do know that i owe a lot of time to LHC and thats also not in my hands. I will follow this thread to see what the out come is or degrade to my previous version of BOINC. :-) These are my machines after all, so i would like to be in control.;-)
Regards
Masud.
ID: 8455 · Report as offensive
MikeMarsUK

Send message
Joined: 16 Apr 06
Posts: 386
United Kingdom
Message 8463 - Posted: 1 Mar 2007, 8:19:24 UTC
Last modified: 1 Mar 2007, 8:34:16 UTC

I did notice a bug with work issue - it uses resource share to see how much CPU time is available for a given project, but fails to ignore projects which are suspended.

My example (dual-core PC) was:

CPDN beta - 20%
SAP - 20% (14 CPU weeks of work in queue)
CPDN - 20%, no more work, suspended (since the beta started, will remain suspended until the beta finishes, 3 CPU months of suspended work with 12 months deadline)
BBC - 20%, no more work, suspended (since the beta started, will remain suspended until the beta finishes, 3 CPU months of suspended work with 9 months deadline)
Rosetta - 20%, no more work, suspended (since December '06, will remain suspended for the forseeable future, no suspended work)

The CPDN beta server used 20% not 50% as it's guideline when issuing work.

I seem to recall it was 5.9 on the backend servers, but I don't know what the point-release was.

2007-02-27 15:07:55 [---] [scrsave_debug] ACTIVE_TASK::check_graphics_mode_ack(): got graphics ack <mode_hide_graphics/> for hadcm3transsw_ccd5_1920_40_10000788_0, previous mode <mode_unsupported/>
2007-02-27 15:08:04 [---] [sched_op_debug] SCHEDULER_OP::init_op_project(): starting op for http://climateapps1.oucs.ox.ac.uk/beta/
2007-02-27 15:08:04 [Climateprediction.net Beta] Sending scheduler request: To fetch work
2007-02-27 15:08:04 [Climateprediction.net Beta] Requesting 4874 seconds of new work, and reporting 1 completed tasks
2007-02-27 15:08:09 [Climateprediction.net Beta] Scheduler RPC succeeded [server version 509]
2007-02-27 15:08:09 [Climateprediction.net Beta] Message from server: No work sent
2007-02-27 15:08:09 [Climateprediction.net Beta] Message from server: (won't finish in time) Computer on 99.9% of time, BOINC on 91.1% of that, this project gets 20.0% of that
2007-02-27 15:08:09 [Climateprediction.net Beta] Project requested delay of 7.000000 seconds
2007-02-27 15:08:09 [---] [sched_op_debug] handle_scheduler_reply(): got ack for result hadcm3transstd_ccn6_10000609_0
2007-02-27 15:08:09 [Climateprediction.net Beta] Deferring communication for 7 sec
2007-02-27 15:08:09 [Climateprediction.net Beta] Reason: requested by project
2007-02-27 15:08:09 [Climateprediction.net Beta] Deferring communication for 1 min 0 sec
2007-02-27 15:08:09 [Climateprediction.net Beta] Reason: no work from project
2007-02-27 15:33:31 [---] [sched_op_debug] SCHEDULER_OP::init_op_project(): starting op for http://attribution.cpdn.org/


As it happened, I didn't actually want a second beta downloaded at that time, so wasn't a problem for me.

There are two problems I feel - firstly the use of 20% rather than 50%, and secondly the message should really also mention the amount of work already on the computer, since that's also used in the calculation.
ID: 8463 · Report as offensive
MikeMarsUK

Send message
Joined: 16 Apr 06
Posts: 386
United Kingdom
Message 8468 - Posted: 1 Mar 2007, 12:01:35 UTC


Which bug are you referring to, the one from the original post in the thread or my post? Work issue is server-side not client-side ... (although presumably requires information supplied from the client)?
ID: 8468 · Report as offensive
River~~
Avatar

Send message
Joined: 12 Mar 07
Posts: 59
Message 8871 - Posted: 19 Mar 2007, 8:11:16 UTC - in response to Message 8107.  

OK, the primary goal is to not report work late. ...


This is good, and I like the way the new scheduler is working.

However, if the *primary* goal is not to report work late, can we *please* reinstate the option to report work as soon as complete?

All the logic you describe is based on the assumption that we know when the connections are going to be. In my case I have a very unreliable connection, and no control over the timing of the unreliability. It is irritation to see work sitting around for ages once complete, and it is infuriating when we have a net outage to see work going past the deadline that could have been reported before the outage.

It should only be an option - for most users don't have this problem. But in my (biased) opinion it should be an available option.

Secondly, when a project is in deadline trouble, as determined in the way you describe, does the client still wait for the next connection slot to report a finished WU? Or does it then report immediately (assuming the network is available)?

This is an issue because in my situation I'd like to be able to set a 3 day connect interval to mean that the box can survive a three day net outage at any time; and that means reporting work as it completes and topping up every time some work completes.

To bang an old drum here for a moment - the issue is really that the connect every interval is the difference between hi-tide and lo-tide levels of work queued locally. The level of lo-tide is a different number in principle, and in my continued opinion really needs to be given a separate identity in the prefs.

However, as you have explained to me before, that is unlikely to be politically possible.... in lieu of that, a good second best would be the return of the option to report (& top up) on every completed upload.

Reporting work as soon as it completes is part of making sure it does not miss a deadline if the network is unpredictable.

River~~
ID: 8871 · Report as offensive
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 8876 - Posted: 19 Mar 2007, 10:01:40 UTC - in response to Message 8871.  

OK, the primary goal is to not report work late. ...


This is good, and I like the way the new scheduler is working.

However, if the *primary* goal is not to report work late, can we *please* reinstate the option to report work as soon as complete?


My experience is that BOINC CC will connect project server for work fetch as soon as work cache drops below the connect every setting. If there's work to be reported, it will get reported.

The problematic part is: when does BOINC CC connect the project? Some projects assign more work than requested. OK, to certain extent this is understandable as work available is quantised and there will always be too much work assigned. The problem is that some projects assign much more work than requested. The discepancy is proportional to the amount of work requested.

In your particular case: if you drain your cache and request 3 days worth of work, ideally you would get 3 days an a couple of hours of work. In this case BOINC CC would next try to connect in a couple of hours when surplus work would be done. However, you might end up with 5 days of work. This would mean that BOINC CC wouldn't try to connect project server for 2 days.

When BOINC CC tries to connect project server and fails, it backs up by random time period which increases with every failed connection. It does not wait for another connect every time period ...
Metod ...
ID: 8876 · Report as offensive
Previous · 1 · 2

Message boards : BOINC client : Problem with the work buffer

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.