Message boards : BOINC client : Scheduling Needs Project Out-of-work Flag Kept in Client
Message board moderation
Author | Message |
---|---|
Send message Joined: 10 Jan 06 Posts: 6 |
Scheduling Needs Project "Out of Work" Flag Kept in Client Machine Environment: - Dial-up manually, three projects, two out of work - Manual manipulation of Network Activity Flag used to control when network available One of the stated objectives of the scheduler is to control multiple projects so that if a project is out of work, that the machine will not be idle by processing work for other projects. However that is not working in my experience. I observed that scheduling would not load queues with enough work for my allocation duration of three days. Instead it would only load less than one third (each project was not equal) amount of work for the project that had work, even though the other two projects had tried to get work and had received negative response from their servers during my connection. The result was my machine sat idle for a long time unless I dialed in multiple times in a day. I finally got so ticked off that I detached the two idle projects from my machine so that I could get a full load of work. (I thought of tripling my work duration, but that would result in excess loading of the queue for single project machines.) Now someone is going to say that the two projects out of work had a negative work debt (my words) and that is why it wouldn't fill the work queue with work from the only project that did have work... And that's fine for a machine that has a dedicated internet connection that can get work anytime. But for a machine that gets connected for a short time at long intervals, that stratagy doesn't fullfill your continuous work objective (i.e. implemention needs improvement.) Scheduling needs to divide work queue alotment between projects that have work available for the allocation period. In my example above, it needed to fill my work queue with work from the project that had work for the full three days, not for just one third of that because it didn't take into consideration that the other two projects had no work to give at the time I asked for it. When I detached from the empty projects, scheduling loaded my queue with the correct amount. But I shouldn't have to do that if the design objectives were met. It also means that I have to watch those two projects web sites to see if work is available and attach to the projects again (not good for statistics either). Ned |
Send message Joined: 30 Aug 05 Posts: 297 |
Instead of detaching, the normal approach is to suspend any projects that don't have work; that accomplishes the same thing, but it's easier to hit "resume" than it is to reattach, and you can resume every third or fourth time you connect, just to see if there is work. The problem with a project being "out of work" is that BOINC doesn't know when it's suddenly going to _have_ work - that could be five minutes from now. If there was a way to tell when someone is on dialup, it would be easier to deal with this, but all attempts to get "dialup yes/no" or separate settings for connect-every and cache, have been refused so far. Which version of BOINC are you running? There are complaints about current versions (5.2.13) _not_ following resource shares during work fetch, and getting too _much_ work. |
Send message Joined: 10 Jan 06 Posts: 6 |
Bill, I am running 5.2.13. As for dialup, check the general preferences for maximum upload/ download rate and if its less than 4.8 KB/s, assume the connection is dialup. Don't worry about a project suddenly getting work. If it should happen, have the distribution be recalculated. The worst that could happen is that you have an overload during an allocation period which is short compared to when the WU needs to be processed. It been my experience that if a project doesn't have work, the duration is for a long time, versus 5 minutes or so... I'll use the "Suspend" command next time, but its still a manual process and you need to do monitoring. Ned |
Send message Joined: 30 Aug 05 Posts: 297 |
It been my experience that if a project doesn't have work, the duration is for a long time, versus 5 minutes or so... My experience is exactly opposite - Predictor would have no work for my platform, then next connection would. Rosetta gets some overload condition where it will say "no work", but retrying gets work. SETI has been running "on the edge" between having work and not, with people grabbing it as fast as they can generate it. I've never attached to LHC, they have no Mac client. But to your specific problem; if you have three projects, and there is no work ON YOUR HOST for two of them, then regardless of any "out of work" signals from those two projects, the Client should download your full cache setting worth of work for the third project. If it is not doing that, then it may be a bug, rather than a need for a new feature. Work fetch is controlled by long term debt. LTD is not supposed to be affected while a project is out of work, but it does slowly drift. This is why the standard recommendation is to suspend such projects, as that completely removes them from the LTD calculation, or even to occasionally "reset" them (which resets their debt to zero). If you let a project be active for a long period while it has no work, the debt for that project will climb, which (because the sum of all LTDs is zero) will affect the LTD of your _active_ projects, and therefore affect work fetch. Did you note the LTDs for the projects before detaching? If they were the cause, that's a _known_ bug with known workarounds, as mentioned. If not, we could have tried to identify the problem and see if it was something new. There is no currently available _reliable_ way to detect a dialup user. The upload/download limits are not _required_ to be set; I would bet 99% of dialup users say "no limit". |
Send message Joined: 29 Aug 05 Posts: 225 |
Basing dial-up connection on upload/download speed is not accurate either. I could simply have a heavily loaded "always on" connection. |
Send message Joined: 10 Jan 06 Posts: 6 |
Bill: "There is no currently available _reliable_ way to detect a dialup user. The upload/download limits are not _required_ to be set; I would bet 99% of dialup users say "no limit"." -- I use Einstein as my "home" project... I've set the upload/download limits a couple of times, and they stay put, so their settings should be an accurate indicator of dialup for me at least. There must be something that indicates that they have been manually set so that they don't get modified by the application with measured rate data. ------ Paul: Basing dial-up connection on upload/download speed is not accurate either. I could simply have a heavily loaded "always on" connection. -- Once set manually, it doesn't seem to move... Just set if higher for a DSL connection. ------ Bill: "But to your specific problem; if you have three projects, and there is no work ON YOUR HOST for two of them, then regardless of any "out of work" signals from those two projects, the Client should download your full cache setting worth of work for the third project. If it is not doing that, then it may be a bug, rather than a need for a new feature." -- That indeed was the case. So I guess I'm reporting a bug. Too bad I've blown away the data required to debug the problem. ------ Bill: This is why the standard recommendation is to suspend such projects, as that completely removes them from the LTD calculation, or even to occasionally "reset" them (which resets their debt to zero)." -- If I had known this, it would have saved me a lot of trouble... I suppose there is a source for all these "recommendations" that I haven't discovered yet. ------ Thanks for your responses... Ned |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.