Estimated remaining time vs. percentage of processor use

Message boards : Questions and problems : Estimated remaining time vs. percentage of processor use
Message board moderation

To post messages, you must log in.

AuthorMessage
Strassenschild

Send message
Joined: 25 Aug 14
Posts: 4
Germany
Message 55576 - Posted: 25 Aug 2014, 9:04:32 UTC

Hullo,

I've been running Boinc in some capacity or another for about a decade now, but the last two years have changed the way the software behaves apparently. Admittedly, the way I use Boinc changed as well:

At the time I switched to a Mac laptop (not up to discussion). Depending on when and where I use it, I let Boinc run all processors and would have it use 100% of processor time, but… when I do that the heat generation is so big the fan speed maxes out (6200rpm), even with a Laptop cooler shunting more heat away.

Since I don't want to degrade my processor, battery and general hardware unduly (and cook my nether regions when I use it atop my lap), I changed the use of processor time to 20–30%, a range in which the fans run at idle (2000rpm) to manageable (3000rpm).

My issue with this is: the time remaining is always wrong. As in: it tells me it need another hour for 30%, when it needed 7 hours for 70% completion of a WU. And since Boinc consistently calculates the time need for completion of WU wrong, it regularly starts WU late. The one I described? Initial estimate: 2 hours. Real time: 9 hours. Started: 5 hours before deadline. And I'm talking about computing time, not real time, which is 4–5 times higher since I run the processors at reduced percentage.
I think you can see the problem? If I don't check every other day or so if there are WU due the next few days, it is entirely likely Boinc will work the unit, but send the result days after deadline, since it's priority calculation (probably partly based on estimated time) doesn't flag the WU due tomorrow as high priority, since, hey, estimated time is an hour!

So I'd like to know if there's some workaround available or something. Like, how do I reset the algorithm calculating estimated time remaining? That might help. Maybe.

Yours,
Schild
ID: 55576 · Report as offensive
noderaser
Avatar

Send message
Joined: 2 Jan 14
Posts: 276
United States
Message 55584 - Posted: 26 Aug 2014, 4:01:03 UTC
Last modified: 26 Aug 2014, 4:01:28 UTC

As a note about the processor usage versus heat, I have found that lowering the "Use at most XX CPU time" setting to even just 85 or 90% can make a big difference in the heat and fan speed on my 2010 MacBook Pro, but does not have a huge impact on the computing.

Regarding your questions about the estimated CPU time, that really depends on the project and its apps. The best bet would be to ask at the appropriate project's forum, unless this is consistent behavior across all projects.
My Detailed BOINC Stats
ID: 55584 · Report as offensive
SuperSluether

Send message
Joined: 6 Jul 14
Posts: 94
United States
Message 55637 - Posted: 27 Aug 2014, 0:46:09 UTC - in response to Message 55576.  

Try limiting the CPU time (or even CPU cores if multiprocessor) to 80%. It may not seem like much, but it does make that difference in heat. The estimated time is just that: estimated. Most likely it is estimated based on specific hardware, then re-estimated for your specific hardware. This is not always accurate, and changing the CPU time to 20-30% can also affect it. Somewhere on these forums, sombody said that BOINC needs a few days (up to a week) to manage tasks effectively. Maybe it just needs time to consider your setup.
ID: 55637 · Report as offensive
Strassenschild

Send message
Joined: 25 Aug 14
Posts: 4
Germany
Message 55664 - Posted: 28 Aug 2014, 11:02:48 UTC
Last modified: 28 Aug 2014, 11:43:06 UTC

I get all these arguments.

I will also wait a few weeks before raising a fuzz again, especially since I uninstalled and reinstalled boinc, so theoretically the calculating algorithm should be reset.
Pointing at my progress report, another issue is the proximity between deadline and actual calculation time. It seems to me the WU's will expire in or close to the time frame needed for calculation, making the delivery of WU's pretty much just-in-time. There's no buffer for delay — what if I don't use my computer for a day or two? This will almost certainly make WU's pass their deadline.

As a progress report:

Remaining (Estimated) at the beginning: 3:50 hours.
Running time after 50%: 3:30 hours.
Actual running time: 14 hours.
WU in queue: 13 + 4 currently in computation.
Guess at time frame for queue completion: 120 continuous hours, or 12-14 realistic days.
Buffer settings: Min .1 days, Max .5 days.

At a guesstimate the actual time per WU is at 30 hours, or three to four days, with four WUs in parallel. The closest deadline is in 5 days. The latest is 10.

So you see my problem? The automated WU queue does not take into consideration the percentage of processor time the program may use. If I used 100%, everything would be within parameters and observed deviations'd be acceptable, but at lesser capacity both algorithms seem to be detrimental, increasing the number of late WU's.

Maybe I ought to abandon World Community Grid and settle for a project that has longer deadlines.

PS:
At 80% processor time the CPU temperature is at ~85° C compared to >90° C at 100%. The fans will run at ~6000 rpm instead of 6200. Since I consider >3000 rpm as too loud (it rises from background to consciously ignored) the fan speed is more of a benchmark for me than the temperature.
Incidentally, at 25% the die temp is ~70° C with <3000 rpm.
ID: 55664 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2540
United Kingdom
Message 55665 - Posted: 28 Aug 2014, 11:14:33 UTC - in response to Message 55576.  

Interesting, I only run CPDN most of the time which still accepts work after the deadlines which are only there for that project because BOINC insists on them and will still accept work after the deadlines are over. I currently have a task 72% completed on 41 hours elapsed time and an estimated 80 hours remaining! I wonder, assuming your machine is one with more than once processor on the chip if it is worth restricting BOINC to just once processor? This will probably make it run cool enough but still allow things to complete in a reasonable length of time.

I find on my machine that total throughput of work when I only run one processor out of two is not all that much less than running both. I had wondered if it was because of cpu throttling reducing the speed but running a programme that tells me the speeds of both cores, that seems not to be the case. It may be that with the project I run, even 2GB/core isn't enough when shared with video?

Probably not the whole answer but maybe some ideas to think about.
ID: 55665 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 55666 - Posted: 28 Aug 2014, 13:24:12 UTC - in response to Message 55664.  

I get all these arguments.

I will also wait a few weeks before raising a fuzz again, especially since I uninstalled and reinstalled boinc, so theoretically the calculating algorithm should be reset.

Just two quick observations that you may wish to feed into that theoretical assumption.

1) Uninstalling the BOINC programs doesn't remove the user data that is used to drive the calculating algorithm.

2) Even if you take the additional (manual) step of removing the user data from the machine, much of the runtime estimation (depending on project) nowadays relies on data held on the project server. You won't have deleted that.
ID: 55666 · Report as offensive
Strassenschild

Send message
Joined: 25 Aug 14
Posts: 4
Germany
Message 55770 - Posted: 2 Sep 2014, 9:46:07 UTC - in response to Message 55665.  

Dave wrote:
Interesting, I only run CPDN most of the time which still accepts work after the deadlines which are only there for that project because BOINC insists on them and will still accept work after the deadlines are over. I currently have a task 72% completed on 41 hours elapsed time and an estimated 80 hours remaining! I wonder, assuming your machine is one with more than once processor on the chip if it is worth restricting BOINC to just once processor? This will probably make it run cool enough but still allow things to complete in a reasonable length of time.


I assume CPDN is ClimatePredictionDotNet? WCG, as I understand it, deals with the deadlines depending on the project. Some are more lenient than others, with one project taking late results into consideration, but giving no credit for it, while others don't even do that. On the other side of the spectrum would be CPDN, which doesn't seems to care about deadlines much.

Dave wrote:
I find on my machine that total throughput of work when I only run one processor out of two is not all that much less than running both. I had wondered if it was because of cpu throttling reducing the speed but running a programme that tells me the speeds of both cores, that seems not to be the case. It may be that with the project I run, even 2GB/core isn't enough when shared with video?


I have tried running Boinc on fewer cores at higher intensity. The heat generation is reduced, but not to an appreciable degree: Running only 1 of 4 cores, but at 100%, will results in fan speeds exceeding 5000 rpm and CPU temperatures around 90°C.

Fiddling with the preferences a bit let me increase the intensity to 30% and still adhering to my conditions set prior (fan <3000 rpm) without a laptop cooler. With a LC at minimum speeds I can raise the intensity to 40%, and the LC at maximum speeds allows for 50% intensity. Sadly, the sound level of the LC at max is in the annoying range, so a no-go.

Your point about the memory per core is quite interesting. For the foreseeable future I have only 1GB/core and no dedicated graphics card. Once the current queue is done I'll try out variations in number of cores assigned to Boinc.

Another possible solution might be a staggered approach to multiple cores. At the moment a check on usage shows that all cores are in use at the same time (the four processes appear and disappear as main CPU hoggers at the same time). Consequently they use the very same memory and cache. I wonder if a staggered approach would be more beneficial.

To give an example (4 Cores at 30% intensity):
T_0: Core 1 starts
T_0.25 Core 2 starts
T_0.3 Core 1 stops
T_0.5 Core 3 starts
T_0.55 Core 2 stops
T_0.75 Core 4 starts
T_0.8 Core 3 stops
T_1: Core 1 stars
T_1.05: Core 4 stops

That way every core would have .2 seconds of the memory for itself, and would only need to share .05 seconds with one other core, as compared to the current .3 seconds it has to share with three other cores.

And thank you, Richard, for your input; I didn't know that.
ID: 55770 · Report as offensive
Strassenschild

Send message
Joined: 25 Aug 14
Posts: 4
Germany
Message 55893 - Posted: 8 Sep 2014, 13:21:17 UTC - in response to Message 55770.  

Strassenschild wrote:
Once the current queue is done I'll try out variations in number of cores assigned to Boinc.


I tried some variations, but none increased the net use I get out of my processor in the current configuration (all 4 cores at 30% = 120% of a maximum 400%). One core at 50% increased the temp / fan enough to near unwanted ranges, so adding more cores proved to be detrimental to my preferences.

Nevertheless I fiddled with the settings a bit more and reduced the buffer settings to .1 day each. So far this seems to do the trick, reining in the over-eager WU queue. At a guess even the deadlines in the queue will not come to pass before they can be reasonably reached.

This is however a workaround of the issue; ideally the WU queue would be able to take into consideration both computing time as well as real time spent on a WU and compare that to the deadlines issues, with a subsequent reduction in downloaded WU for the buffer. As it is now it only seems to care about the estimated CPU time remaining, the very issue that led to a significant amount of computing time lost due to WU starting way late, without any reasonable time to complete them in the time given, based on recent workstation behaviour.

I don't expect this issue to be addressed, though.

TL;DR: Be as you were; Workaround found.
ID: 55893 · Report as offensive

Message boards : Questions and problems : Estimated remaining time vs. percentage of processor use

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.