Message boards : BOINC client : How should I assign maximum disk space usage?
Message board moderation
Author | Message |
---|---|
Send message Joined: 19 Dec 05 Posts: 93 |
I have two machines, and both of them run the BOINC client in its own partition (/boinc on Linux). In the past I had the /boinc partition set to 8 GBytes (approximately) on both machines, but my big machine was running over because of running sometimes 4 climateprediction.net applications at a time. So today, I stopped BOINC client, did a backup of the /boinc partition, unmounted the partition, deleteded the partition, recreated it with about 16 GBytes, remade the file system, mounted it, and restored everything there from the backup. So now what should I put into the general preferences? There are three, and their current values are: Use no more than 7.7 GB disk space Leave at least 0.1 GB disk space free Use no more than 99% of total disk space I think I would prefer the last two to remain the same, but if I raise the first one from 7.7 GB disk space to 15.8 GBytes, what will happen on the system with only about 8 GBytes? I.e., does BOINC use the minimum of all these? Or what? |
Send message Joined: 25 Nov 05 Posts: 1654 |
In Climateprediction, a trick that we used to prevent the download of more work, in the days before there was a BOINC version that had a "No more work" option, was to set the first parameter to a value greater than the maximum disk size. Then the server said that there wasn't enough disk space, and didn't download any new work. So this is what will happen on your computer that doesn't have the large disk space that you set for the other two. |
Send message Joined: 29 Aug 05 Posts: 15581 |
Do you use more than 7.7GB? Even 4 CPDN units don't use more than 2.4GB of space, so why not leave it for now and see how it goes? |
Send message Joined: 19 Dec 05 Posts: 93 |
Do you use more than 7.7GB? Even 4 CPDN units don't use more than 2.4GB of space, so why not leave it for now and see how it goes? They don't? I get this with just three, and 48vf_200298395 is all done. The numbers are in 1024-byte blocks. So if I calculate right, for climateprediction applications data would be about 7.27 GBytes. And if I am running other projects, such as setiathome, rosetta, and predictor@home, it squeezes 8 GBytes too tightly. boinc[~/projects/climateprediction.net]$ du . | sort -nr 5454180 . 2176932 ./460n_100294695 2150364 ./46st_c00295709 2027584 ./460n_100294695/dataout 2015352 ./46st_c00295709/dataout 1012760 ./48vf_200298395 147648 ./460n_100294695/datain 133316 ./46st_c00295709/datain 32488 ./46st_c00295709/datain/ancil 32488 ./460n_100294695/datain/ancil 11708 ./46st_c00295709/datain/dumps 11708 ./460n_100294695/datain/dumps 1236 ./46st_c00295709/tmp 1236 ./460n_100294695/tmp 620 ./46st_c00295709/datain/ancil/ctldata 620 ./460n_100294695/datain/ancil/ctldata 532 ./46st_c00295709/datain/ancil/ctldata/STASHmaster 532 ./460n_100294695/datain/ancil/ctldata/STASHmaster 444 ./46st_c00295709/jobs 444 ./460n_100294695/jobs 84 ./46st_c00295709/datain/ancil/ctldata/stasets 84 ./460n_100294695/datain/ancil/ctldata/stasets |
Send message Joined: 29 Aug 05 Posts: 15581 |
Well... the CPDN FAQ on Disk Space says: "Each CPDN model you run on your system will require around 600MB of disk space while it's running." I'm not sure if the sulphur model needs more, but I doubt it. So what is your Connect to setting set to for all these projects? How many CPUs (real and/or virtual) do you have? Then we can calculate how much you use. |
Send message Joined: 25 Nov 05 Posts: 1654 |
Sorry, the cpdn FAQ hasn't been updated for a long while. 600M was for early models, in the days before BOINC. The latest slab models need about 700M, and sulphur, which is all that is available at present, needs about 2.5G before the finish, when some compression takes place. If you have completed models, some space will be used by the bulk of the results, which isn't sent back to Oxford, but remains on your computer for study by the user. Even ones that have failed will take up a bit of space. By deleting these, or moving the model's folder outside of BOINC, more space will be available inside the BOINC area. |
Send message Joined: 19 Dec 05 Posts: 93 |
Well... the CPDN FAQ on Disk Space says: It is pretty clear that the sulfur models take _a lot more_. Two active ones, and one finished one take 5.454180 GBytes. One active one takes 2.176932 GBytes and it is not done yet. So 4 nearly-done ones would take around 8 GBytes. This would not include data and programs for other projects, such as setiathome, rosetta, and predictor@home. I have 2 real 3.06 GHz Intel Xeon hyperthreaded processors in the main machine, the ones with a 1 Megabyte L3 cache in addition to the L1 and L2 caches. There is 4 GBytes of 266MHz ECC DDR SDRAM in there now, and the processor bus runs at 533 MHz. Right now the connect setting is set to 0.25 days, which is not enough to keep all the processors running, but if I make it any bigger, it downloads so much short-term stuff that the climate prediction stuff will never get cycles until a week or two before the deadline, and then it will be too late. |
Send message Joined: 29 Aug 05 Posts: 15581 |
So why not use a different venue then for CPDN? I have tested that in the past and it worked. Leaving my normal projects on default venue, while putting a hard-hitting one at another, just so I could up my other projects a little, while the hard-hitting one would stay the same. You can then up it for the CPDN project to use 16GB, while the other projects still use around 8GB. |
Send message Joined: 19 Dec 05 Posts: 93 |
So why not use a different venue then for CPDN? I have tested that in the past and it worked. Leaving my normal projects on default venue, while putting a hard-hitting one at another, just so I could up my other projects a little, while the hard-hitting one would stay the same. I think I do not understand what you are talking about. What is a venue? If you mean that one of my machines is one venue and another machine is another venue, why not rely on the scheduler in the BOINC client to follow the rules? To avoid running out of disk space, I just doubled the amount in my system allocated to boinc to 16 GBytes, and that should work. At least it will not get too crowded as often as it does now. Normally, I do not expect to have 4 climateprediction work units in the machine at once anyway. I believe that only reason I got four is because none of the other projects were delivering work units for a while and I ran out of work for them, so the BOINC client just got 4 from climateprediction. Now the only problem is the scheduler in the BOINC client, but that complaint is for another (already existing) thread. |
Send message Joined: 29 Aug 05 Posts: 15581 |
I think I do not understand what you are talking about. What is a venue? Look in Your CPDN account, Project Preferences. Scroll down. See where it says: "Add separate preferences for home Add separate preferences for school Add separate preferences for work" Those are venues. Click one. Set it up with the same resource share & settings as the default venue. Press the Add preference button. Then go to the project's general preferences, scroll down. See how it says the same options at the bottom? Press the one for the venue you set up in the Project's preferences. Now set it up so it shows at least a max disk usage of 16GB. Press the Add preferences button to save the changes. Open BOINC Manager, go to the Projects tab, select the CPDN project, press the Update button. If you then go to the Messages tab, you will see that it contacts CPDN and gives it a different venue than the rest has. You can then change the other projects their default preferences to your heart's content without changing the CPDN ones. Yet since you already changed things... well, try it a next time then. Now the only problem is the scheduler in the BOINC client, but that complaint is for another (already existing) thread. The scheduler is no problem if you let BOINC do its job. If you try to micro manage BOINC the scheduler will be a problem, but if you just let it run for 3 days to 2 weeks without touching it, you'll see that the scheduler works as it is supposed to. |
Send message Joined: 19 Dec 05 Posts: 93 |
I think I do not understand what you are talking about. What is a venue? OK, now I know what you are talking about, but it seems to me that that is micromanaging something that the BOINC client should handle. Furthermore, on your second point, I have let the BOINC client run without fooling with it for much more than a few days and it would not run climate prediction work units that were partly completed and were at great risk of missing their deadlines. Instead, it insisted in downloading a batch of predictor or setiathome or rosetta work units and then preempting the processors from running the processes at risk. And as those short deadline processes completed, it downloaded some more, over and over, and never scheduled the climateprediction ones. Only when I cut the check in interval to 0.25 days did I even get it to run one. But then it happens that a processor will go idle for several hours because it runs out of work. What the scheduler should do, when it detects that the machine is overcommitted and enters the mode where the processes with the nearest deadlines are run first, is to subtract the time to completion estimate from the deadline in making that calculation, so a process with 7679 hours to go with a 10 week deadline would get scheduled in preference to a process with 24 hours to go and a 2 week deadline. But this is argued in another thread, and who knows when they will fix it. Meantime, there are lots of ways of fiddling with parameters here and there whose side-effects may get short-term solutions. But this still violates the rule that the more you try to outsmart an operating system, the more it will outsmart you. The objective should be to cure the problem, not to find band-aid work-arounds. |
Send message Joined: 30 Aug 05 Posts: 297 |
Furthermore, on your second point, I have let the BOINC client run without fooling with it for much more than a few days and it would not run climate prediction work units that were partly completed and were at great risk of missing their deadlines. Instead, it insisted in downloading a batch of predictor or setiathome or rosetta work units and then preempting the processors from running the processes at risk. And as those short deadline processes completed, it downloaded some more, over and over, and never scheduled the climateprediction ones. Only when I cut the check in interval to 0.25 days did I even get it to run one. But then it happens that a processor will go idle for several hours because it runs out of work. It is known that the scheduler doesn't work "correctly" on multi-CPU machines, where CPDN is one of the projects. No idea when it'll be fixed however, but hopefully soon. The "workaround" is to set the cache size very small; you still should not have idle CPUs though, I don't know why that's happening in your case. |
Send message Joined: 29 Aug 05 Posts: 15581 |
What the scheduler should do, when it detects that the machine is overcommitted and enters the mode where the processes with the nearest deadlines are run first, is to subtract the time to completion estimate from the deadline in making that calculation, so a process with 7679 hours to go with a 10 week deadline would get scheduled in preference to a process with 24 hours to go and a 2 week deadline. Since you've been telling BOINC so many changes in so many days, how can the scheduler work out what the exact end time of a work unit is going to be? Work units aren't exact in time. Not CPDN unit, not Seti units, not Predictor units. they change in run time. If you check your results per project you will see this shown. One unit takes 8,000 seconds, while another for the same project takes 11,000 seconds (these are examples given). Built in in the work scheduler is a calculation that reduces the estimated time for "same" work units. If you just let it run without changing preferences every day, then BOINC will eventually (after a couple of days or weeks) get to a stable setting that it can guess new work units correctly. I've been running this client with this estimate calculation for a long time now. I can now change clients, I can add/remove projects, I can stop running BOINC to play games!, I can increase/decrease my connect to time and each time the projects I have been running all this time will start out with an etsimated time to completion that is close to the actual time that the unit is completed! Maybe you are taking on too many projects. Maybe it's what Bill says, that CPDN is going haywire. All I am saying is that you should stop checking BOINC every minute. Just let it run in the background. Check it again in a week's time. See what it does then. |
Send message Joined: 19 Dec 05 Posts: 93 |
It is known that the scheduler doesn't work "correctly" on multi-CPU machines, where CPDN is one of the projects. No idea when it'll be fixed however, but hopefully soon. The "workaround" is to set the cache size very small; you still should not have idle CPUs though, I don't know why that's happening in your case. It is quite clear what is happening. The scheduler sees that there is an idle processor, correctly decides that the machine is not overcommitted, and requests more work. Unfortunately, it asks for too much work, so it downloads a lot of applications with relatively short deadlines. If it downloads more than one, it puts the scheduler back into overcommitted run-most-recent-deadline-programs-first mode. These programs preempt the climateprediction ones, and as they finish, the scheduler goes out of overcommitted mode, gets more short-term work, etc. Now if the short term stuff all has 1-week deadlines, like predictor for example, it will keep the climateprediction stuff from running, except for very brief intervals when downloading more short deadline stuff, until about a week from the climateprediction deadlines. By then, it is way too late. I have already suggested a fix, and no longer care if they ever fix it. But if they don't, we must either resort to tampering with the scheduler by various ways, including the ones you have suggested, or wasting processor power computing stuff that will not be finished until long after the deadline. or just turning off climateprediction (which, IMAO, would be a pity). |
Send message Joined: 19 Dec 05 Posts: 93 |
What the scheduler should do, when it detects that the machine is overcommitted and enters the mode where the processes with the nearest deadlines are run first, is to subtract the time to completion estimate from the deadline in making that calculation, so a process with 7679 hours to go with a 10 week deadline would get scheduled in preference to a process with 24 hours to go and a 2 week deadline. Of course. If you check your results per project you will see this shown. One unit takes 8,000 seconds, while another for the same project takes 11,000 seconds (these are examples given). Of course. Built in in the work scheduler is a calculation that reduces the estimated time for "same" work units. If you just let it run without changing preferences every day, then BOINC will eventually (after a couple of days or weeks) get to a stable setting that it can guess new work units correctly. I think I would be satisfied with its current estimated time calculations, although IMAO they are quite pessimistic. But as far as I can tell, the scheduler ignores the estimated time in its efforts to schedule processes. I think the right thing for it to do is use these estimates to reduce the deadline. I do not change the preferences every day. I used to have the major one set to 1.5 days and it worked pretty well until the scheduler download climateprediction work so that 4 climate prediction work units were running at once. That was not an apparent problem until one of them finished. That is when I noticed (running top) that it was never running climateprediction work units anymore. I looked and saw that it was running a bunch of short deadline work units instead. I let it do that for about a week, thinking that it was just trying to get my distribution of work done right in the longer term. Anyhow in a thread in the BOINC area, someone said I should reduce my interval, so I shortened it to 0.5 day or something and it did not help at all; it still downloaded too much work. I finally set it to 0.125 day, hoping it would then coast over the 3-hour setiathome shutdown every week, and perhaps it will. Meanwhile a sulfur model climateprediction work unit completed, so I now have only two processes at risk, and one is likely to complete on time. But the other may not make it. I've been running this client with this estimate calculation for a long time now. I can now change clients, I can add/remove projects, I can stop running BOINC to play games!, I can increase/decrease my connect to time and each time the projects I have been running all this time will start out with an etsimated time to completion that is close to the actual time that the unit is completed! I never stop it except when I reboot the machine, which is usually less than once a month (although today I installed an additional NIC, so I had to power it down for that). Maybe you are taking on too many projects. Not really. CPDN is one of the best behaved of the applications. It is one of the few that reliably shuts down when you shut down the BOINC client, for example. The others usually do, but once in a while one of them keeps on running even after all other BOINC processes have terminated. All I am saying is that you should stop checking BOINC every minute. Just let it run in the background. Check it again in a week's time. See what it does then. I have already done that. I do not check it once a minute, and I do not monkey with it other than the recent doubling of the allowable disk space. |
Send message Joined: 29 Aug 05 Posts: 15581 |
I used to have the major one set to 1.5 days and it worked pretty well until the scheduler download climateprediction work so that 4 climate prediction work units were running at once. That was not an apparent problem until one of them finished. That is when I noticed (running top) that it was never running climateprediction work units anymore. I looked and saw that it was running a bunch of short deadline work units instead. I let it do that for about a week, thinking that it was just trying to get my distribution of work done right in the longer term. Work Scheduler Debt The others usually do, but once in a while one of them keeps on running even after all other BOINC processes have terminated. Which you have reported to their forums, I hope? |
Send message Joined: 29 Aug 05 Posts: 225 |
Sorry, the cpdn FAQ hasn't been updated for a long while. 600M was for early models, in the days before BOINC. Well, the FAQ in the Wiki is pretty up to date, though I just "tweaked" the disk space numbers to these higher values. |
Send message Joined: 19 Dec 05 Posts: 93 |
Sorry, the cpdn FAQ hasn't been updated for a long while. 600M was for early models, in the days before BOINC. Thank you; that will be helpful to others, I imagine. But one slight typo: "1GMB" "Around 330MB are left on your computer for a full slab model, 1GMB for a completed Sulphur Cycle model." (Talking about the disk requirements for ClimatePrediction -- the leftovers when work unit completes.) |
Send message Joined: 30 Aug 05 Posts: 297 |
But one slight typo: "1GMB" Fixed. Paul - should the "Running BOINC CC on Mac OSX" section in there just be deleted? It refers to the CLI... |
Send message Joined: 29 Aug 05 Posts: 225 |
There has been a need to update the CLI part with changes, but, as with the rest I have not gotten there. This really should refer to that ... |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.