Message boards : BOINC client : massive work fetch bug in 7.0.25
Message board moderation
Author | Message |
---|---|
Send message Joined: 5 Aug 06 Posts: 59 |
Just switched to the new recommended version... and it's definitely nothing I would recommend. ;) My system: i7 980X + ATI HD 5850 + NVIDIA GTX 470 Active projects: - PrimeGrid (only on NVIDIA) - Collatz Conjecture (only on ATI, so ATI is currently idle because of downtime) - WUProp@Home - EDGeS@Home (CPU; Resource Share 0, because I used it as backup project during last PrimeGrid Challenge and did not change it back afterwards) Work buffer settings: <work_buf_min_days>0.0000000</work_buf_min_days> <work_buf_additional_days>0.0100000</work_buf_additional_days> While PG on NVIDIA and WUProp (nci) run nicely, the client keeps requesting EDGeS work for CPU and ATI... of course, it can't get work for ATI there, but it gets one more CPU WU on each request. 320 WUs and counting, about 12 hours of work according to BOINC's own estimation... that's 0.5 days, way more than 0.01 days. At least I did a backup before the update, so I can switch back to 6.12.34 for now. |
Send message Joined: 29 Aug 05 Posts: 15552 |
Running away from something like this isn't going to help anyone. So, if you would please be so kind as to: Make a [trac]Wiki:Client_configuration[/trac] (cc_config.xml) file and add into it: <cc_config> <log_flags> <cpu_sched_debug>1</cpu_sched_debug> <work_fetch_debug>1</work_fetch_debug> </log_flags> </cc_config> Save it in your BOINC data directory. Let BOINC know about it, by either exiting BOINC & restarting it, or opening BOINC Manager->Advanced view->Advanced->Read config file. The output will be in the Event Log. Let it output for at least 5 minutes, then copy the part of the log with these debug messages in them, and stop BOINC. Remove the old cc_config.xml file's lines, and in their place put the following: <cc_config> <log_flags> <rr_simulation>1</rr_simulation> </log_flags> </cc_config> Run this in the same manner as described above. After 5 minutes, copy that part of the log, add it to the previous part (like in Notepad), and send this file to David Anderson with a description of what you think is wrong, how it should work and where in the logs it isn't. You can find his address here. As for your question whether or not we tested this version, of course we did. The developers do not chuck untested applications in your lap, however, we can't test every possible combination of hardware + projects that exist, as then no BOINC version will ever be released. Also of note, it's very possible that this is a project scheduler problem. It wouldn't be the first. We're always on the lookout for alpha testers, so if you think you can do better than the rest of us, go to http://boinc.berkeley.edu/trac/wiki/AlphaInstructions and follow the instructions. It is that simple. |
Send message Joined: 6 May 06 Posts: 287 |
320 WUs and counting, about 12 hours of work according to BOINC's own estimation... that's 0.5 days, way more than 0.01 days. I've been running the 7.0.x series on 4 boxes and I did notice that kind of behaviour on a couple of projects (predominantly newly added ones) but it sorted itself out - not guaranteeing that this will happen in your case, but let it run its course before abandoning it. I can't remember which projects it was but when it happened I was a bit concerned, particularly as boinc was going into "high priority" mode and it looked as though the wu's from other projects were going to suffer, as it was I think it's just the way the 7 series estimates runtime. CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1 |
Send message Joined: 29 Aug 05 Posts: 15552 |
Yes, in general, for others reading here. Do know, that the scheduling code has been rewritten from the ground up. It will not go and fetch work, or schedule which projects to run, as previous versions did. So rather than just run it for an hour, run it for 2 weeks, try to learn what it does differently. See how the new work fetch methods are now based on a low water mark and a high water mark. Find how BOINC won't immediately after uploading & reporting work ask for new work, but will only do so when it's past the low water mark. E.g. if you wanted at least a day's worth of work: In 6.12 and before, you'd set connect to interval to 0.01 and additional work to 1.0 In 7.0 you set minimum work buffer to 1.0 and max additional work buffer to 0.01 |
Send message Joined: 11 Jul 09 Posts: 18 |
well I'm sorry but I hate the new method. I'm attached to ten projects, eight are active & the two are not, & I'll miss WU when they are available using this method. So far after running 7.0.25 for a few days now it now only ever runs three of my eight active projects?!. & when it does fetch work it fetches a bunch of WU’s from only three projects. so if I set it to - "In 7.0 you set minimum work buffer to 1.0 and max additional work buffer to 0.01" -it will run like it should?, like it used to?. Thanks. |
Send message Joined: 20 Dec 07 Posts: 1069 |
so if I set it to - What keeps you from trying it? ;-) Gruß, Gundolf |
Send message Joined: 11 Jul 09 Posts: 18 |
so if I set it to - Because I just run it?, I dont make it, let alone like to fool with it?. thats why I thought i'd ask first? & provide some feedback wile I was at it as a longtime user?. is that ok?. |
Send message Joined: 11 Jul 09 Posts: 18 |
well I set it to those numbers & now its even worse?!~. Now its running only two of my eight active projects with 13 WU between them!?! LOL! Thats just great! thanks alot... Going to re-install 6.12.34 I guess. |
Send message Joined: 20 Dec 07 Posts: 1069 |
Because I just run it?, I dont make it, let alone like to fool with it?. thats why I thought i'd ask first? & provide some feedback wile I was at it as a longtime user?. is that ok?. Sorry, I didn't want to criticise you, I just think that playing with some preferences isn't like fooling with the application. It's never wrong to ask, but sometimes you get a faster response by just trying it. And giving feedback is okay in any case. :-) Gruß, Gundolf |
Send message Joined: 20 Dec 07 Posts: 1069 |
Going to re-install 6.12.34 I guess. And what about trying some other numbers and giving feedback again? ;-) Perhaps that way you help to discover and remove some bugs. Gruß, Gundolf |
Send message Joined: 15 Apr 12 Posts: 5 |
Back in the early 6.x release the Min and Additional work buffer fields were set to ZERO so that projects would only get a single task - not multiple. Then you would have many tasks but only from individual projects. How can we do this in the new 7.0.25 version? Thanks |
Send message Joined: 15 Mar 10 Posts: 10 |
Find how BOINC won't immediately after uploading & reporting work ask for new work...How do the scheduler changes affect the reporting of completed WU's? 7.0.25 seems to keep completed WU's around a lot longer than 6.12.34 did. In the particular case of Milkyway@Home, 6.x always uploaded the results almost immediately (~1min after completion). 7.x lets the results sit there by the multiple dozens for an indeterminate period of time. |
Send message Joined: 29 Aug 05 Posts: 15552 |
6.x always uploaded the results almost immediately (~1min after completion). 7.x lets the results sit there by the multiple dozens for an indeterminate period of time. Yup, plus it will ignore the <report_results_immediately/> switch in cc_config.xml The change is that it will try to do a work request at the same time as it's reporting work. Only when a time-of-day change (CPU and Network) is happening in the next 30 minutes, will BOINC 7 report work immediately. |
Send message Joined: 29 Aug 05 Posts: 15552 |
Back in the early 6.x release the Min and Additional work buffer fields were set to ZERO so that projects would only get a single task - not multiple. Then you would have many tasks but only from individual projects. How can we do this in the new 7.0.25 version? I'd almost say guess... ;-) But uhm, how about setting both to ZERO again? The minimum work buffer setting sets the minimum amount of work you're going to request. The maximum additional work buffer sets the additional days worth of work you want to have. So if you only want 1 task for each CPU + GPU core, you set 0 + 0, which will fetch 1 second per hardware core. |
Send message Joined: 11 Jul 09 Posts: 18 |
All I know is if I have X number of active projects then I want at least one WU running/waiting to run for each of those projects, just like 6.x did so well for years. I install 7.x & it goes from that to only 4 of my 8 active projects having 4 WU each!?!... Just because people use BOINC doesn't mean they are computer experts who understand all the jargon etc. or that they are comfortable fooling around with settings, if you know the dame answer then please just give it?. Re-installed 6.12.34 after worthless babble for answers & nothing from ageless who I’d hoped for a simple straight answer, working perfect again now anyways, no thanks to this forum... |
Send message Joined: 29 Aug 05 Posts: 15552 |
I have given you a straight and simple answer. Want to have the difficult answer? As I posted here: I have the same problem with version 7. Each task would complete, but BOINC Manager would not request new ones. I do not consider myself stupid, but the explanation of the parameters for minimum and maximum work buffers in the network preferences is VERY difficult to understand. I didn't program BOINC, yet when I tried to put an explanation on how this new BOINC works in the User Manual Wiki, I was called back by the developers who found this information too strenuous for the poor souls of the simple BOINC user, it would scare them away. And as such, hey presto, there's no explanation anywhere why BOINC 7.0 works the way that it does, other than in quite technical language. Still want to lay blame in all the wrong places? |
Send message Joined: 15 Apr 12 Posts: 5 |
Back in the early 6.x release the Min and Additional work buffer fields were set to ZERO so that projects would only get a single task - not multiple. Then you would have many tasks but only from individual projects. How can we do this in the new 7.0.25 version? But by having ZERO in both settings in 6.x - allowed all my projects, that had a task to send, to get only a single task downloaded per project – so I ultimately had many tasks waiting but there was only one per project. Now the ZERO settings just gets one task at a time and if I up it – I get several tasks for the same project. How can I get only one task per project in 7.x release? This used to work… |
Send message Joined: 29 Aug 05 Posts: 15552 |
By allowing BOINC to run for several days with the zero and zero days setting. BOINC 7 has a new work fetch module and new separated CPU and GPU schedulers. It will have to learn anew about how you run things. It can't do that without starting with one task per hardware core per project. It won't do it immediately, there is no way to force it. But your BOINC 6.10 or 6.12 didn't do this immediately either, it had to learn this as well. So if you're just willing to let go of things, not look in BOINC Manager every moment of the day what it is doing, you will eventually see that it will do things the old way. Just one task per project per core, depending on resource share and REC (the new debt). |
Send message Joined: 15 Apr 12 Posts: 5 |
By allowing BOINC to run for several days with the zero and zero days setting. BOINC 7 has a new work fetch module and new separated CPU and GPU schedulers. It will have to learn anew about how you run things. It can't do that without starting with one task per hardware core per project. Great!!! I guess as in most things: ”Patience is the Key” Thanks for the quick response – I will just leave it alone... Keep up the good work |
Send message Joined: 15 Mar 10 Posts: 10 |
The change is that it will try to do a work request at the same time as it's reporting work.But when does it report results when it is NOT doing a work request? Does it wait until WU's available is less than "Minimum work buffer"? If so, setting "Max. Additional work buffer" to a number of days could presumably delay transmitting results for a very long time. |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.