Thread 'CUDA Project makes 1 CPU idle most of the time'

Author	Message
GEPO Send message Joined: 21 Dec 08 Posts: 4	Message 21974 - Posted: 21 Dec 2008, 17:51:51 UTC I recently installed BOINC 6.4.5 and attached to GPUGRID to start using my GTX280 with BOINC. I have a dual Xeon E5462 (8 Cores total) and running BOINC all the time on all 8 cores. In window Task Manager I noticed that when the GPUGRID project is running the following occurs: 1) one core is dedicated to the GPU task, i.e. there are 7 normal tasks plus 1 GPU task running in BOINC at any time 2) System Idle Process CPU share is 7% to 8% most of the time (with 8 cores that is one core sitting idle more than 1/2 the time) 3) unlike normal tasts that run at Low Base Priority, the GPU task runs at Normal Base Priority So it seems that BOINC is wasting 7-8% of the CPU computing power available when running a CUDA task. I would think it this could be avoided if the BOINC scheduler did not reserve a CPU for a CUDA task, rather it ran a normal task on that CPU at low priority as well as the CUDA at normal priority. In other words, if the scheduler disregarded the CUDA tasks and ran as many CPU tasks as CPUs at low priority and as many CUDA tasks as GPUs at normal priority, then the computer would never be idle. This of course could be problem on computers wihh only 1 or 2 cores as the CUDA tasks would run at normal priority and might interfere with the system usability. So maybe, istead of running at Normal Base Priority, the CUDA tasks should be set to Below Normal, which is still higher then the Low priority of CPU BOINC tasks, allowing the GPU to work at full speed. Anyway, sorry for the long post, but I hate to see my system not being utilized at its fullest :) ID: 21974 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15581	Message 21975 - Posted: 21 Dec 2008, 17:55:58 UTC - in response to Message 21974. There is a bug in the client that will prevent it from asking enough work. The developers knew about it but released the client anyway. It won't come that far that you'll actually totally run out of work, although it is possible that on multi-processor systems one or more CPUs go idle. ID: 21975 ·

GEPO Send message Joined: 21 Dec 08 Posts: 4	Message 21977 - Posted: 21 Dec 2008, 18:02:26 UTC - in response to Message 21975. Thanks, but it's not a metter of not fetching enough work. In other words, there is almost allways work in the queue. The issue I have is that the scheduler schedules 7 normal (CPU) tasks and 1 CUDA+CPU task at the same time on an 8 core machine. But the CUDA task only uses about 1/3 of a core, so the other 2/3 are unused. If it scheduled 8 normal (CPU) tasks and 1 CUDA+CPU task at the same time with the CUDA at slightly higher priority the core would be fully utilized. Also, running the CUDA task at normal priority is not a problem with 8 cores, but I would immagine it might make a 1 or 2 core machine less responsive ID: 21977 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15581	Message 21978 - Posted: 21 Dec 2008, 18:19:44 UTC - in response to Message 21977. That's a Seti problem. See this thread in which I explained what's the problem and solution. ID: 21978 ·

GEPO Send message Joined: 21 Dec 08 Posts: 4	Message 21979 - Posted: 21 Dec 2008, 19:00:34 UTC - in response to Message 21978. Thank you very much! I created a cc_config.xml file in the BOINC data folder with the following content: <cc_config> <options> <ncpus>9</ncpus> </options> </cc_config> Then told BOINC to "Read config file" and "Read local prefs file" and I now have 8 regular tasks + 1 CUDA task running with no idle CPU time! I still think somethinglike this should be automatically done by the BOINC Manager, may be in a not so distant future release :-) Thanks again and Happy Holidays! ID: 21979 ·

Nicolas Send message Joined: 19 Jan 07 Posts: 1179	Message 21997 - Posted: 22 Dec 2008, 2:49:06 UTC - in response to Message 21979. aaaaaaaaaaagh STOP MESSING WITH CC_CONFIG NCPUS ID: 21997 ·

Richard Haselgrove Volunteer tester Help desk expert Send message Joined: 5 Oct 06 Posts: 5137	Message 22002 - Posted: 22 Dec 2008, 10:09:23 UTC - in response to Message 21997. aaaaaaaaaaagh STOP MESSING WITH CC_CONFIG NCPUS To which I can only reply: AAAAAAAAAAAAAAAAAAGH Stop releasing half-baked BOINC clients with new features that haven't been fully thought through, declaring them stable with next-to-no testing, introducing new bugs (DCF), failing to fix old bugs .... .... and which require the use of NCPUs (plus resetting the thread priorities at application level) to do that very thing which BOINC was designed to do right at the very beginning, namely to put idle CPU cycles to productive use. Not your fault personally, Nicolas, of course: but if you dare look at the SETI boards to see what a confusing mess has been generated by this BOINC release, plus the equallly rushed, untested, un-thought-through, unsupported (there are too few SETI project staff to handle a release of this magnitude, especially so soon after their Astropulse release, which still hasn't had the post-release fine-tuning attention it deserves) SETI CUDA release, then you'd see why volunteers resort to experiments like this. ID: 22002 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15581	Message 22005 - Posted: 22 Dec 2008, 10:24:44 UTC - in response to Message 22002. declaring them stable with next-to-no testing Ahem... 6.4 has been in testing for quite a while, with CUDA tests being done on GPUGrid. That their app and Seti's app can't be compared isn't the fault of the BOINC testers who did try out the new BOINC clients. And yes, there's still bugs in the client. If you think you can do better, get the source code and start ripping. I'll look forward to RHOINC 1.0 :-) ID: 22005 ·

MarkJ Volunteer tester Help desk expert Send message Joined: 5 Mar 08 Posts: 272	Message 22006 - Posted: 22 Dec 2008, 12:20:41 UTC - in response to Message 21997. aaaaaaaaaaagh STOP MESSING WITH CC_CONFIG NCPUS Is there something in the works to update the scheduling to allow (without having to resort to the above cc_config change) 4 cpu + 1 gpu on a quad core? I have raised trak #802 in regard to this. MarkJ ID: 22006 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15581	Message 22007 - Posted: 22 Dec 2008, 12:26:16 UTC - in response to Message 22006. Yes there is, but I wouldn't expect it before BOINC 6.8 Now on the other hand, it does work as long as you crunch Seti on the GPU and any other project on the CPUs... it also works with Seti on the GPU and Seti AP on the CPUs. Just not all at Seti Enhanced (Multibeam). ID: 22007 ·

MarkJ Volunteer tester Help desk expert Send message Joined: 5 Mar 08 Posts: 272	Message 22009 - Posted: 22 Dec 2008, 12:41:38 UTC - in response to Message 22007. Yes there is, but I wouldn't expect it before BOINC 6.8 Now on the other hand, it does work as long as you crunch Seti on the GPU and any other project on the CPUs... it also works with Seti on the GPU and Seti AP on the CPUs. Just not all at Seti Enhanced (Multibeam). Ahh I was hoping for something in 6.5, but I guess we will have to wait. I was also hoping they'd find the time to look at Superhost, but that doesn't look very likely given all the work around BOINC for the time being. MarkJ ID: 22009 ·

Nicolas Send message Joined: 19 Jan 07 Posts: 1179	Message 22030 - Posted: 23 Dec 2008, 0:17:49 UTC - in response to Message 22002. To which I can only reply: AAAAAAAAAAAAAAAAAAGH Stop releasing half-baked BOINC clients with new features that haven't been fully thought through, declaring them stable with next-to-no testing, introducing new bugs (DCF), failing to fix old bugs .... And to which I can only agree. Every major release has been more "half-baked" than the previous ones. Or should I say, this one was quarter-baked :) Still, if you edit ncpus, you're just working around the problem and potentially causing new ones. For example, if you have no GPU work, you'll be running more CPU tasks than your total cores, making your computer slower. And your "CPU efficiency" (measured by BOINC) will go down, causing more work fetch problems than the ones 6.4.5 already has out-of-the-box. And if you work around the problem, instead of helping fix it, you get it sort-of-working for YOU. BOINC will keep being broken for everybody else. Note I'd count boycotting the current version as "helping fix it"... ...and which require the use of NCPUs (plus resetting the thread priorities at application level) to do that very thing which BOINC was designed to do right at the very beginning, namely to put idle CPU cycles to productive use. BOINC not doing what it was designed to do? Go file a Trac ticket. And please watch it closely too, they seem to have a tendency to get closed without "proof of fix". Not your fault personally, Nicolas, of course: but if you dare look at the SETI boards to see what a confusing mess has been generated by this BOINC release, plus the equallly rushed, untested, un-thought-through, unsupported (there are too few SETI project staff to handle a release of this magnitude, especially so soon after their Astropulse release, which still hasn't had the post-release fine-tuning attention it deserves) SETI CUDA release No way in hell I'm getting anywhere close to the SETI forum, it's a very dangerous place to be, especially now! If it was in the real world rather than a virtual forum, I'm sure someone would have already set something on fire... , then you'd see why volunteers resort to experiments like this. If testers set that for some period of time to see if it solves the problem, or to get that problem out of the way to see if there is something else broken, then that's fine. But it's not an experiment when I see "change ncpus" being given out as a suggestion to newbies with scheduling problems. ID: 22030 ·

Nicolas Send message Joined: 19 Jan 07 Posts: 1179	Message 22031 - Posted: 23 Dec 2008, 0:18:20 UTC - in response to Message 22005. declaring them stable with next-to-no testing Ahem... 6.4 has been in testing for quite a while, with CUDA tests being done on GPUGrid. That their app and Seti's app can't be compared isn't the fault of the BOINC testers who did try out the new BOINC clients. That's not the real problem. There WAS testing. Problems WERE found. The client was released anyway, without testing for long enough, and ignoring the problems already found. Oh, they weren't completely ignored actually... Rom did a mass-edit to all Trac tickets, pushing those with milestone 6.4 (and even some with 6.2 and still unfixed!) to 6.6. ID: 22031 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15581	Message 22033 - Posted: 23 Dec 2008, 0:27:54 UTC - in response to Message 22030. No way in hell I'm getting anywhere close to the SETI forum, it's a very dangerous place to be, especially now! If it was in the real world rather than a virtual forum, I'm sure someone would have already set something on fire... Or made a movie script out of it and sold it for big bucks to John Carpenter. The Thing II: Cuda! See if Kurt Russell can wriggle his way out of this one. :-) ID: 22033 ·

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.