multiple NVIDIA GPUs

Message boards : BOINC client : multiple NVIDIA GPUs
Message board moderation

To post messages, you must log in.

AuthorMessage
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 42823 - Posted: 2 Mar 2012, 13:43:05 UTC

I've recently installed two video cards in my Linux box. They are Nvidia GT 430. BOINC CC sees them just fine:

02-Mar-2012 13:30:51 [---] Starting BOINC client version 7.0.2 for x86_64-pc-linux-gnu
02-Mar-2012 13:30:51 [---] This a development version of BOINC and may not function properly
02-Mar-2012 13:30:51 [---] log flags: file_xfer, sched_ops, task, coproc_debug, cpu_sched, cpu_sched_debug
02-Mar-2012 13:30:51 [---] Libraries: libcurl/7.18.2 OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.8 libssh2/0.18
02-Mar-2012 13:30:51 [---] Data directory: censored
02-Mar-2012 13:30:51 [---] Processor: 8 AuthenticAMD Dual-Core AMD Opteron(tm) Processor 8218 [Family 15 Model 65 Stepping 2]
02-Mar-2012 13:30:51 [---] Processor: 1.00 MB cache
02-Mar-2012 13:30:51 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
02-Mar-2012 13:30:51 [---] OS: Linux: 2.6.36.1
02-Mar-2012 13:30:51 [---] Memory: 63.05 GB physical, 56.83 GB virtual
02-Mar-2012 13:30:51 [---] Disk: 66.40 GB total, 20.64 GB free
02-Mar-2012 13:30:51 [---] Local time is UTC +0 hours
02-Mar-2012 13:30:51 [---] NVIDIA GPU 0: GeForce GT 430 (driver version unknown, CUDA version 4.20, compute capability 2.1, 1024MB, 1001MB available, 280 GFLOPS peak)
02-Mar-2012 13:30:51 [---] NVIDIA GPU 1: GeForce GT 430 (driver version unknown, CUDA version 4.20, compute capability 2.1, 1024MB, 1001MB available, 280 GFLOPS peak)
02-Mar-2012 13:30:51 [---] OpenCL: NVIDIA GPU 0: GeForce GT 430 (driver version 295.20, device version OpenCL 1.1 CUDA, 1024MB)
02-Mar-2012 13:30:51 [---] OpenCL: NVIDIA GPU 1: GeForce GT 430 (driver version 295.20, device version OpenCL 1.1 CUDA, 1024MB)
02-Mar-2012 13:30:51 [---] NVIDIA library reports 2 GPUs


Tasks requiring NVIDIA coprocessor run fine until their requirement is <coproc><count>1.0</count></coproc>. If this gets larger than 1.0, then it doesn't work at all. One example is project Moo! Wrap, which is a wrapper for DNETC projects. Their science application grabs all available GPUs. Project scheduler this correctly indicates by setting coproc count to 2.0. However BOINC CC claims that my system doesn't have enough NVIDIA GPUs available:

02-Mar-2012 13:30:53 [Moo! Wrapper] [cpu_sched_debug] insufficient NVIDIA for dnetc_r72_1330613363_72_192_0


Any idea about this?
Metod ...
ID: 42823 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 42824 - Posted: 2 Mar 2012, 13:47:54 UTC - in response to Message 42823.  
Last modified: 2 Mar 2012, 13:50:31 UTC

Any idea about this?

Try Boinc 7.0.18, Boinc 7.0.2 ia a very early Alpha,

Claggy
ID: 42824 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 42825 - Posted: 2 Mar 2012, 13:56:49 UTC - in response to Message 42823.  

Are you attached to other projects? Was anything else active at the time?

I think that's the message you would see if, for example:

Two count=1.0 applications are running, separately, one on each GPU.

One of the two tasks finishes, leaving one GPU free and one GPU occupied.

The client scheduler considers scheduling a Moo! app next - it finds one available GPU, needs two, and backs out with the 'insufficent' message.

In general, BOINC doesn't like to pre-empt CUDA applications - it can be inefficient. So, Moo! probably won't run until both GPUs are freed by single tasks finishing at exactly the same time (unlikely), the other projects run out of work, or Moo! is forced to run in High Priority by deadline pressure.

One possible work round would be to set a really low Task Switch Interval, so that the other tasks become pre-emptible quickly.
ID: 42825 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 42826 - Posted: 2 Mar 2012, 14:03:51 UTC

Afterthought - has anybody discussed this potential cross-project "play nice" issue at Moo! ?

It's similar to issues we saw with BOINC in the early days of multi-threaded CPU applications. I think the solution there was to unceremoniously dump all single-threaded tasks when a MT app was scheduled to run (whether or not they should have continued running under 'Task Switch' rules). That wasn't very nice, either.
ID: 42826 · Report as offensive
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 42827 - Posted: 2 Mar 2012, 14:36:25 UTC - in response to Message 42826.  
Last modified: 2 Mar 2012, 14:38:03 UTC

On this host I have multiple projects running: CPDN, Seti, Einstein and now Moo!. I don't have proper CUDA application for Seti (yet, see problem description right at the end of this post), Einstein provides with nice CUDA application. However, during this tests I disabled Einstein so it's only Moo! that wants to use NVIDIA. As BOINC CC decided NVIDIA resources were not ample enough, both GPUs stayed idle.
If I edit client_state.xml and decrease count of CUDA for Moo! to 1.0, BOINC CC starts a couple of Moo! tasks. This is far from ideal as both tasks want to run on both GPUs.

There's some discussion on Moo! boards about making dnetc application nicer - so that it would only occupy designated GPU and not all of them. However, as this project is only a wrapper around different type of distributed computing project, it's not totally in hands of project management to make needed changes to the crunching (I just can't call it science) application.

I already tried to run newer BOINC CC but I'm unable to use Berkeley-provided executable - my OS distribution is somehow elderly so system libraries are too old. I probably should compile BOINC CC on my own. I would like to get some indication about whether this problem (single task requiring multi-GPU resource) is known to developers and if there had been some work done about it before I jump on it.
Metod ...
ID: 42827 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 42828 - Posted: 2 Mar 2012, 15:04:09 UTC - in response to Message 42827.  

I forwarded it to Rom, asked him to pass by. Can't promise anything. :-)
ID: 42828 · Report as offensive
Metod, S56RKO

Send message
Joined: 9 Sep 05
Posts: 128
Slovenia
Message 42829 - Posted: 2 Mar 2012, 15:33:37 UTC - in response to Message 42828.  

Just for the record: now I'm running BOINC CC 7.0.18 and it still doesn't want to run tasks with requirement of more than 1.0 NVIDIA resource (thus leaving both GPUs idle).
Metod ...
ID: 42829 · Report as offensive
Tex1954

Send message
Joined: 3 Mar 12
Posts: 27
United States
Message 42840 - Posted: 3 Mar 2012, 12:15:19 UTC

I can report the same problem running Moo! Wrapper on the ATI HD6990 card.

If I use an app_info.xml file and say one GPU per task, it will load and run two tasks...

But, the Dnet application looks to use ALL GPU in real time, so if you suspend one task, the other starts using both GPU's even though app_info doesn't prescribe that. That causes more problems as eventually it's trying to run two tasks on two GPU's at the same time.

This behavior also seen running a collatz task on one GPU and Moo on another. If the collatz task is suspended, the Moo (distributed.net) task starts using both GPU's!

Without app_info file, Moo! will never start.

:)
ID: 42840 · Report as offensive

Message boards : BOINC client : multiple NVIDIA GPUs

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.