Message boards : BOINC client : nVidia GTX-295 x 2 = GPU Hell
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Apr 09 Posts: 10 |
Hi guys This has been driving me nuts for a month or so. I have 2 GTX-295's not in SLi with the desktop expanded over the 4 GPU's. 2 Monitors connected and 2 Dummy VGA plugs. All this is on Vista 32 SP 1 (due to the nVidia drivers on 64 bit showing only 2 CUDA devices). nVidia drivers 182.08, 181.xx and 185.xx. What I'd like to know is why the BOINC client only reports 3 devices found even though the nVidia control panel shows 4 and if I run a FurMark test it reports 4 GPU's in use. This problem occurs on GPUGrid and SETI as it seems to be a client issue. I only get the line :- 04/12/09 01:59:43 CUDA devices: GeForce GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS), GeForce GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS), GeForce GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS) which is obviously 1 short. I have 5 working Dummy VGA plugs and have tried all of them. I know they work as I use them in another machine. For 4 hours today all 4 showed up but as soon as I (foolishly) rebooted we are back to the same scenario. This has been on client 6.4.5, 6.6.15 and 6.6.20. Many Thanks |
Send message Joined: 29 Aug 05 Posts: 15552 |
As far as I understand this, it is a driver problem. BOINC will only detect all GPUs correctly if all drivers for them have loaded. Does this occur when you start up and BOINC Manager is starting automatically before everything else has loaded? What happens if you now exit BOINC and restart it? Does it then detect all GPUs? If it does, you may want to stop BOINC Manager auto-load on Windows startup and start it only after Windows has completely loaded. Starting BM will start the client and if not everything has loaded completely yet, it may give these problems. (at my guess) |
Send message Joined: 23 Apr 07 Posts: 1112 |
A Link to the host would help please, Note, other people (Al) have posted Boinc showing 3 GPU's and the fourth getting cut off part way through, what does the website show? Does 4 cuda apps run together? Claggy |
Send message Joined: 12 Apr 09 Posts: 10 |
Hi The BOINCManager was set to start at login. I have disabled this. I booted to the desktop. Checked the number of GPU's was 4 in the nvidia control panel. Checked the displays were all extended. Ran FurMark which finished and said there were 4 active GPU's. Started BOINCManager and found that BOINC thought there were 2 GPU's but is showing 3 on the host page. The 2 monitors and 2 dummy vga plugs should ensure all the GPU's are active. Indeed as I mentioned, FurMark is telling me the 4 are active. The machine is at:- http://setiathome.berkeley.edu/show_host_detail.php?hostid=4837048 Also seems there have been some errors:- Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error. Cuda error 'cudaAcc_summax32_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error. Cuda error 'cudaAcc_summax32_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error. Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error. |
Send message Joined: 23 Apr 07 Posts: 1112 |
Hi Have you tried each GTX-295's on their own to see if you can get both GPU Cores working, then swap the GTX-295's over to see if both GPU cores work on that one. How many power conections on each Card?, have you checked them? It should work, some of the top hosts have 2 or 3 GTX-295's on XP, XP x64 and Vista x64, and using Boinc 6.6.20, But i couldn't find anyone else with Vista x86 and 2 GTX-295's. I wouldn't worry about the compute error's, we all get a few. Made your host link live, click reply to see how. Claggy |
Send message Joined: 12 Apr 09 Posts: 10 |
Yes indeed. I have swapped all the cards around and all is fine with them. As mentioned FurMark itself sees and uses all 4 GPUs. If I suspend all BOINC tasks with 3 GPU's running, close down BOINC Manager and stop the clients, run FurMark 4 GPUSs work. I restart the manager and watch the processes start in task manager and I have 3 again. I'm not paticualrly trying to run 32-bit Vista but as there was a known 64-bit problem at one time. I am still trying to get it to work on both and getting same issues on both. 64-bit would be better actually but same results. |
Send message Joined: 12 Apr 09 Posts: 10 |
Ahh no now I get the cutting off thing :) I read the thread twice before I saw what you mean't :) Not sure if it is cutting off the 4th card in the string but I'm sure it isn't allocating a task to it still. |
Send message Joined: 12 Apr 09 Posts: 10 |
So I did some sceen shots. The compound image shows:- 2x 295 with 4 displays enabled and the image streatched across the 4. SLI off and PhysX on. 4 GPUs shown in Riva Tuner. A PrintScreen of the desktop showing all 4 desktops are present in the screen shot. Yet I still have 3 GPU tasks running and 3 showing when BOINCManager starts up. Now quite why the screens are numbered 1,2, 4 and 5 and where 3 went is a mystery and maybe one which indicates where the issue is. i.e. Is BOINC looking for 1, 2 ,3 and 4 and as such 3 is not present. BOINC does not look not look for 5 there I get 3 GPU's 1,2 and 4 and not 4 GPU tasks 1, 2, 4 and 5? I doubt this assumption as according to Rivatuner they are GPU0 through GPU3 which makes more programmatic sense. I am sooo confuddled. |
Send message Joined: 29 Aug 05 Posts: 15552 |
Now quite why the screens are numbered 1,2, 4 and 5 and where 3 went is a mystery and maybe one which indicates where the issue is. i.e. Is BOINC looking for 1, 2 ,3 and 4 and as such 3 is not present. BOINC does not look not look for 5 there I get 3 GPU's 1,2 and 4 and not 4 GPU tasks 1, 2, 4 and 5? The driver that BOINC looks for should take care of this. BOINC won't physically look for your video card, how many GPUs it has and how many screens you have connected. That said, I can understand that this may be the cause of your problem. I have already forwarded this thread to the BOINC developer who is going over this code, but haven't heard back from him yet. In the mean time, can't you force screen 3 back in the Nvidia control panel? |
Send message Joined: 12 Apr 09 Posts: 10 |
Alas not that I can see no :( The British nation is unique in this respect. They are the only people who like to be told how bad things are, who like to be told the worst. Winston Churchill |
Send message Joined: 12 Apr 09 Posts: 10 |
Okay the 4th GPU has entered the building. After I taught myself CUDA I wrote my own program to enumerate the CUDA drivers present and was seeing 3 even though the nVidia control panel and the MMC device panel was showing 4. So it's not BOINC Manager. I deinstalled all the nVidia drivers, rebooted and checked in the registry at Hkey_Local_Machine\Hardware\DeviceMap\Video to see what I had. There were 3 devices Video0, Video1 and Video2 present which were all standard VGA drivers. I then installed Forceware 185.68 and rebooted. I now had 10 devices in the registry. I navigated down to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Video\ and looked to see what was present. Here we see a list of registry keys like HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Video\(GPU ID)\0000 where there is a key for each GPU. I searched the list and for each GPU e.g. HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Video\{D82C4D9A-396A-4292-A538-A8F8F22FFEB5}\0000\Settings field (where {D82C4D9A-396A-4292-A538-A8F8F22FFEB5} is a GPU id) and looked for the words NVIDIA GeForce GTX 295. I presumed (luckily correctly) that the 1st GPUID with those words i nthe Settings field had the monitor on it. To the following 3 GPU's I added the following registry keys:- DisplayLessPolicy DWORD 1 LimitVideoPresentSources DWORD 1 at the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Video\(GPU ID)\0000 level in the registry tree and rebooted. Now I only have 2 monitors showing in the nVidia control panel as having my desktop on them but my own CUDA program shows 4 GPUS. I fired up BOINC Manager which still displays that it found only 3 (due to the string length being exceeded I expect) but I have 4 running GPUS. Maybe the string output in the Manager can be changed to be more like the CPU one. i.e. Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [x86 Family 6 Model 26 Stepping 4] so we could have CUDA devices: (4) GeForce GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS), GeForce GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS), GeForce GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS) which would at least semi mitigate it shows 3 or (4) GeForce GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS) and in the case of mixed cards (2) GeForce GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS) (1) GeForce 9800GT (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS) etc. Granted if they have 4 totally different models it would still not be right in the second option. Or allow it to span more than 1 line in the output :) The British nation is unique in this respect. They are the only people who like to be told how bad things are, who like to be told the worst. Winston Churchill |
Send message Joined: 5 Mar 08 Posts: 272 |
Okay the 4th GPU has entered the building. They could spread it across as many lines as cards are in the machine (ie 1 line for each card). The cuda device number could also be useful for each one. MarkJ |
Send message Joined: 19 Nov 05 Posts: 7 |
I added the displaylesspolicy and LimitVideoPresentSources as binary "01 00 00 00" on all four "GPU ID" strings and was able to get four cuda devices to show up, gpu-z. now the blue led is lit on the second card. I suspect only the limitvideopresentsources is required. i have only one monitor, and in display settings i have four under intensified monitor icons. |
Send message Joined: 19 Nov 05 Posts: 7 |
LimitVideoPresentSources is all that is required. search for it, and add it to the other gpu tab which look just like the one you found with the same binary string. |
Send message Joined: 8 Jan 09 Posts: 24 |
Make sure you do NOT have SLI enabled on any of them. That would cause two GPUs to function as one, thus reducing the visible CUDA devices by one. CUDA is not yet SLI compatible. |
Send message Joined: 12 Apr 09 Posts: 10 |
Correct but seeing as I was seeing 3 GPU not 2 SLI could not have been on. The British nation is unique in this respect. They are the only people who like to be told how bad things are, who like to be told the worst. Winston Churchill |
Send message Joined: 7 Jun 09 Posts: 1 |
Hey - I have the exact same issue!! ASUS p6t deluxe with sli 295 GTX cards. Sometime I see 3 GPUs in dev manager, sometimes 4!! I even sent screenshots to FalconNW to help. Did you find an easy solution?? Bruce |
Send message Joined: 18 Jun 09 Posts: 9 |
I want to make the LimitVideoPresentSources registry mod so I can run 2x295's. Will this mod cause issues if I want to later set SLI on to play games? Thanks, Far |
Send message Joined: 18 Jun 09 Posts: 9 |
Anyone know? I won't be running Boinc when SLI is on, but will changing the registry cause any catastrophic stuffups if it is switched back to SLI for a while? (I would just go ahead and try it but hard pressed to find time for any more rebuilds of this thing) |
Send message Joined: 29 Aug 05 Posts: 15552 |
All that SLI does is make two GPUs acts as if there is only one in your system. It'll combine all the memory available on both cards and for all intends and purposes show as if you only have one videocard in the system. So BOINC will then also treat it as one videocard, one GPU, one task at a time running. |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.