Dual GPUs but only one with load

Message boards : GPUs : Dual GPUs but only one with load
Message board moderation

To post messages, you must log in.

AuthorMessage
Barker

Send message
Joined: 5 Nov 17
Posts: 6
South Africa
Message 82650 - Posted: 5 Nov 2017, 14:36:13 UTC

Hi I am running windows 7 64bit with BOINC 7.8.2 (x64)

I have 2 identical legacy ATI Radeon HD 3800 cards which I am using on the Moo! Wrapper project. I have been running them on BOINC since 2012 on various projects but 2 weeks ago for the first time I used GPU -Z to investigate them as I had heating issues and that is when I found that only the one had any load while the other had none. What is a little confusing is that BOINC shows tasks assigned to device 0 and device 1 and I have 2 tasks running for the GPU in BOINC.

I did a bit of reading and found a few things, that I have already tried.

1. Installed the latest drivers for the cards - did that.
2. I enabled all GPUS in the BOINC cc_config.xml file.
3. Confirmed the mother board support PCIe x 16 2.0.to allow 2 cards to run at the same time
4. Added dummy monitor plugs into all your GPUS as I read they may not be used otherwise. - I did that but have no monitors attached.
5. Confirmed that the 2nd card is not blown. I took them one by one and did a test - all good.
6. Made sure that overdrive and cross-fire is disabled.

Has anyone else seen this before? How do I fix this as I am technically running only at 50% capacity.

Any help will be appreciated.

Thanks

Wayne
ID: 82650 · Report as offensive     Reply Quote
Barker

Send message
Joined: 5 Nov 17
Posts: 6
South Africa
Message 82702 - Posted: 6 Nov 2017, 12:24:12 UTC - in response to Message 82650.  

I have attached more images that show the problem, also images for eventlog and config...

Thanks for the assistance.

BOINC tasks showing device 0 and device 1 - Load on device 0


BOINC tasks showing device 0 and device 1 - Load on device 1


BOINC eventlog


Machine Summary


Device 0 Driver


Device 1 Driver


cc config file


App config file
ID: 82702 · Report as offensive     Reply Quote
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 753
Finland
Message 82719 - Posted: 6 Nov 2017, 20:20:12 UTC

Could you check the command line BOINC used to start the GPU tasks? This Super User Q&A shows more than many enough ways to do that.
ID: 82719 · Report as offensive     Reply Quote
robsmith
Help desk expert

Send message
Joined: 25 May 09
Posts: 343
United Kingdom
Message 82720 - Posted: 6 Nov 2017, 21:20:14 UTC

Two GPUs running, each running one task - that's the default situation.
Temperature differences between otherwise identical cards, running "identical" applications are not unknown as they can be influenced by all sort of things, including the exact loading that the two data-sets are presenting, the way the cooling air is moving around the case - the list goes on. You really need to look a fair number of tasks to see if there is a significant difference in run times between the two - not forgetting that driving a monitor may have an impact on the performance of one of the cards.
ID: 82720 · Report as offensive     Reply Quote
Barker

Send message
Joined: 5 Nov 17
Posts: 6
South Africa
Message 82736 - Posted: 7 Nov 2017, 12:33:49 UTC - in response to Message 82720.  

Two GPUs running, each running one task - that's the default situation.
Temperature differences between otherwise identical cards, running "identical" applications are not unknown as they can be influenced by all sort of things, including the exact loading that the two data-sets are presenting, the way the cooling air is moving around the case - the list goes on. You really need to look a fair number of tasks to see if there is a significant difference in run times between the two - not forgetting that driving a monitor may have an impact on the performance of one of the cards.


Thing is GPU-Z shows only one has load. Who is right BOINC or GPU-Z?
ID: 82736 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer moderator
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 2832
United Kingdom
Message 82737 - Posted: 7 Nov 2017, 12:44:42 UTC - in response to Message 82719.  

Could you check the command line BOINC used to start the GPU tasks? This Super User Q&A shows more than many enough ways to do that.
BOINC v7.8.2 will normally pass "which device to use" instructions via init_data.xml, rather than the command line - the command line option is present and used for legacy compatibility purposes only. I don't know whether the Moo Wrapper server has been kept up-to-date to use the newer mechanism.

@ Wayne,

Although I don't think it's implicated in this case, BOINC v7.8.2 was buggy - I'd recommend that you update to v7.8.3 when convenient and check again.
ID: 82737 · Report as offensive     Reply Quote
Barker

Send message
Joined: 5 Nov 17
Posts: 6
South Africa
Message 82742 - Posted: 7 Nov 2017, 14:35:28 UTC - in response to Message 82737.  

Could you check the command line BOINC used to start the GPU tasks? This Super User Q&A shows more than many enough ways to do that.


Could you check the command line BOINC used to start the GPU tasks? This Super User Q&A shows more than many enough ways to do that.
BOINC v7.8.2 will normally pass "which device to use" instructions via init_data.xml, rather than the command line - the command line option is present and used for legacy compatibility purposes only. I don't know whether the Moo Wrapper server has been kept up-to-date to use the newer mechanism.

@ Wayne,

Although I don't think it's implicated in this case, BOINC v7.8.2 was buggy - I'd recommend that you update to v7.8.3 when convenient and check again.


I upgraded to BOINC v7.8.3. The problem remains.

Below is an image of the commandline of the processes. No parameters are visible.

ID: 82742 · Report as offensive     Reply Quote
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 753
Finland
Message 82744 - Posted: 7 Nov 2017, 15:10:47 UTC - in response to Message 82742.  

Below is an image of the commandline of the processes. No parameters are visible.


It's odd that there is nothing at all. It should have had executable path and name the same way a few other processes in the screenshot have.

Anyway, next step is to look in init_data.xml. Each task has its own init_data.xml and the files are located in slot directories. Check properties of running GPU tasks and directory entries there. That tells what slots the tasks use.

Then open Explorer and go to C:\ProgramData\BOINC\slots\n. That directory may be hidden so just copy-paste it into Explorer address field. Next open init_data.xml files in Notepad. Find entries like these:

<gpu_type>NVIDIA</gpu_type>
<gpu_device_num>0</gpu_device_num>


I'm not sure if the GPU type is ATI or CAL in your case. If <gpu_device_num> matches what Manager tells you then this looks like a bug in dnetc_wrapper which you need to report to the project.
ID: 82744 · Report as offensive     Reply Quote
Barker

Send message
Joined: 5 Nov 17
Posts: 6
South Africa
Message 82748 - Posted: 7 Nov 2017, 17:27:50 UTC - in response to Message 82744.  

I'm not sure if the GPU type is ATI or CAL in your case. If <gpu_device_num> matches what Manager tells you then this looks like a bug in dnetc_wrapper which you need to report to the project.


Thanks, here are the images of the 2 files in the slots. I assume this shows that its a bug in the dnetc wrapper?



ID: 82748 · Report as offensive     Reply Quote
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 753
Finland
Message 82750 - Posted: 7 Nov 2017, 18:12:22 UTC - in response to Message 82748.  

Thanks, here are the images of the 2 files in the slots. I assume this shows that its a bug in the dnetc wrapper?


Yep.
ID: 82750 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer moderator
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 2832
United Kingdom
Message 82753 - Posted: 7 Nov 2017, 18:46:34 UTC

Moo! Wrapper Windows app 1.04 (ati14) is quite old, dating from 6 Sep 2014.

I was reminded of commit befb90f, from around the same time. I presume those two have worked together for the last three years: which does raise the possibility that we broke something in 7.8?
ID: 82753 · Report as offensive     Reply Quote
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 753
Finland
Message 82756 - Posted: 7 Nov 2017, 19:24:14 UTC - in response to Message 82753.  

Or Moo missed the API change like Jason did with Seti's CUDA apps and Barker is the first one to run dual CAL GPUs since the API change.
ID: 82756 · Report as offensive     Reply Quote
Barker

Send message
Joined: 5 Nov 17
Posts: 6
South Africa
Message 82757 - Posted: 7 Nov 2017, 19:31:35 UTC - in response to Message 82756.  

I had a look through the computer participating on Moo and there are several legacy ATI cards participating. So i guess its quite possible that no one has picked up. Especially with BOINC indicating that the tasks are distributed correctly. I only noticed cause the one card was running so much hotter than the other and on inspection the one had load and the other didn't.
ID: 82757 · Report as offensive     Reply Quote

Message boards : GPUs : Dual GPUs but only one with load

Copyright © 2018 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.