Losing use of GPU

Message boards : Questions and problems : Losing use of GPU
Message board moderation

To post messages, you must log in.

AuthorMessage
Vig

Send message
Joined: 10 Sep 17
Posts: 2
United States
Message 81016 - Posted: 10 Sep 2017, 13:48:41 UTC
Last modified: 10 Sep 2017, 13:53:22 UTC

I was hoping someone would be able to help me track down why one instance of BOINC eventually loses the use of my GPUs.

I have two instances of BOINC currently running on almost identical computers (Windows 7). The one in question is connected to a display via HDMI. When it starts up the GPUs are being used as expected. After hours of running I find that the GPUs are not being used, however, GPU tasks are still being crunched. This is not project specific as it has happened with multiple projects. It also is not BOINC version dependant as it has happened with both 7.6.33 and 7.8.2. Rebooting the computer resets the use of the GPUs until time passes again.

I am confident that the computer in question is not overheating. Drivers are updated to latest versions. Does anyone have any advice on other items I could check?

I am currently running BOINC 7.8.2 64-bit.
ID: 81016 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 81018 - Posted: 10 Sep 2017, 14:01:23 UTC - in response to Message 81016.  
Last modified: 10 Sep 2017, 14:03:41 UTC

After hours of running I find that the GPUs are not being used, however, GPU tasks are still being crunched.
What do you mean with this?
How are you measuring that the GPUs are used?

I am confident that the computer in question is not overheating.
How are you sure of that? What are you using to check that?
Drivers are updated to latest versions.
Sorry, but this isn't a useful statement, because you don't say which drivers you use, or even what brand and model GPU(s) you use. Did you at least get the drivers from the GPU manufacturer's website?
Else, please check this thread on what minimum information we need.
ID: 81018 · Report as offensive
Vig

Send message
Joined: 10 Sep 17
Posts: 2
United States
Message 81019 - Posted: 10 Sep 2017, 14:10:21 UTC - in response to Message 81018.  
Last modified: 10 Sep 2017, 14:12:43 UTC

What do you mean with this?
How are you measuring that the GPUs are used?


Task runtime and power draw.

I am confident that the computer in question is not overheating. How are you sure of that? What are you using to check that?


This was the first item I checked. You'll have to trust me that what I'm saying is accurate or we won't get much of anywhere.

Drivers are updated to latest versions. Sorry, but this isn't a useful statement, because you don't say which drivers you use, or even what brand and model GPU(s) you use. Did you at least get the drivers from the GPU manufacturer's website?


AMD RX 480 GPUs with 17.9.1 ReLive drivers directly from AMD. If the drivers were an issue both systems should be having problems as they are setup identically. It has to be something specific to the one instance and not the other.
ID: 81019 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 81020 - Posted: 10 Sep 2017, 14:22:44 UTC - in response to Message 81019.  

AMD RX 480 GPUs with 17.9.1 ReLive drivers directly from AMD. If the drivers were an issue both systems should be having problems as they are setup identically. It has to be something specific to the one instance and not the other.
Identically setup machines don't necessarily have to be identical. Even if all the hardware is identical in brand and model, different outcomes are possible due to use of different production materials. Even the position of the computers can matter in how much heat they expel.

As for trusting someone on his word, I'd rather have some software do that.
I use a Sapphire RX 470 - 8GB with ReLive 17.5.2 and keep an eye on it and the rest of my system using Speedfan and Sapphire Trixx.
It's been running Seti tasks for the past several hours without problems, at a sustained 72 degrees centigrade, with a max of 37% fan speed. None of my tasks err, or have their application continue while the GPU is no longer under load.

So the rest of the questions:
- which project(s) is (are) affected?
- which applications lose contact with the GPU(s)?
- just one GPU? or both?
- what brand and model power supply unit is installed in that system?
- anti-virus or other anti-malware software? does it have the BOINC data directory excluded from being (actively) scanned?
- any more information that you can give?
ID: 81020 · Report as offensive
Profile rainer1973at

Send message
Joined: 11 Sep 17
Posts: 2
Austria
Message 81045 - Posted: 11 Sep 2017, 8:45:08 UTC

I have also the same problem with my Graphic-Card:
AMD RX 480 GPUs with 17.9.1 ReLive drivers directly from AMD.

Boinc is not using my GPU
ID: 81045 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 81047 - Posted: 11 Sep 2017, 10:25:06 UTC - in response to Message 81045.  
Last modified: 11 Sep 2017, 10:25:35 UTC

Same drivers as the other guy, so I'd first go and try older ones: https://support.amd.com/en-us/download/desktop/previous?os=Windows%207%20-%2064
Make sure to use the AMD Clean Uninstall Utility (https://support.amd.com/en-us/kb-articles/Pages/AMD-Clean-Uninstall-Utility.aspx) to uninstall the present drivers, and do the necessary reboot(s), then install the older ones. For a good test, and because I don't have any problems with them, try the 17.5.2s.
ID: 81047 · Report as offensive
Profile rainer1973at

Send message
Joined: 11 Sep 17
Posts: 2
Austria
Message 81213 - Posted: 14 Sep 2017, 8:23:56 UTC - in response to Message 81047.  

Hi,

when i understand it correctly, i must go back to a older driver-version, to use the bonic gpu ?

so i think, thats not the right way, boinc must make a version, who can use the gpu with the newest drivers of graphic-cards.

so, im waiting, bevor i install a old driver version

thanks rainer
ID: 81213 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 81219 - Posted: 14 Sep 2017, 9:55:31 UTC - in response to Message 81213.  
Last modified: 14 Sep 2017, 12:27:08 UTC

Yes, you should try to go back to a previous driver version to see if it does the same thing. Especially if there are multiple people with the same problem with that driver version, but not with previous driver versions.

What I think happens is that at one point the drivers crash in Windows, possibly due to Windows or due to them being unstable, causing to BOINC lose the connection with the GPU and the applications continue without it. That's not something a newer BOINC is going to fix, because it doesn't use the GPU nor the drivers.

BOINC doesn't use your GPU, it merely detects it so that projects can use it.
It can detect your GPU with your drivers, else the project(s) you run cannot use your GPU. So the detection of the GPU isn't the problem.
ID: 81219 · Report as offensive
Profile Yavanius
Avatar

Send message
Joined: 19 May 15
Posts: 123
Antarctica
Message 81512 - Posted: 21 Sep 2017, 0:09:00 UTC - in response to Message 81213.  


when i understand it correctly, i must go back to a older driver-version, to use the bonic gpu ?

so i think, thats not the right way, boinc must make a version, who can use the gpu with the newest drivers of graphic-cards.

so, im waiting, bevor i install a old driver version



The latest driver doesn't necessarily equate to the best performance. Sometimes a new driver isn't properly vetted before being released or certain apps have issues interfacing the GPU with the new drivers. (On the flip side of that, if you have rather old drivers, new drivers can help.) Also, sometimes new drivers can change how your GPUs work and even reset custom settings.

If you're having problems now across multiple projects where you were just fine before, it's best to go to AMD's forums and check if anybody else is having the same issues. If not, you should report it so they are aware of it. Sometimes it's an "Oops, we goofed in some routine or we didn't realize it was going to react like that." This is nothing new to BOINC if you look back at old posts across projects and not exclusive to AMD."

In the meantime, you can go back to your previous drivers to keep crunching until the issue is resolved. BOINC doesn't have anything to do with using your GPU. It checks for a GPU and then passes that info along to the project site which determines if your GPU A) has a client for it, B) if that client will work with your GPU. Each project programs their own GPU client(s).

You can also try posting to the project websites. I know Einstein is pretty responsive to issues and may be able to assist you more.
ID: 81512 · Report as offensive

Message boards : Questions and problems : Losing use of GPU

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.