AMD driver discovery problem revisited

Message boards : Questions and problems : AMD driver discovery problem revisited
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 91479 - Posted: 14 May 2019, 13:56:49 UTC
Last modified: 14 May 2019, 14:03:15 UTC

On a working system with this copro_info file, I did a disk cleanup, removed all windows 10 restore points, and created a new restore point as I wanted to try a different AMD driver for the S9X000 cards I have.

I rebooted and brought up boinc to make sure all was ok before installing AMD's 19Q2 package.
Unaccountably, boinc did not see any video boards. That copro_info file had the following:
    <coprocs>
<warning>No NVIDIA library found</warning>
<warning>calInit() returned 1</warning>
<warning>clGetPlatformIDs() failed to return any OpenCL platforms</warning>
    </coprocs>


I had made a copy of that coproc_info file before the cleanup so I put it back in and marked it Read Only so it would not be overridden. I restarted boinc and the GPU's were recognized.

This system does not have any nVidia boards so obviously ( I would think) the call to find the library is going to fail. OR MAYBE SINCE I DID A CLEANUP SOME OLD VERSION OF THE LIBARAY GOT REMOVED AND BOINC IS CONFUSED AND DOES NOT LOOK FOR THE AMD STUFF.

This really needs to be fixed.

[EDIT] If I could build the client under VS2017 or m$ofts lastest & greatest possibly I could debug this problem. It might help if there was a "lightweight" windows only version available that does not have the proverbial kitchen sink.
ID: 91479 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 91515 - Posted: 16 May 2019, 0:08:40 UTC
Last modified: 16 May 2019, 0:13:20 UTC

Follow-up info, maybe this might be useful for debugging.

1. The file I listed, the working "coproc_info.xml", actually had some wrong information but it seems there was enough correct info for opencl to work. All boards were identified as S9100 and clock speed 900 but only the 3rd from the top was an S9100. The rest were S9000. that coproc_info.xml was created using the 10-10-2018 AMD driver, not the new one I am testing, the "19Q2"

2. The following Microsoft updates came in at 4am or so this morning



About 10am I had to shutdown that system and it did not survive the reboot. All those updates were "uninstalled" as windows recovered to the last valid state. Unfortunately, that state was the one that had the problem of the BOINC client not seeing the GPUs
I looked in windows\system32 and there was an opencl dated about 10-10-2018 so that seemed to matched the 2018 driver.

I re-installed the AMD "19Q2" driver and noticed that the opencl.dll was now dated 5-6-2019 which is the date of the 19Q2 AMD release
Boinc client started up and ran fine and I noticed that the coproc_info.xml file had all the correct info. All boards were correctly identified and the clocks speeds were now correct. It seems that the 2019 AMD drivers do a better job of identifying their own boards. Everything seems to be working fine BUT BUT BUT the "windows 10 update center" has 3 installs (those same 3 above) pending a reboot.

I am going to try the following but it is just a guess that it will work.
1. going to prevent boinc from starting automatically. I am guessing that boinc and/or the drivers crash and windows thinks there was a problem with the install which caused a fallback.
2. If I can, I will install each update individually to see if I can figure out where the problem is. Not sure how to do this as they are already downloaded and "pending". Probably a script somewhere that I can edit to prevent the install.
3. once the update is OK then I will verify that opencl.dll is still in system32 and if the drivers look ok then I will start BOINC.

QUESTION FOR DEVELOPERS (or anyone) What happens if the system has 3 different manufacturer GPUs ie: Intel GPU on mombo, an nVidia board and a Radeon board. Since the opencl.dll is in system32 and it seems to be supplied by the vendor, which one is the "better one". Putting in an older nvidia driver might toss the newer and better AMD or vice versa.
ID: 91515 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 867
United States
Message 91517 - Posted: 16 May 2019, 2:38:37 UTC - in response to Message 91479.  

The warning about no Nvidia library found is just part of DA's code. He tests for everything and the warning is benign. I asked a similar question because of a reported warning and found the answer in the code once Richard pointed me at the part that I walked. It does and can throw someone off on a tangent without knowing that is the normal response for not finding a library.
ID: 91517 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 91529 - Posted: 17 May 2019, 17:20:34 UTC

follow-up on my previous post. Be nice if one could add comments to existing instead of a new message

I have my S9x00 boards working fine with AMD's 2015 driver and have been able to set the clock speed whereas I could not do that with AMD latest "Pro Series". However the opencl is old, 2015 as shown by clinfo.exe. The 2015 driver is the one you get if you asks microsoft to get the latest.

I then downloaded and extracted AMD_OpenCL64.dll from both AMD's 2018Q4 and their 2019Q2 releases and put those at \windows\system32 and also at \windows\SysWOW64 replacing the 2015 opencl.dll

clinfo.exe showed I had the latest opencl but all milkyway tasks errored out on either of those two. I had suspended all but a few MW tasks as I did not want 800+ tasks to error out in a couple of seconds like they did a week ago. Looks like I am stuck with the correct driver but a 4 year old opencl library.

maybe a developer could shed some light on this. How can I upgrade the opencl library but still keep video drivers that were designed for the board?
ID: 91529 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 867
United States
Message 91530 - Posted: 17 May 2019, 18:05:18 UTC - in response to Message 91529.  

What project or app is complaining of using the four year old opencl.dll? I believe almost all the projects are written to use OpenCL 1.2 and I don't know of any that require OpenCL 2.0.
ID: 91530 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 91531 - Posted: 17 May 2019, 18:21:42 UTC - in response to Message 91530.  
Last modified: 17 May 2019, 18:34:28 UTC

What project or app is complaining of using the four year old opencl.dll? I believe almost all the projects are written to use OpenCL 1.2 and I don't know of any that require OpenCL 2.0.


Wanted to use newer opencl as old one possibly has bug. About 1 out of 5 work units are invalid on this driver (the 2015) but I have gotten newer drivers to run w/o any invalids. Very difficult to repeat as after a microsoft update the opencl seems to have been removed and I have trouble re-installing the one that gave no invalid errors.

Virtually every time a major release reboots I lose the working opencl driver. I sometimes see a notification that AMD has restored something (cant find out as it is momentary bottom right corner of windows 10) and I end up doing a clean install of, usually the 2018Q4 driver to fix the problem. This just happened with the Tuesday release this week, but I cannot seem to get that driver to work again. It used to do a work unit in an average of 10 seconds (4 boards) with no invalids but now I have 200+ invalid result with 8500 valid ones. I am guessing the 2018 driver (opencl) has something in it that works better with milkyway but windows keeps rebooting and AMD then changes things.

I did go to Task Scheduler and stopped the AMD updater from doing its thing. Do not know how to get windows to stop updating.
I would use ubuntu but had problem with the latest AMD ubuntu install.

[EDIT] Going to try to get system to work with no invalids. This particular system has unusual boards, three S9000 and one S9100 which is not common. However, AMD does support them in 2019server and I tried that install (the 2019svr AMD update) which installed with no error but BOINC did not see any GPU so I put the 2015 back in. I may give ubuntu a try but will avoid the more recent release and try to find one that matches the latest AMD driver. I think you run Linux, can you recommend a release that is known to work with the lasted AMD drivers? This assumes the latest drivers still work with S9x00 cards.
ID: 91531 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 867
United States
Message 91533 - Posted: 17 May 2019, 20:22:38 UTC - in response to Message 91531.  

I'm not really familiar with AMD graphics having only run Nvidia. I see weekly updates to the Radeon drivers for Linux on news articles at Phoronix.com. When I searched for AMD drivers for Linux, I landed on
https://drivers.amd.com/drivers/linux/amdgpu-pro-19.10-782345-ubuntu-18.04.tar.xz
which covers the S9000 and S9100. But then I see a note down in Driver Details which mentions
Note: Customers who have upgraded to the latest 4.15 Kernel for Ubuntu will need to use an 18.20 based driver such as Radeon™ Pro Software Adrenalin Edition 18.Q3 for Linux.

which doesn't make sense because Ubuntu 18.04.2 has the 4.15 Kernel. That Radeon™ Pro Software Adrenalin Edition 18.Q3 for Linux doesn't cover the S9000 or S9100 cards. Huh?

I would stick to the the long term releases like 16.04 or 18.04. I think Ubuntu 18.04 is a very stable release. I had a test partition with 18.10 for a while which I updated to the 19.04 release. The release is very fast compared to 18.04 but it also ships with a GLIBC 2.29 versus the GLIBC 2.27 of 18.04 and threw a monkey wrench into my normal reason for testing and compiling in that test partition with any executable not being compatible with the earlier Linux versions. Based on the speed of 19.04, I think that when the 20.04 LTS release debuts next year, I will be doing a distribution upgrade.
ID: 91533 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2533
United Kingdom
Message 91536 - Posted: 18 May 2019, 5:58:58 UTC - in response to Message 91531.  

Do not know how to get windows to stop updating.


I have suggested this on another forum but no one who uses windows has commented on my suggestion. How about blocking the M$ domain(s) in your router?
ID: 91536 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 91537 - Posted: 18 May 2019, 7:24:37 UTC - in response to Message 91536.  

Do not know how to get windows to stop updating.


I have suggested this on another forum but no one who uses windows has commented on my suggestion. How about blocking the M$ domain(s) in your router?


yes , could do it but for specific systems. I have an ubiquity edge router that has IP cameras. I could not figure how to stop them from phoning home so I put them on a subnet by themselves. it didn't stop them from phoning home but they cant access my network which was a security concern. I could move the problem systems to the subnet but do not know how to block a domain. may ask at the ubiquity forum. I am not a net expert.
ID: 91537 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 91543 - Posted: 18 May 2019, 16:21:08 UTC

This system has the gpedit app unlike windows "home" so I will give another try to blocking updates.
Item 3 here says to enable update configuration (I had it disabled which had no effect). I suspect it wont work as the suggestion is dated 2015 but maybe??

My modem, Arris BCW210 does not allow blocking and my edge router will required a cable run to the problem system which is a PITA.
putting "127.0.0.1 microsoft.com" into the host table will block browsers but I suspect windows bypasses that.
ID: 91543 · Report as offensive

Message boards : Questions and problems : AMD driver discovery problem revisited

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.