Radeon R9 290X not detected by BOINC

Message boards : BOINC client : Radeon R9 290X not detected by BOINC
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
CX99-OCE

Send message
Joined: 27 Nov 15
Posts: 6
United States
Message 65647 - Posted: 27 Nov 2015, 1:31:10 UTC

Hi, I seem to be having some trouble getting BOINC to detect and use my 290X. I am running Xubuntu 14.04 with AMD FGLRX Driver installed. I have read here: http://boinc.berkeley.edu/dev/forum_thread.php?id=7681#4467
that BOINC must have a symbolic link to the OpenCL libraries in order to function properly. BOINC is installed under /var/lib/ on my computer where as the OpenCL Libraries are under /usr/lib/x86_64-linux-gnu. Where would I have to place the symbolic link for the OpenCL Library for the GPU to be usable by BOINC in /var/lib?
ID: 65647 · Report as offensive
ChristianB
Volunteer developer
Volunteer tester

Send message
Joined: 4 Jul 12
Posts: 321
Germany
Message 65651 - Posted: 27 Nov 2015, 19:37:38 UTC

That is a fairly old thread using a 7.0.x Client version. You should have something newer. Can you post the startup log messages? Those are the first 10 to 15 lines in Tools -> Event Log.

In order to use the card you also need an opencl library. You can try to install the boinc-amd-opencl package. That should do the trick.
ID: 65651 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 65652 - Posted: 27 Nov 2015, 20:43:19 UTC

CX, the symlink goes to the same directory where the libOpenCL.so.1 is.

Christian, this is a bug in the client but so far no one has been able to convince Rom and David. The client is trying to open libOpenCL.so but the symlink libOpenCL.so->libOpenCL.so.1 has been removed from the packages AMD, NVIDIA and Debian/Ubuntu make ages ago.

And even if the symlink existed it would still be wrong to try to open it. If you look at for example Einstein's OpenCL apps they want libOpenCL.so.1. For the check in the client to be useful it should be checking whatever the apps want.
ID: 65652 · Report as offensive
ChristianB
Volunteer developer
Volunteer tester

Send message
Joined: 4 Jul 12
Posts: 321
Germany
Message 65654 - Posted: 28 Nov 2015, 8:38:11 UTC

I didn't know that. Do you know why the libopencl.so -> libopencl.so.1 symlink was removed. Normally it's some kind of version incompatibility I think. Would it be safe to look for a libopencl.so first and if not found then libopencl.so.1?

I agree that the client should report what the projects need but also has to make sure that it is usable by projects.
ID: 65654 · Report as offensive
CX99-OCE

Send message
Joined: 27 Nov 15
Posts: 6
United States
Message 65657 - Posted: 28 Nov 2015, 16:04:03 UTC

Sat 28 Nov 2015 09:27:04 AM CST | | Starting BOINC client version 7.2.42 for x86_64-pc-linux-gnu
Sat 28 Nov 2015 09:27:04 AM CST | | log flags: file_xfer, sched_ops, task
Sat 28 Nov 2015 09:27:04 AM CST | | Libraries: libcurl/7.35.0 OpenSSL/1.0.1f zlib/1.2.8 libidn/1.28 librtmp/2.3
Sat 28 Nov 2015 09:27:04 AM CST | | Data directory: /var/lib/boinc-client
Sat 28 Nov 2015 09:27:04 AM CST | | OpenCL CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1729.3 (sse2,avx), device version OpenCL 1.2 AMD-APP (1729.3))
Sat 28 Nov 2015 09:27:04 AM CST | | No usable GPUs found
Sat 28 Nov 2015 09:27:04 AM CST | | Host name: C7X99-OCE
Sat 28 Nov 2015 09:27:04 AM CST | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz [Family 6 Model 63 Stepping 2]
Sat 28 Nov 2015 09:27:04 AM CST | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
Sat 28 Nov 2015 09:27:04 AM CST | | OS: Linux: 3.16.0-53-generic
Sat 28 Nov 2015 09:27:04 AM CST | | Memory: 15.51 GB physical, 15.83 GB virtual
Sat 28 Nov 2015 09:27:04 AM CST | | Disk: 424.13 GB total, 372.47 GB free
Sat 28 Nov 2015 09:27:04 AM CST | | Local time is UTC -6 hours
Sat 28 Nov 2015 09:27:04 AM CST | | Config: GUI RPCs allowed from:

This seems to be the relevant data from the event log ChristianB requested.
Thanks
ID: 65657 · Report as offensive
CX99-OCE

Send message
Joined: 27 Nov 15
Posts: 6
United States
Message 65658 - Posted: 28 Nov 2015, 16:09:46 UTC

I found it interesting that the AMD OpenCL drivers were detected, but not the physical hardware itself.
ID: 65658 · Report as offensive
floyd
Help desk expert

Send message
Joined: 23 Apr 12
Posts: 77
Message 65660 - Posted: 28 Nov 2015, 17:46:25 UTC - in response to Message 65658.  

I found it interesting that the AMD OpenCL drivers were detected, but not the physical hardware itself.

That happens if the fglrx module is not loaded, or perhaps if the user running BOINC lacks permission. "xhost +si:localuser:boinc" would be the thing to try in the latter case.
ID: 65660 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 65661 - Posted: 28 Nov 2015, 20:30:22 UTC - in response to Message 65654.  

I didn't know that. Do you know why the libopencl.so -> libopencl.so.1 symlink was removed. Normally it's some kind of version incompatibility I think. Would it be safe to look for a libopencl.so first and if not found then libopencl.so.1?


It is my understanding that symlinks are supposed to be installed by dev packages. This way when you link a program you'll be linking against the library version that matches the headers.

Lets say at some point a new OpenCL ICD loader API is made and the library for that is called libOpenCL.so.2. If you now opened libOpenCL.so you wouldn't know which version you'd get and wouldn't know which API to use. You'd want to open both versioned library names and report back to servers which API the host supports or if it supports both.

That said, I'd accept opening both files as a compromise. Right now the biggest problem for Linux users is getting GPU computing working. Fixing the OpenCL detection one way or the other would help.
ID: 65661 · Report as offensive
CX99-OCE

Send message
Joined: 27 Nov 15
Posts: 6
United States
Message 65662 - Posted: 29 Nov 2015, 4:14:15 UTC - in response to Message 65660.  

I tried loading the fglrx module and giving BOINC the proper permissions. Neither worked :-(
ID: 65662 · Report as offensive
CX99-OCE

Send message
Joined: 27 Nov 15
Posts: 6
United States
Message 65663 - Posted: 29 Nov 2015, 5:05:48 UTC

I found the solution:
sudo /etc/init.d/boinc-client restart

Thanks for the help!
ID: 65663 · Report as offensive
CX99-OCE

Send message
Joined: 27 Nov 15
Posts: 6
United States
Message 65664 - Posted: 29 Nov 2015, 5:05:54 UTC

I found the solution:
sudo /etc/init.d/boinc-client restart

Thanks for the help!
ID: 65664 · Report as offensive
Profile Agentb
Avatar

Send message
Joined: 30 May 15
Posts: 265
United Kingdom
Message 65700 - Posted: 1 Dec 2015, 20:34:17 UTC - in response to Message 65663.  
Last modified: 1 Dec 2015, 20:34:44 UTC

I found the solution:
sudo /etc/init.d/boinc-client restart

Thanks for the help!


Are running this command from within X (and lightdm display manager) ?

If yes, i had the same problem, (not crunch with GPU at start-up), and used the same restart workaround for a while.

The problem was related to xhost, i could not get it to function correctly when running from init.d script.

I added the xhost command into the lightdm [SeatDefaults] stanza - see this post

and that allowed boinc to auto-start GPU crunching.

Well done on getting it running.
ID: 65700 · Report as offensive
ChristianB
Volunteer developer
Volunteer tester

Send message
Joined: 4 Jul 12
Posts: 321
Germany
Message 65721 - Posted: 2 Dec 2015, 17:33:17 UTC - in response to Message 65661.  

I didn't know that. Do you know why the libopencl.so -> libopencl.so.1 symlink was removed. Normally it's some kind of version incompatibility I think. Would it be safe to look for a libopencl.so first and if not found then libopencl.so.1?


It is my understanding that symlinks are supposed to be installed by dev packages. This way when you link a program you'll be linking against the library version that matches the headers.

Lets say at some point a new OpenCL ICD loader API is made and the library for that is called libOpenCL.so.2. If you now opened libOpenCL.so you wouldn't know which version you'd get and wouldn't know which API to use. You'd want to open both versioned library names and report back to servers which API the host supports or if it supports both.

That said, I'd accept opening both files as a compromise. Right now the biggest problem for Linux users is getting GPU computing working. Fixing the OpenCL detection one way or the other would help.

It could be a simple fallback strategy. First check for libOpenCL.so, if not found check for libOpenCL.so.1 and so on.

The issue with the fglrx driver is known to me from the debian community where they couldn't find a universal fix to the problem and advise users to restart the client after theu logged into the Windows Manager.
ID: 65721 · Report as offensive
Profile Agentb
Avatar

Send message
Joined: 30 May 15
Posts: 265
United Kingdom
Message 65724 - Posted: 2 Dec 2015, 21:34:23 UTC - in response to Message 65721.  

Fixing the OpenCL detection one way or the other would help.

It could be a simple fallback strategy. First check for libOpenCL.so, if not found check for libOpenCL.so.1 and so on.

Yes that will help (a lot), and search the system library directory path, which includes environment variable LD_LIBRARY_PATH.

It is not totally clear (to me **1) that boinc (as started in the debian style) is searching for libraries in the correct places - i do know some GPU detection issues are fixed by adjusting LD_LIBRARY_PATH when it is almost certain the library is installed in the right place. See comment about LD_LIBRARY_PATH in Environment vars.

It also would be helpful to [<debug>] log where boinc looked and what .so*(s) it found. I can imagine different vendors doing different OpenCL library install locations.


The issue with the fglrx driver is known to me from the debian community where they couldn't find a universal fix to the problem and advise users to restart the client after they logged into the Windows Manager.


I'm guessing this is the GPU not recognized at start issue - "X currently must start fgrlx, then run the windows manager. Then an authenticated user, can then (re)start boinc"

I managed to reduce it to "X must start fgrlx, then run the windows manager (lightdm) which auto-starts boinc".

But this is not headless but is ok (for me) - I had been meaning to try some of the suggestions here

Guide to run OpenCL headless, without X server and as normal user
and
Running OpenCL applications remotely on AMD GPUs


I seem to recall the final sticking point was xhost needed X running. boinc could check if it has permissions, and if X is running and log that info.

**1 i have never found a well written guide to shared libraries and "ld" so this is always something i struggle with.
ID: 65724 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 65763 - Posted: 3 Dec 2015, 22:19:39 UTC - in response to Message 65724.  

It is not totally clear (to me **1) that boinc (as started in the debian style) is searching for libraries in the correct places - i do know some GPU detection issues are fixed by adjusting LD_LIBRARY_PATH when it is almost certain the library is installed in the right place.


BOINC doesn't search for libraries in any specific place. It ask the OS to open libOpenCL.so for it and it's the OS's job to figure out where to look. If you need to use LD_LIBRARY_PATH then something has gone wrong with the driver install.


The issue with the fglrx driver is known to me from the debian community where they couldn't find a universal fix to the problem and advise users to restart the client after they logged into the Windows Manager.


I'm guessing this is the GPU not recognized at start issue - "X currently must start fgrlx, then run the windows manager. Then an authenticated user, can then (re)start boinc"

I managed to reduce it to "X must start fgrlx, then run the windows manager (lightdm) which auto-starts boinc".

But this is not headless but is ok (for me) - I had been meaning to try some of the suggestions here

Guide to run OpenCL headless, without X server and as normal user
and
Running OpenCL applications remotely on AMD GPUs


I seem to recall the final sticking point was xhost needed X running. boinc could check if it has permissions, and if X is running and log that info.


Somehow I was under the impression that they had managed to delay starting BOINC after X was ready. Guess not.

But that AMD link was interesting. It says fglrx 14.12 doesn't need X any more. Have you tried if it is really true?
ID: 65763 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 65767 - Posted: 3 Dec 2015, 23:18:55 UTC - in response to Message 65652.  

Christian, this is a bug in the client but so far no one has been able to convince Rom and David.

Seeing how we have three threads on this now, let me try to convince them.
ID: 65767 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 65815 - Posted: 4 Dec 2015, 23:17:53 UTC - in response to Message 65767.  
Last modified: 4 Dec 2015, 23:18:13 UTC

Progress.

client (Unix): if libOpenCL.so isn't there, try libOpenCL.so.1

#else
     opencl_lib = dlopen("libOpenCL.so", RTLD_NOW);
     + if (!opencl_lib) {
     + opencl_lib = dlopen("libOpenCL.so.1", RTLD_NOW);
     + }

That what you wanted, Juha?
ID: 65815 · Report as offensive
Profile Agentb
Avatar

Send message
Joined: 30 May 15
Posts: 265
United Kingdom
Message 65817 - Posted: 5 Dec 2015, 0:35:49 UTC - in response to Message 65763.  

BOINC doesn't search for libraries in any specific place. It ask the OS to open libOpenCL.so for it and it's the OS's job to figure out where to look. If you need to use LD_LIBRARY_PATH then something has gone wrong with the driver install.


This started me thinking i needed to understand this a bit better. I went away and re-read some of the sticky posts in the GPU forum, and tried to get an understanding in detail about dlopen() and how it finds stuff.

Then looked around at boinc and libOpenCL with readelf -d and the boinc client source ./client/gpu_opencl.cpp does this

#else
//TODO: Is this correct?
    opencl_lib = dlopen("libOpenCL.so", RTLD_NOW);
#endif
    if (!opencl_lib) {
        warnings.push_back("No OpenCL library found");
        return;
    }


dlopen will return a fail (0) if

    + it cannot open "libOpenCL.so" - after searching - a file not found being one reason.
    + undefined symbols in the library can not be resolved (RTLD_NOW forces this), readelf shows these Shared library files dependencies in libOpenCL.so.1 on my AMD [librt.so.1, libm.so.6, libdl.so.2, libpthread.so.0, libc.so.6]
    + other reasons



so dlerror() should be called
and the error logged with the warnings.push_back error
- that will help with debugging.


... SONAMES
After some searching - it seems the .so symbolic link (to the .so.1 shared library) is created in some AMD development package installs, the "soname" which is visible using readelf and included in the .so file - is the "right" one to use for dlopen(). The .so link is also created on nVidia package installs, i don't know what beigenet does.

readelf also shows SONAME: [libOpenCL.so.1] for both my AMD and nVidia systems.

On my nVidia we have this arrangement in /usr/lib

/usr/lib/libOpenCL.so -> nvidia-current/libOpenCL.so

/usr/lib/nvidia-current/ shows

libOpenCL.so -> libOpenCL.so.1
libOpenCL.so.1 -> libOpenCL.so.1.0
libOpenCL.so.1.0 -> libOpenCL.so.1.0.0

(wow 4 link chain!)

The ldconfig (which i guess gets called during package install) seems to sort this nonsense out, and ldconfig -p shows what shared objects (by SONAME) is where; and sure enough libOpen.so.1 is there on both.

so i'm definitely +1 for "libOpenCL.so.1" (followed by a legacy "libOpenCL.so" attempt)


I seem to recall the final sticking point was xhost needed X running. boinc could check if it has permissions, and if X is running and log that info.


Somehow I was under the impression that they had managed to delay starting BOINC after X was ready. Guess not.


Yes for nVidia, that was a problem and is now fixed (i think).

It was not a delay problem, it is a X permission problem. i could not run /usr/bin/xhost +SI:localuser:boinc during init.d, and without it boinc cannot access AMD's fgrlx. For some people re-starting boinc-client in a window is ideal.


But that AMD link was interesting. It says fglrx 14.12 doesn't need X any more. Have you tried if it is really true?


14.12 was not a pleasant experience - I posted about that in the GPU forum and gave up and reversed to 14.9 here Subsequent I have not tried, I needed to run X on this system, i just wanted boinc to auto-start.

Whilst i like the AMD engine and raw horsepower, getting another one running "in a new way" - no - I'm still recovering!
ID: 65817 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 65822 - Posted: 5 Dec 2015, 11:05:41 UTC - in response to Message 65817.  

dlopen will return a fail (0) if

    + it cannot open "libOpenCL.so" - after searching - a file not found being one reason.
    + undefined symbols in the library can not be resolved (RTLD_NOW forces this), readelf shows these Shared library files dependencies in libOpenCL.so.1 on my AMD [librt.so.1, libm.so.6, libdl.so.2, libpthread.so.0, libc.so.6]
    + other reasons



so dlerror() should be called
and the error logged with the warnings.push_back error
- that will help with debugging.


Here you go: https://github.com/BOINC/boinc/commit/bc8b9a5d9eb5ba232ff202a191c03512c5d5b5b1

client (Unix): use dlerror() for GPU library failures; shows the filename.
ID: 65822 · Report as offensive
Profile Agentb
Avatar

Send message
Joined: 30 May 15
Posts: 265
United Kingdom
Message 65824 - Posted: 5 Dec 2015, 11:33:44 UTC - in response to Message 65822.  


Here you go: https://github.com/BOINC/boinc/commit/bc8b9a5d9eb5ba232ff202a191c03512c5d5b5b1

client (Unix): use dlerror() for GPU library failures; shows the filename.


Bonus points - the other dlopens() checked as well! Thank you.
ID: 65824 · Report as offensive
1 · 2 · Next

Message boards : BOINC client : Radeon R9 290X not detected by BOINC

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.