Incorrect GPU detection

Message boards : GPUs : Incorrect GPU detection
Message board moderation

To post messages, you must log in.

AuthorMessage
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46290 - Posted: 15 Nov 2012, 16:34:38 UTC

hello, i have 2 identical HD 7970 cards.
but BOINC tell this:
Thu 15 Nov 2012 02:46:00 PM MSK | | OpenCL: ATI GPU 0: Tahiti (driver version 1084.2 (VM), device version OpenCL 1.2 AMD-APP (1084.2), 3072MB, 2730MB available)
Thu 15 Nov 2012 02:46:00 PM MSK | | OpenCL: ATI GPU 1 (not used): Tahiti (driver version 1084.2 (VM), device version OpenCL 1.2 AMD-APP (1084.2), 2048MB, 0MB available)
so, it don't use it.
what have i to do with it?
ID: 46290 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 46295 - Posted: 15 Nov 2012, 17:27:53 UTC - in response to Message 46290.  

It's something that BOINC detects when it queries the GPU directly after it found the OpenCL library. Either there's something wrong with the driver (isn't 1084.2 (Catalysts 12.11) a beta driver? As the regualr download at the time of writing this answer is still 12.10), or there's something wrong with the card.

It isn't showing as GPU 1 either for the normal CAL detection, unless you culled that from all your other threads here and at the Einstein forums. If it isn't being detected there, there's already something wrong.
ID: 46295 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46298 - Posted: 15 Nov 2012, 19:16:19 UTC - in response to Message 46295.  
Last modified: 15 Nov 2012, 19:17:25 UTC

yes, i'm using beta 12.11
it is all ok with the cards itself: i tried to swap it with the same result: physical first detected right, second - not. and it is all ok if there are only single card in the system. each works right.
which method are used to detect gpu? both aticonfig and clinfo reports two adapter in the system
ID: 46298 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 46301 - Posted: 15 Nov 2012, 21:05:45 UTC - in response to Message 46298.  

which method are used to detect gpu? both aticonfig and clinfo reports two adapter in the system

Yes, and? BOINC is also reporting two GPUs.

Can you please post the full start-up messages, all the way from Starting BOINC Client version such and so, till the first project line?
ID: 46301 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46302 - Posted: 15 Nov 2012, 23:11:32 UTC - in response to Message 46301.  
Last modified: 15 Nov 2012, 23:21:51 UTC

now i'm trying to run boinc with catalyst 12.10, ubuntu 12.04, amd app sdk 2.7.

second adapter does not detected by boinc at all:

Fri 16 Nov 2012 03:04:36 AM MSK | | Starting BOINC client version 7.0.31 for x86_64-pc-linux-gnu
Fri 16 Nov 2012 03:04:36 AM MSK | | log flags: file_xfer, sched_ops, task
Fri 16 Nov 2012 03:04:36 AM MSK | | Libraries: libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
Fri 16 Nov 2012 03:04:36 AM MSK | | Data directory: /home/stitch/BOINC.beta/BOINC
Fri 16 Nov 2012 03:04:36 AM MSK | | Processor: 4 GenuineIntel Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz [Family 6 Model 58 Stepping 9]
Fri 16 Nov 2012 03:04:36 AM MSK | | Processor: 6.00 MB cache
Fri 16 Nov 2012 03:04:36 AM MSK | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
Fri 16 Nov 2012 03:04:36 AM MSK | | OS: Linux: 3.2.0-33-generic
Fri 16 Nov 2012 03:04:36 AM MSK | | Memory: 6.77 GB physical, 0 bytes virtual
Fri 16 Nov 2012 03:04:36 AM MSK | | Disk: 27.50 GB total, 21.68 GB free
Fri 16 Nov 2012 03:04:36 AM MSK | | Local time is UTC +4 hours
Fri 16 Nov 2012 03:04:36 AM MSK | | ATI GPU 0: AMD Radeon HD 7900 series (Tahiti) (CAL version 1.4.1741, 3072MB, 2711MB available, 11264 GFLOPS peak)
Fri 16 Nov 2012 03:04:36 AM MSK | | OpenCL: ATI GPU 0: AMD Radeon HD 7900 series (Tahiti) (driver version CAL 1.4.1741 (VM), device version OpenCL 1.2 AMD-APP (923.1), 3072MB, 2711MB available)

Fri 16 Nov 2012 03:04:36 AM MSK | | Config: use all coprocessors
Fri 16 Nov 2012 03:04:36 AM MSK | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 6098949; resource share 100
Fri 16 Nov 2012 03:04:36 AM MSK | Einstein@Home | General prefs: from Einstein@Home (last modified 07-Feb-2011 05:27:19)
Fri 16 Nov 2012 03:04:36 AM MSK | Einstein@Home | Host location: none
Fri 16 Nov 2012 03:04:36 AM MSK | Einstein@Home | General prefs: using your defaults
Fri 16 Nov 2012 03:04:36 AM MSK | | Preferences:
Fri 16 Nov 2012 03:04:36 AM MSK | | max memory usage when active: 3467.08MB
Fri 16 Nov 2012 03:04:36 AM MSK | | max memory usage when idle: 6240.74MB
Fri 16 Nov 2012 03:04:36 AM MSK | | max disk usage: 13.75GB
Fri 16 Nov 2012 03:04:36 AM MSK | | don't use GPU while active
Fri 16 Nov 2012 03:04:36 AM MSK | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
Fri 16 Nov 2012 03:04:36 AM MSK | | Not using a proxy
Fri 16 Nov 2012 03:04:36 AM MSK | Einstein@Home | Restarting task p2030.20110210.G193.77-04.27.S.b2s0g0.00000_416_1 using einsteinbinary_BRP4 version 131 (opencl-ati) in slot 0


here is output from aticonfig:

root@hawaii:~# aticonfig --lsa
* 0. 03:00.0 AMD Radeon HD 7900 Series
1. 04:00.0 AMD Radeon HD 7900 Series

* - Default adapter
ID: 46302 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46303 - Posted: 15 Nov 2012, 23:20:36 UTC - in response to Message 46301.  
Last modified: 15 Nov 2012, 23:21:09 UTC

and here is BOINC log with catalyst 12.11 beta, ubuntu 12.10, amd app sdk 2.7

Fri 16 Nov 2012 03:18:15 AM MSK | | Starting BOINC client version 7.0.28 for x86_64-pc-linux-gnu
Fri 16 Nov 2012 03:18:15 AM MSK | | log flags: file_xfer, sched_ops, task
Fri 16 Nov 2012 03:18:15 AM MSK | | Libraries: libcurl/7.27.0 OpenSSL/1.0.1c zlib/1.2.7 libidn/1.25 librtmp/2.3
Fri 16 Nov 2012 03:18:15 AM MSK | | Data directory: /home/stitch/BOINC
Fri 16 Nov 2012 03:18:15 AM MSK | | Processor: 4 GenuineIntel Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz [Family 6 Model 58 Stepping 9]
Fri 16 Nov 2012 03:18:15 AM MSK | | Processor: 6.00 MB cache
Fri 16 Nov 2012 03:18:15 AM MSK | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
Fri 16 Nov 2012 03:18:15 AM MSK | | OS: Linux: 3.5.0-18-generic
Fri 16 Nov 2012 03:18:15 AM MSK | | Memory: 6.77 GB physical, 0 bytes virtual
Fri 16 Nov 2012 03:18:15 AM MSK | | Disk: 29.35 GB total, 18.23 GB free
Fri 16 Nov 2012 03:18:15 AM MSK | | Local time is UTC +4 hours
Fri 16 Nov 2012 03:18:15 AM MSK | | ATI GPU 0: Tahiti (CAL version 1.4.1741, 3072MB, 2662MB available, 11264 GFLOPS peak)
Fri 16 Nov 2012 03:18:15 AM MSK | | OpenCL: ATI GPU 0: Tahiti (driver version 1084.2 (VM), device version OpenCL 1.2 AMD-APP (1084.2), 3072MB, 2662MB available)
Fri 16 Nov 2012 03:18:15 AM MSK | | OpenCL: ATI GPU 1 (not used): Tahiti (driver version 1084.2 (VM), device version OpenCL 1.2 AMD-APP (1084.2), 2048MB, 0MB available)

Fri 16 Nov 2012 03:18:15 AM MSK | | Config: use all coprocessors
Fri 16 Nov 2012 03:18:15 AM MSK | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 6095464; resource share 100
Fri 16 Nov 2012 03:18:15 AM MSK | Einstein@Home | General prefs: from Einstein@Home (last modified 07-Feb-2011 05:27:19)
Fri 16 Nov 2012 03:18:15 AM MSK | Einstein@Home | Host location: none
Fri 16 Nov 2012 03:18:15 AM MSK | Einstein@Home | General prefs: using your defaults
Fri 16 Nov 2012 03:18:15 AM MSK | | Reading preferences override file
Fri 16 Nov 2012 03:18:15 AM MSK | | Preferences:
Fri 16 Nov 2012 03:18:15 AM MSK | | max memory usage when active: 3466.96MB
Fri 16 Nov 2012 03:18:15 AM MSK | | max memory usage when idle: 6240.53MB
Fri 16 Nov 2012 03:18:15 AM MSK | | max disk usage: 14.67GB
Fri 16 Nov 2012 03:18:15 AM MSK | | don't use GPU while active
Fri 16 Nov 2012 03:18:15 AM MSK | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
Fri 16 Nov 2012 03:18:15 AM MSK | | Not using a proxy
ID: 46303 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46304 - Posted: 16 Nov 2012, 0:18:41 UTC

if it amd driver problem i will post on amd dev. forum.
ID: 46304 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46310 - Posted: 16 Nov 2012, 17:16:20 UTC

could you tell me, please, which method is using to detect GPU and their number?
ID: 46310 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 46311 - Posted: 16 Nov 2012, 17:21:16 UTC - in response to Message 46310.  

ID: 46311 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 46312 - Posted: 16 Nov 2012, 17:52:16 UTC

And if not a videocard fault, it may be:
a) a videocard driver fault.
b) a motherboard driver fault.
c) a motherboard fault. Problems with the second PCIe slot when first one is filled.
d) a problem with the PSU, or PSU not heavy enough to run 2 of these videocards. One HD7970 requires a minimum 500W PSU, two of them would then presumably require 750W or more.
ID: 46312 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46314 - Posted: 16 Nov 2012, 18:20:50 UTC - in response to Message 46312.  

And if not a videocard fault, it may be:
a) a videocard driver fault.

driver sees both gpu

b) a motherboard driver fault.

don't know what to say. which mb driver do you mean? it is all work fine under ubuntu.

c) a motherboard fault. Problems with the second PCIe slot when first one is filled.

no. second GPU works fine. including graphic output and BOINC crunching.
if i use <ignore_ati_dev>0</ignore_ati_dev> option in a cc_config file, then BOINC succefully starts to cruching on a second GPU ignoring the first one. if i don't use <ignore_ati_dev> option, then boinc start to crunch on the first GPU, ignoring the second one. so second GPU works fine.

d) a problem with the PSU, or PSU not heavy enough to run 2 of these videocards.

definitely no, PSU is 1500 Watt

i see the problem is following
1. BOINC's right detection of GPU
2. BOINC's decision to run tasks on the 1st GPU and do not run them on the 2nd.
and i can't tell BOINC to forced use of the 2nd GPU even if detection is not very correct.
is there any option like <use_ati_dev> ?
ID: 46314 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 46318 - Posted: 16 Nov 2012, 21:32:58 UTC - in response to Message 46314.  
Last modified: 16 Nov 2012, 21:34:49 UTC

c) a motherboard fault. Problems with the second PCIe slot when first one is filled.

no. second GPU works fine. including graphic output and BOINC crunching.
if i use <ignore_ati_dev>0</ignore_ati_dev> option in a cc_config file, then BOINC succefully starts to cruching on a second GPU ignoring the first one. if i don't use <ignore_ati_dev> option, then boinc start to crunch on the first GPU, ignoring the second one. so second GPU works fine.

Could still be a videocard fault, a videocard driver fault, a motherboard driver fault, or a motherboard fault. And thinking about it, it could also be a Linux (distro) problem.

Look, there is something fishy about the detection of your second GPU, this despite you being able to sing, dance and crunch with it. It already shows in the detection:

Fri 16 Nov 2012 03:18:15 AM MSK | | ATI GPU 0: Tahiti (CAL version 1.4.1741, 3072MB, 2662MB available, 11264 GFLOPS peak)
Fri 16 Nov 2012 03:18:15 AM MSK | | OpenCL: ATI GPU 0: Tahiti (driver version 1084.2 (VM), device version OpenCL 1.2 AMD-APP (1084.2), 3072MB, 2662MB available)
Fri 16 Nov 2012 03:18:15 AM MSK | | OpenCL: ATI GPU 1 (not used): Tahiti (driver version 1084.2 (VM), device version OpenCL 1.2 AMD-APP (1084.2), 2048MB, 0MB available)

The second GPU isn't even being detected as a CAL capable GPU. Normally after "ATI GPU 0: Tahiti (CAL version 1.4.1741, 3072MB, 2662MB available, 11264 GFLOPS peak)" you have a second line showing some thing alike "ATI GPU 1: Tahiti (CAL version 1.4.1741, 3072MB, 2662MB available, 11264 GFLOPS peak)". In your case, you don't.

That the 'normal' 12.10 driver can't even detect the whole card, be it for CAL or OpenCL makes it doubly fishy. Thus my reaction on it being the videocard, the motherboard, the motherboard chipset driver etc.

BOINC can only start running work on a GPU when it knows there is one and that it's capable of doing work. The science application will do the actual work and also do a pre-check whether it can use the hardware.

That work can be done on the second GPU when you go ignore the first one, shows it isn't a BOINC fault but something external. As soon as you ignore the first one, BOINC will automatically make a second GPU the best in the system. When you ignore that one, it'll use the third and so on until there's none to use and it won't use any.

So go figure out what causes the second GPU not to show up correctly. Try different (older) drivers. Make sure you use the drivers from AMD, not from repositories as these may miss parts.
ID: 46318 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46321 - Posted: 16 Nov 2012, 22:29:27 UTC - in response to Message 46318.  

obviously it not a GPU problem because if i swap it physically on the motherboard, then start the PC, load OS and run BOINC, it again detects ATI GPU 0, use it and ignoring the second, which was the first before.
again, if i leave only one GPU in a system and remove the second all works fine despite which physical GPU board i put in system: A or B. all works fine with A GPU alone as well as all works fine with B GPU alone.
so both cards are OK.
ok, tomorrow i will try to put them into another MB with another chipset and another CPU.
ID: 46321 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46349 - Posted: 18 Nov 2012, 21:49:09 UTC - in response to Message 46318.  
Last modified: 18 Nov 2012, 21:50:43 UTC

so,
if we (me and another participant from einstein@home with the similar system with two 7970) start boinc like this:
export DISPLAY=:0
Then start BOINC client:
./boinc --allow_remote_gui_rpc
we get right GPUs detection:

Sun 18 Nov 2012 11:59:21 PM MSK | | ATI GPU 0: Tahiti (CAL version 1.4.1741, 3072MB, 2765MB available, 11264 GFLOPS peak)
Sun 18 Nov 2012 11:59:21 PM MSK | | ATI GPU 1: Tahiti (CAL version 1.4.1741, 3072MB, 2962MB available, 11264 GFLOPS peak)
Sun 18 Nov 2012 11:59:21 PM MSK | | OpenCL: ATI GPU 0: Tahiti (driver version 1084.2 (VM), device version OpenCL 1.2 AMD-APP (1084.2), 3072MB, 2765MB available)
Sun 18 Nov 2012 11:59:21 PM MSK | | OpenCL: ATI GPU 1: Tahiti (driver version 1084.2 (VM), device version OpenCL 1.2 AMD-APP (1084.2), 3072MB, 2962MB available)


otherwise we get 2048 mb total and 0 mb free for a second gpu.
something wrong with detection itself, not the hardware.
ID: 46349 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 46350 - Posted: 18 Nov 2012, 22:07:46 UTC - in response to Message 46349.  

I found that when running a Host (on Windows) with a Nvidia GPU and a ATI/AMD GPU, that i have to extend the desktop onto the ATI/AMD GPU to get it detected, after that everything works fine,

Claggy
ID: 46350 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 46351 - Posted: 18 Nov 2012, 22:18:45 UTC - in response to Message 46349.  

From How do I set the DISPLAY variable on Linux:

The magic word is DISPLAY. In the X window system, a display consists (simplified) of a keyboard, a mouse and a screen. A display is managed by a server program, known as an X server. The server serves displaying capabilities to other programs that connect to it.

By setting DISPLAY=:0, you specify that the X server should use the localhost display.

The display consists of a hostname (such as light.uni.verse and localhost), a colon (:), and a sequence number (such as 0 and 4). The hostname of the display is the name of the computer where the X server runs. An omitted hostname means the local host. The sequence number is usually 0 -- it can be varied if there are multiple displays connected to one computer.

Now then, things you can try:
1. Figure out what DISPLAY was set to prior to you specifying :0. I don't know if it can be reset, but next time try export -p first. That shows all the declarations of the variables as set up in your configuration. I'll bet that DISPLAY was set to 0.0, which is essentially the same as :0, but more specified as display 0.
2. Will this also work without the --allow_remote_rpc_gui switch?
3. What version of X do you have? X -version in a terminal window will give that info.
4. You can try to run BOINC 7.0.36, without the extra commands. It has some OpenCL detection fixes. See BOINC 7.0.36 available for testing for Macintosh, Linux and Windows in the BOINC 7 Change Log thread for links.
ID: 46351 · Report as offensive
Mimi Koko

Send message
Joined: 15 Nov 12
Posts: 13
Message 46362 - Posted: 19 Nov 2012, 16:33:57 UTC - in response to Message 46351.  

2. Will this also work without the --allow_remote_rpc_gui switch?

yes. also works

3. What version of X do you have? X -version in a terminal window will give that info
.
X.Org X Server 1.13.0
ID: 46362 · Report as offensive

Message boards : GPUs : Incorrect GPU detection

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.