BOINC 6.10.17 + ATI GPU + SIGSEGV

Message boards : Questions and problems : BOINC 6.10.17 + ATI GPU + SIGSEGV
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ice00

Send message
Joined: 9 Nov 09
Posts: 6
Italy
Message 28597 - Posted: 9 Nov 2009, 17:33:11 UTC

From when I install official latest ATI driver 9.10 for a X86_64, BOINC did not run anymore (due a segfault in testing GPU):

[ice@localhost BOINC]$ ./boinc
09-Nov-2009 18:15:25 [---] Starting BOINC client version 6.10.17 for x86_64-pc-linux-gnu
09-Nov-2009 18:15:25 [---] log flags: file_xfer, sched_ops, task
09-Nov-2009 18:15:25 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1
09-Nov-2009 18:15:25 [---] Data directory: /mnt/old/home/ice/BOINC
09-Nov-2009 18:15:25 [---] Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz [Family 6 Model 15 Stepping 11]
09-Nov-2009 18:15:25 [---] Processor: 4.00 MB cache
09-Nov-2009 18:15:25 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_s
09-Nov-2009 18:15:25 [---] OS: Linux: 2.6.29.4-167.fc11.x86_64
09-Nov-2009 18:15:25 [---] Memory: 1.94 GB physical, 1.86 GB virtual
09-Nov-2009 18:15:25 [---] Disk: 74.46 GB total, 8.59 GB free
09-Nov-2009 18:15:25 [---] Local time is UTC +1 hours
SIGSEGV: segmentation violation
Stack trace (15 frames):
./boinc(boinc_catch_signal+0x43)[0x45d4c3]
/lib64/libpthread.so.0[0x326b60eee0]
/usr/lib64/libaticaldd.so[0x7f85e55ea84a]
/usr/lib64/libaticaldd.so[0x7f85e55ea6c5]
/usr/lib64/libaticaldd.so[0x7f85e55e7197]
/usr/lib64/libaticaldd.so[0x7f85e55deb35]
/usr/lib64/libaticaldd.so[0x7f85e5513183]
/usr/lib64/libaticaldd.so[0x7f85e56272ea]
/usr/lib64/libaticaldd.so[0x7f85e5620e9f]
/usr/lib64/libaticaldd.so[0x7f85e5630115]
./boinc[0x458e7b]
./boinc[0x415435]
./boinc[0x441a8a]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x326aa1ea2d]
./boinc(__gxx_personality_v0+0x1e1)[0x4062f9]


Else, if I follow this instructions: http://boinc.thesonntags.com/collatz/power_apps.php,
and test Collatz without Boinc, I obtains:

shmget in attach_shmem: Invalid argument
18:17:59 (3004): Can't set up shared mem: -1. Will run in standalone mode.

Running Collatz Conjecture (3x+1) ATI GPU application version 2.01 by Gipsel (Linux64, CAL 1.4 R1.1)
Reading input file ... done.
Checking 4294967296 numbers starting with 2361185725354183731560
CAL Runtime: 1.4.427
SIGSEGV: segmentation violation
Stack trace (14 frames):
./collatz_2.01_x86_64-pc-linux-gnu__ati14(boinc_catch_signal+0x49)[0x40e9e9]
/lib64/libpthread.so.0[0x326b60eee0]
/usr/lib64/libaticaldd.so[0x7febbe8f684a]
/usr/lib64/libaticaldd.so[0x7febbe8f66c5]
/usr/lib64/libaticaldd.so[0x7febbe8f3197]
/usr/lib64/libaticaldd.so[0x7febbe8eab35]
/usr/lib64/libaticaldd.so[0x7febbe81f183]
/usr/lib64/libaticaldd.so[0x7febbe9332ea]
/usr/lib64/libaticaldd.so[0x7febbe92ce9f]
/usr/lib64/libaticaldd.so[0x7febbe93c115]
./collatz_2.01_x86_64-pc-linux-gnu__ati14[0x4071e3]
./collatz_2.01_x86_64-pc-linux-gnu__ati14[0x40905e]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x326aa1ea2d]
./collatz_2.01_x86_64-pc-linux-gnu__ati14(__gxx_personality_v0+0x1a9)[0x404ba9]

Exiting...


Unfortunately I cannot do more, as even BOINC and ATI driver as at latest version available.
Any idea for helping in this?

I use Fedora Core 11 with 2.6.29.4-167.fc11.x86_64 kernel (with kernel 2.6.30 I lost 3D application onto ATI driver)
ID: 28597 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 28598 - Posted: 9 Nov 2009, 17:47:54 UTC - in response to Message 28597.  

Have you tried to power down the system, wait 30 seconds to a minute, then power it back up? Clears all memory.

If you then still get the error immediately upon detecting the ATI card, I'd take a much closer look at the ATI card. Its memory could be on the go (check for burns on the PCB, or bulging capacitors). No hardware is build to last forever, while these early GPUs are not build to run calculations 24/7 without letup.

Have you also posted this at Collatz? It could be a problem with their application, although I don't think it is.
ID: 28598 · Report as offensive
ice00

Send message
Joined: 9 Nov 09
Posts: 6
Italy
Message 28602 - Posted: 9 Nov 2009, 18:28:09 UTC - in response to Message 28598.  

No hardware is build to last forever, while these early GPUs are not build to run calculations 24/7 without letup.

I did not use the GPU for computation before (it is a HD2600xt) and I just test the BOINC cliant after the computer was turned off for more that 12h.
So far the ATI control center did not found errors into the card, so I hope it is fine.

I use Fedora Core 8 for all day works (and here BOINC 6.10.17 runs as ATI driver is old: 8.6 or something and so GPU is not detected), but I wont to migrate to Fedora 11 as soon as I have BOIN run with GPU.

Have you also posted this at Collatz? It could be a problem with their application, although I don't think it is.

I think even that did not depend by Collatz, but from the GPU dectection algorithm.

However it could be better if in the future the BOINC client will have a command line switch that disable GPU detection, so the program can run using only the CPU.
I did not find actually a way to disable GPU detection.
ID: 28602 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 28604 - Posted: 9 Nov 2009, 18:35:30 UTC - in response to Message 28602.  
Last modified: 9 Nov 2009, 18:41:01 UTC

I think even that did not depend by Collatz, but from the GPU dectection algorithm.

There is no algorithm; BOINC checks which driver files are available (in the system path) of your system. Minimum required for 6.10 is Catalysts 8.12

However it could be better if in the future the BOINC client will have a command line switch that disable GPU detection, so the program can run using only the CPU. I did not find actually a way to disable GPU detection.

We use the core client configuration file for that, also known as the cc_config.xml file. It wants to live in your BOINC Data directory (where client_state.xml lives as well). It doesn't come with BOINC, you will have to make one yourself if it isn't in your Data directory yet.

Add into it these lines:
<cc_config>
<options>
<no_gpus>1</no_gpus>
</options>
</cc_config>


Save the file, make sure the only extension on it is .xml, then restart BOINC.

Edit: I like the idea though of having it as command line option, so forwarded that to the developers.
ID: 28604 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 28607 - Posted: 9 Nov 2009, 21:11:05 UTC - in response to Message 28604.  

Edit: I like the idea though of having it as command line option, so forwarded that to the developers.

The developers liked the idea as well and added it. It'll be available in the (trunk) source code at this moment (if you build your own BOINC), or in BOINC 6.10.19

[trac]changeset:19523[/trac]
ID: 28607 · Report as offensive
[AF>Le_Pommier>MacGeneration.c...

Send message
Joined: 10 Nov 09
Posts: 5
Belgium
Message 28617 - Posted: 10 Nov 2009, 1:13:18 UTC - in response to Message 28607.  

I've got exactly the same problem.
When i launch BOINC i get :
10-Nov-2009 02:04:08 [---] Starting BOINC client version 6.10.17 for x86_64-pc-linux-gnu
10-Nov-2009 02:04:08 [---] log flags: file_xfer, sched_ops, task
10-Nov-2009 02:04:08 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3.3 c-ares/1.5.1
10-Nov-2009 02:04:08 [---] Data directory: /home/arnaudschoofs/Bureau/BOINC
10-Nov-2009 02:04:08 [---] Processor: 2 GenuineIntel Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz [Family 6 Model 15 Stepping 10]
10-Nov-2009 02:04:08 [---] Processor: 4.00 MB cache
10-Nov-2009 02:04:08 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm t
10-Nov-2009 02:04:08 [---] OS: Linux: 2.6.31-14-generic
10-Nov-2009 02:04:08 [---] Memory: 3.85 GB physical, 999.99 MB virtual
10-Nov-2009 02:04:08 [---] Disk: 13.87 GB total, 9.38 GB free
10-Nov-2009 02:04:08 [---] Local time is UTC +1 hours
SIGSEGV: segmentation violation
Stack trace (15 frames):
./boinc(boinc_catch_signal+0x43)[0x45d4c3]
/lib/libpthread.so.0[0x7f53fe664190]
/usr/lib/libaticaldd.so[0x7f53fc60584a]
/usr/lib/libaticaldd.so[0x7f53fc6056c5]
/usr/lib/libaticaldd.so[0x7f53fc602197]
/usr/lib/libaticaldd.so[0x7f53fc5f9b35]
/usr/lib/libaticaldd.so[0x7f53fc52e183]
/usr/lib/libaticaldd.so[0x7f53fc6422ea]
/usr/lib/libaticaldd.so[0x7f53fc63be9f]
/usr/lib/libaticaldd.so[0x7f53fc64b115]
./boinc[0x458e7b]
./boinc[0x415435]
./boinc[0x441a8a]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7f53fdb59abd]
./boinc(__gxx_personality_v0+0x1e1)[0x4062f9]

Exiting...

When removing the file "libaticaldd.so" everything is working fine, but not ATI GPU is find of course.
Please help me ! I want to discover the joyce of GPU computing :)

Here's my config :
Linux Ubuntu 9.10 kernel 2.6.31-14-generic
iMac intel Core2Duo @ 2,4GHz with 4Go RAM
BOINC 6.10.17_x86_64
Ati RADEON HD2600 Pro
drivers : Catalyst 9.10 (normally running well)

Thanks for your help :)
ID: 28617 · Report as offensive
ice00

Send message
Joined: 9 Nov 09
Posts: 6
Italy
Message 28640 - Posted: 10 Nov 2009, 11:57:30 UTC - in response to Message 28604.  


There is no algorithm; BOINC checks which driver files are available (in the system path) of your system. Minimum required for 6.10 is Catalysts 8.12


Maybe the problem is in ATI driver calling as in coproc.ccp get method, because no one of that print message are printed.

I will try to compile BOINC from the source this night and then go into debug to see the point where there is the segmentation fault..
ID: 28640 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 28656 - Posted: 10 Nov 2009, 18:05:48 UTC - in response to Message 28640.  

OK, the developers are trying to reproduce this problem as well. The client should never crash under these circumstances.
ID: 28656 · Report as offensive
ice00

Send message
Joined: 9 Nov 09
Posts: 6
Italy
Message 28659 - Posted: 10 Nov 2009, 21:14:09 UTC - in response to Message 28656.  

In my system, the Segmentation fault appears in this point of coproc.cpp:

retval = (*__calInit)();


calInit is given by:

callib = dlopen("libaticalrt.so", RTLD_NOW);


These are the dependencies of the library:

[root@localhost ice]# ldd /usr/lib64/libaticalrt.so
ldd: warning: you do not have execution permission for `/usr/lib64/libaticalrt.so'
        linux-vdso.so.1 =>  (0x00007fff11fff000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1a09bfb000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f1a099f3000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f1a0976e000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f1a09554000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f1a09350000)
        libXext.so.6 => /usr/lib64/libXext.so.6 (0x00007f1a0913d000)
        libX11.so.6 => /usr/lib64/libX11.so.6 (0x00007f1a08e04000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f1a08a96000)
        /lib64/ld-linux-x86-64.so.2 (0x000000326a600000)
        libXau.so.6 => /usr/lib64/libXau.so.6 (0x00007f1a08893000)
        libxcb.so.1 => /usr/lib64/libxcb.so.1 (0x00007f1a08678000)


However the error is generated into libaticaldd.so that seems not directly related to libaticalrt.so from the dependencies.

I test this with the latest trunk version (8.11.0) and with ElecticFence active for catching the segmentation fault point.
ID: 28659 · Report as offensive
Rom Walton
Project developer
Avatar

Send message
Joined: 26 Aug 05
Posts: 164
Message 28660 - Posted: 10 Nov 2009, 21:42:48 UTC - in response to Message 28659.  

Well I have mixed news, we'll be able to prevent BOINC from crashing in the future. We are working on a fix for that right now.

However, when the ATI Runtime Library causes that situation we won't be able to detect the GPU or anything like that. The client will just go about doing what it does without using it.

You'll need to contact ATI or use some of their diagnostics tools to figure out why their code is crashing.

I'm sorry we don't have better news on this issue.

----- Rom

In my system, the Segmentation fault appears in this point of coproc.cpp:

retval = (*__calInit)();


calInit is given by:

callib = dlopen("libaticalrt.so", RTLD_NOW);


These are the dependencies of the library:

[root@localhost ice]# ldd /usr/lib64/libaticalrt.so
ldd: warning: you do not have execution permission for `/usr/lib64/libaticalrt.so'
        linux-vdso.so.1 =>  (0x00007fff11fff000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1a09bfb000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f1a099f3000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f1a0976e000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f1a09554000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f1a09350000)
        libXext.so.6 => /usr/lib64/libXext.so.6 (0x00007f1a0913d000)
        libX11.so.6 => /usr/lib64/libX11.so.6 (0x00007f1a08e04000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f1a08a96000)
        /lib64/ld-linux-x86-64.so.2 (0x000000326a600000)
        libXau.so.6 => /usr/lib64/libXau.so.6 (0x00007f1a08893000)
        libxcb.so.1 => /usr/lib64/libxcb.so.1 (0x00007f1a08678000)


However the error is generated into libaticaldd.so that seems not directly related to libaticalrt.so from the dependencies.

I test this with the latest trunk version (8.11.0) and with ElecticFence active for catching the segmentation fault point.


----- Rom
BOINC Development Team, U.C. Berkeley
My Blog
ID: 28660 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 28666 - Posted: 10 Nov 2009, 23:38:19 UTC - in response to Message 28659.  

David put a (temporary?) fix in coproc.cpp which will catch SIGSEGV errors and let the client continue. See [trac]changeset:19533[/trac]. You can update your source code and try again.

It shouldn't crash the client, but it will give up on trying to detect the GPU instead and continue.
ID: 28666 · Report as offensive
ice00

Send message
Joined: 9 Nov 09
Posts: 6
Italy
Message 28909 - Posted: 21 Nov 2009, 11:56:08 UTC - in response to Message 28666.  

I obtain this now (using the given fix in source):

./boinc
21-Nov-2009 12:33:51 [---] Starting BOINC client version 6.11.0 for x86_64-pc-linux-gnu
21-Nov-2009 12:33:51 [---] This a development version of BOINC and may not function properly
21-Nov-2009 12:33:51 [---] log flags: file_xfer, sched_ops, task
21-Nov-2009 12:33:51 [---] Libraries: libcurl/7.19.6 NSS/3.12.4.1 Beta zlib/1.2.3 libidn/1.9 libssh2/1.0
21-Nov-2009 12:33:51 [---] Data directory: /mnt/old/home/ice/SRC/cvsroot/boinc/client
21-Nov-2009 12:33:51 [---] Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz [Family 6 Model 15 Stepping 11]
21-Nov-2009 12:33:51 [---] Processor: 4.00 MB cache
21-Nov-2009 12:33:51 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_s
21-Nov-2009 12:33:51 [---] OS: Linux: 2.6.30.9-96.fc11.x86_64
21-Nov-2009 12:33:51 [---] Memory: 1.94 GB physical, 1.86 GB virtual
21-Nov-2009 12:33:51 [---] Disk: 74.46 GB total, 5.44 GB free
21-Nov-2009 12:33:51 [---] Local time is UTC +1 hours
X Error of failed request:  BadRequest (invalid request code or no such operation)
  Major opcode of failed request:  135 ()
  Minor opcode of failed request:  19
  Serial number of failed request:  8
  Current serial number in output stream:  8
[ice@localhost client]$


The code did not crash, but computation did not goes with CPU (however this could be given by the fact that I run boinc from compiled code without proper installation).
By the way I'm migrating to Fedora 12 and I'm being installed the new ATI driver just released: I hope to have GPU computation working with those updates
ID: 28909 · Report as offensive
ice00

Send message
Joined: 9 Nov 09
Posts: 6
Italy
Message 28910 - Posted: 21 Nov 2009, 12:39:55 UTC - in response to Message 28909.  

Little new: installing the new ATI driver, boinc is running with CPU computation after the GPU is not recognized (without the crash; the crash instead is still present using boinc 6.10.17), so the proposed fix is working.
ID: 28910 · Report as offensive
elect

Send message
Joined: 13 Feb 10
Posts: 9
Germany
Message 30996 - Posted: 13 Feb 2010, 3:53:32 UTC - in response to Message 28910.  

Little new: installing the new ATI driver, boinc is running with CPU computation after the GPU is not recognized (without the crash; the crash instead is still present using boinc 6.10.17), so the proposed fix is working.


Any news?
ID: 30996 · Report as offensive
elect

Send message
Joined: 13 Feb 10
Posts: 9
Germany
Message 31001 - Posted: 13 Feb 2010, 11:36:52 UTC - in response to Message 30996.  

With BOINC 6.10.32 it doesnt crash any more, but it doesnt see the gpu

Lastest driver.

How can i check if i have CAL?
ID: 31001 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 31002 - Posted: 13 Feb 2010, 11:45:40 UTC - in response to Message 31001.  

How can i check if i have CAL?

A list of Stream compatible cards can be found in the System Requirements list on http://developer.amd.com/gpu/ATIStreamSDK/Pages/default.aspx. If your GPU isn't in that list, it won't do a thing.

You will also need a minimum of the Catalysts 9.3 for BOINC 6.10.
See this thread for other possibilities why BOINC won't detect the GPU.
ID: 31002 · Report as offensive
elect

Send message
Joined: 13 Feb 10
Posts: 9
Germany
Message 31003 - Posted: 13 Feb 2010, 11:56:13 UTC - in response to Message 31002.  

How can i check if i have CAL?

A list of Stream compatible cards can be found in the System Requirements list on http://developer.amd.com/gpu/ATIStreamSDK/Pages/default.aspx. If your GPU isn't in that list, it won't do a thing.

You will also need a minimum of the Catalysts 9.3 for BOINC 6.10.
See this thread for other possibilities why BOINC won't detect the GPU.


Yes, 4870 x2.

I m running Ubuntu 64 bit, Catalysts 9.10, so should it be CAL 1.2?
ID: 31003 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 31004 - Posted: 13 Feb 2010, 12:06:12 UTC - in response to Message 31003.  

No, CAL 1.2 is Cat 8.12 - 9.2
Cal 1.4 is Cat 9.3 and above.

But even then, on any system that hasn't got Windows on it, the ATI detection is somewhat hit and miss. Most probably due to BOINC just not being able to detect what drivers you're using.
ID: 31004 · Report as offensive
elect

Send message
Joined: 13 Feb 10
Posts: 9
Germany
Message 31005 - Posted: 13 Feb 2010, 12:33:25 UTC - in response to Message 31004.  

No, CAL 1.2 is Cat 8.12 - 9.2
Cal 1.4 is Cat 9.3 and above.


So, with 9.12 which CAL i have?
ID: 31005 · Report as offensive
Orhan

Send message
Joined: 13 Feb 10
Posts: 1
Turkey
Message 31008 - Posted: 13 Feb 2010, 14:00:59 UTC - in response to Message 31005.  

i have a problem with boinc either. my machine is intel i5 750 2.66 ghz, 4gb 1333 mhz ram with windows 7 but i can not leave my machine all day open because i get blue screen when boinc is open. i dont have find the problem. some times i get memory management error. what can i do for solve this?
ID: 31008 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : BOINC 6.10.17 + ATI GPU + SIGSEGV

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.