6.6.40 SIGSEGV on Gentoo

Message boards : Questions and problems : 6.6.40 SIGSEGV on Gentoo
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 27659 - Posted: 29 Sep 2009, 13:53:19 UTC

The current gentoo ebuild of the 6.6.40 client is failing in certain circumstances.

In my case it seems to be related to RCN wu's failing to upload

29-Sep-2009 21:23:25 [---] Contacting account manager at http://bam.boincstats.com/
29-Sep-2009 21:23:25 [Rectilinear Crossing Number] Started upload of W8_0156_0041_1657_0961_18.lmd
29-Sep-2009 21:23:25 [Rectilinear Crossing Number] Started upload of W8_0156_0041_1657_0961_19.lmd
29-Sep-2009 21:23:25 [Milkyway@home] Restarting task de_s222_3s_random_2p_06r_21_473551_1253970184_0 using milkyway version 18
29-Sep-2009 21:23:25 [Docking@Home] Sending scheduler request: To report completed tasks.
29-Sep-2009 21:23:25 [Docking@Home] Reporting 1 completed tasks, not requesting new tasks
29-Sep-2009 21:23:30 [Rectilinear Crossing Number] [error] Error reported by file upload server: nbytes missing or negative
29-Sep-2009 21:23:30 [Rectilinear Crossing Number] [error] Error reported by file upload server: nbytes missing or negative
29-Sep-2009 21:23:30 [Rectilinear Crossing Number] Giving up on upload of W8_0156_0041_1657_0961_18.lmd: permanent upload error
SIGSEGV: segmentation violation
Stack trace (8 frames):
/usr/bin/boinc_client(boinc_catch_signal+0x4d)[0x474bdd]
/lib/libpthread.so.0[0x7fa95a814a00]
/usr/bin/boinc_client[0x4571b5]
/usr/bin/boinc_client[0x457469]
/usr/bin/boinc_client[0x4210e8]
/usr/bin/boinc_client[0x455a80]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7fa95a24c5c6]
/usr/bin/boinc_client(__gxx_personality_v0+0x1d9)[0x405de9]

Exiting... 


Workarounds so far seem to be detaching from all projects before upgrading and deleting offending wu's.

The thread discussing this on gentoo can be found here http://forums.gentoo.org/viewtopic-t-794402.html

Can anyone shed some light on this?

Cheers
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 27659 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15487
Netherlands
Message 27664 - Posted: 29 Sep 2009, 21:39:48 UTC

Forwarded to the developers.
ID: 27664 · Report as offensive
Rom Walton
Project developer
Avatar

Send message
Joined: 26 Aug 05
Posts: 164
Message 27666 - Posted: 29 Sep 2009, 21:42:07 UTC

Could you rebuild the client software with symbols and reproduce the crash?

----- Rom
BOINC Development Team, U.C. Berkeley
My Blog
ID: 27666 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 27673 - Posted: 30 Sep 2009, 12:04:31 UTC - in response to Message 27666.  

Recompiled with "-g" in CFLAGS and CXXFLAGS. This box is about an hour away from completing a seti wu - after this completes I'll suspend all other wu's and let a partially completed RCN wu complete and attempt to upload - at this point it should crash & I'll post the results.
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 27673 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 27674 - Posted: 30 Sep 2009, 12:15:42 UTC

Seems the RCN problem has already been noted http://dist.ist.tugraz.at/cape5/forum_thread.php?id=492 and the solution is for RCN to upgrade their software.
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 27674 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 27694 - Posted: 1 Oct 2009, 8:56:49 UTC - in response to Message 27673.  

Recompiled with "-g" in CFLAGS and CXXFLAGS. This box is about an hour away from completing a seti wu - after this completes I'll suspend all other wu's and let a partially completed RCN wu complete and attempt to upload - at this point it should crash & I'll post the results.


The client crashed as suspected


01-Oct-2009 07:02:27 [lhcathome] Scheduler request completed: got 0 new tasks
01-Oct-2009 07:08:36 [BCL@Home] Sending scheduler request: To fetch work.
01-Oct-2009 07:08:36 [BCL@Home] Requesting new tasks
01-Oct-2009 07:08:41 [BCL@Home] Scheduler request completed: got 0 new tasks
01-Oct-2009 07:09:47 [Rectilinear Crossing Number] Computation for task W8_0168_1081_0927_0001_0 finished
01-Oct-2009 07:09:48 [yoyo@home] Restarting task ogr_090926020004_5_0 using crunch version 205
01-Oct-2009 07:09:50 [Rectilinear Crossing Number] Started upload of W8_0168_1081_0927_0001.12
01-Oct-2009 07:09:50 [Rectilinear Crossing Number] Started upload of W8_0168_1081_0927_0001.13
01-Oct-2009 07:09:54 [Rectilinear Crossing Number] [error] Error reported by file upload server: nbytes missing or negative
01-Oct-2009 07:09:54 [Rectilinear Crossing Number] [error] Error reported by file upload server: nbytes missing or negative
01-Oct-2009 07:09:54 [Rectilinear Crossing Number] Giving up on upload of W8_0168_1081_0927_0001.12: permanent upload error
SIGSEGV: segmentation violation
Stack trace (8 frames):
/usr/bin/boinc_client(boinc_catch_signal+0x45)[0x473895]
/lib/libpthread.so.0[0x7fa1a2caca00]
/usr/bin/boinc_client[0x456225]
/usr/bin/boinc_client[0x4564c1]
/usr/bin/boinc_client[0x420b28]
/usr/bin/boinc_client[0x454b10]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7fa1a26e45c6]
/usr/bin/boinc_client(__gxx_personality_v0+0x1d9)[0x405de9]

Exiting...


If you can tell me where the debug info you need is I can post it.
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 27694 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15487
Netherlands
Message 27699 - Posted: 1 Oct 2009, 9:54:13 UTC - in response to Message 27694.  

If all is well, in stderrdae.txt
ID: 27699 · Report as offensive
Profile Trog Dog
Avatar

Send message
Joined: 6 May 06
Posts: 287
Australia
Message 27700 - Posted: 1 Oct 2009, 13:19:57 UTC - in response to Message 27699.  

Don't seem to have a stderrdae.txt file - I'll get back to you on this. stderrdae should be a product of the init.d file or is it a compile-time option?
CIC1=CC=C(C2=N[C@@H](CC(OC(C)(C)C)=O)C3=NN=C(C)N3C4=C2C(C)=C(C)S4)C=C1
ID: 27700 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15487
Netherlands
Message 27703 - Posted: 1 Oct 2009, 14:28:44 UTC - in response to Message 27700.  

By default, the stderrdae.txt and stdoutdae.txt files can be found in your BOINC Data directory. Its position can be read in the 2nd or 3rd line when starting up BOINC.
ID: 27703 · Report as offensive
Thyme Lawn

Send message
Joined: 2 Sep 05
Posts: 103
United Kingdom
Message 27707 - Posted: 1 Oct 2009, 19:04:21 UTC - in response to Message 27703.  

By default, the stderrdae.txt and stdoutdae.txt files can be found in your BOINC Data directory. Its position can be read in the 2nd or 3rd line when starting up BOINC.

That will only happen on Linux if BOINC is started with the command line arg --redirectio
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 27707 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 27708 - Posted: 1 Oct 2009, 19:22:33 UTC - in response to Message 27694.  

Recompiled with "-g" in CFLAGS and CXXFLAGS.


The client crashed as suspected

01-Oct-2009 07:09:54 [Rectilinear Crossing Number] Giving up on upload of W8_0168_1081_0927_0001.12: permanent upload error
SIGSEGV: segmentation violation
Stack trace (8 frames):
/usr/bin/boinc_client(boinc_catch_signal+0x45)[0x473895]
/lib/libpthread.so.0[0x7fa1a2caca00]
/usr/bin/boinc_client[0x456225]
/usr/bin/boinc_client[0x4564c1]
/usr/bin/boinc_client[0x420b28]
/usr/bin/boinc_client[0x454b10]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7fa1a26e45c6]
/usr/bin/boinc_client(__gxx_personality_v0+0x1d9)[0x405de9]

Exiting...


If you can tell me where the debug info you need is I can post it.

Do that again, running the client manually (from the correct directory, careful with that!) under gdb.

trog@localhost /wherever/boinc/data/is$ gdb boinc_client
(gdb) run
[now wait for it to crash]
SIGSEGV: segmentation violation
(gdb) backtrace full

This time you should get a stack trace with debugging information instead of just memory addresses [0x4564c1].

ID: 27708 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15487
Netherlands
Message 27709 - Posted: 1 Oct 2009, 19:23:29 UTC - in response to Message 27707.  

By default, the stderrdae.txt and stdoutdae.txt files can be found in your BOINC Data directory. Its position can be read in the 2nd or 3rd line when starting up BOINC.

That will only happen on Linux if BOINC is started with the command line arg --redirectio

Ah thanks, Ian. I thought it did, but couldn't find the info on that so quickly.
ID: 27709 · Report as offensive
Ser

Send message
Joined: 2 Oct 09
Posts: 2
United States
Message 27733 - Posted: 2 Oct 2009, 17:10:55 UTC
Last modified: 2 Oct 2009, 17:12:18 UTC

I'm having a similar issue.

(gdb) run
Starting program: /var/tmp/portage/sci-misc/boinc-6.6.40-r1/work/boinc-6.6.40/client/boinc_client 
[Thread debugging using libthread_db enabled]
[New Thread 0x7fca0d4ca710 (LWP 19386)]
02-Oct-2009 13:08:33 [---] Starting BOINC client version 6.6.40 for x86_64-pc-linux-gnu
02-Oct-2009 13:08:33 [---] log flags: task, file_xfer, sched_ops
02-Oct-2009 13:08:33 [---] Libraries: libcurl/7.19.6 OpenSSL/0.9.8k zlib/1.2.3
02-Oct-2009 13:08:33 [---] Data directory: /var/lib/boinc
02-Oct-2009 13:08:33 [---] Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad  CPU   Q9450  @ 2.66GHz [Family 6 Model 23 Stepping 7]
02-Oct-2009 13:08:33 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_
02-Oct-2009 13:08:33 [---] OS: Linux: 2.6.31-gentoo
02-Oct-2009 13:08:33 [---] Memory: 7.82 GB physical, 0 bytes virtual
02-Oct-2009 13:08:33 [---] Disk: 4.59 GB total, 1.87 GB free
02-Oct-2009 13:08:33 [---] Local time is UTC -4 hours
02-Oct-2009 13:08:33 [---] No CUDA-capable NVIDIA GPUs found
02-Oct-2009 13:08:33 [---] No coprocessors
02-Oct-2009 13:08:33 [---] Not using a proxy
02-Oct-2009 13:08:33 [Einstein@Home] URL: http://einstein.phys.uwm.edu/; Computer ID: 2080924; location: (none); project prefs: default
02-Oct-2009 13:08:33 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 5098045; location: (none); project prefs: default
02-Oct-2009 13:08:33 [Cosmology@Home] URL: http://cosmologyathome.org/; Computer ID: 62042; location: (none); project prefs: default
02-Oct-2009 13:08:33 [Milkyway@home] URL: http://milkyway.cs.rpi.edu/milkyway/; Computer ID: 106173; location: (none); project prefs: default
02-Oct-2009 13:08:33 [World Community Grid] URL: http://worldcommunitygrid.org/; Computer ID: 1045140; location: (none); project prefs: default
02-Oct-2009 13:08:33 [SETI@home] General prefs: from SETI@home (last modified 27-Sep-2009 22:35:10)
02-Oct-2009 13:08:33 [SETI@home] Host location: none
02-Oct-2009 13:08:33 [SETI@home] General prefs: using your defaults
02-Oct-2009 13:08:33 [---] Preferences limit memory usage when active to 4005.25MB
02-Oct-2009 13:08:33 [---] Preferences limit memory usage when idle to 6408.41MB
02-Oct-2009 13:08:33 [---] Preferences limit disk usage to 0.92GB
02-Oct-2009 13:08:33 [Milkyway@home] Task de_s222_3s_random_2p_09r_24_201356_1254108312_0 is 1.57 days overdue.
02-Oct-2009 13:08:33 [Milkyway@home] You may not get credit for it.  Consider aborting it.
02-Oct-2009 13:08:33 [Cosmology@Home] [error] File params_091209_231612_2.ini has wrong size: expected 1911, got 0
02-Oct-2009 13:08:33 [Cosmology@Home] Started download of params_091209_231612_2.ini
02-Oct-2009 13:08:33 [Cosmology@Home] Restarting task wu_092009_214546_1_1_0 using camb version 216
02-Oct-2009 13:08:33 [World Community Grid] Restarting task R00357_cf629280a859b673a0af23ff2a737752_03_005_6 using rice version 617
02-Oct-2009 13:08:33 [World Community Grid] Restarting task CMD2_0072-SOS1A.clustersOccur-1GPL_A.clustersOccur_38_1 using hcmd2 version 614
02-Oct-2009 13:08:33 [Milkyway@home] Restarting task de_s222_3s_random_2p_09r_24_201356_1254108312_0 using milkyway version 18
02-Oct-2009 13:08:33 [World Community Grid] Sending scheduler request: Requested by project.
02-Oct-2009 13:08:33 [World Community Grid] Reporting 1 completed tasks, not requesting new tasks
02-Oct-2009 13:08:36 [Cosmology@Home] Giving up on download of params_091209_231612_2.ini: file not found

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fca0d4ca710 (LWP 19386)]
0x00000000004523a3 in PERS_FILE_XFER::poll (this=0xc54cf0) at pers_file_xfer.cpp:273
273	        if (fxp->bytes_xferred || (fip->urls.size() > 1)) {
(gdb) bt f
#0  0x00000000004523a3 in PERS_FILE_XFER::poll (this=0xc54cf0)
    at pers_file_xfer.cpp:273
	retval = <value optimized out>
	diff = <value optimized out>
#1  0x00000000004526db in PERS_FILE_XFER_SET::poll (this=0xbae180)
    at pers_file_xfer.cpp:465
	i = 1
	action = false
	last_time = 1254503316.0588651
#2  0x000000000041f13a in CLIENT_STATE::poll_slow_events (this=0x6a1d00)
    at client_state.cpp:634
	actions = 3
	retval = <value optimized out>
	tasks_restarted = true
	last_suspend_reason = 0
	first = false
	old_now = 1254502400.0000005
	old_user_active = <value optimized out>
#3  0x0000000000450d48 in boinc_main_loop () at main.cpp:500
	retval = 0
#4  0x00007fca0bb88946 in __libc_start_main () from /lib/libc.so.6
No symbol table info available.
#5  0x0000000000406029 in _start ()
No symbol table info available.
(gdb) print fxp
$1 = (FILE_XFER *) 0x0
ID: 27733 · Report as offensive
Rom Walton
Project developer
Avatar

Send message
Joined: 26 Aug 05
Posts: 164
Message 27734 - Posted: 2 Oct 2009, 22:01:00 UTC

I checked in the fix to the 6.6a branch.
----- Rom
BOINC Development Team, U.C. Berkeley
My Blog
ID: 27734 · Report as offensive
Ser

Send message
Joined: 2 Oct 09
Posts: 2
United States
Message 27735 - Posted: 3 Oct 2009, 1:08:05 UTC - in response to Message 27734.  
Last modified: 3 Oct 2009, 1:14:56 UTC

Many thanks.

Here's the patch in the format that should be added to the current gentoo ebuild.

isaac@galapagos /usr/local/portage/sci-misc/boinc/files $ cat 6.6.40-xfersigsegv.patch 
--- boinc-6.6.40/client/pers_file_xfer.cpp.orig	2009-10-02 20:55:03.419212277 -0400
+++ boinc-6.6.40/client/pers_file_xfer.cpp	2009-10-02 20:55:43.626712540 -0400
@@ -270,7 +270,7 @@
         // so that we'll query file size on next retry.
         // Otherwise leave it as is, avoiding unnecessary size query.
         //
-        if (fxp->bytes_xferred || (fip->urls.size() > 1)) {
+        if (last_bytes_xferred || (fip->urls.size() > 1)) {
             fip->upload_offset = -1;
         }


I'll go make a bug report for the Gentoo folks, too.

Edit: Done.
ID: 27735 · Report as offensive

Message boards : Questions and problems : 6.6.40 SIGSEGV on Gentoo

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.