segmentation violation

Message boards : BOINC client : segmentation violation
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15002
Netherlands
Message 16231 - Posted: 31 Mar 2008, 22:41:08 UTC - in response to Message 16225.  

iLuvatar wrote:
yup, worked for me, too.
so, what now? get rid of the wu?

You can only do that by editing the client_state.xml file.

I'm offering people in the other thread to show how you can get (temporarily) rid of LHC from the client_state.xml file. This will mean it wipes out any work from LHC that is trying to upload or report.
ID: 16231 · Report as offensive
Professor Ray

Send message
Joined: 31 Mar 08
Posts: 59
United States
Message 16234 - Posted: 31 Mar 2008, 22:44:15 UTC - in response to Message 16204.  

I'm in contact with one of the developers at this moment...


Well, that's good that you're talking to "developers". Perhaps you can tell 'em that there IS a general issue with the BOINC manager specific to the LHC client. It causes repeated DLL init errors on my machine. There is a thread on the LHC forum regard that issue that I initiated some time ago, and no resolution yet (despite having migrated BOINC to the most current one). It appears that LHC WU are successfully completing in any case. Its just that using my machine while LHC is running is virtually impossible.

What's most annoying about this problem is that it causes my machine to hang for as long as two minutes. Even so the mouse cursor moves, it remains frozen with the "hand pointer" icon (nothing else is accessible on the desktop, although sometimes I can access the auto-hide taskbar).

The time period between "freezes" is variable, as is the time period of "freezing". The system invariably resumes whatever it was doing when it "wakes up". Sometimes, but not always there'll be an "DLL init" error message in the BOINC messages pane. There is NEVER a stderr.txt file in the LHC slot folder (and stderrdae.txt is empty also).

No other BOINC client gives me this issue.

3/31/2008 6:29:09 PM|rosetta@home|URL: http://boinc.bakerlab.org/rosetta/; Computer ID: 609131; location: home; project prefs: home
3/31/2008 6:29:09 PM|boincsimap|URL: http://boinc.bio.wzw.tum.de/boincsimap/; Computer ID: 83721; location: home; project prefs: default
3/31/2008 6:29:09 PM|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 1011976; location: home; project prefs: home
3/31/2008 6:29:09 PM|lhcathome|URL: http://lhcathome.cern.ch/lhcathome/; Computer ID: 9636853; location: home; project prefs: home
3/31/2008 6:29:09 PM|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 3829768; location: home; project prefs: home
3/31/2008 6:29:09 PM|Spinhenge@home|URL: http://spin.fh-bielefeld.de/; Computer ID: 100366; location: home; project prefs: default
3/31/2008 6:29:09 PM|uFluids|URL: http://www.ufluids.net/; Computer ID: 57580; location: home; project prefs: default
3/31/2008 6:29:09 PM|The Lattice Project|URL: http://boinc.umiacs.umd.edu/; Computer ID: 10196; location: home; project prefs: default
3/31/2008 6:29:09 PM|Leiden Classical|URL: http://boinc.gorlaeus.net/; Computer ID: 39374; location: home; project prefs: default

So dunno what to tell you, and its a completely different issue than what started the thread, but now there's this new problem.


ID: 16234 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15002
Netherlands
Message 16235 - Posted: 31 Mar 2008, 22:48:28 UTC - in response to Message 16234.  
Last modified: 31 Mar 2008, 22:49:26 UTC

its a completely different issue than what started the thread, but now there's this new problem.

That problem is something the developers are aware of and they can't fix it in the present 5.10 range. It'll have to wait for BOINC 6. Sorry.

edit: it's not confined to LHC only. Multiple projects have this problem with the application restarting.
ID: 16235 · Report as offensive
Professor Ray

Send message
Joined: 31 Mar 08
Posts: 59
United States
Message 16236 - Posted: 31 Mar 2008, 22:50:54 UTC - in response to Message 16225.  

yup, worked for me, too.
so, what now? get rid of the wu?


Why not wait until the server comes back up. The LHC web-site displays the "down for maintenance" page. I've got 23.5 hrs into the WU I'm trying to upload. No way do I want to lose credit for that.

Would suspending the LHC project prevent attempts by LHC to communicate to the server if network connection was reestablished in the activity drop-down menu?
ID: 16236 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15002
Netherlands
Message 16239 - Posted: 31 Mar 2008, 22:55:50 UTC - in response to Message 16236.  

Would suspending the LHC project prevent attempts by LHC to communicate to the server if network connection was reestablished in the activity drop-down menu?

No. As long as there's work to upload, suspending LHC or setting it to No new tasks won't work. The upload will try to continue, contact the scheduler, crash BOINC.

Suspending network activity prevents everything from uploading/downloading/reporting.

The problem is that we don't know for how long LHC is down. We don't even know why it's down. Could be they return in an hour, a day, a week. So teh suspending of network activity is only of temporary use, especially if you're connected to multiple projects.
ID: 16239 · Report as offensive
Professor Ray

Send message
Joined: 31 Mar 08
Posts: 59
United States
Message 16240 - Posted: 31 Mar 2008, 22:56:08 UTC - in response to Message 16235.  

its a completely different issue than what started the thread, but now there's this new problem.

That problem is something the developers are aware of and they can't fix it in the present 5.10 range. It'll have to wait for BOINC 6. Sorry.

edit: it's not confined to LHC only. Multiple projects have this problem with the application restarting.


Roger, copy that. But Duuuuuuuude, why am I ONLY having that problem with LHC? I concede that I've heard what you say about the DLL init error problem, but I DON'T see the problem with any other BOINC client (EVER). They run flawlessly.

Quite frankly, the problem strikes as a priority issue. It seems to behave akin to setting Spybot's scan priority to high (I lose control of the system for inordinate amounts of time).
ID: 16240 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15002
Netherlands
Message 16242 - Posted: 31 Mar 2008, 22:59:07 UTC - in response to Message 16240.  

But Duuuuuuuude, why am I ONLY having that problem with LHC?

If we knew that... the problem would be easily fixed.

Why does it happen on some systems and not on others? Why only for some projects and not others? Why is rain wet and snow cold? um... ;-)
ID: 16242 · Report as offensive
Professor Ray

Send message
Joined: 31 Mar 08
Posts: 59
United States
Message 16243 - Posted: 31 Mar 2008, 23:00:34 UTC - in response to Message 16239.  

Would suspending the LHC project prevent attempts by LHC to communicate to the server if network connection was reestablished in the activity drop-down menu?

No. As long as there's work to upload, suspending LHC or setting it to No new tasks won't work. The upload will try to continue, contact the scheduler, crash BOINC.

Suspending network activity prevents everything from uploading/downloading/reporting.

The problem is that we don't know for how long LHC is down. We don't even know why it's down. Could be they return in an hour, a day, a week. So teh suspending of network activity is only of temporary use, especially if you're connected to multiple projects.


BUMMMERS.

Well, my earliest deadline is 08 Apr 3 0539. The worst part is I've a Lattice Project WU download transfer pending. I'll sacrifice obtaining WU's as opposed to sacrificing credit for WU completed.

Thanks for your feedback.
ID: 16243 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15002
Netherlands
Message 16246 - Posted: 31 Mar 2008, 23:19:47 UTC

I have a how to in this thread.
Read it all first, please.
If you have questions, do ask.
If you don't want to try it, then don't try it.
ID: 16246 · Report as offensive
iLuvatar

Send message
Joined: 31 Mar 08
Posts: 7
Switzerland
Message 16248 - Posted: 31 Mar 2008, 23:21:00 UTC

well, if lhc doesn't come back online soon, i'll have to remove the wu 'cause i'm running out of work ;-)
ok, i'll see you tomorrow
ageless, thx a lot for your help
ID: 16248 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 16253 - Posted: 31 Mar 2008, 23:50:22 UTC

Anybody who can reproduce this crash on Linux, could you get me access to a ssh account on the machine having the problem? I can't reproduce the crash myself, but I progressed a bit in trying to find the cause. I think curl_multi_remove_handle is being passed an invalid handle.
ID: 16253 · Report as offensive
peschue

Send message
Joined: 1 Apr 08
Posts: 2
Austria
Message 16294 - Posted: 1 Apr 2008, 14:15:01 UTC
Last modified: 1 Apr 2008, 14:49:28 UTC

I have this problem and found out something strange:

sched_reply_lhcathome.cern.ch_lhcathome.xml contains a web page instead of a scheduler reply xml document.

I used wireshark to find out which request causes this:

My computer does a "POST /lhcathome_cgi/cgi HTTP/1.1" request on "lhcathome.cern.chrn" and receives a "HTTP/1.1 301 Moved Permanently"

which contains (among a lot of other HTML) the message "The document has moved <a href="http://lhcathome.cern.ch/">here</a>"

The BOINC client then does a "Get /" on that URL and I think the result is stored in the sched_reply...xml file.

The reason is that LHC is currently down for maintenance, but this should not cause the reply XMLs to contain semi-garbage...
ID: 16294 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 16298 - Posted: 1 Apr 2008, 15:02:26 UTC - in response to Message 16294.  

My computer does a "POST /lhcathome_cgi/cgi HTTP/1.1" request on "lhcathome.cern.chrn" and receives a "HTTP/1.1 301 Moved Permanently"

which contains (among a lot of other HTML) the message "The document has moved <a href="http://lhcathome.cern.ch/">here</a>"

The BOINC client then does a "Get /" on that URL and I think the result is stored in the sched_reply...xml file.

Wow. HTTP specification says a 301 Moved Permanently is supposed to keep the same method. I can't believe curl is doing that wrong.
ID: 16298 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15002
Netherlands
Message 16309 - Posted: 1 Apr 2008, 17:06:25 UTC

LHCs scheduler is running again.
ID: 16309 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 16311 - Posted: 1 Apr 2008, 18:43:13 UTC - in response to Message 16309.  

LHCs scheduler is running again.

Doesn't seem to be.
ID: 16311 · Report as offensive
iLuvatar

Send message
Joined: 31 Mar 08
Posts: 7
Switzerland
Message 16314 - Posted: 1 Apr 2008, 19:12:13 UTC

hi
i set user_network_request back to 2, started the client and everything went fine.
seems like they fixed the problem at lhc.
ID: 16314 · Report as offensive
Previous · 1 · 2

Message boards : BOINC client : segmentation violation

Copyright © 2022 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.