Sudden Disconnect

Message boards : Questions and problems : Sudden Disconnect
Message board moderation

To post messages, you must log in.

AuthorMessage
Bauglir

Send message
Joined: 25 May 10
Posts: 9
United States
Message 33011 - Posted: 25 May 2010, 15:39:03 UTC

I've just downloaded 6.10.56 and am running it on Ubuntu 9.10. I'm trying to connect to the World Community Grid. When I initially run the manager, I get the window that has me attach to a project, and after going through that everything seems to be running fine (I'm connected to localhost), but within a second of starting the actual downloading of files, everything in the Messages tab is greyed out, downloads stop, and the lower right of the GUI says "Disconnected". It will then try to reconnect, and fail. I can't attach to a new project, and World Community Grid vanishes from my Projects tab. Any ideas on what's up? Let me know if I need to supply additional info.
ID: 33011 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14955
Netherlands
Message 33014 - Posted: 25 May 2010, 16:30:31 UTC - in response to Message 33011.  

No, but let me move your post to the Q&P forum. This isn't so much a problem with BOINC Manager, more like it loses the connection with the client.

Is the client still running? (check in top)
Did you allow both the boinc and boincmgr binaries through your firewall on TCP port 31416?
ID: 33014 · Report as offensive
Bauglir

Send message
Joined: 25 May 10
Posts: 9
United States
Message 33015 - Posted: 25 May 2010, 21:12:26 UTC

Ah, sorry about the misplacement. Thanks for moving it. I didn't have a firewall before, but you reminded me that I ought to install one. Having done so, and having enabled that exception, it now works for some inexplicable reason. I'd have thought the complete absence of a firewall would've made no difference, but things seem to be working, so thank you.
ID: 33015 · Report as offensive
Bauglir

Send message
Joined: 25 May 10
Posts: 9
United States
Message 33030 - Posted: 26 May 2010, 3:33:11 UTC - in response to Message 33015.  

Actually, strike that. I should have waited longer; it happened again. I think that the problem may have been that there was no work available earlier when I tried it, but this time it downloaded several files and the same thing happened. Watching the client in the system monitor, it spontaneously vanishes when the manager begins experiencing problems.
ID: 33030 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1636
Australia
Message 33032 - Posted: 26 May 2010, 6:10:47 UTC - in response to Message 33030.  

Is it possible that you're downloading LOTS of WUs?
If so, then you're probably overloading the manager, which tries to update it's info once per second.

To cure this, a new option & button were created in the Tasks tab. It's label (and function), alternates between Show active tasks and Show all tasks.
Set it so that it displays the latter, meaning that it's set to only show Active tasks.

ID: 33032 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 33033 - Posted: 26 May 2010, 6:59:00 UTC - in response to Message 33032.  

Is it possible that you're downloading LOTS of WUs?
If so, then you're probably overloading the manager, which tries to update it's info once per second.

I don't think so, because in that case the boinc client wouldn't vanish, only the connection to the manager would be interrupted.

@Bauglir: Aren't there any entries in the message log before the exit of the client? I'm not sure about the filenames with unix, but they should start with "stdout" or "stderr" and you should look for them in the BOINC data directory, as denoted in the BOINC startup messages.

Gruß,
Gundolf
ID: 33033 · Report as offensive
Bauglir

Send message
Joined: 25 May 10
Posts: 9
United States
Message 33043 - Posted: 26 May 2010, 15:31:11 UTC
Last modified: 26 May 2010, 15:32:47 UTC

Well, I've found stdoutdae. There's no error messages listed in there, and, oddly, it continues adding entries even after the client has exited for a minute or two. It also seems to have a few extra copies of the preferences and benchmark message logs. I am, at this point, mystified, as it's doing things that don't seem like they should be possible, unless I misunderstand how the client works with downloads.

Just to clarify, there seems to be no indication in the log file that anything's gone wrong with the client when it exits.
ID: 33043 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14955
Netherlands
Message 33046 - Posted: 26 May 2010, 16:47:40 UTC - in response to Message 33043.  
Last modified: 26 May 2010, 16:47:58 UTC

Well, I've found stdoutdae. There's no error messages listed in there

Error messages of the BOINC client are recorded in stderrdae
ID: 33046 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 33049 - Posted: 26 May 2010, 17:21:31 UTC - in response to Message 33043.  

Just to clarify, there seems to be no indication in the log file that anything's gone wrong with the client when it exits.

Just to be sure: are you aware that the client consists of two programs, the core client and the manager? So, the client can continue running when the manager exits.

Gruß,
Gundolf
ID: 33049 · Report as offensive
Bauglir

Send message
Joined: 25 May 10
Posts: 9
United States
Message 33054 - Posted: 26 May 2010, 20:38:59 UTC - in response to Message 33049.  

Just to clarify, there seems to be no indication in the log file that anything's gone wrong with the client when it exits.

Just to be sure: are you aware that the client consists of two programs, the core client and the manager? So, the client can continue running when the manager exits.

Gruß,
Gundolf


Yeah; the manager doesn't exit, the client does. At least, according to the system monitor (the equivalent of the Task Manager in Windows), the process for the client dies, but the process for the manager keeps functioning (and the window stays open, it just loses all functionality). The weird thing is that the stdoutdae file seems to keep recording log info even after the client appears to exit. If it helps, this log information is not displayed in the manager, even though that remains open.

I'm away from my home computer right now, but I'll check to see if stderrdae is around when I get back, although I don't remember it. Perhaps that will prove enlightening.
ID: 33054 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 33055 - Posted: 26 May 2010, 21:27:40 UTC - in response to Message 33054.  

The weird thing is that the stdoutdae file seems to keep recording log info even after the client appears to exit...

That's why I asked in the first place ;-) Really weird!

Gruß,
Gundolf
ID: 33055 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1636
Australia
Message 33056 - Posted: 26 May 2010, 22:35:48 UTC - in response to Message 33054.  

the manager doesn't exit, the client does.

the process for the client dies, but the process for the manager keeps functioning

These two statements contradict each other.
There's no manager, but the manager keeps function? !!! Not possible.

Also:
... this log information is not displayed in the manager, even though that remains open.


It sounds as though you're confused about which is which, which is not a good thing when we're trying to read your mind as to what you're seeing.

ID: 33056 · Report as offensive
Bauglir

Send message
Joined: 25 May 10
Posts: 9
United States
Message 33057 - Posted: 26 May 2010, 22:38:41 UTC
Last modified: 26 May 2010, 22:46:43 UTC

Ok, stderrdae has two very similar errors (and these are the only errors in the file),

    SIGSEGV: segmentation violation
    Stack trace (12 frames):
    /home/xerxes/Downloads/BOINC/boinc(boinc_catch_signal+0x64)[0x80af514]
    [0x2bc400]
    /lib/tls/i686/cmov/libc.so.6(memset+0x37)[0x3d7b57]
    /home/xerxes/Downloads/BOINC/boinc[0x80aa62f]
    /home/xerxes/Downloads/BOINC/boinc[0x8058101]
    /home/xerxes/Downloads/BOINC/boinc[0x805881c]
    /home/xerxes/Downloads/BOINC/boinc[0x8073cd2]
    /home/xerxes/Downloads/BOINC/boinc[0x805f289]
    /home/xerxes/Downloads/BOINC/boinc[0x809585c]
    /home/xerxes/Downloads/BOINC/boinc[0x8095c58]
    /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0x379b56]
    /home/xerxes/Downloads/BOINC/boinc(__gxx_personality_v0+0x195)[0x804c121]

    Exiting...
    SIGSEGV: segmentation violation
    Stack trace (12 frames):
    /home/xerxes/Downloads/BOINC/boinc(boinc_catch_signal+0x64)[0x80af514]
    [0xd5f400]
    /lib/tls/i686/cmov/libc.so.6(memset+0x37)[0x1aab57]
    /home/xerxes/Downloads/BOINC/boinc[0x80aa62f]
    /home/xerxes/Downloads/BOINC/boinc[0x8058101]
    /home/xerxes/Downloads/BOINC/boinc[0x805881c]
    /home/xerxes/Downloads/BOINC/boinc[0x8073cd2]
    /home/xerxes/Downloads/BOINC/boinc[0x805f289]
    /home/xerxes/Downloads/BOINC/boinc[0x809585c]
    /home/xerxes/Downloads/BOINC/boinc[0x8095c58]
    /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0x14cb56]
    /home/xerxes/Downloads/BOINC/boinc(__gxx_personality_v0+0x195)[0x804c121]

    Exiting...



Where xerxes is my user account. Am thinking there's no sufficiently identifying information in there to be risky. These are beyond what little knowledge I have to understand. Thanks for your help so far.

EDIT: No, they don't contradict each other, as far as I can tell. In the first, I say the manager doesn't die and the client does, and in the second, I say the process for the client dies but the one for the manager doesn't. In the third I say that the manager remains open, because that's the part that has a window; the client does not, right?

ID: 33057 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1636
Australia
Message 33058 - Posted: 26 May 2010, 23:06:38 UTC - in response to Message 33057.  
Last modified: 26 May 2010, 23:20:13 UTC

Sorry. When I read your previous post, the first part said: "the manager doesn't exist". When I re-read it just now, the word had changed to exit. :(

This problem has been getting worse lately. Sometimes I "read" a word that isn't even there.

***************

However;
If it helps, this log information is not displayed in the manager, even though that remains open

The manager gets it's info from the client, not the error file.
If there's no client running, then the manager can't display anything about it. Even if the client is running, if the 2 parts can't communicate, then you'll also get nothing. In this case at least, the message on the bottom line of the manger, near the right hand side, will be Disconnected, as you said earlier.
ID: 33058 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1636
Australia
Message 33059 - Posted: 26 May 2010, 23:33:45 UTC

There may be 2 problems, but only because I can't see how it's happening.

Splitting them up:
1) The client (boinc.exe), "disappears".
2) Messages are still being written after the disappearances.

For 2), perhaps the writing is due to a delayed write-to-disk.
Also, note the absence of dates. The messages may have been there for a while.

For 1): Is the program itself still on the disk?
If so, then perhaps "disappeared" should be described as "stops running".

So the question translates to: What causes the client to spontaneously stop running in a Linux system?
And is it WCG specific?

(The only thing that *I* know of, is an overly aggressive AV program quarantining parts of BOINC, or the science apps.)

ID: 33059 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14955
Netherlands
Message 33060 - Posted: 27 May 2010, 0:26:46 UTC - in response to Message 33057.  

SIGSEGV: segmentation violation

/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0x379b56]

OK, two things.

Singal 11, or the SIGSEGV segmentation violation are most commonly caused by:
1. Not having installed 32bit compatibility libraries on a 64bit Linux.
2. A Stack or Page Fault.

1 can be overcome by installing the 32bit compatibility libraries for Ubuntu.
2 can be caused by bad RAM or a not large enough page file.

Second, check that libc.so.6 is on your system.
Better yet, open a console window in the directory where you installed the BOINC binaries in and type ldd boinc (that's el dee dee) and hit Enter. Any libraries missing? Install those. Do the same for boincmgr.

That stdoutdae is written to after the client has exited can be caused by lag. Normally, when you run the client from a console window, it won't store the output to stdoutdae immediately, but only when it hits the maximum amount of lines it can show (anywhere between 1,000 and 2,500) or when you exit the client. The output that's still in memory will then be written to this file.

If you run the boinc binary from a console with the --redirectio flag, the output is written to the stdoutdae file immediately.
ID: 33060 · Report as offensive
Bauglir

Send message
Joined: 25 May 10
Posts: 9
United States
Message 33061 - Posted: 27 May 2010, 3:53:55 UTC

Hooray, seems to be working. Between deleting my old virus scanner, reinstalling the libc6 package, increasing my swap file to a preposterous 14 GiB, and giving the whole thing another reboot afterward, it's been functioning for a fair bit with actual work being done. I appreciate your help. Thank you!
ID: 33061 · Report as offensive

Message boards : Questions and problems : Sudden Disconnect

Copyright © 2022 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.