Message boards : BOINC Manager : manager not responding
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 29 Aug 05 Posts: 15542 |
I'm using 5.10.45, not the debug version, but the stderrdae.txt log DOES have a callstack in it if that helps. It does. Is it of today's crash? Can you email it to Rom? Anyone who has a stack trace of this happening, can you zip it up and email it to Rom Walton at romw at romwnet dot org |
Send message Joined: 29 Aug 05 Posts: 15542 |
To get past the crashing, edit the client_state.xml file, scroll all the way down and edit the network option to show <user_network_request>3</user_network_request> That means you suspended network activity. Do know that this means none of your projects will upload, download or report work. |
Send message Joined: 31 Mar 08 Posts: 8 |
Does anyone know if it'll be fixed when LHC comes back online? I have several work units in another project due tomorrow morning, it would be a shame to have the time spent crunching them wasted. |
Send message Joined: 29 Aug 05 Posts: 15542 |
It should be fixed when LHC comes back online. The problem is that we don't know when that is as no one knows why they are off line in the first place. So my edit there is only of temporary help. If you don't mind losing the work for LHC, I can tell you how to edit client_state.xml to take out LHC... Just let me know. |
Send message Joined: 9 Feb 08 Posts: 54 |
Jord, So will this edit wipe out completed tasks? Or just stop communication? [/qoute Oh - I thinkk this message answers that question Oh no it doesn't! Wow. I'm not going anywhere near it. I'm just going to wait. Come on you LHC SysOps! Yaaaay! |
Send message Joined: 25 Nov 05 Posts: 1654 |
How about: In Projects tab, select LHC and then click No new tasks In Tasks tab, select a LHC wu, then click Suspend. Repeat for all LHC wus. Would this stop the LHC problem, while allowing other projects to continue? Just as a temporary measure, to upload wus for other projects, then turn off Network access again. Also, doing this just before uploading the other wus will allow LHC work to continue to this point. Rather fiddly for those with many projects and wus, but ... |
Send message Joined: 29 Aug 05 Posts: 15542 |
Les, The problem with this is that the BOINC core client has crashed already. So you can't set LHC to NNT. And it won't solve the crashing of the client as as soon as LHC tries to contact the scheduler, BOINC crashes. Guy, It will wipe out tasks to LHC that are trying to upload and those that are ready to report. It won't harm other projects. |
Send message Joined: 25 Nov 05 Posts: 1654 |
Sorry, I meant IN ADDITION to your temp fix. After the client_state edit.) |
Send message Joined: 29 Aug 05 Posts: 15542 |
Sorry, I meant IN ADDITION to your temp fix. After the client_state edit.) Ah.. yes. Although, the scheduler isn't there, so people should get an error that BOINC can't parse a scheduler reply. It won't crash BOINC. At least, it doesn't in my case. |
Send message Joined: 29 Aug 05 Posts: 15542 |
OK, I'll write a new thread for that. With a clear warning. |
Send message Joined: 29 Aug 05 Posts: 15542 |
I have a how to in this thread. Read it all first, please. If you have questions, do ask. If you don't want to try it, then don't try it. |
Send message Joined: 31 Mar 08 Posts: 59 |
Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0035D9FC read attempt to address 0x65772075 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 5.10.45 Dump Timestamp : 03/31/08 18:21:19 Debugger Engine : 4.0.5.0 Symbol Search Path: C:\Program Files\BOINC;C:\Program Files\BOINC;srv*C:\DOCUME~1\Raygun\LOCALS~1\Temp\symbols*http://msdl.microsoft.com/download/symbols;srv*C:\DOCUME~1\Raygun\LOCALS~1\Temp\symbols*http://boinc.berkeley.edu/symstore ///////////////////////////////////////////////////// \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ Then a bunch of mod-loads follow (none of which appear to be issues \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ ///////////////////////////////////////////////////// *** Dump of the Process Statistics: *** - I/O Operations Counters - Read: 346, Write: 0, Other 755 - I/O Transfers Counters - Read: 0, Write: 14483, Other 0 - Paged Pool Usage - QuotaPagedPoolUsage: 21504, QuotaPeakPagedPoolUsage: 23616 QuotaNonPagedPoolUsage: 34880, QuotaPeakNonPagedPoolUsage: 38016 - Virtual Memory Usage - VirtualSize: 28655616, PeakVirtualSize: 30306304 - Pagefile Usage - PagefileUsage: 5455872, PeakPagefileUsage: 5861376 - Working Set Size - WorkingSetSize: 8048640, PeakWorkingSetSize: 8445952, PageFaultCount: 4303 *** Dump of the thread (454): *** - Information - Status: Ready, Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Registers - eax=004603d0 ebx=0012f8c4 ecx=00000061 edx=00000000 esi=7813e457 edi=7813ed1f eip=7c90eb94 esp=0012f1dc ebp=0012f1ec cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000212 - Callstack - ChildEBP RetAddr Args to Child 0012f1d8 7c90e57c 7c80a027 00000078 00000000 0012f89c ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0] 0012f1dc 7c80a027 00000078 00000000 0012f89c 00445648 ntdll!_NtSetEvent@8+0x0 FPO: [2,0,0] 0012f1ec 00445648 00000078 0012f780 004455d0 7c884780 kernel32!_SetEvent@4+0x0 0012f200 7c863016 0012f8c4 00000000 00000000 00000000 boinc!boinc_catch_signal+0x0 (c:srcboincsvnbranchesboinc_core_release_5_10libdiagnostics_win.c:2142) 0012f89c 7c8436da 0012f8c4 7c839b09 0012f8cc 00000000 kernel32!_UnhandledExceptionFilter@4+0x0 (c:srcboincsvnbranchesboinc_core_release_5_10libdiagnostics_win.c:2142) 0012f8a4 7c839b09 0012f8cc 00000000 0012f8cc 00000000 kernel32!_BaseProcessStart@4+0x0 (c:srcboincsvnbranchesboinc_core_release_5_10libdiagnostics_win.c:2142) FPO: [0,0,0] 0012f8cc 7c9037bf 0012f9b8 0012ffe0 0012f9d4 0012f98c kernel32!__except_handler3+0x0 (c:srcboincsvnbranchesboinc_core_release_5_10libdiagnostics_win.c:2142) FPO: [3,0,7] 0012f8f0 7c90378b 0012f9b8 0012ffe0 0012f9d4 0012f98c ntdll!ExecuteHandler2@20+0x0 (c:srcboincsvnbranchesboinc_core_release_5_10libdiagnostics_win.c:2142) 0012f9a0 7c90eafa 00000000 0012f9d4 0012f9b8 0012f9d4 ntdll!ExecuteHandler@20+0x0 (c:srcboincsvnbranchesboinc_core_release_5_10libdiagnostics_win.c:2142) 0012f9a4 00000000 0012f9d4 0012f9b8 0012f9d4 c0000005 ntdll!_KiUserExceptionDispatcher@8+0x0 (c:srcboincsvnbranchesboinc_core_release_5_10libdiagnostics_win.c:2142) FPO: [2,0,0] *** Dump of the thread (7e4): *** - Information - Status: Waiting, Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Registers - eax=71a5d5af ebx=c0000000 ecx=7c913288 edx=ffffffff esi=00000000 edi=71a87558 eip=7c90eb94 esp=014aff7c ebp=014affb4 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 - Callstack - ChildEBP RetAddr Args to Child 014aff78 7c90e31b 71a5d609 0000010c 014affbc 014affb0 ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0] 014aff7c 71a5d609 0000010c 014affbc 014affb0 014affa4 ntdll!_ZwRemoveIoCompletion@20+0x0 FPO: [5,0,0] 014affb4 7c80b683 71a5d8ec 0012f700 7c90ee18 0015c378 mswsock!_SockAsyncThread@4+0x0 014affec 00000000 71a5d5af 0015c378 00000000 000000c8 kernel32!_BaseThreadStart@8+0x0 *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... |
Send message Joined: 19 Jan 07 Posts: 1179 |
Professor Ray wrote: *** Dump of the Process Statistics: *** What BOINC version are you using? Unfortunately, that stackdump seems useless (I think it's a crash on the code making the crash report; so the stackdump points to the crash reporter code). |
Send message Joined: 31 Mar 08 Posts: 59 |
Plain vanilla 5.10.45 |
Send message Joined: 30 Sep 05 Posts: 50 |
Hi, I took a look, and find I have company. Yes, I'm running LHC as well. That would be when it broke. Thanks for all the help, guys. I haven't lost anything, and it reminded me how to install properly again. I learned that much. I wouldn't have suspected LHC. I've had tons of problems with Milkyway, and recently some with Cosmology, but this is the first I've heard with LHC. I apologize to any BOINC people that are within reading distance. I was wrong. BOINC has been, and is still, the most reliable piece of software I've run across, using three OSs over the years. I use it as a personal benchmark for my machines. I have added projects in the last couple years that are unfortunately not so well written. |
Send message Joined: 14 Dec 06 Posts: 16 |
I'm using 5.10.45, not the debug version, but the stderrdae.txt log DOES have a callstack in it if that helps. Sent. I chopped off all the stuff from before today, and there were several crashes from 5.10.42 and several from 5.10.45, all due to today's crash. |
Send message Joined: 31 Mar 08 Posts: 59 |
Does this help? Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0035D79C read attempt to address 0x65772075 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 6.1.12 Dump Timestamp : 03/31/08 20:58:27 Debugger Engine : 4.0.5.0 Symbol Search Path: C:\Program Files\BOINC;C:\Program Files\BOINC;srv*C:\DOCUME~1\Raygun\LOCALS~1\Temp\symbols*http://msdl.microsoft.com/download/symbols;srv*C:\DOCUME~1\Raygun\LOCALS~1\Temp\symbols*http://boinc.berkeley.edu/symstore [bla bla bla bla - mod loads - bla bla bla] *** Dump of the Process Statistics: *** - I/O Operations Counters - Read: 14579, Write: 0, Other 1102 - I/O Transfers Counters - Read: 0, Write: 12715, Other 0 - Paged Pool Usage - QuotaPagedPoolUsage: 23984, QuotaPeakPagedPoolUsage: 26052 QuotaNonPagedPoolUsage: 35544, QuotaPeakNonPagedPoolUsage: 38352 - Virtual Memory Usage - VirtualSize: 30932992, PeakVirtualSize: 33034240 - Pagefile Usage - PagefileUsage: 5541888, PeakPagefileUsage: 5943296 - Working Set Size - WorkingSetSize: 8396800, PeakWorkingSetSize: 8765440, PageFaultCount: 5448 *** Dump of thread ID 3132 (state: Waiting): *** - Information - Status: Wait Reason: UserRequest, , Kernel Time: 9413536.000000, User Time: 8512240.000000, Wait Time: 1862366.000000 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0035D79C read attempt to address 0x65772075 - Registers - eax=00f8a228 ebx=00c64580 ecx=00c64580 edx=003f0608 esi=65772065 edi=00000000 eip=0035d79c esp=0012fca0 ebp=0012fd6c cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010206 - Callstack - ChildEBP RetAddr Args to Child 0012fd6c 0040c72f 00467c28 00000000 3fd322b0 7813eab9 libcurl!curl_multi_remove_handle+0x0 0012fdf0 00431501 00000000 3fd322b0 00000000 003f4ff8 boinc!+0x0 0012fe64 0043189d 00471aec 00000001 0012ffc0 00000000 boinc!+0x0 0012fe90 78136d6c f64df3d9 003f34b0 003f3498 7c90e027 boinc!+0x0 0012fedc 781323ff 781c3bc8 00000094 00000005 00000001 MSVCR80!__msize+0x0 0012ff28 0043c27e 0043c2ed f643d6b9 00471aec 0044e4d0 MSVCR80!__unlock+0x0 0012ff64 0043c2ed 0043c300 0044d670 0044d63a 0044d670 boinc!+0x0 0012ff68 0043c300 0044d670 0044d63a 0044d670 f643d7ad boinc!+0x0 0043c2ed 00000000 74ffc359 52e80424 f7ffffff f7c01bd8 boinc!+0x0 *** Dump of thread ID 3012 (state: Waiting): *** - Information - Status: Wait Reason: EventPairLow, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 1862306.000000 - Registers - eax=71a5d5af ebx=c0000000 ecx=7c913288 edx=ffffffff esi=00000000 edi=71a87558 eip=7c90eb94 esp=014bff7c ebp=014bffb4 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 - Callstack - ChildEBP RetAddr Args to Child 014bff78 7c90e31b 71a5d609 00000164 014bffbc 014bffb0 ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0] 014bff7c 71a5d609 00000164 014bffbc 014bffb0 014bffa4 ntdll!_ZwRemoveIoCompletion@20+0x0 FPO: [5,0,0] 014bffb4 7c80b683 71a5d8ec 0012f714 7c90ee18 00161708 mswsock!_SockAsyncThread@4+0x0 014bffec 00000000 71a5d5af 00161708 00000000 000000c8 kernel32!_BaseThreadStart@8+0x0 *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... |
Send message Joined: 19 Jan 07 Posts: 1179 |
- Callstack - Wow, a useful Windows stack dump for once :) Seems to show the same I found on Linux-based stackdumps, so the problem is definitely pin-pointed. Now to find the cause... :\ |
Send message Joined: 19 Jan 07 Posts: 1179 |
Anybody who can reproduce this crash on Linux, could you give me access to a ssh account on the machine having the problem? I can't reproduce the crash myself. |
Send message Joined: 19 Jan 07 Posts: 1179 |
I'm going to look into this tomorrow, could be a problem that cropped up with this version of libcurl & perhaps also a new Microsoft patch level? Look at threads in "core client" forum. It happens in Linux too, the problem trigger is definitely the LHC redirect. I tracked down what part of the code crashes, but I'm still not sure of why. --sent from my iPod |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.