id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
705	Broadband fault causes BOINC 6.2.14 crash	Thyme Lawn	davea	An exchange hardware fault caused my (always on) broadband connection to drop last night.  BOINC 6.2.14 (protected install on XP and Vista) had stopped running on both systems when I checked them this morning.\r\n\r\nLooking at stdoutdae.txt it's clear that BOINC didn't detect the network failure and everything was fine as long as it was only attempting scheduler requests.  As soon as an upload was added into the equation it crashed.\r\n\r\nHere's a scheduler request from stdoutdae.txt after the connection had failed:\r\n{{{\r\n29-Jul-2008 00:50:08 [CPDN Beta] Sending scheduler request: To send trickle-up message.  Requesting 0 seconds of work, reporting 0 completed tasks\r\n29-Jul-2008 00:50:11 [---] Project communication failed: attempting access to reference site\r\n29-Jul-2008 00:50:12 [---] Internet access OK - project servers may be temporarily down.\r\n29-Jul-2008 00:50:13 [CPDN Beta] Scheduler request failed: Server returned nothing (no headers, no data)\r\n}}}\r\nNote that the reference site check is being made '''before''' the scheduler request has failed and is being marked as successful.\r\n\r\nThe trickle-up and reference file check was retried 9 times before the following sequence when boinc.exe crashed ('normal' scheduler requests take priority over trickle-ups):\r\n{{{\r\n29-Jul-2008 02:51:10 [malariacontrol.net] Computation for task wu_133_524_149170_0_1217280246_0 finished\r\n29-Jul-2008 02:51:10 [malariacontrol.net] Sending scheduler request: To fetch work.  Requesting 818 seconds of work, reporting 1 completed tasks\r\n29-Jul-2008 02:51:12 [---] Project communication failed: attempting access to reference site\r\n29-Jul-2008 02:51:12 [malariacontrol.net] Started upload of wu_133_524_149170_0_1217280246_0_0\r\n}}}\r\nBOINC Windows Runtime Debugger didn't generate any stack traces on the XP system but on the Vista system the trace in stderrdae.txt indicates that the crash was in the libcurl function curl_multi_remove_handle():\r\n{{{\r\nBOINC Windows Runtime Debugger Version 6.2.14\r\n\r\nDump Timestamp    : 07/29/08 02:51:13\r\nDebugger Engine   : 4.0.5.0\r\n}}}\r\n\r\n{{{\r\n*** Dump of thread ID 44492 (state: Waiting): ***\r\n\r\n- Information -\r\nStatus: Wait Reason: UserRequest, , Kernel Time: 87828560.000000, User Time: 71604456.000000, Wait Time: 19143612.000000\r\n\r\n- Unhandled Exception Record -\r\nReason: Access Violation (0xc0000005) at address 0x0016D9FC read attempt to address 0x27273D84\r\n\r\n- Registers -\r\neax=01e40278 ebx=00d3fe00 ecx=00d3fe00 edx=00001caa esi=27273d74 edi=00000000\r\neip=0016d9fc esp=0129fda0 ebp=0129fe6c\r\ncs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206\r\n\r\n- Callstack -\r\nChildEBP RetAddr  Args to Child\r\n0129fe6c 0040c86f 00468c18 00000000 3fc68730 7554eab9 libcurl!curl_multi_remove_handle+0x0 \r\n0129fef0 00431e51 00000000 3fc68730 76cae0c5 00000000 boinc!+0x0 \r\n0129ff68 0043b467 00000000 001d19a0 76cad1da 00000001 boinc!+0x0 \r\n0129ff88 75854911 001d19a0 0129ffd4 76fce4b6 001d19a0 boinc!+0x0 \r\n0129ff94 76fce4b6 001d19a0 7dc5be09 00000000 00000000 kernel32!BaseThreadInitThunk+0x0 \r\n0129ffd4 76fce489 76cad1b9 001d19a0 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0 \r\n0129ffec 00000000 76cad1b9 001d19a0 00000000 43534552 ntdll!RtlInitializeExceptionChain+0x0 \r\n}}}\r\nWhile I was waiting for the faulty line card to be replaced I tried to get BOINC running again on both systems.   Shortly after network operations started all tasks stopped running with heartbeat failures:\r\n{{{\r\nNo heartbeat from core client for 31 sec - exiting\r\nCPDN Monitor - No 'heartbeat' from BOINC...\r\n}}}\r\nTasks could only be kept running by suspending networking until the exchange problem was fixed.	Defect	closed	Critical	6.2	Client - Daemon	6.2.14	fixed	network	
