Message boards : Questions and problems : Projects "stalling" on my Macs
Message board moderation
Author | Message |
---|---|
Send message Joined: 19 May 08 Posts: 7 |
I've noticed, on both my Macs that are running Boinc, that occasionally one will "stall" ... it is listed as "Running" but the time does not update. A check in Activity Monitor shows it consuming CPU time but the memory footprint is rather small. The other project (I run two projects at a time, the systems have two cores) appears normal. If I suspend work then the "stuck" project takes 200% CPU and doesn't stop, I need to shut down the manager and restart it and then all is well for a few days. This morning it was the Seti task and the same for last night on the other computer. Do you think this might be a Seti thing or a Boinc thing? Bill W |
Send message Joined: 20 May 08 Posts: 1 |
I've noticed, on both my Macs that are running Boinc, that occasionally one will "stall" ... it is listed as "Running" but the time does not update. A check in Activity Monitor shows it consuming CPU time but the memory footprint is rather small. The other project (I run two projects at a time, the systems have two cores) appears normal. If I suspend work then the "stuck" project takes 200% CPU and doesn't stop, I need to shut down the manager and restart it and then all is well for a few days. I noticed the same thing a few days ago. It only happens with Seti on my Machine (MacBook Pro). Einstein@Home seems to be unaffected. It happens almost every day. Regards, Christian |
Send message Joined: 21 May 08 Posts: 6 |
I'm seeing the same on my iMac. Very frustrating. My two projects are SETI and Rosetta. Can't tell which of the two is the cause of the problem. However ... It's getting worse. Now BOINC Manager won't run. I've uninstalled BOINC, checked that all the files are gone (using both Spotlight and a manual directory slash-and-burn), then re-installed 10.45. The installer completes normally, but the BOINC Manager hangs when run -- just bounces in the Dock until I force quit it. I'm running DiskUtility to repair permissions, but I have limited time and patience to trouble-shoot an application that is asking me to volunteer my CPU time. This despite more than eight years with SETI and friends. So ... for the moment BOINC is uninstalled and I'm haunting this board in the hopes of finding a solution |
Send message Joined: 29 Aug 05 Posts: 15533 |
I've asked the developer to come take a look at this thread. |
Send message Joined: 29 Aug 05 Posts: 15533 |
I got word back from the developer. To be honest, he doesn't know. He suspects it's a problem with Seti, so best report it there on the forums. If you're adventurous, you could try running BOINC 6.2.4, which you can download from http://boinc.berkeley.edu/download_all.php. Do know that it needs at least OS 10.3.9 |
Send message Joined: 29 Aug 05 Posts: 15533 |
A request from the developer. Can you make a cc_config.xml file and turn on the <app_msg_send> and <app_msg_receive> flags and post the log here? (Not the whole log of course, only around 20 to 40 lines of when BOINC or BOINC Manager hangs.) Make the cc_config.xml file in your /Library/Application Support/BOINC Data directory and add to it: <cc_config> <log_flags> <app_msg_send>1</app_msg_send> <app_msg_receive>1</app_msg_receive> </log_flags> </cc_config> If you already have a cc_config.xml file, just add the flags to it, make sure they're between the <log_flags> tags. With thanks. |
Send message Joined: 21 May 08 Posts: 6 |
A request from the developer. Not an option for me, since I can't even get BOINC Manager to run anymore, but I'm happy to try out the 6.2.4 version. Will report back ... ... well, that didn't take very long. 6.2.4 hangs, too. Here's the beginning of the (very, very long) trouble report after I "Force Quit" it. If anyone wants all of it, either show me a way to upload a text file or send me an e-mail address. Problem Details Date/Time: 2008-05-22 15:53:49 +0200 OS Version: 10.5.2 (Build 9C7010) Architecture: i386 Report Version: 4 Command: BOINC Path: /Applications/BOINCManager.app/Contents/MacOS/BOINCManager Version: BOINC version 6.2.4 (6.2.4) Parent: launchd [106] PID: 15636 Event: hang Time: 17.49s Steps: 135 Process: BOINCManager [15636] Path: /Applications/BOINCManager.app/Contents/MacOS/BOINCManager ADDRESS BINARY 00001000 /Applications/BOINCManager.app/Contents/MacOS/BOINCManager 0070b000 /System/Library/TextEncodings/Unicode Encodings.bundle/Contents/MacOS/Unicode Encodings 00738000 /Applications/BOINCManager.app/Contents/Frameworks/SystemMenu.bundle/Contents/MacOS/SystemMenu 0da36000 /System/Library/CoreServices/RawCamera.bundle/Contents/MacOS/RawCamera Thread id: 77b6000 User stack: 135 ??? [0x2099] 135 ??? [0x2172] 135 ??? [0xe76c] 135 ??? [0x1e80ea] 135 ??? [0xbb43e] 135 ??? [0x1fd8db] 135 ??? [0x1fda7a] 135 _ReceiveNextEvent + 58 (in HIToolbox) [0x9700ed1f] 135 _ReceiveNextEventCommon + 175 (in HIToolbox) [0x96ecb3f2] 135 _RunCurrentEventLoopInMode + 283 (in HIToolbox) [0x96ecb6a0] 135 _CFRunLoopRunInMode + 88 (in CoreFoundation) [0x902aad18] 135 _CFRunLoopRunSpecific + 4494 (in CoreFoundation) [0x902aab5e] 135 ??? [0x136705] 135 ??? [0xdee78] 135 ??? [0xdea4e] 135 ??? [0xde8ea] 135 ??? [0x2fd16] 135 ??? [0xb9caf] 135 ??? [0xb9aa8] 135 ??? [0x1f3d6] 135 ??? [0x1c6a4] 135 _recvfrom$NOCANCEL$UNIX2003 + 10 (in libSystem.B.dylib) [0x931f5f16] Kernel stack: 135 _unix_syscall + 572 [0x3dcf13] 135 _recvfrom_nocancel + 301 [0x3aabcb] 135 _socketpair + 1219 [0x3aa875] 135 _soreceive + 1142 [0x3a61db] 135 _sbwait + 159 [0x3a7625] 135 _msleep + 157 [0x37fbc6] 135 _uiomove + 653 [0x37f815] 135 _lck_mtx_sleep + 87 [0x130a1e] 135 _thread_block + 33 [0x1369fb] 135 _thread_continue + 1181 [0x13678e] Thread id: 4e5d8b8 User stack: 135 _thread_start + 34 (in libSystem.B.dylib) [0x931ddb12] 135 __pthread_start + 321 (in libSystem.B.dylib) [0x931ddc55] 135 __Z11CMMConvTaskPv + 54 (in ColorSync) [0x951ebd92] 135 __Z20pthreadSemaphoreWaitP18t_pthreadSemaphore + 42 (in ColorSync) [0x951d9460] 135 ___semwait_signal + 10 (in lib |
Send message Joined: 21 May 08 Posts: 6 |
... and the new 6.2.6 hangs also, as shown below. Remember: this is after completely deleting BOINC and all files, so this is not project-dependent, this is a pure BOINC problem Date/Time: 2008-05-29 14:21:03 +0200 OS Version: 10.5.2 (Build 9C7010) Architecture: i386 Report Version: 4 Command: BOINC Path: /Applications/BOINCManager.app/Contents/MacOS/BOINCManager Version: BOINC version 6.2.6 (6.2.6) Parent: launchd [106] PID: 28832 Event: hang Time: 8.39s Steps: 48 Process: BOINCManager [28832] Path: /Applications/BOINCManager.app/Contents/MacOS/BOINCManager ADDRESS BINARY 00001000 /Applications/BOINCManager.app/Contents/MacOS/BOINCManager 0070c000 /System/Library/TextEncodings/Unicode Encodings.bundle/Contents/MacOS/Unicode Encodings 0073a000 /Applications/BOINCManager.app/Contents/Frameworks/SystemMenu.bundle/Contents/MacOS/SystemMenu 0da36000 /System/Library/CoreServices/RawCamera.bundle/Contents/MacOS/RawCamera |
Send message Joined: 29 Aug 05 Posts: 15533 |
Can you please use the cc_config.xml option as I asked in this post and post the messages from that? That will help the developer who is monitoring this thread. |
Send message Joined: 21 May 08 Posts: 6 |
Can you please use the cc_config.xml option as I asked in this post and post the messages from that? That will help the developer who is monitoring this thread. Jord, Well ... I can't give you any data for the simple reason that BOINC Manger hangs immediately on application launch, not after it's started to process any project data. (In fact, as I have already explained, there is no project data in the BOINC Data directory since I did a wipeout followed by a clean install). Thus ... no log files to send because I never get that far. All I have is the BOINC app window with the basic interface, saying "Retrieving current status", and completely frozen. The only way to deal with it is to Force Quit. So ... what now? |
Send message Joined: 29 Aug 05 Posts: 15533 |
So ... what now? Can you start the daemon, meaning BOINC itself? (Not the GUI aka BOINC manager). |
Send message Joined: 21 May 08 Posts: 6 |
So ... what now? 1. The location of the daemon is far from obvious, and I've given up hunting for it. (Not in Startupitems, etc.) 2. But ... overtaken by events. I upgraded to Mac OS X 10.5.3 and on reboot BOINC Manager started to run. I've reconnected to my projects (SETI, Rosetta) and am now processing work packages. Too soon to tell if the other BOINC problem (stalling while processing a package) is going to return with BOINC 6.2.6, but that typically takes a few days to appear, if at all. Will report back a) if further problems occur, or b) in a week to report that there's nothing to report (which would be nice). |
Send message Joined: 21 May 08 Posts: 6 |
Will report back a) if further problems occur, or b) in a week to report that there's nothing to report (which would be nice). <Sigh...> Ran OK for about a 1.5 days, then locked my system at the point where BOINC Screensaver was trying to connect to a project. My only option was to force a power-off, then reboot. Given that 6.2.6 is still in development, and anything earlier won't even start running on my 10.5.3 system, I've uninstalled everything and am taking a BOINC vacation until a new production release is out. Then, maybe, I try providing some log data |
Send message Joined: 19 May 08 Posts: 7 |
A request from the developer. Sorry I haven't been back to my own thread! I had another hang today and it was not Seti, it was a WCG project. I stopped and restarted the manager and it's now running again but I also put the cc_config.xml in the Boinc directory so we'll see what we find when it hangs again. I'll also do the same on the iMac ... I hope it hangs soon! The log file is going to get really big really fast! Bill W |
Send message Joined: 19 May 08 Posts: 7 |
By the way, where are the log files kept? Once this hangs I'll have to search for the failure point ... |
Send message Joined: 29 Aug 05 Posts: 15533 |
By the way, where are the log files kept? Once this hangs I'll have to search for the failure point ... In the stdoutdae.txt or stderrdae.txt file. |
Send message Joined: 19 May 08 Posts: 7 |
Well, I had another stall on the mini. The Seti app was running fine, the WCG was stalled. Here's a few of the message right before I quit the manager: 06-Jun-2008 10:06:22 [---] [app_msg_receive] got msg from slot 0: <current_cpu_time>6.752305052000000e+03</current_cpu_time> <checkpoint_cpu_time>6.719659688000001e+03</checkpoint_cpu_time> <fraction_done>0.49718582</fraction_done> <fpops_cumulative>32160826941935.445312</fpops_cumulative> 06-Jun-2008 10:06:23 [---] [app_msg_receive] got msg from slot 0: <current_cpu_time>6.753256623000000e+03</current_cpu_time> <checkpoint_cpu_time>6.719659688000001e+03</checkpoint_cpu_time> <fraction_done>0.49725296</fraction_done> <fpops_cumulative>32162704996354.652344</fpops_cumulative> 06-Jun-2008 10:06:24 [---] [app_msg_receive] got msg from slot 0: <current_cpu_time>6.754173933000000e+03</current_cpu_time> <checkpoint_cpu_time>6.719659688000001e+03</checkpoint_cpu_time> <fraction_done>0.49731774</fraction_done> <fpops_cumulative>32166281404858.515625</fpops_cumulative> 06-Jun-2008 10:06:25 [---] [app_msg_receive] got msg from slot 0: <current_cpu_time>6.755114933000000e+03</current_cpu_time> <checkpoint_cpu_time>6.719659688000001e+03</checkpoint_cpu_time> <fraction_done>0.49737761</fraction_done> <fpops_cumulative>32169789705230.421875</fpops_cumulative> 06-Jun-2008 10:06:26 [---] [app_msg_receive] got msg from slot 0: <current_cpu_time>6.756025538000000e+03</current_cpu_time> <checkpoint_cpu_time>6.719659688000001e+03</checkpoint_cpu_time> <fraction_done>0.49744287</fraction_done> <fpops_cumulative>32171562525318.519531</fpops_cumulative> 06-Jun-2008 10:06:27 [---] Exit requested by user 06-Jun-2008 10:06:27 [---] [app_msg_send] sent <quit/> to X0000046280299200502182056_1 06-Jun-2008 10:06:27 [---] [app_msg_send] sent <quit/> to 03mr08ab.27832.22158.13.8.28_4 Unfortunately the last "slot 1" message doesn't appear in the logs. When I restarted the manager both tasks were processing fine. |
Send message Joined: 11 Oct 06 Posts: 83 |
Well, I had another stall on the mini. The Seti app was running fine, the WCG was stalled. What WCG project/application do you crunch and this happens ? - I've reported such a hang about a half year ago at the WCG fourms. There was an old bug, which has the cause that shows the application as running but did not get any CPU. This problem was fixed by Eric in a old 5.8 or 5.10 BOINC API. But it looks like the bug came back. I have also this problem at the Astropulse application for MacOS... |
Send message Joined: 19 May 08 Posts: 7 |
There was an old bug, which has the cause that shows the application as running but did not get any CPU. This problem was fixed by Eric in a old 5.8 or 5.10 BOINC API. But it looks like the bug came back. I have also this problem at the Astropulse application for MacOS... I run all of the projects, I think the one that hung was Fighting Aids. But the problem has also happened on Seti so I think it's a Boinc problem, but not absolutely necessarily. Both tasks (I have two cores) show as running but one isn't clicking off any time in Boinc Manager. And if I look at Activity Monitor the "hung" task is in a really tight loop, using no system time and is usually a very small memory footprint. |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.