6.12.26 Mac BOINC as Service Broken

Message boards : BOINC Manager : 6.12.26 Mac BOINC as Service Broken
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Penguirl
Avatar

Send message
Joined: 15 Jan 11
Posts: 24
United States
Message 38009 - Posted: 26 May 2011, 16:04:52 UTC

I run BOINC as a service on my PPC & Intel Macs using "Make_BOINC_Service.sh" script. After installing 6.12.26 on my Mac Pro running SL 10.6.7 the service became flaky. Sometimes it would be running when I logged in, sometimes it would not be running but would start after I logged in, and sometimes it wouldn't run and BOINCManager wouldn't be able to connect to boinc.

Reverting to 6.10.58 resolved the problem.
ID: 38009 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 38010 - Posted: 26 May 2011, 16:12:55 UTC - in response to Message 38009.  

Can you at least test with 6.12.28 if that one works better?
http://boinc.berkeley.edu/download_all.php

I will forward your complaint to the developer, see what he says.
ID: 38010 · Report as offensive
Penguirl
Avatar

Send message
Joined: 15 Jan 11
Posts: 24
United States
Message 38020 - Posted: 26 May 2011, 23:24:40 UTC

I'll post my findings when I've had time to test it for a while.
ID: 38020 · Report as offensive
CraniuMod

Send message
Joined: 25 May 11
Posts: 1
United States
Message 38023 - Posted: 27 May 2011, 16:30:04 UTC - in response to Message 38009.  

I have encountered the same problem on my MacBook Pro 10.6.7. And to add to the above comments, when the daemon quits boinc_master proceeds to run out of control, using all cpu, massively overheating my MBP. I have reverted to 6.10.58 which does not do this. I have only one project running as this is my daily machine and have gone through detach, uninstall, remove Make_BOINC_service and rerun that, reinstall, reattach, etc. and then switched to 6.10.58 which runs fine. I will test the 6.12.28 when I can.
ID: 38023 · Report as offensive
Penguirl
Avatar

Send message
Joined: 15 Jan 11
Posts: 24
United States
Message 38056 - Posted: 30 May 2011, 3:20:05 UTC

6.12.28 hung for a minute after installation, but then it connected to localhost and work commenced. Today upon unlocking the screen I found my CPUs at idle, launching BOINCManager showed it was disconnected from localhost.
ID: 38056 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 38119 - Posted: 1 Jun 2011, 23:53:47 UTC - in response to Message 38056.  

6.12.28 hung for a minute after installation, but then it connected to localhost and work commenced. Today upon unlocking the screen I found my CPUs at idle, launching BOINCManager showed it was disconnected from localhost.

I have not been able to reproduce any problems with running BOINC 6.12.28 as a service on my Mac Pro running OS 10.6.7. It sounds as if perhaps your BOINC Client might be crashing.

Please check for a crash report for BOINC that coincides with your BOINC Manager's disconnection from localhost. It should be either in /Library/Logs/CrashReporter/ or /Users/USERNAME/Library/Logs/CrashReporter/.

If you find it, please post the part up to, but not including, the section titled Binary Images.

Thank you.
Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 38119 · Report as offensive
Penguirl
Avatar

Send message
Joined: 15 Jan 11
Posts: 24
United States
Message 38181 - Posted: 5 Jun 2011, 20:28:15 UTC

I have no crash logs related to boinc in either of those locations, or in Console. I haven't pruned my crash logs since these incidents so it would seem that no crash entries were written.
ID: 38181 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 38223 - Posted: 7 Jun 2011, 0:58:03 UTC - in response to Message 38181.  

I have no crash logs related to boinc in either of those locations, or in Console. I haven't pruned my crash logs since these incidents so it would seem that no crash entries were written.


OK, I am puzzled. Loss of connection between the BOINC Manager and the BOINC Client is usually caused by the Client no longer running. If I might ask you to check a couple of more things:

The next time this happens, please run the /Applications/Utilities/Activity Monitor. The Manager will appear as BOINC (upper case) with its icon and the Client as boinc (lower case) with no icon. If it is running as a service, the Client will have a small process ID (PID). If the Client was launched by the Manager, the Client will have a larger PID than the Manager and its parent will be the Manager (to see the parent, select the Client in the list and click the Inspect icon at the top of the Activity Monitor window.)

There may also be useful information at the end of the stdoutdae.txt and stderrdae.txt files in the /Library/Application Support/BOINC Data/ directory.

Finally, please confirm that you are using the current version of the Make_BOINC_Service.sh script. The text of the script should include the line revised 1/6/08 to use launchd.

Thanks for helping solve this puzzle.

Cheers,
--Charlie

Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 38223 · Report as offensive
Profile NullCoding*
Avatar

Send message
Joined: 10 Jan 11
Posts: 58
United States
Message 38236 - Posted: 7 Jun 2011, 14:53:52 UTC

I have the same problem as Penguirl but I do not run BOINC as a service. I simply launch the manager from the Dock. It runs under my account, which is fine since it is the only account on this computer and it runs 24/7.

This problem never seems to arise if I am actively using the machine, which comprises a good chunk of the day. However, if I leave it overnight, I find in the morning that attempting to click anything results in an infinite "Communicating with BOINC Manager..." message and I need to restart BOINC entirely.

Oddly it would appear my tasks continue to run even when I cannot click anything in the Manager.

Sorry if this is throwing a wrench in the mix. Next time this problem crops up I will go into Activity Monitor and also the Console, as the current apps I am running use Java (Constellation's TrackJack). Beyond that...bit lost.
ID: 38236 · Report as offensive
Penguirl
Avatar

Send message
Joined: 15 Jan 11
Posts: 24
United States
Message 38378 - Posted: 13 Jun 2011, 3:49:37 UTC
Last modified: 13 Jun 2011, 3:54:30 UTC

The only BOINC related process running at the latest event (all CPUs are idle) is BOINCMenubar 2, no other BOINC process' are running with Activity Monitor.app set to show all process. BOINCMenubar 2 says "This Computer (localhost) Host not connected."

stderrdae.txt has a LOT of "shmat: Too many open files" at the beginning of the document, followed by a bunch of "md5_file: can't open projects/szdg.lpds.sztaki.hu_szdg/caa03215-1b38-489c-abd4-252c594ad6ff_f1075c82-cfb2-4ba5-bb7a-0b69cbe700c4_565740_1_1" in the middle, then "GetMACAddress returned 0x00000005" appears twice, and then a LOT of "shmat:Too many open files" again.

stdoutdae.txt shows normal looking start, resuming, uploading, etc… of workunits but ends with "12-Jun-2011 19:49:16 [yoyo@home] Computation for task ogr_110610103054_79_0 finished
12-Jun-2011 19:49:16 [yoyo@home] Resuming task ecm_es_1307868011_2_1232P.C271_1455_0 using ecm version 1
12-Jun-2011 19:49:18 [yoyo@home] Started upload of ogr_110610103054_79_0_0
12-Jun-2011 19:49:18 [yoyo@home] Started upload of ogr_110610103054_79_0_1
12-Jun-2011 19:49:18 [---] Can't open client_state_next.xml: fopen() failed
12-Jun-2011 19:49:18 [---] Couldn't write state file: fopen() failed; giving up"

I am using Make_BOINC_Service.sh dated 01/06/08 to use launchd

BOINCManager says at launch "BOINC Manager - Daemon Start Failed
BOINCManager is not able to start a BOINC client. Please start the daemon and try again."

And then the CPUs jumped to 100%, relaunch of BOINCManager works normally.
ID: 38378 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 38383 - Posted: 13 Jun 2011, 7:57:08 UTC - in response to Message 38378.  
Last modified: 13 Jun 2011, 8:27:17 UTC

stderrdae.txt has a LOT of "shmat: Too many open files" at the beginning of the document, followed by a bunch of "md5_file: can't open projects/szdg.lpds.sztaki.hu_szdg/caa03215-1b38-489c-abd4-252c594ad6ff_f1075c82-cfb2-4ba5-bb7a-0b69cbe700c4_565740_1_1" in the middle, then "GetMACAddress returned 0x00000005" appears twice, and then a LOT of "shmat:Too many open files" again.


It sounds like this has nothing to do with running BOINC as a service. Have you tried removing the file /Library/LaunchDaemons/edu.berkeley.boinc.plist and restarting the computer to run for a while not as a service?

The messages you report indicates that you may be running out of shared memory segments. Please see this post for an explanation and a possible workaround.

Please let me know if this helps.

By the way, because of the small number of shared memory segments available on the Mac and some other UNIX / Linux systems, BOINC moved away from using shmget and shmat almost 4 years ago in favor of memory-mapped files, and only supports the older method for backward compatibility with legacy project applications. All BOINC projects should have upgraded years ago. Perhaps some of the projects you are running may be using very old code.

Cheers,
--Charlie
Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 38383 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 38405 - Posted: 14 Jun 2011, 0:22:02 UTC - in response to Message 38383.  

By the way, because of the small number of shared memory segments available on the Mac and some other UNIX / Linux systems, BOINC moved away from using shmget and shmat almost 4 years ago in favor of memory-mapped files, and only supports the older method for backward compatibility with legacy project applications. All BOINC projects should have upgraded years ago. Perhaps some of the projects you are running may be using very old code.

You can tell whether a given project application uses the old shared memory logic by checking your /Library/Application Support/BOINC Data/client_state.xml file. Each has an element giving the API version used. If the api version is less than 6.0.0, then it is using the old, obsolete logic. Please let me know which project applications, if any, are doing this. We will then contact those projects and let them know they should upgrade.

If you prefer, send me a private message and I'll respond with an email address where you can send the file and we will examine it.

Thanks.

Cheers,
--Charlie

Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 38405 · Report as offensive
Penguirl
Avatar

Send message
Joined: 15 Jan 11
Posts: 24
United States
Message 38410 - Posted: 14 Jun 2011, 5:01:00 UTC
Last modified: 14 Jun 2011, 5:01:31 UTC

I have configured shared memory in the past on my PPC machines, does this still apply to a SL Xeon? It seems odd that this wasn't an issue with 6.10.x, it only started with the update to 6.12.x.

All of the <api_version> are at or above 6.

I have not yet tested without the launch daemon but I will let you know as soon as I do.
ID: 38410 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 38414 - Posted: 14 Jun 2011, 8:02:57 UTC - in response to Message 38410.  

I have configured shared memory in the past on my PPC machines, does this still apply to a SL Xeon?
My experience is that it applies even more because, to the best of my knowledge, Apple has not increased the shared memory segment limit even though the Mac Pro has more cores and hence can run more processes.

It seems odd that this wasn't an issue with 6.10.x, it only started with the update to 6.12.x.
I agree, but that seems to be what the error messages indicate. I'm afraid we have little choice but to find the problem by trial and error.

All of the are at or above 6.
Well, that does seem to reduce the likelihood that I'm on the right track, but you never know ....

I have not yet tested without the launch daemon but I will let you know as soon as I do.
Thanks.

Cheers,
--Charlie

Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 38414 · Report as offensive
JeromeC

Send message
Joined: 13 Oct 10
Posts: 115
France
Message 38427 - Posted: 14 Jun 2011, 13:10:24 UTC

Hi,

for your information I was forced to downgrade to 6.10.58 after having 6.12.26 on my iMac (OS X latest version) for some days : I would regularly find the CPU completely idle not running any project in the morning, or eventually with lots of file errors (I can't remember which error) on ALL the project at the same time (I run many different projects). It would be quite difficult to restart boinc then, having some erratic behavior, boinc not being ran anymore by boinc_master but by my user, BM not being able to connect to boinc again, and/or having two different boinc running at the same time...

Putting back 6.10.58 did fix everything, so it cannot be related to some specific project using the shared memory stuff as mentioned above.

I'm running boinc as a service, but I ignore if the issue was related to this, or not. I had difficulties to reinstall the 6.10.58 (only one project at a time was running on my i7, even with proper multiCPU option to 100%), I had to use uninstall script + find / delete every boinc related stuff before reinstalling 6.10.58 before it would work again with 8 projects in parallel.

I also had to set it up as a service again.
ID: 38427 · Report as offensive
Charlie Fenton
Project developer

Send message
Joined: 17 Jul 06
Posts: 287
United States
Message 38446 - Posted: 14 Jun 2011, 22:09:36 UTC

Hi all,

Thank you all for your input. We have found a number of serious bugs in 6.12.26. If you are feeling a bit adventurous, you might want to try BOINC version 6.12.33, which is currently in testing.

You can get BOINC 6.12.33 here.

Cheers,
--Charlie
Charlie Fenton
BOINC / SETI@home Macintosh & Windows Programmer
ID: 38446 · Report as offensive
Profile NullCoding*
Avatar

Send message
Joined: 10 Jan 11
Posts: 58
United States
Message 38452 - Posted: 15 Jun 2011, 1:02:31 UTC
Last modified: 15 Jun 2011, 1:03:03 UTC

Thanks Charlie (and others who posted here) for letting me know it's not something I did wrong whilst installing BOINC!

I have downloaded the 6.12.33 for Mac (I run the latest 64-bit Snow Leopard) and will see how it fares. The 6.12.26 never lasted through the night - would always disconnect from the client, similar problems to those stated above.

Might I recommend you remove 6.12.26 entirely from the download list, if it is in fact full of serious bugs. Or at least make it something other than the "recommended version" ;)

Gonna test that now.


ID: 38452 · Report as offensive
Penguirl
Avatar

Send message
Joined: 15 Jan 11
Posts: 24
United States
Message 38455 - Posted: 15 Jun 2011, 3:30:21 UTC

Turns out I already have /etc/sysctl.conf with the settings:

kern.sysv.shmmax=16777216
kern.sysv.shmmin=1
kern.sysv.shmmni=128
kern.sysv.shmseg=32
kern.sysv.shmall=4096

It must have carried over as I migrated machines.
ID: 38455 · Report as offensive
Profile NullCoding*
Avatar

Send message
Joined: 10 Jan 11
Posts: 58
United States
Message 38462 - Posted: 15 Jun 2011, 12:17:13 UTC

I don't have that file...a bunch of other kernel configurations but not that one in particular. Interesting.

At any rate the 6.12.33 client made it through the night alright and shows no signs of serious bugs. Good!
ID: 38462 · Report as offensive
Penguirl
Avatar

Send message
Joined: 15 Jan 11
Posts: 24
United States
Message 38553 - Posted: 18 Jun 2011, 21:34:48 UTC

Currently I am running BOINC 6.12.33 as an app, 5 of the CPUs are running at low percentage just slightly above the system requirements, and 3 are at 0%. BOINC is connected to localhost, the WUs that are "running" are not elapsing any time. Quitting BOINCManager brought the 5 active CPUs down no nearly 0%.
ID: 38553 · Report as offensive
1 · 2 · Next

Message boards : BOINC Manager : 6.12.26 Mac BOINC as Service Broken

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.