OS X Yosemite - Communicating with BOINC client. Please wait…

Message boards : Questions and problems : OS X Yosemite - Communicating with BOINC client. Please wait…
Message board moderation

To post messages, you must log in.

AuthorMessage
wrooney

Send message
Joined: 10 Dec 14
Posts: 6
United States
Message 58590 - Posted: 10 Dec 2014, 3:46:58 UTC
Last modified: 10 Dec 2014, 3:49:22 UTC

After upgrading to OS X Yosemite I started getting the error message "Communicating with BOINC client. Please wait…".

I tried an uninstall of BOINC using "Uninstall BOINC" in the extras folder and then upgrading to BOINC Manager v7.4.26. It seems to successfully download new tasks, but then this error message comes up, so I assume it is an error trying to upload the results. The message comes up and I can't get out of it. If I hit "Cancel" it comes back in 1-2 seconds. If I Quit, then it does of course stop, but so does BOINC. If I leave the message up there overnight, it's still there in the morning.

I've reinstalled a couple of times. Each time it seems to successfully download new tasks, then go to work on them, but then the next day I get the message again.

On my most recent reinstall, I thought the problem was resolved, since I saw some of the results were apparently uploaded when I checked my most recent statistics, and it said I have results from the last few days, however it has started again.

I'm currently running:
OS X Yosemite
Version 10.10.1
Intel Core i3 - 64 bit
BOINC Manager v7.4.26

Graphics:
Chipset Model: ATI Radeon HD 5670
Type: GPU
Bus: PCIe

Thanks
ID: 58590 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 58598 - Posted: 10 Dec 2014, 18:05:31 UTC - in response to Message 58590.  

The BOINC program consists of two main parts, called the BOINC client and BOINC Manager. The BOINC client does all the heavy lifting, as scheduling of what work to run, caching of work, keeping tabs on what projects you added and all that. The BOINC Manager is a graphical user interface that allows you to easily command and control the BOINC client.

The BOINC client and the BOINC Manager need to talk to each other, in other for you to be able to see tasks progressing, and be able to do the control and command. Technically this means that these two programs talk to each other through a remote procedure call on the local network (127.0.0.1) via TCP port 31416. Or for the non-so-technical minded, this means that you need to allow both the boinc and boincmgr (or boinc-manager) binaries through your firewall.

The BOINC client can perfectly happily run without the BOINC Manager, thus it will schedule project programs to run project tasks, do the uploading and reporting, and request new work, without requiring BOINC Manager to run. This is why you see at the project that tasks ran, uploaded and reported.

The "Communicating with BOINC client" message is not an error message, it's just informative, telling you that something is blocking the communication between the client and the manager. That may be a firewall, or an anti-virus product. Or just another program that is using TCP port 31416.
ID: 58598 · Report as offensive
wrooney

Send message
Joined: 10 Dec 14
Posts: 6
United States
Message 58680 - Posted: 14 Dec 2014, 22:00:14 UTC - in response to Message 58598.  

You say "something is blocking the communication between the client and the manager". So this is Mac OS X and I'm not aware of any antivirus software in Mac OS X. If there is, please let me know how to access it. But I don't have an add-on program such as Norton or McAfee.

Now as for a firewall, they way you describe the communication between the client and the manager makes me believe you are referring to a client and manager that are both resident on my computer, like using IP address 127.0.0.1. This as opposed to the client / server relationship that exists between my computer and the one at Berkeley.

If that is correct, then I assume that any firewall I have in my cable/DSL modem or in my router has no impact on this communication. If that is not the case, let me know and I'll check those settings.

On my Mac, there is a built-in firewall under "System Preferences", "Security & Privacy", "Firewall"; however when I posted this question the firewall setting was off. So I assume that is not interfering with the communcation.

I did turn the firewall on, and then selected "Allow incoming connections" for "BOINCManager", thinking maybe even with the firewall off there is some default policy that is getting in the way. This appeared to work for a day or two but I'm not back to getting the original message.

Finally, if there is another program using TCP port 31416, then I don't know how to go about PD for this on a Mac. Is there a way I can display ports in use or trace IP connections?

Thanks,
Bill
ID: 58680 · Report as offensive
Profile Gary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2465
United States
Message 58683 - Posted: 15 Dec 2014, 5:30:00 UTC

Injecting, on my Mac, the port presently in use is 49161 for the Boinc Master communication. You can find this easily with activity monitor. Select the Bonic Master process from the list, click on inspect and select the tab for open files and ports. In my case the last item in the list.
    localhost:49161->localhost:xqosd

To find what has it open, open terminal and type "netstat" Then do a find command in terminal on the port number.

Unfortunately, or perhaps fortunately, due to security and sandbox you can not inspect open files and ports on programs of other users. boinc and the science programs do not run under your user name. You will need to use Unix skills in terminal to do this.

It recently did see an instance on my Mac with that error. Some probing found the projects using the Virtual Machine had started several virtual machines, all of which were rather furiously sending messages back and forth. Obviously boinc has to talk to each VM, but then the VM has to relay the message to the science app and get a reply which is passed back to boinc and finally on to Boinc Master. Also the VM's themselves pass messages within themselves. I suspected that there was so much message traffic that perhaps the memory allocated by Mac OS for messages was being filled passed capacity. My solution was to limit each of the processes using a VM to a single job at a time. Since I've done this I have not had any more issues. Obviously if you aren't running VM projects, this isn't your issue. YMMV


ID: 58683 · Report as offensive
wrooney

Send message
Joined: 10 Dec 14
Posts: 6
United States
Message 58749 - Posted: 19 Dec 2014, 3:32:02 UTC - in response to Message 58683.  

Okay, so I did a netstat and here are my results after BOINC hung again:

Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4      59      0  localhost.xqosd        localhost.58358        CLOSE_WAIT 
tcp4       0      0  localhost.58358        localhost.xqosd        FIN_WAIT_2 


Does this mean that the BOINC client is not pulling messages off the receive queue??

- Bill
ID: 58749 · Report as offensive
wrooney

Send message
Joined: 10 Dec 14
Posts: 6
United States
Message 58754 - Posted: 19 Dec 2014, 21:07:49 UTC - in response to Message 58749.  

So here is the output of netstat after running BOINC with World Community Grid for about 24 hours, and after I start getting the "Communicating with BOINC Client" message:

Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4      59      0  localhost.xqosd        localhost.52648        ESTABLISHED
tcp4       0      0  localhost.52648        localhost.xqosd        ESTABLISHED


Then if I "Quit BOINC Manager", I get:

Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4      59      0  localhost.xqosd        localhost.52648        CLOSE_WAIT 
tcp4       0      0  localhost.52648        localhost.xqosd        FIN_WAIT_2


Now if I restart BOINC Manager, I still get the "Communicating with BOINC Client" message and netstat now shows:

Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4      59      0  localhost.xqosd        localhost.53495        ESTABLISHED
tcp4       0      0  localhost.53495        localhost.xqosd        ESTABLISHED
tcp4      59      0  localhost.xqosd        localhost.52648        CLOSE_WAIT 
tcp4       0      0  localhost.52648        localhost.xqosd        FIN_WAIT_2 


Now it is interesting that if I restart Mac OS X, these numbers will go back to zero and I will start uploading results again. Since hanging yesterday, rebooting and then hanging again today I returned 3 results.

So I assume after I reboot, I'll again start running normally - for a while.

I see no other users of ports 52648 and 53495.
ID: 58754 · Report as offensive
Profile Gary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2465
United States
Message 58775 - Posted: 20 Dec 2014, 18:20:59 UTC - in response to Message 58754.  

So here is the output of netstat after running BOINC with World Community Grid for about 24 hours, and after I start getting the "Communicating with BOINC Client" message:

Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4      59      0  localhost.xqosd        localhost.52648        ESTABLISHED
tcp4       0      0  localhost.52648        localhost.xqosd        ESTABLISHED

Normal

Then if I "Quit BOINC Manager", I get:

Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4      59      0  localhost.xqosd        localhost.52648        CLOSE_WAIT 
tcp4       0      0  localhost.52648        localhost.xqosd        FIN_WAIT_2

When you quit, did you ask it to quit the science? If so, it did not. (There would be no entries if it did)
Oh, forgot, on Mac it is impossible not to quit the science if you quit the manager .... these entries should not exist at this point.

Now if I restart BOINC Manager, I still get the "Communicating with BOINC Client" message and netstat now shows:

Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp4      59      0  localhost.xqosd        localhost.53495        ESTABLISHED
tcp4       0      0  localhost.53495        localhost.xqosd        ESTABLISHED
tcp4      59      0  localhost.xqosd        localhost.52648        CLOSE_WAIT 
tcp4       0      0  localhost.52648        localhost.xqosd        FIN_WAIT_2 

See the top pair, both are established, they are communicating. The bottom pair, one of the processes is not responding to a signal and doing the cleanup to close the pair of ports. Since closing all files, ports is a requirement to quit and done by the O/S, it can't be the manager. So for some reason the daemon has hung. Possibly it is trying to get a science app to quit and it isn't responding?

Now it is interesting that if I restart Mac OS X, these numbers will go back to zero and I will start uploading results again. Since hanging yesterday, rebooting and then hanging again today I returned 3 results.

So I assume after I reboot, I'll again start running normally - for a while.

I see no other users of ports 52648 and 53495.

From a happy system
 $ uptime
 9:48  up 3 days,  9:42, 2 users, load averages: 9.18 9.26 9.30

In process land
$ ps -Al
  UID   PID  PPID        F CPU PRI NI       SZ    RSS WCHAN     S             ADDR TTY           TIME CMD
    0     1     0 80004004   0  31  0  2508776   2220 -      Ss                  0 ??         5:55.25 /sbin/launchd
...
  501   364   264     4000   0  48  0   818996  32436 -      S                   0 ??         7:07.73 /Applications/BOINCManager.app/Contents/MacOS/BOINCManager -psn_0_98328
...
  502   408   364     4100   0  33  0   683632  20216 -      S                   0 ??         5:25.88 /Applications/BOINCManager.app/Contents/Resources/boinc --redirectio --launched_by_manager
...
  503 11069   408     4000   0  14 19   716416  53680 -      RN                  0 ??       189:44.20 setiathome_7.00_i686-apple-darwin
...

Your UID's may be different, but they are you, bonic_master and bonic_project, in numeric form
The PID's will be different, but the PPID of the "--launched_by_manager" should be the PID of other one.
Your science apps will have the PPID of the "--launched_by_manager"
I included launchd with PID of 1, because if any of the BOINC items have it as the PPID, then they are "orphaned".

Since I didn't write and haven't looked at the code, I'd be speculating that a signal or error message is being ignored and it really needs to be handled. It might even be a MacOS specific error that isn't a Unix(R) error, they are not well documented and with each new version there are more of them.

Now as to what may be causing it, if you are running other programs which may be higher priority and long running, then it is possible that the communication between the manager and the daemon is being timed out because they can't get enough/any run time before default timers expire. System virus scans are one known thing that has caused this in the past. I suspect that many game programs may also cause this. Hate to say it but malware is another possible cause. Know what is running on your system and why.
ID: 58775 · Report as offensive
wrooney

Send message
Joined: 10 Dec 14
Posts: 6
United States
Message 58785 - Posted: 21 Dec 2014, 4:54:54 UTC - in response to Message 58775.  

I'm just starting to realize there may be a pattern to when this problem occurs and when it does not.

My Mac is set up with multiple users, with mine being the only Administrator. I did this just to avoid the possibility of other family members installing malware.

Now typically I leave my system up and logged on all night when I'm home so backups can run and BOINC has some time to play. In the morning before I leave the house I log out of my userid for security reasons (wouldn't want that person breaking into my house to have access to all my files, and my hard drive is encrypted). I do not however shut down the system.

On the weekends I tend to leave the system up with me logged in the entire weekend and often if no one else needs to use the machine I am the only one logged in the entire time.

I haven't had BOINC hang yet this weekend, and it may be the first time I have not had to log out for one reason or another for more than a day. So I'm starting to believe that starting with the install of Yosemite, that maybe when I log out that it is not quitting the science??

Keep in mind that when I restart Mac OS, the problem temporarily goes away.

I'll see if I can keep track of when I log on and off and when BOINC hangs to see if there really is a correlation.

- Bill
ID: 58785 · Report as offensive
wrooney

Send message
Joined: 10 Dec 14
Posts: 6
United States
Message 58962 - Posted: 24 Dec 2014, 16:08:23 UTC - in response to Message 58785.  

Okay, I have been able to run for over three days without a hang in BOINC with World Community Grid. The entire time I kept my computer running with me logged into my user.

On Monday, I logged out to let another family member log in on their user ID and use the computer for a few hours. During this time BOINC did not show signs of a hang, but it may have been too sort a period for the hang to occur. When that person logged out and I logged back into my user, within just a few hours BOINC hung again. By "hung" I mean I received the error message "Communicating with BOINC client. Please wait…", with no way to get out of it short of selecting "Quit BOINC Manager". A restart of just BOINC Manager resulting in the message coming out immediately.

I then restarted Mac OS late on Monday, logged into my account and the message went away and BOINC has not hung since.

So I'm really thinking there is a problem (or an change) that was introduced in Mac OS X Yosemite that is causing this problem.
ID: 58962 · Report as offensive

Message boards : Questions and problems : OS X Yosemite - Communicating with BOINC client. Please wait…

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.