Memory Usage

Message boards : BOINC client : Memory Usage
Message board moderation

To post messages, you must log in.

AuthorMessage
AlphaLaser

Send message
Joined: 14 Sep 06
Posts: 17
United States
Message 10932 - Posted: 16 Jun 2007, 2:19:05 UTC
Last modified: 16 Jun 2007, 2:21:29 UTC

I noticed BOINC is using a very large amount of memory and also CPU time:



I was running 5.9.x, upgraded to the latest 5.10.6 to see that nothing changed in regards to this. This just started occuring, and the only thing I can suspect to cause this is SciLINC. Reason being is that the WUs from this project are miniscule (only a second or two) and each WU seems to deal with a large number of files. In fact, the projects directory for SciLINC lists 740+ files, and client_state.xml is just over 10 MB probably because of all the tags referencing those files. Maybe BOINC is either having trouble dealing with all those files or all the associated file transfers.

BOINC Manager and BOINCView are both failing to contact the daemon, so the only way I'm able to really control BOINC right now is via stdoutdae.txt and manually editing client_state.xml.
ID: 10932 · Report as offensive
AlphaLaser

Send message
Joined: 14 Sep 06
Posts: 17
United States
Message 10936 - Posted: 16 Jun 2007, 3:31:24 UTC
Last modified: 16 Jun 2007, 3:33:41 UTC

Wanted to add that 1) there are now over 1,600 files in the project directory for SciLINC, 2) stdoutdae.txt indicates the project continues to download/upload files even though I've set BOINC to no new work via editing client_state.xml at least an hour ago. To try and get files out of the pending queue I've increased max file transfers per project to 8 through cc_config.xml and confirmed this change in stdoutdae.txt, and 3) I took the following screenshot of the BOINC daemon in Process Explorer if it helps any:



There doesn't appear to be a decent way to contact admins of SciLINC to maybe get the files grouped together somehow.
ID: 10936 · Report as offensive
Profile KSMarksPsych
Avatar

Send message
Joined: 30 Oct 05
Posts: 1239
United States
Message 10937 - Posted: 16 Jun 2007, 3:35:24 UTC

I wouldn't be all that surprised if it was choking that. My client_state is sitting at 97K. boinc.exe is taking 5300 K of memory and boincmgr.exe is taking 2900. So with your client_state file that huge, and BOINC constantly writing to it. And WUs taking only a few seconds with hundreds of files generated... eek.

Can you shut off work fetch for that project in client_state and once everything is gone, exit out of BOINC and see if it goes back to normal on a restart?

I see that project has no forums. I did send the admin a PM through BOINC Stats last night asking him to activate them. This is a great example why projects need their forums running.
Kathryn :o)
ID: 10937 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 10939 - Posted: 16 Jun 2007, 3:41:56 UTC

Editing the client state won't really work... BOINC reads it only at startup, loading data to internal structures. Then it writes it to disk when those internal structures change. If you edit the XML file, internal data on the client is still the same, since it never reloads the file from disk unless you restart BOINC.

10MB client state? Now *that* is insane...
ID: 10939 · Report as offensive
AlphaLaser

Send message
Joined: 14 Sep 06
Posts: 17
United States
Message 10942 - Posted: 16 Jun 2007, 3:57:10 UTC
Last modified: 16 Jun 2007, 4:01:11 UTC


Editing the client state won't really work... BOINC reads it only at startup, loading data to internal structures. Then it writes it to disk when those internal structures change. If you edit the XML file, internal data on the client is still the same, since it never reloads the file from disk unless you restart BOINC.

10MB client state? Now *that* is insane...


I guess I didn't mention but I did in fact stop boinc prior to making any edits to the file. I can't really check but I think BOINC has developed a huge backlog of file transfers to the project. I think the periodic dips in the second screenshot I posted may actually be points where BOINC initiated another set of transfers, I think BOINC does have a short wait period in between those transfers.

Work fetch has been disabled for awhile and I've suspended all other projects in an attempt to get rid of the backlog.
ID: 10942 · Report as offensive
Profile KSMarksPsych
Avatar

Send message
Joined: 30 Oct 05
Posts: 1239
United States
Message 10943 - Posted: 16 Jun 2007, 4:19:55 UTC

The admin had posted in a thread over at BOINC Stats. You might want to PM him. I've asked him to turn the forums on. But I haven't gotten a response yet.
Kathryn :o)
ID: 10943 · Report as offensive
AlphaLaser

Send message
Joined: 14 Sep 06
Posts: 17
United States
Message 10951 - Posted: 16 Jun 2007, 9:18:28 UTC

Hi, I did what you suggested and also gave a PM to the admin. Also from the thread at BoincStats the other testers have reported the same issue. I noted a side effect of this: When I tried to launch BOINC Manager to observe, the manager locked up. I use service install so when I'm physically at the computer to manage I start the BOINC Manager. BOINCView, however, doesn't lock up, it simply reports the host as being unavailable--maybe this is a separate problem with the manager?

It seems as if the number of files in BOINC project directory has peaked to just under 10K, and BOINC itself is still uploading files every minute or so. I'm going to try and sit this one out and see if BOINC will settle down after enough files are out. I've got a gig on affected host so despite the huge resource hogging the computer itself is still responsive enough to use.
ID: 10951 · Report as offensive
Profile KSMarksPsych
Avatar

Send message
Joined: 30 Oct 05
Posts: 1239
United States
Message 10956 - Posted: 16 Jun 2007, 11:35:15 UTC

I remember similar things happening over at RCN when their WUs were generating multiple files and running for only a few seconds at a time. I do remember the manager becoming completely unresponsive.

Kathryn :o)
ID: 10956 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 10967 - Posted: 16 Jun 2007, 16:43:54 UTC - in response to Message 10951.  
Last modified: 16 Jun 2007, 16:44:17 UTC

BOINCView, however, doesn't lock up, it simply reports the host as being unavailable--maybe this is a separate problem with the manager?

Yes there is. The manager stops responding to absolutely anything (including GUI events) while it's waiting for the client to respond. That's what causes the manager to hang if you have no Internet connection too. Also, if you use the manager with a remote host which is physically far away (causing high ping times, 500ms in my example), the manager would be cycling on half a second hanged and half a second responsive.

Good to know BOINCView author knows how network apps have to be written.
ID: 10967 · Report as offensive
AlphaLaser

Send message
Joined: 14 Sep 06
Posts: 17
United States
Message 10970 - Posted: 16 Jun 2007, 19:15:38 UTC

This issue has finally subsided. I boosted max file transfers up to 24 earlier and it finally went back to normal just a few minutes ago.
ID: 10970 · Report as offensive
zombie67
Avatar

Send message
Joined: 14 Feb 06
Posts: 139
United States
Message 10989 - Posted: 17 Jun 2007, 7:18:37 UTC

What a nightmare. I had three machines completely messed up. The only way I could recover was to bake a copy of the BOINC folder, uninstall BOINC, delete the remaining folder, reinstall BOINC, and start over from there.

Now I need to learn how to recover the SAP projects that were almost complete. Pointers?
Reno, NV
Team: SETI.USA
ID: 10989 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 10990 - Posted: 17 Jun 2007, 7:24:10 UTC


Now I need to learn how to recover the SAP projects that were almost complete. Pointers?

From a backup. Otherwise they're gone.
README - Backup and Restore
ID: 10990 · Report as offensive
zombie67
Avatar

Send message
Joined: 14 Feb 06
Posts: 139
United States
Message 10991 - Posted: 17 Jun 2007, 7:57:56 UTC - in response to Message 10990.  


Now I need to learn how to recover the SAP projects that were almost complete. Pointers?

From a backup. Otherwise they're gone.
README - Backup and Restore

Thanks! I'll give it a shot.
Reno, NV
Team: SETI.USA
ID: 10991 · Report as offensive
Ron Parker

Send message
Joined: 18 Jun 07
Posts: 2
Message 11026 - Posted: 18 Jun 2007, 16:41:41 UTC

(This is a crosspost from BOINCStats.)

/me dawns his flame proof fire suit.

Hello all,

I am the one responsible for SciLINC and the problems you have been experiencing.

First of all I would like to apologize for the problems everyone is seeing. When I returned Friday I realized that workunits were not flowing, the forums were disabled and an unexpected number of people and teams had joined the project. I would also like to say thank you to everyone here that has responded with helpful input for diagnosing and correcting these issues.

My first priorities were getting the work flowing again then turning on the forums so that there would be a place for feedback, venting and frustration resolution (that almost sounds politically correct.)

Just to let you all know, we did perform internal testing on the project and did not see these issues. Of the machines we tested, one had problems and that was my Windows development machine. The OpenGL driver was causing problems and the machine lagged a little bit. Nothing like what has been reported here. But that machine is also heavily loaded in a number of ways.

Of course the latency of the Internet is much higher than our local network. Unfortunately, I did not foresee that contributing to these problems.

Updates will be posted to the main SciLINC page as we get our database back on-line and are able to address the work unit issue.

Part of the discussion here has hints on improving SciLINC performance so that your machine may recover better once the database is back up.

Sincerely and regretfully,

Ron Parker
ID: 11026 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 11030 - Posted: 18 Jun 2007, 17:42:02 UTC - in response to Message 11026.  

Fixing the links. URL= doesn't use quotes.

(This is a crosspost from BOINCStats.)

Updates will be posted to the main SciLINC page as we get our database back on-line and are able to address the work unit issue.

Part of the discussion here has hints on improving SciLINC performance so that your machine may recover better once the database is back up.


ID: 11030 · Report as offensive
Ron Parker

Send message
Joined: 18 Jun 07
Posts: 2
Message 11036 - Posted: 18 Jun 2007, 20:35:46 UTC - in response to Message 11030.  

Thanks, Ageless.

That's what I get for copying and pasting from another board.
ID: 11036 · Report as offensive

Message boards : BOINC client : Memory Usage

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.