Tasks with "Waiting for memory" status don't free up CPU cores

Message boards : BOINC client : Tasks with "Waiting for memory" status don't free up CPU cores
Message board moderation

To post messages, you must log in.

AuthorMessage
Robert Peetsalu

Send message
Joined: 2 Mar 16
Posts: 1
Estonia
Message 68091 - Posted: 2 Mar 2016, 22:29:51 UTC

Problem:
BOINC task status "Waiting for memory" does not free CPU cores, so they can sit idle indefinitely on systems where there is always less free RAM than the task demands (e.g. 400MB for E@H) and smaller tasks don't get computed because of this.

Cause:
The E@H project is creating big tasks with a high amount of reserved RAM (400MB for more than 24h). It's highly likely that during this time BOINC will get suspended at least once and after unsuspending there will be not enough free/shared RAM for them and they will start "Waiting for memory". Instead of reassigning CPU-s for other tasks BOINC will not use the cores.

Example:
This state has been going on for about a week on my laptop until I aborted the tasks. I usually run 10 tasks simultaneously (8 cores + 2 GPU, 8GB RAM with 25% limit for BOINC during use) but only 6 progressed at any point during that week. The other 4 were under E@H tasks. Soon after aborting them 2 more E@H tasks started "Waiting for memory". I'll suspend E@H until it gets fixed.
ID: 68091 · Report as offensive
Profile Agentb
Avatar

Send message
Joined: 30 May 15
Posts: 265
United Kingdom
Message 68092 - Posted: 3 Mar 2016, 0:28:29 UTC - in response to Message 68091.  

Some discussion about this here
ID: 68092 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 68109 - Posted: 3 Mar 2016, 22:24:01 UTC - in response to Message 68091.  

So 25% of 8 BG is 2 GB.
4*400 MB is 1.6 GB.
That leaves 400 MB free.

I can't tell if the question is 'why couldn't BOINC find a task to fit in that 400 MB' of 'why couldn't BOINC unload an already running 400 MB task and replace it with something smaller'.

Either way, I think we are going to need see a list of tasks that were running and that were available when all this happened. That list would need to include the exact memory usage of the running tasks and the expected memory usage of tasks not yet started. <mem_usage_debug> log flag tells the memory usage of running tasks but I can't seem to find a flag that tells how BOINC takes available memory into account when scheduling tasks.
ID: 68109 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5080
United Kingdom
Message 68110 - Posted: 3 Mar 2016, 22:58:50 UTC - in response to Message 68109.  

... but I can't seem to find a flag that tells how BOINC takes available memory into account when scheduling tasks.

Isn't that specified by each project in the workunit definition?

<rsc_memory_bound>33554432.000000</rsc_memory_bound>

(32 MB) appears in the SETI multibeam spec, for example.

The only Einstein task I have immediately to hand is a BRP6 for intel_gpu, which specifies

<rsc_memory_bound>260000000.000000</rsc_memory_bound>

(<250 MB)
ID: 68110 · Report as offensive
Profile Agentb
Avatar

Send message
Joined: 30 May 15
Posts: 265
United Kingdom
Message 68111 - Posted: 3 Mar 2016, 23:24:31 UTC - in response to Message 68110.  
Last modified: 4 Mar 2016, 0:15:09 UTC

I have a couple more Einstein

<app_name>einstein_O1AS20-100T</app_name>
<rsc_memory_bound>157286400.000000</rsc_memory_bound>

<app_name>einsteinbinary_BRP4G</app_name>
<rsc_memory_bound>260000000.000000</rsc_memory_bound>

Typically they all run run 80-128 MB (resident memory)

except for this app

<app_name>hsgamma_FGRPB1</app_name>
<rsc_memory_bound>450000000.000000</rsc_memory_bound>

which runs about 400MB.

I not quite sure i understand why you would run 100% of processes and 25% memory, perhaps just avoid using large memory apps in that configuration.

If any task is "Waiting for memory", does that stop all other tasks from starting?
ID: 68111 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 68126 - Posted: 4 Mar 2016, 21:49:04 UTC - in response to Message 68110.  

... but I can't seem to find a flag that tells how BOINC takes available memory into account when scheduling tasks.

Isn't that specified by each project in the workunit definition?

<rsc_memory_bound>33554432.000000</rsc_memory_bound>

(32 MB) appears in the SETI multibeam spec, for example.


Well yes. But then there's some code that deals with task's working set size. Which, btw, is on my machine for Seti MB tasks 50-55 MB.

It would just be easier if all that was listed in some debug log message so there wouldn't be need to combine the information from multiple places.

I stumbled upon this ticket: memory and priority idea. No reply from devs. I'm beginning to think that this isn't about scheduler not selecting a task when it could but that the scheduler should optimise resource usage.
ID: 68126 · Report as offensive

Message boards : BOINC client : Tasks with "Waiting for memory" status don't free up CPU cores

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.