Multiple WU progress toward completion

Message boards : Questions and problems : Multiple WU progress toward completion
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Ross

Send message
Joined: 28 Apr 10
Posts: 33
United States
Message 32544 - Posted: 3 May 2010, 4:38:46 UTC

On one of the projects I am running, multiple Work Units progress, in turn, toward completion (Rosetta). I have one CPU and have only claimed one CPU in my profile (at SETI under parl). I have asked for 4 days work in advance, mostly to outlast the weekly outages at SETI.

Recently, for example, I had 4 (Rosetta) WU and 3 of them were progressing toward completion. Two had a near report deadline (urgent). The third had a distant deadline but was still taking up processor time. My work around was to suspend it until the 2 urgent WU completed, but I'd rather not have to baby-sit my BOINC projects.

I am using (optimized) BOINC 6.10.43 on an older Windows XP computer. SETI is behaving as I would expect, with all fresh WU remaining in queue until each previous WU is completed. I am asking here because I believe that scheduling of WU is a BOINC issue, not a Rosetta issue.
ID: 32544 · Report as offensive
Ross

Send message
Joined: 28 Apr 10
Posts: 33
United States
Message 32545 - Posted: 3 May 2010, 4:46:29 UTC

I just noticed that 6.10.18 is the recommended client. Should I back off from my current (optimized) 6.10.43?
ID: 32545 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 32546 - Posted: 3 May 2010, 5:05:37 UTC - in response to Message 32545.  

Not unless you happen upon the problem where your GPU doesn't have enough memory on board and BOINC keeps downloading work for it anyway.

My work around was...

Stop doing that and let BOINC work it out. It has a scheduler that can only learn by doing things itself, so just let it. I have a single CPU system that has 4 days worth of work from Seti and Einstein, that I hardly ever look at and that manages just fine, as long as I let it do things by itself.

BOINC uses the First In First Out principle with round robin among projects. It'll run in high priority (earliest deadline first) mode when a task is in danger of not reaching its deadline. In this mode it won't let anything else run.
ID: 32546 · Report as offensive
Snagletooth

Send message
Joined: 22 Jun 07
Posts: 4
Message 32559 - Posted: 3 May 2010, 22:36:25 UTC

Ross, am I correct to assume that you are the same Ross posting this question on the rosetta board? There's been no response to my post in the rosetta thread. Perhaps you haven't seen it or perhaps the flaw in my argument is obvious to you. If it's the latter will you kindly point out where I have gone wrong? It's driving me a little batty as it seems fairly clear to me that memory limitations are the likely culprit but that suggestion, made by both me and mod.sense, has gotten no response from you. If you have ruled memory out as a factor could you explain how you did so?

And could you clarify one other thing? Are you seeing the designation "High Priority" next to the running task? It would appear in the status column. This is how BOINC indicates the task may by in deadline trouble. I understood that your description of a task as "urgent" indicated your fear that it was in deadline trouble but that BOINC was not actually running any tasks in "High Priority" mode. Is this correct?

Snags
ID: 32559 · Report as offensive
Ross

Send message
Joined: 28 Apr 10
Posts: 33
United States
Message 32562 - Posted: 4 May 2010, 2:49:03 UTC

Sangletooth,

I'll go back to Rosetta to post, as more of the questions are there.

Ross
ID: 32562 · Report as offensive
Ross

Send message
Joined: 28 Apr 10
Posts: 33
United States
Message 32708 - Posted: 10 May 2010, 17:59:51 UTC

I let BOINC do its thing and I eventually got 5 WU for Rosetta. (I only have 1 for SETI but that's because the project keeps saying there are none available.)

The five WU are due 5/14, 5/16, 5/16, 5/19, & 5/19. All Rosetta work had accumulated on the 5/14 WU until I looked today, when the later 5/19 is now Running high priority. Checking the messages there were no mentions of memory, insufficient or otherwise.

Checking the Task Manager (Win XP) I see that the running WU (5/19 34%) has 122,144 K RAM, while the Waiting to run WU (5/14 73%) has 3,096 K RAM. While I will continue to observe these tasks, I'll keep my grubby fingers off of them, so BOINC can do its thing. My concern is that BOINC will preferentially schedule the WU due later and starve the WU due earlier, to the point where it may miss its deadline. BTW, Rosetta WU tend to to take 25 to 30 hours elapsed time. While running, the task tends to take 96 to 99% of the CPU, as my poor little fingers typing this note don't make much of a drain on it.
ID: 32708 · Report as offensive
Snagletooth

Send message
Joined: 22 Jun 07
Posts: 4
Message 32756 - Posted: 12 May 2010, 17:42:50 UTC

To Ageless and other BOINC gurus:

Will you confirm that BOINC receives information on the memory requirements of a particular workunit, so that it can skip over a workunit for which there is currently insufficient memory to run and go straight to another workunit without leaving a "waiting for memory" message?

On the Rosetta board Ross has now posted that all of the workunits have started and made some progress except for one of the two with May 16 deadlines. He has seen some running at High Priority but has also downloaded another rosetta task. He has not spotted a "waiting for memory" message. FYI most rosetta WUs (and all of Ross') have 10 day deadlines and Ross' should all have project estimated cpu runtimes of 24 hours.

It's been pointed out to me that my understanding of memory use and management is incomplete at best but I still think my theory, that memory limitations are the explanation for what Ross is seeing, is still valid. If this is complete hogwash could someone please say so. I have begun to obsess and it would be kind of someone to put me out of my misery.

Thanks,
Snags




ID: 32756 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 32757 - Posted: 12 May 2010, 17:51:57 UTC - in response to Message 32756.  

Will you confirm that BOINC receives information on the memory requirements of a particular workunit, so that it can skip over a workunit for which there is currently insufficient memory to run and go straight to another workunit without leaving a "waiting for memory" message?

As far as I know, no. The "waiting for memory" message will only show when the task (the application) in question has actually run out of memory to use.

It can skip tasks when it thinks it will have time to run them later on, while it will try to run all tasks by deadline, so that can account for some of the later deadlines to run in high priority.

I think Ross should see, if he leaves everything alone and doesn't force anything, that the tasks with the May 16 deadline will start within the next 48 hours.
ID: 32757 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 32758 - Posted: 12 May 2010, 18:24:22 UTC - in response to Message 32757.  

I think Ross should see, if he leaves everything alone and doesn't force anything, that the tasks with the May 16 deadline will start within the next 48 hours.

Having thought about this one, if he has multiple tasks with the same deadline, he may see that BOINC starts them for a bit, stop them, run some others with the same deadline, stop those, run the first, stop them etc. until it's reasonably sure it can reach the deadline on all of them.

But, depending on which BOINC version Ross uses at this time, this can go wrong. In pre-6.10.43 versions there is a minimal bug in the rr_simulation calculations. This can account for missed deadlines.
ID: 32758 · Report as offensive
Snagletooth

Send message
Joined: 22 Jun 07
Posts: 4
Message 32762 - Posted: 12 May 2010, 20:46:59 UTC - in response to Message 32757.  


As far as I know, no. The "waiting for memory" message will only show when the task (the application) in question has actually run out of memory to use.



So BOINC would have to start up the task and immediately stop it if there isn't enough memory to run it? And then would produce the "waiting for memory message? I know Rosetta makes estimates of the amount of memory a task will require so the server won't send tasks with high requirements to computers reporting the minimum available but didn't know if the client can see and make use of this estimate as well.

Thanks,
Snags
ID: 32762 · Report as offensive
Ross

Send message
Joined: 28 Apr 10
Posts: 33
United States
Message 32764 - Posted: 12 May 2010, 21:43:02 UTC

I have 1 Rosetta WU due 5/14, 92.990% complete, 1:50:07 estimated time to completion. It hasn't accumulated much time recently

I have 1 R WU due 5/16 10:32:57 PM Ready to start.
I have 1 R WU due 5/16 10:32:57 PM also, 65.157% complete, 9:16:27 estimated. (Now running)
I have 1 R WU due 5/19 7:45:32 AM, 7.395% complete, 25:18:33 estimated.
I have 1 R WU due 5/19 10:42:38 PM, 72.406% complete, 7:17:04 estimated.
I have 1 new R WU due 5/21 Ready to start.

I have 1 SETI WU due 6/26, 88.719% complete, 00:35:39 estimated. (Only SETI, 50% share)

I'm seriously considering suspending the 5/21 WU so that it does not start for a while.

I have never seen a Waiting for memory message here, although I only have 2.5 Gig.
ID: 32764 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 32765 - Posted: 12 May 2010, 22:00:51 UTC - in response to Message 32764.  

Don't fiddle !!!
Seriously !!!

If you can't stop yourself from "helping" BOINC, then take up long distance running, or something similar that will keep you away from the computer for a few weeks.

ID: 32765 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 32766 - Posted: 12 May 2010, 22:32:07 UTC - in response to Message 32762.  

So BOINC would have to start up the task and immediately stop it if there isn't enough memory to run it?

No, not immediately. If it does it immediately, there's either really not much memory in the system, or the initial memory needs of the app are ginormous.

It usually happens later in the run.
It also depends on what the user has set in his preferences, local or via the web, on how much memory BOINC should use when it's actively or idly running work.

... but didn't know if the client can see and make use of this estimate as well.

No, the client uses the memory it detects, which are overridden by the preferences set for:

Use at most X% of memory when computer is in use
Use at most Y% of memory when computer is not in use

The local (advanced) preferences override the web-preferences.
ID: 32766 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 32767 - Posted: 12 May 2010, 22:33:34 UTC - in response to Message 32764.  

I'm seriously considering suspending the 5/21 WU so that it does not start for a while.


I agree with Les on this one:

Don't fiddle !!!
Seriously !!!

If you can't stop yourself from "helping" BOINC, then take up long distance running, or something similar that will keep you away from the computer for a few weeks.



:-)
ID: 32767 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 32769 - Posted: 12 May 2010, 23:11:13 UTC - in response to Message 32768.  

..think the newer clients try to learn what science app requirements are from running the tasks

Not that I have heard of; I even foresee that's a little problematic as even on the most used projects the memory use differs per task, even if the same application is used.
ID: 32769 · Report as offensive
Ross

Send message
Joined: 28 Apr 10
Posts: 33
United States
Message 32778 - Posted: 13 May 2010, 20:02:28 UTC

I just realized that there's a discrepancy between BOINC scheduling SETI and Rosetta. I have NEVER seen BOINC schedule multiple SETI WU or even schedule a SETI WU out of Report deadline order. Yet that's consistently the behavior I see for Rosetta. I don't think we can blame this on Rosetta, since it's BOINC (6.10.43) doing the scheduling, not minirosetta 211.

Where can I find out why the BOINC releases after 6.10.18 have been withdrawn? This would help me decide if I should back off to the earlier release. I've read the thread about the release of 6.10.43 and 44, but it doesn't seem to address the problems I have, rather more on Mac.
ID: 32778 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 32779 - Posted: 13 May 2010, 20:07:00 UTC - in response to Message 32778.  

Where can I find out why the BOINC releases after 6.10.18 have been withdrawn? This would help me decide if I should back off to the earlier release. I've read the thread about the release of 6.10.43 and 44, but it doesn't seem to address the problems I have, rather more on Mac.

Um, you read the thread on releasing .43/.44 but missed the reason why they were pulled? Even though I edited it into the first post and put it into the thread in its own post?

I must do better. ;-)

Rom Walton wrote:
Earlier today we pulled the last round of stable clients and rolled back to the stable clients that were available in early December.

A bug was introduced in 6.10.25 where the core client would continuously download new work from projects where the total GPU ram was enough to run the GPU app but not enough was available at run-time to actually run the application without crashing. This bug was fixed in 6.10.46.

As a result of having to pull the previous stable build we are moving forward with the 6.10.50 build as a potential release candidate build. I have adjust the test grouping to enable all of them now.

We really need to get a new stable version of the Mac client out the door, CUDA support for the Mac is not in the current stable Mac client.

Please report your results, good or bad, as quickly as possible.

----- Rom

ID: 32779 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 32781 - Posted: 13 May 2010, 21:05:21 UTC - in response to Message 32780.  

and even if off, think a task is held in memory until first checkpoint is reached

BOINC runs the applications until the task's first checkpoint. Which is fun on apps that don't checkpoint, their tasks run start to finish without giving up the CPU.
ID: 32781 · Report as offensive
Ross

Send message
Joined: 28 Apr 10
Posts: 33
United States
Message 32804 - Posted: 14 May 2010, 23:43:32 UTC - in response to Message 32779.  

Um, you read the thread on releasing .43/.44 but missed the reason why they were pulled? Even though I edited it into the first post and put it into the thread in its own post?

I must do better. ;-)

Rom Walton wrote:
Earlier today we pulled the last round of stable clients and rolled back to the stable clients that were available in early December.

A bug was introduced in 6.10.25 where the core client would continuously download new work from projects where the total GPU ram was enough to run the GPU app but not enough was available at run-time to actually run the application without crashing. This bug was fixed in 6.10.46.

As a result of having to pull the previous stable build we are moving forward with the 6.10.50 build as a potential release candidate build. I have adjust the test grouping to enable all of them now.

We really need to get a new stable version of the Mac client out the door, CUDA support for the Mac is not in the current stable Mac client.

Please report your results, good or bad, as quickly as possible.

----- Rom


I did read that but failed to notice that 6.10.46 was later than .43 I'm running. I recognized that .50 was later but that was future so I browsed past it. I guess since I don't have an appropriate GPU, the bug doesn't apply to me.

I think I may have a clue as to the perceived misbehavior, but it would take a lot of effort to confirm it.

I read somewhere that BOINC schedules WU (within a project) in the order in which they were d/l, absent some pre-defined urgency in the Report deadline. I suppose it's possible that Rosetta WUs were d/l in an order other than deadline order. Still, that doesn't explain why two WU due on the 19th were accumulating time together (one due AM the other PM).

BTW, is Report deadline time UTC?
ID: 32804 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 32814 - Posted: 15 May 2010, 7:04:33 UTC - in response to Message 32804.  

BTW, is Report deadline time UTC?

A task deadline is a fixed, universal point in time, and it's displayed in UTC on project websites because somebody might be looking at it from anywhere in the world.

However, the BOINC Manager on your own machine will display the deadline in whatever local time setting you use. It's really quite clever about that: if you're looking at a task in winter, and the deadline isn't until the summer, BOINC and your computer's operating system should co-operate to make the daylight saving correction to the deadline.
ID: 32814 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : Multiple WU progress toward completion

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.