Message boards : BOINC client : BOINC Allows WU's to go Passed Deadline.
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Aug 05 Posts: 65 |
I've seen this a couple of times but didn't worry too much as the wu's in question were giving 6 cs, but the principle is getting to me now. On my work machine Dell E6850 dual core 2GB Ram, BOINC 6.10.45 now .56, running Aqua, PrimeGrid, LHC, MalariaControl and FreeHAL, I've seen BOINC ignore a wu heading passed its deadline. In fact this morning there was one that was passed it's deadline. This appears to be due to the interaction with the mutli-thread/core Aqua wu's and the single core 'other' projects and resource share allocation resulting in a single core project wu going passed its deadline. Resource Shares are: Aqua 16.55% DNA 22.37% (suspended) FreeHAL 1.12% LHC 42.73% MalariaControl 15.55% PrimeGrid 0.67% Currently only Aqua and PrimeGrid are supplying wu's and due to the resource share allocated to Aqua, only Aqua is being crunched unless PrimeGrid which has 7 wu's cached get's into deadline problems - which it will due to its low resource share. Effectively trying to operate has a backup project. I got into work this morning installed 0.56 and noticed the red messages regarding a PrimeGrid wu passed its deadline from yesterday even though I had just been letting BOINC "do its thing". This PG wu was the only wu with a dead line of the 17th the 6 others have deadlines of the 19th. Aqua wu's have deadlines of the 28th with 45min completion times with 4 cached (limited by Aqua). It appears that with Aqua running and if BOINC doesn't see 2 or more wu's in deadline trouble it will ignore the single wu and continue to crunch the multi-thread/core project. I have also seen on my home machine a wu get into dealine trouble and not be crunched when BOINC worked out that a core would be left idle if it worked on the wu in deadline trouble. [edit] I'll soon see if the same thing happens with the 6 cached PG wu's due tomorrow. |
Send message Joined: 29 Aug 05 Posts: 15575 |
Run with <rr_simulation>, <cpu_sched_debug>, <std_debug> and perhaps an option to make the stdoutdae.txt file a lot bigger than its normal 2MB. Run for the duration of choice, then compress the output file and email it to me. You know where. I'll get it to the developers. In case you feel adventurous, you can run BOINC ClientSim to see if it does the same. Example of cc_config.xml with above choices and a 20MB stdoutdae.txt file: <cc_config> <log_flags> <cpu_sched_debug>1</cpu_sched_debug> <rr_simulation>1</rr_simulation> <std_debug>1<std_debug> </log_flags> <options> <max_stdout_file_size>20971520</max_stdout_file_size> </options> </cc_config> |
Send message Joined: 5 Oct 06 Posts: 5136 |
I think TGG is right to finger the handling of multi-threaded (i.e. AQUA) tasks for this one. We went through several stages of MT scheduling in development testing, including the currently recommended v6.10.18 which has a tendency to leave single-threaded tasks unstarted (or waiting to run) if there is an AQUA task in the mix but not active. I have a dual-core attached to AQUA and QuantumFIRE (you could call it my quantum computer...). At the moment, the AQUA admins are pressing harder for results, so I have NNT set for QF. Each time I do that, I have one orphaned QF task left over, which is never scheduled to run because there's nothing to pair it with - even though debt is on the limit, with +/- 86,400 seconds for the two projects. Up to now, I haven't let anything reach deadline (I've manually allowed the orphan a playmate when deadlines approach, and then they get scheduled together until one completes). But if we still need logs when the time comes (early hours of 24 May for the current one), I can supply them. On the other hand, I think David will recognise that this is a consequence of the current design. |
Send message Joined: 29 Aug 05 Posts: 15575 |
OK, according to JM7 run with the following flags: <cc_config> <log_flags> <task>1</task> <cpu_sched_debug>1</cpu_sched_debug> <rr_simulation>1</rr_simulation> <cpu_sched>1</cpu_sched> </log_flags> <options> <max_stdout_file_size>20971520</max_stdout_file_size> </options> </cc_config> |
Send message Joined: 5 Oct 06 Posts: 5136 |
OK, I've updated my Quantum computer to v6.10.56, brought forward the QF deadline to 22:30 this evening (about 30 minutes longer than BOINC estimates it would need), and set the log flags. I have just three tasks on the machine at the moment - one QF and two AQUA. I suspended all three tasks while I did the fiddling around: then I first resumed the QF. BOINC reported it running High Priority. Then I resumed an AQUA: BOINC preempted the QF, and ran AQUA instead (task duration ~70 minutes, deadline 10 days away). I'll send John (and David?) edited highlights of the log as it approaches and passes the artificial deadline. Fiddling around with Unix time converters, I find that my phone number converts to nest Saturday evening. I don't think that proves very much, but it was diverting.... |
Send message Joined: 29 Aug 05 Posts: 15575 |
Never mind the log. David put in a fix at [trac]changeset:21563[/trac]. However, since we're at the end of 6.10, it's not going to be back-ported, but instead will show up in the next client range. |
Send message Joined: 5 Oct 06 Posts: 5136 |
You mean all those megabytes of 18-May-2010 19:52:58 [QuantumFIRE alpha] [rr_sim] casino_p2-hno_04_parasweep.1000084_0 dur: 26513.53 = 0.335*26079.79 + 0.665*26732.13 are never going to be read? ;-) Just for the record, there's a "misses deadline by 21829.71" in there, and it has fetched new work for AQUA whilst in deadline trouble. I'm comfortable with leaving this in trunk and not holding up v6.10.56 - though if it gets called back yet again, and we have to go through another round of v6.10 testing, I'd suggest including this in the next re-release. |
Send message Joined: 30 Aug 05 Posts: 65 |
Glad to help out...shame it's not considered important enough to make it to a .57 release. WU's going passed deadline due to a bug...pretty important issue. |
Send message Joined: 8 Jan 06 Posts: 448 |
Glad to help out...shame it's not considered important enough to make it to a .57 release. I think that the dev currently considers it more important to get out a new stable released than to have to worry about the new bugs that may come about by putting in a fix that is primarily a problem due to the needs of one project. |
Send message Joined: 29 Aug 05 Posts: 15575 |
I think that the dev currently considers it more important to get out a new stable released than to have to worry about the new bugs that may come about by putting in a fix that is primarily a problem due to the needs of one project. Exactly. It's better to test this fix in a new client, than to add it to what's now, finally, after 56 revisions a reasonably stable client that adds ATI functionality. Remember, the developers didn't find it necessary to use 6.9 as the development number as all that was needed was to add ATI functionality. They figured we'd go on to 6.11 a good 6 months ago! We're way behind on development. Let's hope they learned you don't just 'add something' and that it'll work from the first get-go. :-) |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.