6.6.5 still ignoring no-new-work

Message boards : BOINC client : 6.6.5 still ignoring no-new-work
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 23033 - Posted: 11 Feb 2009, 4:19:00 UTC

As shown below, got total of 8 tasks even though none were asked for (according to the messages)

ID: 23033 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 23037 - Posted: 11 Feb 2009, 6:32:42 UTC - in response to Message 23033.  

Sorry to say, but that's a useless log for the developers. Turn on the <work_fetch_debug>, <cpu_sched_debug> and <debt_debug> flags and try again. Then send the output thereof to the BOINC Alpha email list. Please follow these instructions if you want to Alpha test development versions. It's quite simple, really.
ID: 23037 · Report as offensive
Alinator

Send message
Joined: 8 Jan 06
Posts: 36
United States
Message 23038 - Posted: 11 Feb 2009, 6:36:07 UTC
Last modified: 11 Feb 2009, 6:37:36 UTC

This isn't a CC problem, it's a backend problem.

It looks to me like SAH Beta and GPUGrid haven't gotten around to updating their software package yet.

<edit> @ Ageless: Note the proper behaviour for Cosmo.

Alinator
ID: 23038 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 23039 - Posted: 11 Feb 2009, 6:58:44 UTC - in response to Message 23038.  

I know that and have said something like that before to BeemerBiker, but he still wants to Alpha test and report it here.
ID: 23039 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 23041 - Posted: 11 Feb 2009, 9:40:53 UTC

This email exchange took place on the BOINC Development mailing list on 30 January.

From: "David Anderson"
To: "Richard Haselgrove"
Cc: "BOINC Developers Mailing List" <boinc_dev@ssl.berkeley.edu>
Sent: Friday, January 30, 2009 12:13 AM
Subject: Re: [boinc_dev] Scheduler Request Calculation

This may be fixed now (beta had an old scheduler)
-- David

Richard Haselgrove wrote:
> There is an added complication at SETI Beta in that the SERVER is sending
> out work even when the client doesn't request it.
>
> http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1524#36670

David often does the SETI server updates himself (he has a dual role as Director of SETI, as well as lead developer for BOINC - see About SETI).

BB's screen-shot suggests that the problem, identified prior to 30 January, has not been cured by whatever upgrade David applied. Which client-side logs (reportable by project participants) would help the BOINC developers track down this server bug?
ID: 23041 · Report as offensive
Profile Leopoldo

Send message
Joined: 11 Feb 09
Posts: 8
Russia
Message 23042 - Posted: 11 Feb 2009, 13:00:53 UTC - in response to Message 23033.  

As shown below, got total of 8 tasks even though none were asked for (according to the messages)


Disagreed. NNT is working now, at least with Win2003, BM 6.6.5, non-service installation.

Both projects (S@H main and beta) had been set to NNT and update button was pressed.
Here is my log.

11-Feb-2009 15:41:59 [SETI@home Beta Test] [sched_op_debug] Starting scheduler request
11-Feb-2009 15:41:59 [SETI@home Beta Test] Sending scheduler request: Requested by user.
11-Feb-2009 15:41:59 [SETI@home Beta Test] Reporting 1 completed tasks, not requesting new tasks
11-Feb-2009 15:41:59 [SETI@home Beta Test] CPU work request: 0.00 seconds; 0 idle CPUs
11-Feb-2009 15:41:59 [SETI@home Beta Test] CUDA work request: 0.00 seconds; 0 idle GPUs
11-Feb-2009 15:42:09 [SETI@home Beta Test] Scheduler request completed: got 0 new tasks
11-Feb-2009 15:42:09 [SETI@home Beta Test] [sched_op_debug] Server version 607
11-Feb-2009 15:42:09 [SETI@home Beta Test] Project requested delay of 7.000000 seconds
11-Feb-2009 15:42:09 [---] [sched_op_debug] handle_scheduler_reply(): got ack for result 03no08aa.3988.4162.15.11.58_1
11-Feb-2009 15:42:09 [SETI@home Beta Test] [sched_op_debug] Deferring communication for 7 sec
11-Feb-2009 15:42:09 [SETI@home Beta Test] [sched_op_debug] Reason: requested by project
11-Feb-2009 15:42:14 [SETI@home] [sched_op_debug] Starting scheduler request
11-Feb-2009 15:42:14 [SETI@home] Sending scheduler request: Requested by user.
11-Feb-2009 15:42:14 [SETI@home] Not reporting or requesting tasks
11-Feb-2009 15:42:14 [SETI@home] CPU work request: 0.00 seconds; 0 idle CPUs
11-Feb-2009 15:42:14 [SETI@home] CUDA work request: 0.00 seconds; 0 idle GPUs
11-Feb-2009 15:42:20 [SETI@home] Scheduler request completed: got 0 new tasks
11-Feb-2009 15:42:20 [SETI@home] [sched_op_debug] Server version 607
11-Feb-2009 15:42:20 [SETI@home] Project requested delay of 11.000000 seconds
11-Feb-2009 15:42:20 [SETI@home] [sched_op_debug] Deferring communication for 11 sec
11-Feb-2009 15:42:20 [SETI@home] [sched_op_debug] Reason: requested by project


As You can see, no new workunits had been downloaded.
ID: 23042 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 23047 - Posted: 11 Feb 2009, 13:56:25 UTC - in response to Message 23042.  

Unfortunately, it doesn't seem to be as simple as that. Both you and BB were "not requesting new tasks". You didn't get any, as is correct: BB got 2 new tasks anyway.

That isn't a NNT problem, it's a server problem. It happens very rarely, which makes it very difficult to isolate and fix (if everyone was getting work they hadn't asked for, all day every day, I think it would have been solved by now!)

All we can say to David Anderson is (a) the problem isn't solved yet: it still happens, though rarely, and (b) we haven't come up with any ideas yet about what we, as client operators, can supply by way of evidence to help him track it down.

All I can suggest is that anyone who actually spots it happening should disable network activity (to prevent files being over-written) and take a copy of the sched_request_ and sched_reply_ XML files for the project.
ID: 23047 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 23048 - Posted: 11 Feb 2009, 14:48:53 UTC - in response to Message 23042.  
Last modified: 11 Feb 2009, 14:55:06 UTC

Chicken Little says 6.6.5 may be working after all

I ran another update with NO-NEW-WORK set and this time got no data. The only thing I can think of is that the WU results that were sent out when I ran the "user update" yesterday had all been prepared by 6.6.4.

This time, all the results going out to the servers had been prepared by 6.6.5 and this is what the log showed(i ran update twice and go the same results as follows):



I will try this on my other system that still has a mix of 664 and 665 results.
ID: 23048 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 23050 - Posted: 11 Feb 2009, 15:59:27 UTC - in response to Message 23047.  

and (b) we haven't come up with any ideas yet about what we, as client operators, can supply by way of evidence to help him track it down.

See my earlier post which requests which flags to turn on.
ID: 23050 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 23051 - Posted: 11 Feb 2009, 16:41:56 UTC - in response to Message 23050.  

and (b) we haven't come up with any ideas yet about what we, as client operators, can supply by way of evidence to help him track it down.

See my earlier post which requests which flags to turn on.

Yes, I saw that one, but surely <work_fetch_debug>, <cpu_sched_debug> and <debt_debug> collectively can only reveal the decision-making process by which the local client came to its conclusion: "not requesting new tasks".

The server shouldn't, AFAIK, be responding to that reasoning: it should only be responding to the actual sched_request...xml. Or are you suggesting, in light of Dj Ninja's "bug or feature?" thread on the mailing list, that the server is in effect saying "hehe - I've checked your calculations on the list of all the work you've got on hand, and you got it wrong: you didn't ask for any work, but you should have done, so I'm sending some to you anyway - so there".

Still, it never hinders the investigation process to have additional information on hand, and possibly having those flags turned on will reveal something relevant. Of course, the one logging tool that would really help our understanding here would be the server log: if the problem was being reported at Einstein, we could click through from the host record to the relevant server scheduler log for the host: but that isn't available at SETI Beta, and I'm not going to email Eric and ask for it while he's in the middle of an application launch.
ID: 23051 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 23053 - Posted: 11 Feb 2009, 18:24:46 UTC - in response to Message 23051.  

Set <sched_op_debug> on, it'll tell you what the scheduler deemed was needed.
ID: 23053 · Report as offensive
[B@H] Wassertropfen
Avatar

Send message
Joined: 11 Feb 09
Posts: 3
Germany
Message 23150 - Posted: 17 Feb 2009, 0:42:31 UTC
Last modified: 17 Feb 2009, 0:43:21 UTC

In 5 hours run out of work again. :( Then the scheduler ask for cuda work from every projct but not from Gpugrid.

Please tell me what the scheduler 6.6.5 need:

17.02.2009 01:40:18 GPUGRID [sched_op_debug] Starting scheduler request
17.02.2009 01:40:18 GPUGRID Sending scheduler request: Requested by user.
17.02.2009 01:40:18 GPUGRID Not reporting or requesting tasks
17.02.2009 01:40:18 GPUGRID CPU work request: 0.00 seconds; 0 idle CPUs
17.02.2009 01:40:18 GPUGRID CUDA work request: 0.00 seconds; 0 idle GPUs
17.02.2009 01:40:23 GPUGRID Scheduler request completed: got 0 new tasks
17.02.2009 01:40:23 GPUGRID [sched_op_debug] Server version 607
17.02.2009 01:40:23 GPUGRID Project requested delay of 31.000000 seconds
17.02.2009 01:40:23 GPUGRID [sched_op_debug] Deferring communication for 31 sec
17.02.2009 01:40:23 GPUGRID [sched_op_debug] Reason: requested by project
Steter Tropfen höhlt den Stein. :)
Constant dripping wears away the stone. :)
ID: 23150 · Report as offensive

Message boards : BOINC client : 6.6.5 still ignoring no-new-work

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.