Is it possible to increase delay between result upload and reporting?

Message boards : Questions and problems : Is it possible to increase delay between result upload and reporting?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 19861 - Posted: 31 Aug 2008, 17:35:29 UTC

Look at this thread for reason of such question (in short, small delay sometimes lead to "Validate error" when validator didn't find just uploaded result)
http://setiathome.berkeley.edu/forum_thread.php?id=49100

ID: 19861 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19862 - Posted: 31 Aug 2008, 19:29:35 UTC

Do you run a cc_config.xml file that has this entry in it?
<report_results_immediately>1</report_results_immediately>

If you do, change it to this:
<report_results_immediately>0</report_results_immediately>

Save the file.
Exit BOINC and restart it.

Then BOINC will report after 24 hours (as normal).
ID: 19862 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 19863 - Posted: 31 Aug 2008, 19:35:15 UTC

cc_config:

<cc_config>
<log_flags>
<task>1</task>
<file_xfer>1</file_xfer>
<sched_ops>1</sched_ops>
<cpu_sched>0</cpu_sched>
<cpu_sched_debug>0</cpu_sched_debug>
<rr_simulation>0</rr_simulation>
<debt_debug>0</debt_debug>
<task_debug>0</task_debug>
<work_fetch_debug>0</work_fetch_debug>
<unparsed_xml>0</unparsed_xml>
<state_debug>0</state_debug>
<file_xfer_debug>0</file_xfer_debug>
<sched_op_debug>0</sched_op_debug>
<http_debug>0</http_debug>
<proxy_debug>0</proxy_debug>
<time_debug>0</time_debug>
<http_xfer_debug>0</http_xfer_debug>
<benchmark_debug>0</benchmark_debug>
<measurement_debug>0</measurement_debug>
<poll_debug>0</poll_debug>
<guirpc_debug>0</guirpc_debug>
<scrsave_debug>0</scrsave_debug>
<app_msg_send>0</app_msg_send>
<app_msg_receive>0</app_msg_receive>
<mem_usage_debug>0</mem_usage_debug>
<network_status_debug>0</network_status_debug>
<checkpoint_debug>0</checkpoint_debug>
</log_flags>
<options>
<save_stats_days>30</save_stats_days>
<dont_check_file_sizes>0</dont_check_file_sizes>
<http_1_0>0</http_1_0>
<ncpus>0</ncpus>
<max_file_xfers>8</max_file_xfers>
<max_file_xfers_per_project>2</max_file_xfers_per_project>
<work_request_factor>1,000000</work_request_factor>
<suppress_net_info>0</suppress_net_info>
<disallow_attach>0</disallow_attach>
<os_random_only>0</os_random_only>
</options>
</cc_config>


Nothing about <report_results_immediately> entry.
It reports when ask new work (sometimes right after uploading)
ID: 19863 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19864 - Posted: 31 Aug 2008, 19:40:17 UTC

What's your connect to server and additional work request set to?
ID: 19864 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 19865 - Posted: 31 Aug 2008, 19:43:17 UTC - in response to Message 19864.  

What's your connect to server and additional work request set to?

At moment of error they where 4+4.

Just example showing that BOINC want report tasks when it ask new work:
2008.08.31 22:41:20|SETI@home|Sending scheduler request: To fetch work. Requesting 10177 seconds of work, reporting 0 completed tasks

ID: 19865 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19866 - Posted: 31 Aug 2008, 19:48:01 UTC - in response to Message 19865.  
Last modified: 31 Aug 2008, 19:48:30 UTC

Well if it's 4+4, the 24 hour delay should be in effect. Unless you didn't have any work in queue anymore and BOINC tried to download all new. Can you remember if there was any work in queue? Or was that task in deadline trouble?

For BOINC 5.8 and above: Completed work is reported at the first of:
1) 24 hours after completion (was Connect every X after completion).
2) 24 hours before report is due.
3) Connect every X before report is due.
4) On a trickle up message (CPDN only so far).
5) On a request for more work.
6) On a manual update.
ID: 19866 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 19867 - Posted: 31 Aug 2008, 19:57:48 UTC - in response to Message 19866.  
Last modified: 31 Aug 2008, 19:58:53 UTC

Well if it's 4+4, the 24 hour delay should be in effect. Unless you didn't have any work in queue anymore and BOINC tried to download all new. Can you remember if there was any work in queue? Or was that task in deadline trouble?

For BOINC 5.8 and above: Completed work is reported at the first of:
1) 24 hours after completion (was Connect every X after completion).
2) 24 hours before report is due.
3) Connect every X before report is due.
4) On a trickle up message (CPDN only so far).
5) On a request for more work.
6) On a manual update.


Thanks for listing. In my case X should be == 4 days?
Unfortunately BOINC was run unattended that time, I noticed error only later by browsing completed results on project web-page. So cant' tell if some more work exist at that time or no.
But it seems 5) can lead to this issue. BOINC tries to request more work after (or even in parallel) it completes task, so time between result upload and result reporting can be arbitrary small.
If I understand right, 1) and 2) will apply only if BOINC not needed new work, i.e. when 5) doesnt apply.
So, it would be nice to have separate setting that governs exactly interval between uploading task and reporting the same task.
ID: 19867 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 19868 - Posted: 31 Aug 2008, 20:01:37 UTC

For example part of current log:
2008.08.31 23:44:20|SETI@home|Sending scheduler request: To fetch work. Requesting 4 seconds of work, reporting 0 completed tasks
2008.08.31 23:44:30|SETI@home|Scheduler request succeeded: got 0 new tasks
2008.08.31 23:45:32|SETI@home|Sending scheduler request: To fetch work. Requesting 3 seconds of work, reporting 0 completed tasks
2008.08.31 23:45:37|SETI@home|Scheduler request succeeded: got 0 new tasks
2008.08.31 23:46:38|SETI@home|Sending scheduler request: To fetch work. Requesting 1 seconds of work, reporting 0 completed tasks
2008.08.31 23:46:43|SETI@home|Scheduler request succeeded: got 1 new tasks
2008.08.31 23:46:45|SETI@home|Started download of 05jl08ab.8413.7839.12.8.25
2008.08.31 23:46:58|SETI@home|Sending scheduler request: To fetch work. Requesting 3 seconds of work, reporting 0 completed tasks
2008.08.31 23:46:59|SETI@home|Finished download of 05jl08ab.8413.7839.12.8.25
2008.08.31 23:47:03|SETI@home|Scheduler request succeeded: got 0 new tasks
2008.08.31 23:48:05|SETI@home|Sending scheduler request: To fetch work. Requesting 10 seconds of work, reporting 0 completed tasks
2008.08.31 23:48:20|SETI@home|Scheduler request succeeded: got 0 new tasks
2008.08.31 23:49:21|SETI@home|Sending scheduler request: To fetch work. Requesting 5 seconds of work, reporting 0 completed tasks
2008.08.31 23:49:26|SETI@home|Scheduler request succeeded: got 0 new tasks
2008.08.31 23:50:27|SETI@home|Sending scheduler request: To fetch work. Requesting 12 seconds of work, reporting 0 completed tasks
2008.08.31 23:50:32|SETI@home|Scheduler request succeeded: got 0 new tasks
2008.08.31 23:51:33|SETI@home|Sending scheduler request: To fetch work. Requesting 9 seconds of work, reporting 0 completed tasks
2008.08.31 23:51:43|SETI@home|Scheduler request succeeded: got 0 new tasks
2008.08.31 23:52:45|SETI@home|Sending scheduler request: To fetch work. Requesting 20 seconds of work, reporting 0 completed tasks

It repeats asking every minute. If some task would be completed in that interwal it would be reported almost immediately after uploading.

ID: 19868 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 19872 - Posted: 31 Aug 2008, 21:37:47 UTC

I'd like to support Raistmer's suggestion for an enforced delay following task completion and result upload - in fact more than that, a 'protected interval' following task completion when scheduler requests are actively inhibited.

This story goes back almost two years, to October 2006. I was questioning whether 'Return Results Immediately' was really such a costly event in BOINC database terms. The discussion is in the SETI@home thread Optimized Clients??, and you can see from Rom Walton's input that it was this discussion that led to his blog entry The evils of 'Returning Results Immediately'.

Can I repeat my observation from that research: that for 20 out of the 81 results finished during the test, BOINC initiated a "work fetch" scheduler contact as soon as a task finished, in parallel with the process of uploading the result file.

I'm convinced that this is no coincidence. I think that it relates to the re-calculation of RDCF when a task exits: if RDCF is decreased, then the work queue seems to shrink. This very afternoon I've been watching a host where the length of the work buffer (as expressed by BoincView) has been decreasing by an hour each time a task finishes.

This has become more significant recently, because within the last month SETI@home has introduced a highly-effective saw-tooth wave generator for RCDF. It's called Astropulse.

Astropulse WUs are big, slow, and rare. They have been deliberately introduced with a duration estimate which will tend to an RDCF of around 0.4 on Core2 architecture CPUs.

They are typically run interleaved with Multibeam WUs, which are short, fast and common. If the same CPU is running them with the stock SETI science app, they tend towards an RDCF of ~0.25: if, as is often the case, the stock SETI app has been replaced with an optimised one, RDCF can in extreme cases get as low as 0.1

So a SETI host doing mixed work will see a sudden jump in RDCF as an AP task finishes, followed by an extended gentle decline as the commoner MB tasks pass through. This is exactly the scenario where work fetch is triggered at the instant each (MB) task finishes.

I was arguing two years ago that an enforced moratorium on scheduler requests would result in more efficient server operation, because many results would be reported "this time" (along with the delayed work request), rather than "next time". To that increased efficiency, we can add the (potential) elimination of these 'validate errors' and the redundant work re-issue that they entail.

I have a slight feeling that, following the previous discussion, I started to see the sort of back-off I'm proposing, in the form of a one-minute comms delay, but only in the case of tasks that exit with a science app error. Unfortunately, I don't generate many errors, so I can't look up a log entry at short notice! But there's no comms inhibition for normal status 0/finished file exits: I think there should be.
ID: 19872 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19877 - Posted: 31 Aug 2008, 22:13:14 UTC - in response to Message 19868.  

It repeats asking every minute. If some task would be completed in that interwal it would be reported almost immediately after uploading.

The deferral interval is set by the Seti scheduler, so you'll have to go back to the Seti forums and ask if someone at Seti changes the event interval. If I am not mistaken, the maximum deferral at Seti is 10 minutes.
ID: 19877 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 19880 - Posted: 31 Aug 2008, 22:19:37 UTC

It's also a problem with cpdn.
Last night one of my quad computers finished 2 hi-res models within an hour of each other. It takes about an hour and 10-15 minutes to upload the 5 zip files for each, and while those for the 2nd model where uploading, one of the other 2 still running uploaded a trickle, which also caused the 1st hi-res model to be reported.
If there's any problems with the servers, this can result in the 'report' arriving before the 'data'.

So far it's not caused a problem, but a minimum interval would be as useful as the several maximums currently used.

ID: 19880 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19881 - Posted: 31 Aug 2008, 22:24:32 UTC - in response to Message 19880.  

In your case I would more like ask for a check if the data has uploaded already, before trying to report. Has anyone at CPDN ever reported this behaviour to the BOINC devs?

ID: 19881 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 19882 - Posted: 31 Aug 2008, 22:35:42 UTC - in response to Message 19877.  

It repeats asking every minute. If some task would be completed in that interwal it would be reported almost immediately after uploading.

The deferral interval is set by the Seti scheduler, so you'll have to go back to the Seti forums and ask if someone at Seti changes the event interval. If I am not mistaken, the maximum deferral at Seti is 10 minutes.

You're showing your age here. SETI used to be set to 10-minutes-and-a-bit deferral, but it's been 11 seconds for a long time now - see message 453335. (Flicked past that one while I was looking for the RRI thread!).
ID: 19882 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 19884 - Posted: 31 Aug 2008, 22:47:56 UTC - in response to Message 19883.  

In your case I would more like ask for a check if the data has uploaded already, before trying to report. Has anyone at CPDN ever reported this behaviour to the BOINC devs?

Thought the enforce interval with <report_results_immediately> was 60 seconds. We had some discussion on this as it was seemingly immediately, but with 6.2 clients, probably even 5.10.45 that 60 seconds was adhered to. Time delay great, but maintain a minimum and confer with project managers as the whole function is not necessarily appreciated.

thanks

There was discussion about this relating to a certain third-party BOINC client which sported RRI as one of its attractions, long after the code had been deprecated in the official BOINC clients.

When first released, that third-party client did indeed return results immediately, with the unfortunate side-effect that many tasks at SETI were rejected with validate errors - as Les puts it, the 'report' arriving before the 'data'.

The last I heard, a 60-second delay had been added, but not properly debugged - on multicores, the counter wasn't reset if a second task finished within the minute. So if core 0 finished a task at time T, and core 1 finished at T+55 seconds, the second result could be reported 5 seconds after completion.
ID: 19884 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19885 - Posted: 31 Aug 2008, 22:51:33 UTC - in response to Message 19883.  

Sekerob wrote:
confer with project managers as the whole function is not necessarily appreciated.

The option came back as it is by request of WCG...

Richard wrote:
You're showing your age here. SETI used to be set to 10-minutes-and-a-bit deferral, but it's been 11 seconds for a long time now

Then go ask Seti to increase the deferral set by the scheduler.

Although I wonder what that age comment has to do with things. It won't get you any further towards a solution to imply I am blind and incompetent since you can find things doing searches on Seti, whereas I was solely stating that the deferral is set by Seti and that you have to bark at their tree to get things changed.
ID: 19885 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 19886 - Posted: 31 Aug 2008, 23:02:02 UTC - in response to Message 19885.  
Last modified: 31 Aug 2008, 23:02:48 UTC

Richard wrote:
You're showing your age here. SETI used to be set to 10-minutes-and-a-bit deferral, but it's been 11 seconds for a long time now

Then go ask Seti to increase the deferral set by the scheduler.

Although I wonder what that age comment has to do with things. It won't get you any further towards a solution to imply I am blind and incompetent since you can find things doing searches on Seti, whereas I was solely stating that the deferral is set by Seti and that you have to bark at their tree to get things changed.

Sorry, sorry, ..... just trying to insert a bit of light-heartedness. My only point was that your recollection of a 10-minute delay was correct, but that it had been changed some time ago and was no longer current.

Anyway, that delay is irrelevant to the point under discussion. The delay you're referring to specifies the minimum interval between scheduler requests. It determines the time of the second (and subsequent) scheduler request, given that the first one has already taken place.

I'm concerned about the timing of the first scheduler request in a sequence, which is entirely at the determination of the local BOINC client. As I hope I've demonstrated, there's a significant (and explicable) correlation between 'task exit' events and 'work fetch' events. If the BOINC client (free-standing, not under server direction) could insert a delay here, the overall work-flow would be more efficient.
ID: 19886 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19887 - Posted: 31 Aug 2008, 23:06:10 UTC - in response to Message 19886.  

If the BOINC client (free-standing, not under server direction) could insert a delay here, the overall work-flow would be more efficient.

Go by Trac and request it as an enhancement. While you're at it, ask for a check if work has actually completed the upload before trying to report it.

Or seduce JM7 to come take a look and comment on it. ;-)
ID: 19887 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 19888 - Posted: 31 Aug 2008, 23:09:10 UTC - in response to Message 19881.  

In your case I would more like ask for a check if the data has uploaded already, before trying to report. Has anyone at CPDN ever reported this behaviour to the BOINC devs?


I don't recall any talk about it being reported, but it may have been.
It's a fairly modern problem, I guess, since the arrival of quad cores.
And it could mostly affect those running shorter models.

I sometimes suspend models that look like finishing too close together, to spread them out a bit, and also suspend others not uploading to stop them trickling at the wrong time, but it looked OK last night, so I didn't bother.
But it still snuck in a trickle while I wasn't watching.

ID: 19888 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19890 - Posted: 31 Aug 2008, 23:20:49 UTC - in response to Message 19889.  

Assuming you know the history why it's been reinstated, no further words need wasting on that angle.

I know this: There is a good chance it'll be unavailable again in a future version of BOINC, or that WCG goes back to releasing their own versions of BOINC with less of the options that the present version has, since they apparently don't like some of the changes made and some of the options added (specifically for them).

But I'll leave that good fight over to the devs of BOINC and WCG. ;-)
ID: 19890 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 19893 - Posted: 1 Sep 2008, 7:29:44 UTC

Seeing no one jumped to make a ticket, I made one. ;-)
[trac]#728[/trac]
ID: 19893 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : Is it possible to increase delay between result upload and reporting?

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.