Reporting timer?

Message boards : Questions and problems : Reporting timer?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2515
United Kingdom
Message 93556 - Posted: 6 Nov 2019, 6:48:56 UTC - in response to Message 93554.  

I've worked this out myself, but why does this forum not allow me to delete my own message?!?


I don't know if this is an option that this board and all the projects I have tried that use the BOINC software to manage their fora don't have turned on or if it is not there to start with. I suspect the latter as most discussion boards don't have the option for users to delete their messages. What can be done is messages can be hidden by the moderators which is what they do with duplicate messages which occur when the post button is clicked twice because the user has a slow connection and thinks nothing is happening or with abusive messages, spam etc.
ID: 93556 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 93557 - Posted: 6 Nov 2019, 6:57:18 UTC - in response to Message 93554.  

Does anybody know how the Boinc client decides when to report completed tasks? For example, I have Einstein and Milkyway running. A MW tasks takes 40 seconds, it insists on reporting them every 2 or 3 completed tasks, in fact as often as the server allows it to. Yet Einstein will sit with a completed task for an hour without reporting it.

As far as I know, the client only follows the instructions from the project scheduler as to when and how often to connect to the project. That is controlled by the server software as set up by the project scientist. You can override reporting work however with the <report_results_immediately>0</report_results_immediately> parameter in the cc_config.xml file. That is opposite of what you want to do however.
ID: 93557 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 93558 - Posted: 6 Nov 2019, 9:15:50 UTC

Except for the specific case that Keith mentions, BOINC doesn't 'report' tasks: it saves up everything it needs to do, and 'updates' the server all in one go. What you're seeing at Milkyway will be requests for new work (oh, and we'll report the old stuff while we're at it).

John McLeod VII used to keep a boilerplate of six or seven reasons why an update might be triggered - I think some more have been added or changed since he stopped posting.
ID: 93558 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 93560 - Posted: 6 Nov 2019, 15:08:36 UTC

The problem we are having at MW is that if your update includes any reported work, you will request for replacement work and not get any. Only if the update includes no reported results does your request for work get honored. This is a problem on fast hosts that are always reporting work at every update. They will exhaust their project cache limit of 900 tasks quickly but then have to suffer through a 10 minute backoff of not requesting work until the backoff ends and their 900 task cache gets refilled. This does not help keep the hosts busy very well unless they have 0 resource alternate projects.

What we need to figure out is what setting in the server software is causing this behavior that is only apparent at MW. They have something misconfigured in their server.
ID: 93560 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 93561 - Posted: 6 Nov 2019, 16:35:18 UTC - in response to Message 93560.  

Well, this would be the place to look: https://boinc.berkeley.edu/trac/wiki/ProjectOptions

But I must say I've never heard anyone discussing the need for a project setting like that, nor remember seeing one when I've been looking through for something else.
ID: 93561 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 93562 - Posted: 6 Nov 2019, 17:09:00 UTC - in response to Message 93561.  
Last modified: 6 Nov 2019, 17:28:45 UTC

Well, this would be the place to look: https://boinc.berkeley.edu/trac/wiki/ProjectOptions

But I must say I've never heard anyone discussing the need for a project setting like that, nor remember seeing one when I've been looking through for something else.


There has been an ongoing problem at milkyway with 10-15 minute delays before getting data. A lot of discussion but one of the key points was the moderator, Tom, who posted that they knew about the problem and it was "some obscure boinc setting somethere"
https://milkyway.cs.rpi.edu/milkyway/forum_user_posts.php?userid=1351529

It looks like you just found those "obscure boinc settings"

the following looks interesting

<min_sendwork_interval> N </min_sendwork_interval>
Minimum number of seconds between sending jobs to a given host. You can use this to limit the impact of faulty hosts. 


I think the problem is that this value, probably about 160 seconds ??? is OK but the project starts counting from the last time the user uploaded results. They need to start counting from the time they last downloaded. That is just a guess. I did not see anything else in that scheduler configuration that would cause the count to start at the last upload. If they start the count from the last time the user asked for data then that is OK but only if data was actually sent to the user. None is and I think that is the problem.

Are these files available to examine? I assume they are on the server and hidden.

[EDIT]
Was looking at
<next_rpc_delay>x</next_rpc_delay>
In each scheduler reply, tell the clients to do another scheduler RPC after at most X seconds, regardless of whether they need work. This is useful, e.g., to ensure that in-progress jobs can be canceled in a bounded amount of time. 


I wonder if setting that value to be greater than the "min_sendwork_interval" would fix the problem? That should cause the client to wait minimum of 160 seconds (or whatever) before uploading results and attaching the "piggyback" work fetch request.

I asked Tom to send me a copy of the file.
ID: 93562 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 93563 - Posted: 6 Nov 2019, 17:30:04 UTC - in response to Message 93561.  

Well, this would be the place to look: https://boinc.berkeley.edu/trac/wiki/ProjectOptions

But I must say I've never heard anyone discussing the need for a project setting like that, nor remember seeing one when I've been looking through for something else.

Thanks for the link Richard. I'll give it a read and see if I can see anything that might apply.

I guess you don't do MW or frequent their fora. This problem has been ongoing I think since they updated to server 1.04 release earlier in the summer. There are multiple threads discussing the issue and not much coming back from the administrators. I never saw the issue myself until I doubled my resource share temporarily and then watched my 900 cache dwindle down to nothing. I had always maintained a 900 cache level and replaced whatever the number of deficit tasks to bring it back to 900. But that is when I didn't report a completed result every 91 seconds. That is their server connection interval. My resource share was low enough compared to Seti that the MW tasks don't run all that often. And with my spoofed cache I never run out of Seti gpu work on maintenance days so little chance of running MW work exclusively on my cards. With multiple fast cards running, you are bound to report a finished result when the tasks themselves only run for 90-120 seconds on a middling Nvidia card. The AMD card folks are crunching them in 30 seconds on faster cards and even when running multiples on a individual card. For a host running MW exclusively, it is a big problem.

What is needed is a client configuration do not report work during an update.
ID: 93563 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 93564 - Posted: 6 Nov 2019, 17:52:02 UTC - in response to Message 93562.  

Well, this would be the place to look: https://boinc.berkeley.edu/trac/wiki/ProjectOptions

But I must say I've never heard anyone discussing the need for a project setting like that, nor remember seeing one when I've been looking through for something else.


There has been an ongoing problem at milkyway with 10-15 minute delays before getting data. A lot of discussion but one of the key points was the moderator, Tom, who posted that they knew about the problem and it was "some obscure boinc setting somethere"
https://milkyway.cs.rpi.edu/milkyway/forum_user_posts.php?userid=1351529

It looks like you just found those "obscure boinc settings"

the following looks interesting

<min_sendwork_interval> N </min_sendwork_interval>
Minimum number of seconds between sending jobs to a given host. You can use this to limit the impact of faulty hosts. 



I think this one is the one that gives us troubles. On fast hosts that return work pretty much continuously, I wonder if the scheduler interprets the host as returning invalid work which a faulty host would do if it was not working correctly and rapidly erroring out its cache. The scheduler would normally detect that host and put it into a timeout.



I think the problem is that this value, probably about 160 seconds ??? is OK but the project starts counting from the last time the user uploaded results. They need to start counting from the time they last downloaded. That is just a guess. I did not see anything else in that scheduler configuration that would cause the count to start at the last upload. If they start the count from the last time the user asked for data then that is OK but only if data was actually sent to the user. None is and I think that is the problem.

Are these files available to examine? I assume they are on the server and hidden.

[EDIT]
Was looking at
<next_rpc_delay>x</next_rpc_delay>
In each scheduler reply, tell the clients to do another scheduler RPC after at most X seconds, regardless of whether they need work. This is useful, e.g., to ensure that in-progress jobs can be canceled in a bounded amount of time. 



This is the setting that determines our normal 91 second rpc delay at MilkyWay.



I wonder if setting that value to be greater than the "min_sendwork_interval" would fix the problem? That should cause the client to wait minimum of 160 seconds (or whatever) before uploading results and attaching the "piggyback" work fetch request.

I asked Tom to send me a copy of the file.
ID: 93564 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 93565 - Posted: 6 Nov 2019, 18:12:48 UTC - in response to Message 93563.  

I guess you don't do MW or frequent their fora.
I have run their tasks, and jousted with their 'admins', in the past. But I don't judge it as a project worthy of spending any more time or electricity on - unless anybody can supply evidence of sentient life at project management level. You might like to read messages 63074 - possibly relevant to the current problems - and 58550.
ID: 93565 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 93566 - Posted: 6 Nov 2019, 18:41:21 UTC - in response to Message 93564.  
Last modified: 6 Nov 2019, 18:53:02 UTC

[EDIT]
Was looking at
<next_rpc_delay>x</next_rpc_delay>
In each scheduler reply, tell the clients to do another scheduler RPC after at most X seconds, regardless of whether they need work. This is useful, e.g., to ensure that in-progress jobs can be canceled in a bounded amount of time. 
This is the setting that determines our normal 91 second rpc delay at MilkyWay.
I'm not convinced by that.

If you look at a sched_reply file from SETI, it contains

<request_delay>303.000000</request_delay>
That's recognisable as the standard 'shut up and wait' between contacts.

The equivalent file from GPUGrid contains both

<request_delay>31.000000</request_delay>
<next_rpc_delay>3600.000000</next_rpc_delay>
The second one triggers a 'phone home every hour', which is useful in their case to check if any new work has been created recently. I'll keep looking. If you do get a copy of the file, I'd be interested in taking a look at it - in private if necessary.

Milkyway has

<request_delay>91.000000</request_delay>
I'm more interested in

<min_sendwork_interval> N </min_sendwork_interval>
Minimum number of seconds between sending jobs to a given host. You can use this to limit the impact of faulty hosts.
I'm not yet certain that this is the one which emerges as <request_delay>, but I think it's a more plausible candidate.

Edit - candidacy confirmed (I think) by https://github.com/BOINC/boinc/blob/master/sched/sched_types.cpp#L784
ID: 93566 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 93567 - Posted: 6 Nov 2019, 19:06:59 UTC - in response to Message 93566.  
Last modified: 6 Nov 2019, 19:18:56 UTC

I'm not yet certain that this is the one which emerges as <request_delay>, but I think it's a more plausible candidate.

Edit - candidacy confirmed (I think) by https://github.com/BOINC/boinc/blob/master/sched/sched_types.cpp#L784


So does it seem likely that the MW admins have defined
config.min_sendwork_interval

in the
include "sched_types.h"
header file explicitly?

And that is what is in conflict with

<request_delay>91.000000</request_delay>
??

[Edit] Well it isn't in the header file. Looks more likely to be in the project config.xml file or the setup_project.py file.

py/Boinc/setup_project.py
        config.min_sendwork_interval = 0
        config.max_wus_to_send = 50
        config.daily_result_quota = 500
        config.log_dir       = self.project_dir+'log_'+config.host
        if production:
            config.min_sendwork_interval = 6
 Python
Showing the top six matches
Last indexed on Aug 2
sched/sched_config.cpp
// Parse a project configuration file (config.xml)

#ifdef _USING_FCGI_
#include "boinc_fcgi.h"
#endif

#include <cstring>
#include <string>
#include "sched_config.h"

const char* CONFIG_FILE = "config.xml";
const char* CONFIG_FILE_AUX = "config_aux.xml";
 C++
Showing the top six matches
Last indexed on Jan 13
checkin_notes_2003
    - Added mechanism for starting or restarting all back-end processes
        for a project.  A list of the programs are now in config.xml.
        All programs now use lock files to prevent duplicate execution.
            (we changed in "start"; need to change here too)
        - "MIN_SENDWORK_INTERVAL" is read from config.xml (default 0)
Showing the top three matches
Last indexed on Jun 26, 2018
checkin_notes_2004
    - BOINC how has a project.xml file (by default in the same location as
      config.xml) that can contain database information:
        - projects
        - platforms
        - configxml.py contains code specific to config.xml and run_state.XML
        - external interface to configxml.py unchanged

Showing the top two matches
Last indexed on Jun 26, 2018
ID: 93567 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 93573 - Posted: 7 Nov 2019, 0:47:07 UTC - in response to Message 93572.  


What do you mean by "most discussion boards"? If you mean outside of BOINC, then I disagree. Most forums I've used, if you change your mind or make a mistake, you can delete it. I see no advantage of forcing people to leave it there.


There are no advertisements in any of the boinc or project websites and they are not selling any products. I am happy with that. Ford, Toyota communities are funded using advertisements AFAICT
Apple, Microsoft and big players have plenty of money for bells and whistles.

Not sure how stackoverflow gets funded. They have over 80 affiliated sites and have a lot of bells and whistles.

However, if the "right to be forgotten" gets extended to "the right to be erased" then you might get your wish to be able to delete your posts
ID: 93573 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2515
United Kingdom
Message 93576 - Posted: 7 Nov 2019, 7:41:29 UTC - in response to Message 93572.  

What do you mean by "most discussion boards"? If you mean outside of BOINC, then I disagree. Most forums I've used, if you change your mind or make a mistake, you can delete it. I see no advantage of forcing people to leave it there.


By most, I mean all the ones I am on or have used in the past which is about twelve not including BOINC related ones. Admittedly a small sample. Some of those, the editing option is forever whereas here you can only edit within an hour unless you are a moderator. Even then the BOINC software (at least how CPDN have it set up doesn't allow deletion, just hiding. (unless I have missed something.))
ID: 93576 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 93577 - Posted: 7 Nov 2019, 8:36:50 UTC - in response to Message 93576.  
Last modified: 7 Nov 2019, 8:40:20 UTC

Delete option is only for administrators. I can delete posts, whole threads even.

On these forums if you want to have that done, click on the icon under the post you want gone and request it to be hidden or deleted. That will send an email to the moderators and me.

Hiding a post means it's still there just not visible to anyone, only to moderators and administrators. These posts can be unhidden, be available to everyone again.
Deleting means removal out of the database. The post will be gone. It cannot be restored afterwards.
ID: 93577 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 93580 - Posted: 7 Nov 2019, 19:59:15 UTC - in response to Message 93579.  

Oh, adverts, do webpages still have those?

BOINCstats does
ID: 93580 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2515
United Kingdom
Message 93582 - Posted: 8 Nov 2019, 7:04:06 UTC - in response to Message 93581.  

All I see is them getting upset and asking for a donation because I'm refusing to watch their ads. You'd think an adblocker would have the intelligence to download the ad and just not display it. There's no reason the web server should know it's not actually displayed on your screen.


Depends on the reason for using an ad blocker. If on a slow connection like my bored band, it might be to get pages to load more quickly in which case downloading the ad and not displaying it won't help.
ID: 93582 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : Reporting timer?

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.