Info | Message |
---|---|
21) Message boards : Web interfaces : Typo ... maintenence
Message 25069 Posted 26 May 2009 by Ananas |
Project down for maintenence |
22) Message boards : Server programs : Scheduler 12674 sends work on a 0 seconds request
Message 22610 Posted 23 Jan 2009 by Ananas |
Thanks :-) It does not happen on each contact, just once. SETI seems to have disabled the host connect log, so I couldn't check what the server "thought" about that request. |
23) Message boards : Server programs : Scheduler 12674 sends work on a 0 seconds request
Message 22605 Posted 23 Jan 2009 by Ananas |
http://setiathome.berkeley.edu/forum_thread.php?id=51564 In short : Project set to NNW, server contact in order to report some finished results. sched_request => work_req_seconds=0 but it assigned 20 results. |
24) Message boards : Server programs : Won't finish in time ...
Message 21385 Posted 19 Nov 2008 by Ananas |
No need for lots of projects with a non-steady WU flow. I'll give a simple example, where LTD is messy : On a dual task machine, load one long running CPDN model plus any other project for the second task. While the CPDN model runs, it piles up hundreds of thousands of LTD, even if it does crunch all the time, whereas the second project gets the same value as negative debits. After a month, add a third project - it will start with zero debits - but as the second project has piled up so many negative debits, it will download only stuff from the third project for several weeks. The user will go to that project and blame the project developers, that their project doesn't respect his share settings. The answer is always the same : Edit your client_state.xml and remove all lines with LTD tags. I have read lots of threads like that. Imo., the short term debits are good enough for the CPU time distribution, people understand that. It can easily happen, that a project is supposed to crunch (positive STD) but isn't allowed to download work (negative LTD). As LTD affects the caching, this can even make the client ignore the cache settings completely and download a new WU only when the cache is totally empty. LTD works fine only in one situation : All projects get attached at the same time, no project gets a reset and all projects deliver a constant WU flow. There are projects with long running WUs out there, there are projects with a sparse flow of WUs and there are projects that pop up and disappear after a few months, so BOINC needs to get along with those somehow. edit : I guess it would help, to use the trickle_ups for the decay of LTDs on projects that use trickles. It would not solve the problem completely, but at least it would fix the major LTD problem that all CPDN crunchers have. |
25) Message boards : Server programs : Won't finish in time ...
Message 21313 Posted 18 Nov 2008 by Ananas |
Suspending or setting to "no new work" doesn't make a difference, both types of inactive seem to be treated the same. I have changed the 53.9% uptime fraction to 99% now (the low value has been a result of vacations) and detached from 2 projects that do not exist anymore anyway. Those two changes together have been sufficient to allow new work. p.s.: I'm always using this box to test new projects, that's why it is attached to nearly all projects where I have an account, including test projects. One project (running on canis.csc.ncsu.edu) seems to have been just a test setup for Anansi, I never received any work from it. The other one was SciLINK, which has been stopped due to the heavy traffic it had caused. p.p.s.: The concept of Long Term Debts is a mess anyway (especially in combination with CPDN), I have a workaround for it though, that resets all values to 0 on a client restart. |
26) Message boards : Server programs : Won't finish in time ...
Message 21298 Posted 17 Nov 2008 by Ananas |
The calculation of the project ressource share is (still) wrong :Message from server: (won't finish in time) Computer on 53.9% of time, BOINC on 100.0% of that, this project gets 2.7% of that 2.7% is simply wrong, the project in question currently has 50% on that machine, which equals 100% of one CPU. All other projects are inactive or set to "no new work". This bug hits only on those few projects with a very short deadline and only on computers, that are attached to lots of projects, so it is probably quite a rare problem. It exists for years already though, it would be nice if someone could take a look at it sometimes. p.s.: I am aware that the server side scheduler does not know about inactive projects on the client side - but producing an error message without knowing the facts doesn't sound right. |
27) Message boards : Server programs : Server side DCF
Message 20764 Posted 12 Oct 2008 by Ananas |
... The average host correction factor could be evaluated daily during the stats export, ... I forgot to add, that this should only include active hosts, as the inactive ones will not adjust their own DCF when the new fpops_est get closer to the real requirements. |
28) Message boards : Questions and problems : Is it possible to use another computer to "run" work units hosted on a server?
Message 20757 Posted 11 Oct 2008 by Ananas |
... I joined Rosetta and it seems to have me working on a unit that should only take 6.5 hours to complete. ... Check out your Rosetta project settings, you can even adjust the target run time there. The application respects this setting quite well, it reduces or increases the number of simulations in order to end a result in time. Especially interesting : even already running results can be influenced through this setting (after a project contact, that transfers the preferences to your host of course). |
29) Message boards : Server programs : Server side DCF
Message 20754 Posted 11 Oct 2008 by Ananas |
Lately a lot of projects have really weird settings for fpops_est and fpops_bound, leading to host side correction factors of 25 and more. For old hosts, this doesn't hurt. New hosts with a fairly large cache often receive way more WUs than they can handle, before they have a chance to adjust their DCF. I think it would be a good idea, to help the project admins with this stuff by slowly(!) adjusting those values towards an average host DCF of 1. Introducing a DCF on server side could do this job. The average host correction factor could be evaluated daily during the stats export, adjustment of the server side DCF by maybe 5% per day should do. When a new WU is sent out, both fpops_est and fpops_bound should be adjusted according to this server side factor. p.s.: Adjusting the server side factor smooth is required in order to avoid confusing the old hosts, that have already adjusted their DCF |
30) Message boards : Web interfaces : BOINC dev forum repeatedly requests login?
Message 20745 Posted 11 Oct 2008 by Ananas |
Need to steal this thread - sorry ... forum_post.php seems to be broken, no new threads possible (using SeaMonkey 1.1.11) |
31) Message boards : BOINC client : RPC_REASON_PROJECT_REQ ... whatfor ???
Message 20671 Posted 5 Oct 2008 by Ananas |
Thanks, that helps to understand the intention. This doesn't mean that I like this feature but I can live with it. If it becomes too annoying, I can still detach from the project. |
32) Message boards : BOINC client : RPC_REASON_PROJECT_REQ ... whatfor ???
Message 20666 Posted 4 Oct 2008 by Ananas |
... Wrong place to ask as well, you best ask this at the WCG forums. They're better equipped to answer that. They might be able to explain why the server sends this tag in a scheduler reply - but here would be the place to make the stock core client immune to such things. I assume that WCG does not do that in order to spy on my computers when I don't expect it, or collect informations that they shouldn't have - but it shows that it is possible. |
33) Message boards : BOINC client : RPC_REASON_PROJECT_REQ ... whatfor ???
Message 20663 Posted 4 Oct 2008 by Ananas |
Additional information ... the scheduler reply always contains this :scheduler_version>601</scheduler_version> <master_url>http://www.worldcommunitygrid.org/</master_url> <request_delay>61.000000</request_delay> <project_name>World Community Grid</project_name> <next_rpc_delay>345600.000000</next_rpc_delay> 345600 seconds are those 4 days. I hope this turns out to be a server side bug and not to be intentional :-/ |
34) Message boards : BOINC client : RPC_REASON_PROJECT_REQ ... whatfor ???
Message 20662 Posted 4 Oct 2008 by Ananas |
WCG makes the attached computers contact the project every now and then, even if the project is out of work and set to no new work for weeks already. Why can a project decide, when my core client makes a contact, even if there is no server interaction required? I don't know yet, wether the core client will respect a suspend, it is the next thing I will try. p.s.: I checked the logs, the WCG contact happened every 4 days : 13-Sep-2008 08:48:49 [World Community Grid] Sending scheduler request: Requested by project. Requesting 0 seconds of work, reporting 0 completed tasks |
35) Message boards : Server programs : Database gone -> Scheduler confused
Message 20626 Posted 1 Oct 2008 by Ananas |
... In this situation, it does that when a host is attached using the original ("strong") authenticator and a user lookup by authenticator fails. There should be a log entry (severity=critical) about that incident, if logging is enabled. Maybe it would be a good idea, to treat at least CR_SERVER_LOST and CR_SERVER_GONE_ERROR from mysql_query() different than a normal "not found" error. |
36) Message boards : Server programs : Database gone -> Scheduler confused
Message 20625 Posted 1 Oct 2008 by Ananas |
...Or is there a special situation where detaching & reattaching wouldn't have their usual consequences? ... I highly doubt that - especially as a reattach would not have succeeded with a scheduler that is unable to query the database. |
37) Message boards : Server programs : Database gone -> Scheduler confused
Message 20623 Posted 1 Oct 2008 by Ananas |
Try doing what it says; the S@h DB is fine. Now it is fine, yes, and my client has already reported the tasks. But when that scheduler message returned, the status page said "Disabled" for the database on jocelyn - so there was probably not even a crash but just a short planned maintenance outage. |
38) Message boards : Server programs : Database gone -> Scheduler confused
Message 20617 Posted 1 Oct 2008 by Ananas |
The SETI database seems to have crashed. This problem is not correctly handled by the scheduler, it returns funny messages to the clients :<scheduler_reply> <scheduler_version>603</scheduler_version> <master_url>http://setiathome.berkeley.edu/</master_url> <request_delay>3600.000000</request_delay> <message priority="low">Can't find host record Invalid or missing account key. Detach and reattach to this project to fix this. </message> <project_name>SETI@home</project_name> </scheduler_reply> This advice is plain nonsense of course, detach/reattach will trash work but not fix the database |
39) Message boards : Web interfaces : Foundersnip transfer
Message 20231 Posted 13 Sep 2008 by Ananas |
Typo : ... to transfer foundersnip or decline user's request. ... Some seem to be looking for teams to take over, I received this at Spinhenge from a user with no credits It has been sebastian.weigel@yahoo.se (Siljan) in this case. |
40) Message boards : Web interfaces : Bug : show_result_navigation() inside/outside table
Message 19727 Posted 24 Aug 2008 by Ananas |
Version 15929, module "result.php" The first occurance (line 54) of echo show_result_navigation( $clause, $number_of_results, $offset, $results_per_page ); is inside a table, but outside of a table data ( <td> ) tag |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.