Message boards : BOINC client : Wait for checkpoint before changeing the application
Message board moderation
Author | Message |
---|---|
Send message Joined: 19 Dec 05 Posts: 28 |
Is it possible to change the boinc client in a way that it waits before stopping an application until the next checkpoint is written? Now with slow computers or computers used for cpu intensive tasks besides boinc it is near impossible to crunch for e.g. rosetta. Even if you set high times for changeing there is always a download or benchmark comeing to reset it. Norbert |
Send message Joined: 29 Aug 05 Posts: 225 |
Not at this time. :( This is, we hope, one of the upcoming changes. But, not yet. In any case, because of the slow checkpointing, I would not in any case recomemd running Rosetta@Home. You would be better served runing a project with "faster" work units and more frequent checkpoints. |
Send message Joined: 8 Sep 05 Posts: 168 |
Set preferences to leave in memory!!!!!!! BOINC Wiki |
Send message Joined: 29 Aug 05 Posts: 225 |
But, that does not address the specific concern. The point being you want to reach a checkpoint before halting. This is a crash protection measure. perhaps more critical on slower computers. So, if the task is swapped after 4 hours, and is 1 second from the next check point and the computer crashes, the restart goes back 4 hours to the last checkpoint. If, it had waited one more second no work is lost at all. So, this is a good enhancement. It is just not here yet. |
Send message Joined: 29 Aug 05 Posts: 15575 |
Even if you set high times for changeing there is always a download or benchmark comeing to reset it. I concur with Jim. If you set to leave the applications in memory, a download by the project doesn't drop the application out of memory. Nor does a benchmark when you use 5.2.8 or higher. Example given from my computer: 09/01/2006 13:58:28|SETI@home Beta Test|Pausing result 05jl01ab.26451.818.98570.65_1 (left in memory) 09/01/2006 13:58:31||Running CPU benchmarks 09/01/2006 13:59:30||Benchmark results: 09/01/2006 13:59:30|| Number of CPUs: 1 09/01/2006 13:59:30|| 1138 double precision MIPS (Whetstone) per CPU 09/01/2006 13:59:30|| 2331 integer MIPS (Dhrystone) per CPU 09/01/2006 13:59:30||Finished CPU benchmarks 09/01/2006 13:59:31||Resuming computation 09/01/2006 13:59:31||Rescheduling CPU: Resuming computation 09/01/2006 13:59:31|SETI@home Beta Test|Resuming result 05jl01ab.26451.818.98570.65_1 using setiathome_enhanced version 411 See? The memory being used to leave the process in isn't RAM. It's your page file. Although I can see it might be problematic on a Windows 9x system. Now Paul, how does BOINC know it is coming up on a benchmark? I must say, I haven't got a clue, but know you looked at the source code. ;) |
Send message Joined: 29 Aug 05 Posts: 225 |
Now Paul, how does BOINC know it is coming up on a benchmark? I must say, I haven't got a clue, but know you looked at the source code. ;) Well, as I understand the system we still have a minor problem. The BOINC Client has a function that says, "Ok to checkpoint" that the science applications test. But, as far as I know, the science applications do not tell the BOINC Daemon that "Checkpoint completed". So, this would have to be added IF this design change were to be made... |
Send message Joined: 29 Oct 05 Posts: 5 |
Now Paul, how does BOINC know it is coming up on a benchmark? I must say, I haven't got a clue, but know you looked at the source code. ;) See the boinc_checkpoint_completed() function in boinc_api.C which applications call just after doing a checkpoint. Setiathome uses it, I presume other apps which do checkpoints also. If the <checkpoint_cpu_time> value in the <active_task> sections of client_state.xml is correct, that's how. A partial improvement would have an update of <checkpoint_cpu_time> trigger a check whether the next expected time for rescheduling CPUs would occur within the next checkpoint interval. If so the reschedule could be triggered immediately. That doesn't catch rescheduling caused by other events, though. Joe |
Send message Joined: 19 Dec 05 Posts: 28 |
I concur with Jim. If you set to leave the applications in memory, a download by the project doesn't drop the application out of memory. Nor does a benchmark when you use 5.2.8 or higher. I know, that it is possible to sail around the weaknesses of the system. That's the Windows-way :-). What I want is: Make it so that boinc-client and -applications do what the user wants (whenever possible). The preferences are for the user. The applications and their interactions with the boinc client should not limit the user. Norbert |
Send message Joined: 19 Dec 05 Posts: 28 |
Now Paul, how does BOINC know it is coming up on a benchmark? It is the client that does the benchmark. Let it wait until the next task switch. Should not be rocket science. Norbert |
Send message Joined: 29 Aug 05 Posts: 15575 |
Now Paul, how does BOINC know it is coming up on a benchmark? Client can be read two ways: Science application or BOINC version? - The science application? Nope. - The BOINC client? Yep. No need to believe me, we have the Wiki for that. That's the benchmark being done prior to you being attached to any project. |
Send message Joined: 29 Aug 05 Posts: 15575 |
I know, that it is possible to sail around the weaknesses of the system. That's the Windows-way :-). What I want is: Make it so that boinc-client and -applications do what the user wants (whenever possible). The preferences are for the user. The applications and their interactions with the boinc client should not limit the user. What you're forgetting so easily with saying it like that is that the science application is doing data. Some data needs to be returned quickly, some may be returned after a long time. Using your analogy, you may want to tell when the data is going to be returned. And only you, not the application, not BOINC, not a deadline set to things. So what good are preferences then? |
Send message Joined: 29 Aug 05 Posts: 225 |
Make it so that boinc-client and -applications do what the user wants (whenever possible). The preferences are for the user. The applications and their interactions with the boinc client should not limit the user. And as time has gone on, this has been happening. I does not happen in huge chunks (yet) mostly because even though there are a half a dozen people working on development, none of this is all that easy ... But, since I have been using BOINC, more and more it is easier to use, does more, and in general becomes more powerful. Oh, and my selection of BOINC Powered Projects continues to increase too ... |
Send message Joined: 19 Dec 05 Posts: 28 |
Client can be read two ways: Science application or BOINC version? I use it the "standard" way (IMHO). The client is the boinc.exe. So it stands: It's the client, that initiates the switching of tasks and the benchmarks. Both should be "synchronized" to avoid a stop/restart-cycle of the application. Norbert |
Send message Joined: 19 Dec 05 Posts: 28 |
What you're forgetting so easily with saying it like that is that the science application is doing data. Some data needs to be returned quickly, some may be returned after a long time. I didn't say, that a user should be able to ignore the needs of the project. But I look at boinc from the view of the ordinary user (that I am). Userfriendly is something like: If there is a "leave app in memory?" in the preferences then setting it to "no" should not do more than create some more overhead to start/stop the application. It is not obvious, that you will loose part of your calculations or even have an application stick doing the same work again and again. On the other side the "connect to network every..." should be replaced by a better understandable "queuelength"-parameter. (And it would be ok if the client tells the user something like "using only 4 days for the queue of project x because of the projects deadlines". Or for the "change application every..."-parameter: It's ok if after e.g. 60mins the boinc client logs: "waiting for next checkpoint to switch applications". The user sees that his wish is not blindly followed but respected. Norbert PS: I can live with boinc the way it is, but perhaps there is some idea where a developer says " hey, thats possible and not too difficult / time consuming to implement. Let's put a note for later." |
Send message Joined: 29 Aug 05 Posts: 225 |
We have lots of notes ... :) For many projects, checkpointing is easy and fast. SETI@Home is lucky this way. CPDN takes long enough to checkpoint that 15 minutes is a reasonable compromise. I don't know enough about Rosetta@Home to comment on the issues there. Fundamentally, though, restarting at the checkpoint is the only alternative. Whether or not the stop was because the participant turned off the computer, it crashed, etc. And we have been trying to convince the "powers that be" that splitting the connect interval and the queue/buffer size is a good idea (TM) ... so far no luck as far as I know. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.