Thread 'Wait for checkpoint before changeing the application'

Author	Message
Norbert Hoffmann Send message Joined: 19 Dec 05 Posts: 28	Message 2433 - Posted: 7 Jan 2006, 13:32:33 UTC Last modified: 7 Jan 2006, 13:32:51 UTC Is it possible to change the boinc client in a way that it waits before stopping an application until the next checkpoint is written? Now with slow computers or computers used for cpu intensive tasks besides boinc it is near impossible to crunch for e.g. rosetta. Even if you set high times for changeing there is always a download or benchmark comeing to reset it. Norbert ID: 2433 ·

Paul D. Buck Send message Joined: 29 Aug 05 Posts: 225	Message 2434 - Posted: 7 Jan 2006, 14:40:43 UTC Not at this time. :( This is, we hope, one of the upcoming changes. But, not yet. In any case, because of the slow checkpointing, I would not in any case recomemd running Rosetta@Home. You would be better served runing a project with "faster" work units and more frequent checkpoints. ID: 2434 ·

Jim K Send message Joined: 8 Sep 05 Posts: 168	Message 2439 - Posted: 7 Jan 2006, 18:32:37 UTC Set preferences to leave in memory!!!!!!! BOINC Wiki ID: 2439 ·

Paul D. Buck Send message Joined: 29 Aug 05 Posts: 225	Message 2450 - Posted: 8 Jan 2006, 9:30:57 UTC But, that does not address the specific concern. The point being you want to reach a checkpoint before halting. This is a crash protection measure. perhaps more critical on slower computers. So, if the task is swapped after 4 hours, and is 1 second from the next check point and the computer crashes, the restart goes back 4 hours to the last checkpoint. If, it had waited one more second no work is lost at all. So, this is a good enhancement. It is just not here yet. ID: 2450 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15575	Message 2464 - Posted: 9 Jan 2006, 14:01:22 UTC Even if you set high times for changeing there is always a download or benchmark comeing to reset it. I concur with Jim. If you set to leave the applications in memory, a download by the project doesn't drop the application out of memory. Nor does a benchmark when you use 5.2.8 or higher. Example given from my computer: 09/01/2006 13:58:28\|SETI@home Beta Test\|Pausing result 05jl01ab.26451.818.98570.65_1 (left in memory) 09/01/2006 13:58:31\|\|Running CPU benchmarks 09/01/2006 13:59:30\|\|Benchmark results: 09/01/2006 13:59:30\|\| Number of CPUs: 1 09/01/2006 13:59:30\|\| 1138 double precision MIPS (Whetstone) per CPU 09/01/2006 13:59:30\|\| 2331 integer MIPS (Dhrystone) per CPU 09/01/2006 13:59:30\|\|Finished CPU benchmarks 09/01/2006 13:59:31\|\|Resuming computation 09/01/2006 13:59:31\|\|Rescheduling CPU: Resuming computation 09/01/2006 13:59:31\|SETI@home Beta Test\|Resuming result 05jl01ab.26451.818.98570.65_1 using setiathome_enhanced version 411 See? The memory being used to leave the process in isn't RAM. It's your page file. Although I can see it might be problematic on a Windows 9x system. Now Paul, how does BOINC know it is coming up on a benchmark? I must say, I haven't got a clue, but know you looked at the source code. ;) ID: 2464 ·

Paul D. Buck Send message Joined: 29 Aug 05 Posts: 225	Message 2468 - Posted: 9 Jan 2006, 17:03:35 UTC - in response to Message 2464. Now Paul, how does BOINC know it is coming up on a benchmark? I must say, I haven't got a clue, but know you looked at the source code. ;) Well, as I understand the system we still have a minor problem. The BOINC Client has a function that says, "Ok to checkpoint" that the science applications test. But, as far as I know, the science applications do not tell the BOINC Daemon that "Checkpoint completed". So, this would have to be added IF this design change were to be made... ID: 2468 ·

Josef W. Segur Send message Joined: 29 Oct 05 Posts: 5	Message 2469 - Posted: 9 Jan 2006, 20:04:43 UTC - in response to Message 2468. Now Paul, how does BOINC know it is coming up on a benchmark? I must say, I haven't got a clue, but know you looked at the source code. ;) Well, as I understand the system we still have a minor problem. The BOINC Client has a function that says, "Ok to checkpoint" that the science applications test. But, as far as I know, the science applications do not tell the BOINC Daemon that "Checkpoint completed". So, this would have to be added IF this design change were to be made... See the boinc_checkpoint_completed() function in boinc_api.C which applications call just after doing a checkpoint. Setiathome uses it, I presume other apps which do checkpoints also. If the <checkpoint_cpu_time> value in the <active_task> sections of client_state.xml is correct, that's how. A partial improvement would have an update of <checkpoint_cpu_time> trigger a check whether the next expected time for rescheduling CPUs would occur within the next checkpoint interval. If so the reschedule could be triggered immediately. That doesn't catch rescheduling caused by other events, though. Joe ID: 2469 ·

Norbert Hoffmann Send message Joined: 19 Dec 05 Posts: 28	Message 2471 - Posted: 9 Jan 2006, 22:33:04 UTC - in response to Message 2464. I concur with Jim. If you set to leave the applications in memory, a download by the project doesn't drop the application out of memory. Nor does a benchmark when you use 5.2.8 or higher. I know, that it is possible to sail around the weaknesses of the system. That's the Windows-way :-). What I want is: Make it so that boinc-client and -applications do what the user wants (whenever possible). The preferences are for the user. The applications and their interactions with the boinc client should not limit the user. Norbert ID: 2471 ·

Norbert Hoffmann Send message Joined: 19 Dec 05 Posts: 28	Message 2472 - Posted: 9 Jan 2006, 22:36:14 UTC - in response to Message 2464. Now Paul, how does BOINC know it is coming up on a benchmark? It is the client that does the benchmark. Let it wait until the next task switch. Should not be rocket science. Norbert ID: 2472 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15575	Message 2473 - Posted: 10 Jan 2006, 0:09:11 UTC - in response to Message 2472. Now Paul, how does BOINC know it is coming up on a benchmark? It is the client that does the benchmark. Let it wait until the next task switch. Should not be rocket science. Norbert Client can be read two ways: Science application or BOINC version? - The science application? Nope. - The BOINC client? Yep. No need to believe me, we have the Wiki for that. That's the benchmark being done prior to you being attached to any project. ID: 2473 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15575	Message 2474 - Posted: 10 Jan 2006, 0:13:59 UTC - in response to Message 2471. I know, that it is possible to sail around the weaknesses of the system. That's the Windows-way :-). What I want is: Make it so that boinc-client and -applications do what the user wants (whenever possible). The preferences are for the user. The applications and their interactions with the boinc client should not limit the user. Norbert What you're forgetting so easily with saying it like that is that the science application is doing data. Some data needs to be returned quickly, some may be returned after a long time. Using your analogy, you may want to tell when the data is going to be returned. And only you, not the application, not BOINC, not a deadline set to things. So what good are preferences then? ID: 2474 ·

Paul D. Buck Send message Joined: 29 Aug 05 Posts: 225	Message 2484 - Posted: 10 Jan 2006, 9:16:25 UTC - in response to Message 2471. Make it so that boinc-client and -applications do what the user wants (whenever possible). The preferences are for the user. The applications and their interactions with the boinc client should not limit the user. And as time has gone on, this *has* been happening. I does not happen in huge chunks (yet) mostly because even though there are a half a dozen people working on development, none of this is all that easy ... But, since I have been using BOINC, more and more it is easier to use, does more, and in general becomes more powerful. Oh, and my selection of BOINC Powered Projects continues to increase too ... ID: 2484 ·

Norbert Hoffmann Send message Joined: 19 Dec 05 Posts: 28	Message 2485 - Posted: 10 Jan 2006, 9:20:03 UTC - in response to Message 2473. Client can be read two ways: Science application or BOINC version? I use it the "standard" way (IMHO). The client is the boinc.exe. So it stands: It's the client, that initiates the switching of tasks and the benchmarks. Both should be "synchronized" to avoid a stop/restart-cycle of the application. Norbert ID: 2485 ·

Norbert Hoffmann Send message Joined: 19 Dec 05 Posts: 28	Message 2487 - Posted: 10 Jan 2006, 10:38:48 UTC - in response to Message 2474. What you're forgetting so easily with saying it like that is that the science application is doing data. Some data needs to be returned quickly, some may be returned after a long time. Using your analogy, you may want to tell when the data is going to be returned. And only you, not the application, not BOINC, not a deadline set to things. So what good are preferences then? I didn't say, that a user should be able to ignore the needs of the project. But I look at boinc from the view of the ordinary user (that I am). Userfriendly is something like: If there is a "leave app in memory?" in the preferences then setting it to "no" should not do more than create some more overhead to start/stop the application. It is not obvious, that you will loose part of your calculations or even have an application stick doing the same work again and again. On the other side the "connect to network every..." should be replaced by a better understandable "queuelength"-parameter. (And it would be ok if the client tells the user something like "using only 4 days for the queue of project x because of the projects deadlines". Or for the "change application every..."-parameter: It's ok if after e.g. 60mins the boinc client logs: "waiting for next checkpoint to switch applications". The user sees that his wish is not blindly followed but respected. Norbert PS: I can live with boinc the way it is, but perhaps there is some idea where a developer says " hey, thats possible and not too difficult / time consuming to implement. Let's put a note for later." ID: 2487 ·

Paul D. Buck Send message Joined: 29 Aug 05 Posts: 225	Message 2489 - Posted: 10 Jan 2006, 13:10:34 UTC We have lots of notes ... :) For many projects, checkpointing is easy and fast. SETI@Home is lucky this way. CPDN takes long enough to checkpoint that 15 minutes is a reasonable compromise. I don't know enough about Rosetta@Home to comment on the issues there. Fundamentally, though, restarting at the checkpoint is the only alternative. Whether or not the stop was because the participant turned off the computer, it crashed, etc. And we have been trying to convince the "powers that be" that splitting the connect interval and the queue/buffer size is a good idea (TM) ... so far no luck as far as I know. ID: 2489 ·

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.