Message boards : API : Bug in API or CC, time change issue
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Jun 06 Posts: 305 |
28/10/2007 03:07:35|QMC@HOME|Result two2_dd_hexadeca.7964_0 exited with zero status but no 'finished' file 28/10/2007 03:07:35|QMC@HOME|If this happens repeatedly you may need to reset the project. 28/10/2007 03:07:35||Rescheduling CPU: application exited 28/10/2007 03:07:35|Spinhenge@home|Result 2_Fe30_map_291_1006_0 exited with zero status but no 'finished' file 28/10/2007 03:07:35|Spinhenge@home|If this happens repeatedly you may need to reset the project. 28/10/2007 03:07:35||Rescheduling CPU: application exited It happened on win2k just after I adjusted the PC time by 6 minutes (backwards) and probably has damaged the QAH result after 3 days (QAH checkpoints are not always working properly). I guess, the programs need to accept WM_TIMECHANGED message or so and reset their timers or just send one heartbeat immediately after such a time change. Any attempt to use a delta time for anything will fail in such a situation, except it would use the program tick timer, which is not affected by time changes! Both clients have this in stderr btw. : ... No heartbeat from core client for 31 sec - exiting and stderr is just dated 3:07 also. (all time values are CET, that's why I can post before the bug appeared) p.s.: It is not a DST issue, the DST switch happened several minutes before that problem and affects only localtime() which BOINC fortunately does not use for measuring time |
Send message Joined: 19 Jan 07 Posts: 1179 |
Handling WM_TIMECHANGED sounds like a good idea. Little problem: what window would get that message? Neither the core client nor the science apps have any window open. I would dislike the idea of having a hidden window just to handle that. |
Send message Joined: 27 Jun 06 Posts: 305 |
Good question, I didn't think about that when I posted :-/ Maybe GetConsoleWindow()? Or the default window procedure, i.e. give Win9xMonitorSystemWndProc() a meaning for other windows versions (main.C) ? An extra window for this rare issue or even an OS hook would be a bit too much - usually checkpoints should work and work around this issue. Btw., I think, when the CC polls the project API immediately after a time change, no changes had to be made to the API - so only one place would have to be changed. |
Send message Joined: 27 Jun 06 Posts: 305 |
I think there might a solution for this that fixes Ticket #336 with the same change. If the heartbeat would be redefined to be expected within 30 CPU seconds instead of 30 seconds, heartbeats would be expected less often when the host itself is unresponsive (7-zip a huge file with max. compression has that effect). As high load affects both core client and project application, using the CPU time would probably be more appropriate. The project application would expect less heartbeats when it gets less CPU time itself. The process CPU time should (hopefully) not be influenced by adjusting the PC clock. There should not even be any compatibility issues as elapsed CPU time and elapsed wallclock time are not too different most of the time. So the (unmodified) core clients still try to send a heartbeat within at least 30 seconds but the API would be more patient on overloaded systems. |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.