Thread 'Bug in API or CC, time change issue'

Message boards : API : Bug in API or CC, time change issue
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 13336 - Posted: 28 Oct 2007, 2:24:49 UTC
Last modified: 28 Oct 2007, 2:56:17 UTC

28/10/2007 03:07:35|QMC@HOME|Result two2_dd_hexadeca.7964_0 exited with zero status but no 'finished' file
28/10/2007 03:07:35|QMC@HOME|If this happens repeatedly you may need to reset the project.
28/10/2007 03:07:35||Rescheduling CPU: application exited
28/10/2007 03:07:35|Spinhenge@home|Result 2_Fe30_map_291_1006_0 exited with zero status but no 'finished' file
28/10/2007 03:07:35|Spinhenge@home|If this happens repeatedly you may need to reset the project.
28/10/2007 03:07:35||Rescheduling CPU: application exited


It happened on win2k just after I adjusted the PC time by 6 minutes (backwards) and probably has damaged the QAH result after 3 days (QAH checkpoints are not always working properly).

I guess, the programs need to accept WM_TIMECHANGED message or so and reset their timers or just send one heartbeat immediately after such a time change. Any attempt to use a delta time for anything will fail in such a situation, except it would use the program tick timer, which is not affected by time changes!

Both clients have this in stderr btw. :

...
No heartbeat from core client for 31 sec - exiting

and stderr is just dated 3:07 also.

(all time values are CET, that's why I can post before the bug appeared)

p.s.: It is not a DST issue, the DST switch happened several minutes before that problem and affects only localtime() which BOINC fortunately does not use for measuring time
ID: 13336 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 13337 - Posted: 28 Oct 2007, 3:04:34 UTC - in response to Message 13336.  

Handling WM_TIMECHANGED sounds like a good idea. Little problem: what window would get that message? Neither the core client nor the science apps have any window open. I would dislike the idea of having a hidden window just to handle that.
ID: 13337 · Report as offensive
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 13338 - Posted: 28 Oct 2007, 3:21:51 UTC
Last modified: 28 Oct 2007, 3:24:56 UTC

Good question, I didn't think about that when I posted :-/

Maybe GetConsoleWindow()?

Or the default window procedure, i.e. give Win9xMonitorSystemWndProc() a meaning for other windows versions (main.C) ?

An extra window for this rare issue or even an OS hook would be a bit too much - usually checkpoints should work and work around this issue.


Btw., I think, when the CC polls the project API immediately after a time change, no changes had to be made to the API - so only one place would have to be changed.
ID: 13338 · Report as offensive
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 13729 - Posted: 10 Nov 2007, 13:18:19 UTC
Last modified: 10 Nov 2007, 13:37:37 UTC

I think there might a solution for this that fixes Ticket #336 with the same change.

If the heartbeat would be redefined to be expected within 30 CPU seconds instead of 30 seconds, heartbeats would be expected less often when the host itself is unresponsive (7-zip a huge file with max. compression has that effect). As high load affects both core client and project application, using the CPU time would probably be more appropriate. The project application would expect less heartbeats when it gets less CPU time itself.

The process CPU time should (hopefully) not be influenced by adjusting the PC clock.


There should not even be any compatibility issues as elapsed CPU time and elapsed wallclock time are not too different most of the time.

So the (unmodified) core clients still try to send a heartbeat within at least 30 seconds but the API would be more patient on overloaded systems.
ID: 13729 · Report as offensive

Message boards : API : Bug in API or CC, time change issue

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.