Benchmarking bug - indefinite suspension of computing

Message boards : BOINC client : Benchmarking bug - indefinite suspension of computing
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 16133 - Posted: 30 Mar 2008, 13:55:57 UTC
Last modified: 30 Mar 2008, 14:09:42 UTC

Just come across this as a result of a problem-solving session at SETI - seems to be reproducible in current (v5.10.45) version for Windows.

Scenario: BOINC running as a service on Windows XP. Do that typical end-user thing of using the system clock as a holiday planner (checking a date next month). Inadvertently click 'OK' instead of 'cancel' - sets the clock a month ahead. Some time later, you (or Internet Time) notice that the clock is wrong, and move it back to the correct month. BOINC computation stops with an endless benchmark shortly after the second time change. {Edit - opened trac ticket [trac]#588[/trac]}.

You'll get a message log something like this:

2008-03-30 12:28:49 [Einstein@Home] Resuming task h1_0907.30_S5R3__166_S5R3b_1 using einstein_S5R3 version 436
2008-04-30 14:09:40 [---] Running CPU benchmarks
2008-04-30 14:09:40 [---] Suspending computation - running CPU benchmarks
2008-04-30 14:09:42 [---] [benchmark_debug] Starting floating-point benchmark
2008-04-30 14:09:52 [---] [benchmark_debug] Ended floating-point benchmark
2008-04-30 14:09:57 [---] [benchmark_debug] Starting integer benchmark
2008-04-30 14:10:07 [---] [benchmark_debug] Ended integer benchmark
2008-04-30 14:10:10 [---] [benchmark_debug] Ended benchmark
2008-04-30 14:10:11 [---] [benchmark_debug] CPU 0 has finished
2008-04-30 14:10:11 [---] [benchmark_debug] 1 out of 1 CPUs done
2008-04-30 14:10:11 [---] [benchmark_debug] CPU 0: fp 1038127090.301003 int 1675825412.162456 intloops 27696000.000000 inttime 9.406250
2008-04-30 14:10:11 [---] Benchmark results:
2008-04-30 14:10:11 [---] Number of CPUs: 1
2008-04-30 14:10:11 [---] 1038 floating point MIPS (Whetstone) per CPU
2008-04-30 14:10:11 [---] 1676 integer MIPS (Dhrystone) per CPU
2008-04-30 14:10:12 [---] Resuming computation
2008-03-30 14:13:41 [---] Running CPU benchmarks
2008-03-30 14:13:41 [---] Suspending computation - running CPU benchmarks
2008-03-30 14:17:24 [---] Exit requested by user

To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK

StartServiceCtrlDispatcher being called.
This may take several seconds. Please wait.
2008-03-30 14:17:26 [---] Starting BOINC client version 5.10.13 for windows_intelx86
2008-03-30 14:17:26 [---] log flags: task, file_xfer, sched_ops, benchmark_debug
2008-03-30 14:17:26 [---] Libraries: libcurl/7.16.1 OpenSSL/0.9.8e zlib/1.2.3
2008-03-30 14:17:26 [---] Executing as a daemon
2008-03-30 14:17:26 [---] Data directory: C:Program FilesBOINC
2008-03-30 14:17:26 [---] BOINC is running as a service and as a non-system user.
2008-03-30 14:17:26 [---] No application graphics will be available.
2008-03-30 14:17:27 [Einstein@Home] Found app_info.xml; using anonymous platform
2008-03-30 14:17:27 [SETI@home] Found app_info.xml; using anonymous platform
2008-03-30 14:17:27 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.00GHz [x86 Family 15 Model 2 Stepping 4]
2008-03-30 14:17:27 [---] Processor features: fpu tsc sse sse2 mmx
2008-03-30 14:17:27 [---] Memory: 511.30 MB physical, 1.22 GB virtual
2008-03-30 14:17:27 [---] Disk: 37.24 GB total, 4.92 GB free
2008-03-30 14:17:27 [Einstein@Home] URL: http://einstein.phys.uwm.edu/; Computer ID: 1036916; location: home; project prefs: default
2008-03-30 14:17:27 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 1791152; location: work; project prefs: work
2008-03-30 14:17:27 [---] General prefs: from Einstein@Home (last modified 2007-12-07 10:01:47)
2008-03-30 14:17:27 [---] Host location: home
2008-03-30 14:17:27 [---] General prefs: using separate prefs for home
2008-03-30 14:17:27 [---] Preferences limit memory usage when active to 511.30MB
2008-03-30 14:17:27 [---] Preferences limit memory usage when idle to 511.30MB
2008-03-30 14:17:27 [---] Preferences limit disk usage to 4.92GB
2008-03-30 14:17:27 [---] Running CPU benchmarks
2008-03-30 14:17:30 [---] [benchmark_debug] Starting floating-point benchmark
2008-03-30 14:17:40 [---] [benchmark_debug] Ended floating-point benchmark
2008-03-30 14:17:46 [---] [benchmark_debug] Starting integer benchmark
2008-03-30 14:17:55 [---] [benchmark_debug] Ended integer benchmark
2008-03-30 14:17:59 [---] [benchmark_debug] Ended benchmark
2008-03-30 14:18:01 [---] [benchmark_debug] CPU 0 has finished
2008-03-30 14:18:01 [---] [benchmark_debug] 1 out of 1 CPUs done
2008-03-30 14:18:01 [---] [benchmark_debug] CPU 0: fp 1044264943.457189 int 1983678791.328194 intloops 29952000.000000 inttime 8.593750
2008-03-30 14:18:01 [---] Benchmark results:
2008-03-30 14:18:01 [---] Number of CPUs: 1
2008-03-30 14:18:01 [---] 1044 floating point MIPS (Whetstone) per CPU
2008-03-30 14:18:01 [---] 1984 integer MIPS (Dhrystone) per CPU
2008-03-30 14:18:09 [Einstein@Home] Restarting task h1_0907.30_S5R3__166_S5R3b_1 using einstein_S5R3 version 436
2008-03-30 14:21:45 [---] Exit requested by user

StartServiceCtrlDispatcher being called.
This may take several seconds. Please wait.
30-Mar-2008 14:22:49 [---] Starting BOINC client version 5.10.45 for windows_intelx86
30-Mar-2008 14:22:49 [---] log flags: task, file_xfer, sched_ops, benchmark_debug
30-Mar-2008 14:22:49 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8e zlib/1.2.3
30-Mar-2008 14:22:49 [---] Executing as a daemon
30-Mar-2008 14:22:49 [---] Data directory: C:Program FilesBOINC
30-Mar-2008 14:22:49 [---] BOINC is running as a service and as a non-system user.
30-Mar-2008 14:22:49 [---] No application graphics will be available.
30-Mar-2008 14:22:49 [Einstein@Home] Found app_info.xml; using anonymous platform
30-Mar-2008 14:22:49 [SETI@home] Found app_info.xml; using anonymous platform
30-Mar-2008 14:22:49 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.00GHz [x86 Family 15 Model 2 Stepping 4]
30-Mar-2008 14:22:49 [---] Processor features: fpu tsc sse sse2 mmx
30-Mar-2008 14:22:49 [---] OS: Microsoft Windows XP: Home Edition, Service Pack 2, (05.01.2600.00)
30-Mar-2008 14:22:49 [---] Memory: 511.30 MB physical, 1.22 GB virtual
30-Mar-2008 14:22:49 [---] Disk: 37.24 GB total, 4.83 GB free
30-Mar-2008 14:22:49 [---] Local time is UTC +1 hours
30-Mar-2008 14:22:49 [---] Version change (5.10.13 -> 5.10.45)
30-Mar-2008 14:22:49 [Einstein@Home] URL: http://einstein.phys.uwm.edu/; Computer ID: 1036916; location: home; project prefs: default
30-Mar-2008 14:22:49 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 1791152; location: work; project prefs: work
30-Mar-2008 14:22:49 [---] General prefs: from Einstein@Home (last modified 07-Dec-2007 10:01:47)
30-Mar-2008 14:22:50 [---] Host location: home
30-Mar-2008 14:22:50 [---] General prefs: using separate prefs for home
30-Mar-2008 14:22:50 [---] Preferences limit memory usage when active to 511.30MB
30-Mar-2008 14:22:50 [---] Preferences limit memory usage when idle to 511.30MB
30-Mar-2008 14:22:50 [---] Preferences limit disk usage to 4.83GB
30-Mar-2008 14:22:50 [---] Running CPU benchmarks
30-Mar-2008 14:22:53 [---] [benchmark_debug] Starting floating-point benchmark
30-Mar-2008 14:23:02 [---] [benchmark_debug] Ended floating-point benchmark
30-Mar-2008 14:23:07 [---] [benchmark_debug] Starting integer benchmark
30-Mar-2008 14:23:17 [---] [benchmark_debug] Ended integer benchmark
30-Mar-2008 14:23:21 [---] [benchmark_debug] Ended benchmark
30-Mar-2008 14:23:23 [---] [benchmark_debug] CPU 0 has finished
30-Mar-2008 14:23:23 [---] [benchmark_debug] 1 out of 1 CPUs done
30-Mar-2008 14:23:23 [---] [benchmark_debug] CPU 0: fp 1030769230.769231 int 1932793606.913454 intloops 31200000.000000 inttime 9.187500
30-Mar-2008 14:23:23 [---] Benchmark results:
30-Mar-2008 14:23:23 [---] Number of CPUs: 1
30-Mar-2008 14:23:23 [---] 1031 floating point MIPS (Whetstone) per CPU
30-Mar-2008 14:23:23 [---] 1933 integer MIPS (Dhrystone) per CPU
30-Mar-2008 14:23:32 [Einstein@Home] Restarting task h1_0907.30_S5R3__166_S5R3b_1 using einstein_S5R3 version 436
30-Apr-2008 14:24:32 [---] Running CPU benchmarks
30-Apr-2008 14:24:32 [---] Suspending computation - running CPU benchmarks
30-Apr-2008 14:24:35 [---] [benchmark_debug] Starting floating-point benchmark
30-Apr-2008 14:25:08 [---] [benchmark_debug] Ended floating-point benchmark
30-Apr-2008 14:25:10 [---] [benchmark_debug] Starting integer benchmark
30-Apr-2008 14:25:12 [---] [benchmark_debug] Ended integer benchmark
30-Apr-2008 14:25:14 [---] [benchmark_debug] Ended benchmark
30-Apr-2008 14:25:16 [---] [benchmark_debug] CPU 0 has finished
30-Apr-2008 14:25:16 [---] [benchmark_debug] 1 out of 1 CPUs done
30-Apr-2008 14:25:16 [---] [benchmark_debug] CPU 0: fp 1032357177.148344 int 1912681740.570187 intloops 5776000.000000 inttime 1.718750
30-Apr-2008 14:25:16 [---] Benchmark results:
30-Apr-2008 14:25:16 [---] Number of CPUs: 1
30-Apr-2008 14:25:16 [---] 1032 floating point MIPS (Whetstone) per CPU
30-Apr-2008 14:25:16 [---] 1913 integer MIPS (Dhrystone) per CPU
30-Apr-2008 14:25:17 [---] Resuming computation
30-Mar-2008 14:27:00 [---] Running CPU benchmarks
30-Mar-2008 14:27:00 [---] Suspending computation - running CPU benchmarks
30-Mar-2008 14:30:04 [---] Exit requested by user

StartServiceCtrlDispatcher being called.
This may take several seconds. Please wait.
30-Mar-2008 14:30:06 [---] Starting BOINC client version 5.10.45 for windows_intelx86
30-Mar-2008 14:30:06 [---] log flags: task, file_xfer, sched_ops, benchmark_debug
30-Mar-2008 14:30:06 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8e zlib/1.2.3
30-Mar-2008 14:30:06 [---] Executing as a daemon
30-Mar-2008 14:30:06 [---] Data directory: C:Program FilesBOINC
30-Mar-2008 14:30:06 [---] BOINC is running as a service and as a non-system user.
30-Mar-2008 14:30:06 [---] No application graphics will be available.
30-Mar-2008 14:30:06 [Einstein@Home] Found app_info.xml; using anonymous platform
30-Mar-2008 14:30:06 [SETI@home] Found app_info.xml; using anonymous platform
30-Mar-2008 14:30:07 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.00GHz [x86 Family 15 Model 2 Stepping 4]
30-Mar-2008 14:30:07 [---] Processor features: fpu tsc sse sse2 mmx
30-Mar-2008 14:30:07 [---] OS: Microsoft Windows XP: Home Edition, Service Pack 2, (05.01.2600.00)
30-Mar-2008 14:30:07 [---] Memory: 511.30 MB physical, 1.22 GB virtual
30-Mar-2008 14:30:07 [---] Disk: 37.24 GB total, 4.84 GB free
30-Mar-2008 14:30:07 [---] Local time is UTC +1 hours
30-Mar-2008 14:30:07 [Einstein@Home] URL: http://einstein.phys.uwm.edu/; Computer ID: 1036916; location: home; project prefs: default
30-Mar-2008 14:30:07 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 1791152; location: work; project prefs: work
30-Mar-2008 14:30:07 [---] General prefs: from Einstein@Home (last modified 07-Dec-2007 10:01:47)
30-Mar-2008 14:30:07 [---] Host location: home
30-Mar-2008 14:30:07 [---] General prefs: using separate prefs for home
30-Mar-2008 14:30:07 [---] Preferences limit memory usage when active to 511.30MB
30-Mar-2008 14:30:07 [---] Preferences limit memory usage when idle to 511.30MB
30-Mar-2008 14:30:07 [---] Preferences limit disk usage to 4.84GB
30-Mar-2008 14:30:07 [---] Running CPU benchmarks
30-Mar-2008 14:30:10 [---] [benchmark_debug] Starting floating-point benchmark
30-Mar-2008 14:30:20 [---] [benchmark_debug] Ended floating-point benchmark
30-Mar-2008 14:30:26 [---] [benchmark_debug] Starting integer benchmark
30-Mar-2008 14:30:36 [---] [benchmark_debug] Ended integer benchmark
30-Mar-2008 14:30:38 [---] [benchmark_debug] Ended benchmark
30-Mar-2008 14:30:40 [---] [benchmark_debug] CPU 0 has finished
30-Mar-2008 14:30:40 [---] [benchmark_debug] 1 out of 1 CPUs done
30-Mar-2008 14:30:40 [---] [benchmark_debug] CPU 0: fp 1029145728.643216 int 1641931187.505236 intloops 26640000.000000 inttime 9.234375
30-Mar-2008 14:30:40 [---] Benchmark results:
30-Mar-2008 14:30:40 [---] Number of CPUs: 1
30-Mar-2008 14:30:40 [---] 1029 floating point MIPS (Whetstone) per CPU
30-Mar-2008 14:30:40 [---] 1642 integer MIPS (Dhrystone) per CPU
30-Mar-2008 14:30:43 [Einstein@Home] Restarting task h1_0907.30_S5R3__166_S5R3b_1 using einstein_S5R3 version 436
ID: 16133 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 16137 - Posted: 30 Mar 2008, 14:43:33 UTC - in response to Message 16134.  

Use a utility like LClock (LonghornClock) and you have a clock in place of the regular with a calendar function and sans risk of changing the real clock inadvertantly. It's an additional menu option to get thru or 1 Click, Calendar, double clock, the time setting applet. Been using it now on XP for a few years.

The BOINC/sciences hate forward and backward clock settings especially. Given that the benchmark is scheduled to run every 5 days, that is one of the bonus reactions you get changing the date a month and more. Also, the sciences look at run times, and if they go negative, you're in for a treat and you found one that requires manual intervention.

Yes, yes, yes, but.......

Can you force every Windows XP user to install a read-only clock/calendar? (At least this problem will go away when Vista is universal on the desktop). And can you train every Windows XP user to cancel out of the existing clock, every time they check a date? (When I hit this problem in the real world, with a telesales database, the problem went away when we upgraded from Windows 98 to domain-controlled Windows 2000 for the sales floor, and gave the users restricted rights. But the boss's orders were still all over the place, because his logon had to have administrative rights and could change the clock. But I digress).

The BUG is that benchmarking starts, but doesn't complete. There should be no way that that can happen, period. It's called fault-tolerance, and it should apply even when the 'fault' is a naïve user.
ID: 16137 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 16139 - Posted: 30 Mar 2008, 15:54:46 UTC - in response to Message 16138.  

Just offered a temporary solution. You can do what you please with it.

That's fine. We solved the problem brought to us by the original SETI user. The logs I posted from my own machine were self-inflcted in the interests of research.

As an amateur programmer myself, I know all the mantras which programmers utter when a bug report comes in, and I've used most of them myself.

"Is it reproducible?"
"What did the message actually say?"
"Are you using the latest version?"
"What were you doing at the time?"
"Is your antivirus up-to-date?"
"Have you applied the latest service pack?"
"Oh no, our program could never do that." (often false)
"Is your video/printer driver up-to-date?"
"Why on earth do you want to do that?"
etc.
etc.

I just think that it's grown-up and responsible to try to work through as many as possible of them before I open a trac ticket.
ID: 16139 · Report as offensive
John McLeod VII
Avatar

Send message
Joined: 29 Aug 05
Posts: 147
Message 16140 - Posted: 30 Mar 2008, 16:05:40 UTC

Hmm. This is not the first problem with clocks changing causing problems.

I know that under windows there is a function "GetTickCount" that returns the number of ticks since the machine started rather than the absolute time.

BOINC WIKI
ID: 16140 · Report as offensive
Alinator

Send message
Joined: 8 Jan 06
Posts: 36
United States
Message 16141 - Posted: 30 Mar 2008, 16:46:44 UTC - in response to Message 16137.  



<snip>

The BUG is that benchmarking starts, but doesn't complete. There should be no way that that can happen, period. It's called fault-tolerance, and it should apply even when the 'fault' is a naïve user.


It's not just the benchmark which has problems if you go farther than the next run time for it in the future and then back.

Everything stops for whatever the time interval was for the jump when you go back. The only exception found was that OS initiated DST changes are handled properly.

IOW's, patching it so the benchmark completes once it starts regardless of anything else that happens with the clock won't fix the problem enitrely.

In addition, my observations when I was working the problem with a different user over in SAH is the 'damage' to the time metrics is from the leap forward which created a big seemingly idle period BOINC cannot account for. Fortunately, when the jump back occurs it doesn't interpret this as miraculously somehow having amplified its computational abilities and set the metrics according to that! That probably explains why it resorts to just suspending everything until 'mystery' period goes away.

Alinator
ID: 16141 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 16142 - Posted: 30 Mar 2008, 17:22:45 UTC - in response to Message 16141.  



<snip>

The BUG is that benchmarking starts, but doesn't complete. There should be no way that that can happen, period. It's called fault-tolerance, and it should apply even when the 'fault' is a naïve user.


It's not just the benchmark which has problems if you go farther than the next run time for it in the future and then back.

Everything stops for whatever the time interval was for the jump when you go back. The only exception found was that OS initiated DST changes are handled properly.

IOW's, patching it so the benchmark completes once it starts regardless of anything else that happens with the clock won't fix the problem enitrely.

In addition, my observations when I was working the problem with a different user over in SAH is the 'damage' to the time metrics is from the leap forward which created a big seemingly idle period BOINC cannot account for. Fortunately, when the jump back occurs it doesn't interpret this as miraculously somehow having amplified its computational abilities and set the metrics according to that! That probably explains why it resorts to just suspending everything until 'mystery' period goes away.

Alinator

Also noted. Since performing self-sacrifice in the name of science, I'm getting

<active_frac>0.045279</active_frac>

This machine runs 24/7/365.
ID: 16142 · Report as offensive
W-K ID 666

Send message
Joined: 30 Dec 05
Posts: 456
United Kingdom
Message 16274 - Posted: 1 Apr 2008, 4:34:51 UTC

Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers.
ID: 16274 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 16275 - Posted: 1 Apr 2008, 4:44:22 UTC - in response to Message 16274.  

Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers.

That article is about synchronizing the system clock. BOINC doesn't and shouldn't have permissions to change the system-wide clock.
ID: 16275 · Report as offensive
W-K ID 666

Send message
Joined: 30 Dec 05
Posts: 456
United Kingdom
Message 16276 - Posted: 1 Apr 2008, 5:38:50 UTC - in response to Message 16275.  
Last modified: 1 Apr 2008, 5:39:11 UTC

Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers.

That article is about synchronizing the system clock. BOINC doesn't and shouldn't have permissions to change the system-wide clock.

I can see why in some situations it should not be tolerated, but why not have an option to sync clock to project clock.
Win XP has an option, bring up clock date/time properties and go to internet time tab. But this option only does it once a week or manually and will not adjust if greater than 15hrs wrong.
ID: 16276 · Report as offensive
MikeMarsUK

Send message
Joined: 16 Apr 06
Posts: 386
United Kingdom
Message 16279 - Posted: 1 Apr 2008, 7:58:04 UTC


Basically your suggestion is:

* Store a time-offset for each project in client_state.xml (i.e., offset needed from system time to match project time)

* Periodically, and also whenever the system clock changes signficantly, Update the offset from a project ntp server.


ID: 16279 · Report as offensive
W-K ID 666

Send message
Joined: 30 Dec 05
Posts: 456
United Kingdom
Message 16280 - Posted: 1 Apr 2008, 8:51:42 UTC - in response to Message 16279.  


Basically your suggestion is:

* Store a time-offset for each project in client_state.xml (i.e., offset needed from system time to match project time)

* Periodically, and also whenever the system clock changes signficantly, Update the offset from a project ntp server.


Yes, to be blunt.

There is real need for times in BOINC manager to be anything other than UTC, except for the saving of files in local time. All the web pages run on UTC only.

And me being me, and had 25years in worldwide military comms, running everything in Zulu time is second nature, so If BOINC went UTC everywhere, I would turn off the switch to BST, and run UTC all year.
ID: 16280 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 16282 - Posted: 1 Apr 2008, 9:38:03 UTC

Guys, guys, ....

There are going to be problems with time recording for as long as we have clocks and different timezones. We can argue till the cows come home (and what time is that? Do cows know about DST?) whether it is the duty of BOINC and other software to ignore, correct, notify or work round such errors as it finds.

But while - or preferably before - we discuss all these arcana, there's a bug to be fixed.

It seems that, under certain circumstances as documented in my trac ticket, BOINC just STOPS. Period. That isn't on my list of acceptable responses to a time glitch.

It appeares that there's a code sequence along the lines of

IF <weird time detected>
SUSPEND computing for benchmarking
WAIT [color=red]<for something that isn't going to happen>[/color]
START <floating point benchmark>

I'm wondering if this is something that was introduced round about v5.8.16:
- core client: if benchmark time is in the future (due to user tweak) always run benchmarks

- if it came in with a fix like that, it might explain why it was missed in pre-release testing.

Anyone got a better idea?
ID: 16282 · Report as offensive
MikeMarsUK

Send message
Joined: 16 Apr 06
Posts: 386
United Kingdom
Message 16326 - Posted: 1 Apr 2008, 21:15:07 UTC - in response to Message 16282.  
Last modified: 1 Apr 2008, 21:21:06 UTC

...
But while - or preferably before - we discuss all these arcana, there's a bug to be fixed.
...


But that's exactly what we were talking about. If the time checking used in this benchmark processing used the offset+systemclock, rather than just system-clock on it's own, then the bug would be solved.


IF <weird time detected>
Update offset from project NTP server
SUSPEND computing for benchmarking
WAIT should now work since the system can work out the correct time from clock+offset
START <floating point benchmark>


It'd be a significant design change, and everywhere which referred to time would need to refer to the adjusted time instead.

If the benchmark bug was fixed in isolation, you'd then get the remaining problems which are just as bad - the DCF goes haywire, and processing gets suspended for a month or whatever, which we've seen several times before with older versions. The offset would solve these other bugs simultaneously.
ID: 16326 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 16328 - Posted: 1 Apr 2008, 22:14:40 UTC - in response to Message 16276.  

Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers.

That article is about synchronizing the system clock. BOINC doesn't and shouldn't have permissions to change the system-wide clock.

I can see why in some situations it should not be tolerated, but why not have an option to sync clock to project clock.

Because it's none of BOINC's business to keep the system clock correct, and because NTP servers are way more accurate than project servers.

Since version 6, BOINC will get installed under its own account with very low privileges, definitely not enough to change the clock (which is an admin privilege).
ID: 16328 · Report as offensive
John McLeod VII
Avatar

Send message
Joined: 29 Aug 05
Posts: 147
Message 16375 - Posted: 3 Apr 2008, 0:25:57 UTC - in response to Message 16328.  

Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers.

That article is about synchronizing the system clock. BOINC doesn't and shouldn't have permissions to change the system-wide clock.

I can see why in some situations it should not be tolerated, but why not have an option to sync clock to project clock.

Because it's none of BOINC's business to keep the system clock correct, and because NTP servers are way more accurate than project servers.

Since version 6, BOINC will get installed under its own account with very low privileges, definitely not enough to change the clock (which is an admin privilege).

Actually, under windows, BOINC probably actually DOES have the permissions to set the system clock. Not that it has any business doing so.

BOINC WIKI
ID: 16375 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 16378 - Posted: 3 Apr 2008, 8:39:31 UTC - in response to Message 16328.  

.... and because NTP servers are way more accurate than project servers.

Which raises another question: why would anyone run a BOINC server which isn't set up for automatic synchronisation with an NTP server?

Matt Lebofsky at SETI occasionally forgets when he's setting up a new web server made up of cannibalised/donated parts, and it soon becomes obvious - the latest post on a message board is some time in the future (SETI has separate servers for the database and the web front end). But he always corrects it as soon as he notices or I point it out ;-)
ID: 16378 · Report as offensive
W-K ID 666

Send message
Joined: 30 Dec 05
Posts: 456
United Kingdom
Message 16379 - Posted: 3 Apr 2008, 10:27:22 UTC

Considering the results the results of the BOINC survey.
Most computers are at home.
90%+ run Windows, and if enabled the clock is only checked against an NTP server once a week.
A lot are on 24/7.
Therefore by default, even if BOINC projects was not the original objective, BOINC is the computer primary function. Therefore why not let them use a BOINC project to check, and if desired reset, the clock.

My sons old P4 computer had a regular habit of corrupting its BIOS, luckily it was one with backup copy in ROM. But when it did this it reverted to the default settings, date 01/01/1980. And if set to re-boot on resumption of power, after a power break, it could easily go unnoticed for days. The clock is not that important to a teenager playing games, but it is to the BOINC client.
ID: 16379 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 16381 - Posted: 3 Apr 2008, 12:01:13 UTC

Another selling point for BOINC! Keeps your computer running smoothly and safely.

(because, IIRC, Windows update fails if the local computer clock is sufficiently different from the Microsoft servers' estimation of time).
ID: 16381 · Report as offensive
W-K ID 666

Send message
Joined: 30 Dec 05
Posts: 456
United Kingdom
Message 16393 - Posted: 3 Apr 2008, 14:31:22 UTC - in response to Message 16390.  

Therefore by default, even if BOINC projects was not the original objective, BOINC is the computer primary function. Therefore why not let them use a BOINC project to check, and if desired reset, the clock.


You seem to be saying give them an option. Then some will use the feature while others do not. Those that don't will still have the problem. Better to implement a solution that solves the problem for everybody but does not mess with the clock.


I agree fixing this problem may be the best option for this problem, but having the computer clock wrong also affects the scheduler, and therefore having the ability to identify the clock is wrong and either posting a warning, or having an option to allow the clock to be corrected automatically would be a beneficial.
ID: 16393 · Report as offensive
Pepo
Avatar

Send message
Joined: 3 Apr 06
Posts: 547
Slovakia
Message 16409 - Posted: 3 Apr 2008, 21:16:18 UTC - in response to Message 16406.  

Although it would take longer to implement, the better (more reliable) and politically safer fix is the other fix that has been suggested in this thread wherein BOINC, if I understand correctly, would keep its own "clock" and leave the system clock alone.

I suppose that's what is being referred to as a monotonical clock. I suspect that only the OS is capable of delivering such clock values correctly (be it e.g. the GetTickCount() mentioned by JM7, translated to a date/time value, keeping in mind e.g. hibernation times etc.)

Peter
ID: 16409 · Report as offensive
1 · 2 · Next

Message boards : BOINC client : Benchmarking bug - indefinite suspension of computing

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.