Posts by gwg

1) Message boards : BOINC client : BOINC Client Timing Bug (Message 14618)
Posted 3 Jan 2008 by gwg
Post:
It is generally something external to the BOINC client causing this, for example a virus scanner or indexing program looking at the file.


Yes. That would explain it.

Thanks.

George
2) Message boards : BOINC client : BOINC Client Timing Bug (Message 14603)
Posted 2 Jan 2008 by gwg
Post:
I am certain that I have seen this discussed before, but a search on errors didn't come up with anything.

Occasionally, BOINC Client is unable to write a state file, due to its inability to rename the current one to old. This happens to files for all applications that I am running under BOINC Client, and seems not to cause any permanent harm. However, if the problem is allowed to remain, it could cause real problems in some future release.

I believe it to be a timing error in that BOINC fails somehow to wait for a file to close before attempting a rename. Details are as follows:

Hardware: Dual 1GHz PowerPC G4, Model PowerMac3,6, with 2GB of RAM.

System: Mac OS 10.4.11

BOINC Client Version: 5.10.30

Extracted results from Message tables:

Mon 31 Dec 17:24:53 2007|World Community Grid|Finished download of faah3105_NSC79594_2CF3s_MIN_xmd04790_08_NSC79594_2CF3s_MIN.pdbqt
Mon 31 Dec 17:30:29 2007||Can't rename client_state_next.xml to client_state.xml; check file and directory permissions
Mon 31 Dec 17:30:29 2007||rename client_state_next.xml to client_state.xml returned error 2: No such file or directory
Mon 31 Dec 17:30:29 2007||[error] Couldn't write state file: system rename
Mon 31 Dec 17:38:35 2007|Einstein@Home|Resuming task h1_0630.95_S5R2__16_S5R3a_1 using einstein_S5R3 version 403
...
Mon 31 Dec 22:26:13 2007|Einstein@Home|Resuming task h1_0630.95_S5R2__16_S5R3a_1 using einstein_S5R3 version 403
Mon 31 Dec 22:29:08 2007||Contacting account manager at http://bam.boincstats.com/
Mon 31 Dec 22:29:11 2007||Account manager: BAM Host-ID: 72506
Mon 31 Dec 22:29:11 2007||Account manager contact succeeded
Mon 31 Dec 23:19:16 2007||Can't rename client_state_next.xml to client_state.xml; check file and directory permissions
Mon 31 Dec 23:19:16 2007||rename client_state_next.xml to client_state.xml returned error 2: No such file or directory
Mon 31 Dec 23:19:16 2007||[error] Couldn't write state file: system rename
Mon 31 Dec 23:26:53 2007|SETI@home|Resuming task 17no06ad.1385.2246498.13.6.213_0 using setiathome_enhanced version 528
...
Tue  1 Jan 12:48:37 2008|Einstein@Home|Resuming task h1_0631.15_S5R2__125_S5R3a_2 using einstein_S5R3 version 403
Tue  1 Jan 12:49:09 2008|ABC@home|Resuming task abc_wu_2684523099000_2805000_0 using abc-finder version 103
Tue  1 Jan 13:46:04 2008||Can't rename client_state_next.xml to client_state.xml; check file and directory permissions
Tue  1 Jan 13:46:04 2008||rename client_state_next.xml to client_state.xml returned error 2: No such file or directory
Tue  1 Jan 13:46:04 2008||[error] Couldn't write state file: system rename
Tue  1 Jan 13:56:24 2008|SZTAKI Desktop Grid|Resuming task aec07ca4-7e23-4196-8945-57ec6eb8ea76_920fed37-af38-4b46-a838-ae1140443927_269530_2 using search version 206
...
Wed  2 Jan 00:34:47 2008|SETI@home|Resuming task 11dc06ad.20620.15024.8.6.9_2 using setiathome_enhanced version 528
Wed  2 Jan 00:44:54 2008||Can't rename client_state_next.xml to client_state.xml; check file and directory permissions
Wed  2 Jan 00:44:54 2008||rename client_state_next.xml to client_state.xml returned error 2: No such file or directory
Wed  2 Jan 00:44:54 2008||[error] Couldn't write state file: system rename
Wed  2 Jan 00:47:56 2008|SZTAKI Desktop Grid|Sending scheduler request: To fetch work.  Requesting 21 seconds of work, reporting 2 completed tasks
...
Wed  2 Jan 08:59:57 2008|SZTAKI Desktop Grid|Starting task aec07ca4-7e23-4196-8945-57ec6eb8ea76_ac760f0a-7598-4d38-ada7-d2e01e6bc73d_277568_1 using search version 206
Wed  2 Jan 09:05:54 2008||Can't rename client_state_next.xml to client_state.xml; check file and directory permissions
Wed  2 Jan 09:05:54 2008||rename client_state_next.xml to client_state.xml returned error 2: No such file or directory
Wed  2 Jan 09:05:54 2008||[error] Couldn't write state file: system rename
Wed  2 Jan 09:24:29 2008|SZTAKI Desktop Grid|Computation for task aec07ca4-7e23-4196-8945-57ec6eb8ea76_00197687-5b81-4992-9efb-d8c4bc1006d3_275395_3 finished
...
Wed  2 Jan 10:00:04 2008|World Community Grid|Resuming task faah3105_NSC79594_2CF3s_MIN_xmd04790_08_0 using faah version 542
Wed  2 Jan 10:13:04 2008||Can't rename client_state_next.xml to client_state.xml; check file and directory permissions
Wed  2 Jan 10:13:04 2008||rename client_state_next.xml to client_state.xml returned error 2: No such file or directory
Wed  2 Jan 10:13:04 2008||[error] Couldn't write state file: system rename
Wed  2 Jan 10:25:31 2008|SETI@home|Resuming task 09no06aj.11199.155408.7.6.8_0 using setiathome_enhanced version 528


As you can see, one can get such an error from any of the five applications I am running under BOINC Client (ABC@home, Einstein@Home, SETI@home, SZTAKI Desktop Grid, World Community Grid. Moreover, these errors have been re-occurring over several OS System updates and several BOINC Client updates for many months.

It is a BUG (Feature?) of the BOINC Client. Any possibility of someone finding the time to nut it out?

George
3) Message boards : BOINC client : Loss of a Project Due Corrupt Account File (Message 5454)
Posted 28 Aug 2006 by gwg
Post:
Fixed today (2006-08-28 03:30 UT) after a detailed bug report to BAM 24 hours ago

Thanks to BAM maintainer.

George
------
4) Message boards : BOINC client : Loss of a Project Due Corrupt Account File (Message 5423)
Posted 24 Aug 2006 by gwg
Post:
Did you post this on the BAM forums as well? As to me it sounds more like a problem with BOINC Account Manager. BOINC doesn't overwrite much of the account_*.xml files, unless you re-attach to them.


good idea. I'll do that.

George
------
5) Message boards : BOINC client : Loss of a Project Due Corrupt Account File (Message 5415)
Posted 23 Aug 2006 by gwg
Post:
I am running BOINC 5.4.9 on an Apple Dual 1 GHz PowerPC G4 processor using Mac OS X 10.4.7. Recently, I set up BOINCStats BAM to synchronise all four of my networked computers, so there may be some problem arising from the synchronisation, although I can't see anything wrong with the BAM data when I log in to BAM, and it has been running for several days now.

I am running four projects on this machine, Seti@Home, Einstein@Home, World Communit Grid, and SZTAKI Desktop Grid. SZTAKI gets 50% of my resources and the others share the remaining 50%, but SZTAKI is usually timing out on CPU total time, or running late (as it is now). Thus, recently BOINC entered its procedure to try and reach the deadline for the SZTAKI Desktop Grid WU.

Last night sometime, I lost the Seti@Home project, so I did an orderly shutdown and rebooted my machine at around 17:00 my time.

I got the following relevant messages, indicating that account_setiathome.berkeley.edu.xml was corrupted:

------

Wed 23 Aug 17:04:35 2006||Starting BOINC client version 5.4.9 for powerpc-apple-darwin
Wed 23 Aug 17:04:35 2006||libcurl/7.15.3 OpenSSL/0.9.7i zlib/1.2.3
Wed 23 Aug 17:04:35 2006||Data directory: /Library/Application Support/BOINC Data
Wed 23 Aug 17:04:36 2006||Couldn't parse account file account_setiathome.berkeley.edu.xml
Wed 23 Aug 17:04:38 2006||Project for statistics file statistics_setiathome.berkeley.edu.xml not found - ignoring
Wed 23 Aug 17:04:40 2006|SETI@home|Project SETI@home is in state file but no account file found
Wed 23 Aug 17:04:40 2006||Application setiathome_enhanced outside project in state file
Wed 23 Aug 17:04:40 2006||File info outside project in state file
Wed 23 Aug 17:04:40 2006||File info outside project in state file
Wed 23 Aug 17:04:40 2006||File info outside project in state file
Wed 23 Aug 17:04:40 2006||File info outside project in state file
Wed 23 Aug 17:04:40 2006||File info outside project in state file
Wed 23 Aug 17:04:40 2006||File info outside project in state file
Wed 23 Aug 17:04:40 2006||File info outside project in state file
Wed 23 Aug 17:04:40 2006||File info outside project in state file
Wed 23 Aug 17:04:40 2006||Application version outside project in state file
Wed 23 Aug 17:04:40 2006||Workunit outside project in state file
Wed 23 Aug 17:04:40 2006||Task 19ap06aa.18683.9569.73576.3.134_2 outside project in state file
Wed 23 Aug 17:04:40 2006||State file error: project http://setiathome.berkeley.edu/ not found
Wed 23 Aug 17:04:40 2006||Processor: 2 Power Macintosh PowerMac3,6
Wed 23 Aug 17:04:40 2006||Memory: 1.00 GB physical, 0 bytes virtual
Wed 23 Aug 17:04:40 2006||Disk: 74.40 GB total, 12.32 GB free
Wed 23 Aug 17:04:41 2006|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 414333; location: New:; project prefs: default
Wed 23 Aug 17:04:41 2006|SZTAKI Desktop Grid|URL: http://szdg.lpds.sztaki.hu/szdg/; Computer ID: 147117; location: home; project prefs: default
Wed 23 Aug 17:04:41 2006|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 55465; location: home; project prefs: home
Wed 23 Aug 17:04:41 2006||General prefs: from World Community Grid (last modified 2006-08-16 09:55:46)
Wed 23 Aug 17:04:41 2006||General prefs: using separate prefs for home
Wed 23 Aug 17:04:43 2006||Listening on port 31416
Wed 23 Aug 17:04:43 2006|World Community Grid|Resuming task faah0764_bdb190_mx1gnn_dry_01_1 using faah version 510
Wed 23 Aug 17:04:44 2006|SZTAKI Desktop Grid|Resuming task ab094285-a60b-418b-a548-bec0c0d6271d_05a8efd8-f727-4461-9d73-d691132f8248_1184_4 using search version 200
Wed 23 Aug 17:04:46 2006||Contacting account manager at http://bam.boincstats.com/
Wed 23 Aug 17:04:46 2006||Using earliest-deadline-first scheduling because computer is overcommitted.
Wed 23 Aug 17:04:46 2006||Suspending work fetch because computer is overcommitted.
Wed 23 Aug 17:06:47 2006||Project communication failed: attempting access to reference site
Wed 23 Aug 17:06:47 2006||Access to reference web site failed - check network connection or proxy configuration.
Wed 23 Aug 17:06:47 2006||Account manager error: http error

------

Indeed, the file appears to have been overwritten with the statistics_einstein.phys.uwm.edu.xml file. See the first few lines below:

account_setiathome.berkeley.edu.xml
-----------------------------------
<project_statistics>
<master_url>http://einstein.phys.uwm.edu/</master_url>
<daily_statistics>
<day>1153180800.000000</day>
<user_total_credit>68486.004429</user_total_credit>
<user_expavg_credit>360.044589</user_expavg_credit>
<host_total_credit>32747.211497</host_total_credit>
<host_expavg_credit>189.910476</host_expavg_credit>
</daily_statistics>
...

All other Seti@Home files appear to be in one piece. Now, my BOINC Manager cannot synchronise with BAM, and I can't figure out how to restore my Seti@Home Project.

Wed 23 Aug 17:19:19 2006||Contacting account manager at http://bam.boincstats.com/
Wed 23 Aug 17:21:19 2006||Account manager error: http error


Any help greatly appreciated.

George
------
6) Message boards : Server programs : BOINC Message Board Internationalisation (Message 5325)
Posted 15 Aug 2006 by gwg
Post:
I have been spending a lot of time recently going through the message boards for the SZTAKI Desktop Grid project, and while I know no Magyar, The Magyar text I see on Safari 2.0.4 on Mac OS X 10.4.7 is garbled, with some accented latin characters being improperly displayed as up to three symbol characters. Changing text encoding from its default (utf-8) to any Latin encoding makes little difference to the garbled characters.

The three symbol characters suggests that the BOINC Message interface is mistreating utf-8 octets, at least on the machine and version run for SZTAKI Desktop Grid. One message (somewhereon the SZTAKI Desktop Grid Message Board) also suggests this. Any possibility of a fix with next upgrade?
7) Message boards : BOINC client : exited with zero status but no 'finished' (Message 5324)
Posted 15 Aug 2006 by gwg
Post:
I have a Dual 1 GHz PowerPC G4 with 1GB SDRAM and am running Mac OS X 10.4.7. The BOINC client is v. 5.4.9, and projects are SETI@Home, Einstein@Home, World Community Grid, and SZTAKI Desktop Grid.

I have been having the same trouble, but with exits almost every hour.

shortly before exiting with zero status but no 'finished', we get the following error messages:

Sun 13 Aug 11:43:27 2006 Can't rename state file: Error -1
Sun 13 Aug 11:43:27 2006 Couldn't write state file: system rename

so the cause is something else locking up the state file.

In checking the state files, I note that something is changing them to read only for the group (admin), and even if I change group status to 'read & write', it gets changed back 30 sec later. Interestingly, the BOINC Client is running in group 'wheel'.

The stderr message is "client exit because of no heartbeat". I haven't noticed it as a problem until upgrade to the latest version of OS X.

The anti-virus software is Norton Antivirus 10.1.1 (002), but it shouldn't be interfering with the file system unless I do a complete check on the disk, which hasn't appened for a while.

I have tried resetting projects, but it has made no difference.

I have now isolated Spotlight from the BOINC data folder, which should keep it from periodically sampling files, and hope that this does the trick. I'll report back later if the problem goes away. Otherwise, there is some sort of access or timing problem between the OS and the BOINC Client.


I have now reset all projects, reset all access rights to the BOINC Data folder to be owner 'system', group 'admin' with read and write for owner and group, read for others. I have reset the BOINC application to the same owner and group and access rights, except execute rights for all three classes of user: these seem to be the default rights for most applications, but BOINC originally had owner george (me) and group wheel.

I have also rechecked that neither Norton Antivirus or Spotlight can access the BOINC Data folder. I rstarted the Boinc manager with the four projects mentioned above, and am still getting problems with heartbeat timeout after BOINC fails to rename the state file. This is obviously a BOINC synchronisation failure. It should be isolated, tracked down, and fixed for the next Mac OS X release.

At the moment, it doesn't seem to be doing too much harm, since it happens less than 20 times per day, although sometimes the state file access failure occurs several times in a few minutes until access and update is achieved. Thereafter, it seems to run for about 20, 30, or 60 minutes and sometimes much longer before failing again.

My concern arose because it seemed to be causing loss of computation, especially in SZTAKI Desktop Grid, but it now seems that the WU was making so little progress that it wasn't even benchmarking. The new WU is still taking about 3 days to complete, but there is progress in all four projects.

George
------
8) Message boards : BOINC client : exited with zero status but no 'finished' (Message 5304)
Posted 13 Aug 2006 by gwg
Post:
I have a Dual 1 GHz PowerPC G4 with 1GB SDRAM and am running Mac OS X 10.4.7. The BOINC client is v. 5.4.9, and projects are SETI@Home, Einstein@Home, World Community Grid, and SZTAKI Desktop Grid.

I have been having the same trouble, but with exits almost every hour.

shortly before exiting with zero status but no 'finished', we get the following error messages:

Sun 13 Aug 11:43:27 2006 Can't rename state file: Error -1
Sun 13 Aug 11:43:27 2006 Couldn't write state file: system rename

so the cause is something else locking up the state file.

In checking the state files, I note that something is changing them to read only for the group (admin), and even if I change group status to 'read & write', it gets changed back 30 sec later. Interestingly, the BOINC Client is running in group 'wheel'.

The stderr message is "client exit because of no heartbeat". I haven't noticed it as a problem until upgrade to the latest version of OS X.

The anti-virus software is Norton Antivirus 10.1.1 (002), but it shouldn't be interfering with the file system unless I do a complete check on the disk, which hasn't appened for a while.

I have tried resetting projects, but it has made no difference.

I have now isolated Spotlight from the BOINC data folder, which should keep it from periodically sampling files, and hope that this does the trick. I'll report back later if the problem goes away. Otherwise, there is some sort of access or timing problem between the OS and the BOINC Client.




Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.