Enormous numbers in cpu_time

Message boards : Server programs : Enormous numbers in cpu_time
Message board moderation

To post messages, you must log in.

AuthorMessage
kvn1961
Avatar

Send message
Joined: 9 Jul 12
Posts: 2
Australia
Message 44796 - Posted: 9 Jul 2012, 5:13:09 UTC
Last modified: 9 Jul 2012, 5:16:23 UTC

I'm developing a BOINC project to calculate Spectral Energy Distributions for Astronomy images, but need some advice on tracking down an annoying little bug. We are currently testing the application out using AWS to host it. The project uses the BOINC wrapper to wrap some Fortran code. Essentially it processes a number of pixels one at a time. Credit is assigned on the number of pixels process as it uses a simple "brute force" model fitting approach.

The project is up and running on 32/64 bit Windows, OS X, and 32bit/64bit Linux. Occasionally I am getting absolutely stupid cpu_time values as shown below:

mysql> select userid, name, cpu_time,claimed_credit, granted_credit from result;
+--------+----------------------------------+--------------+----------------+----------------+
| userid | name                             | cpu_time     | claimed_credit | granted_credit |
+--------+----------------------------------+--------------+----------------+----------------+
|      1 | GI3_050001_NGC628_0001__wu31_0   |     677466.6 |              0 |             27 |
|      1 | GI3_050001_NGC628_0001__wu31_1   |       681845 |              0 |             27 |
|      1 | GI3_050001_NGC628_0001__wu31_2   |     680222.4 |              0 |             27 |
|      1 | GI3_050001_NGC628_0001__wu9_0    |       698966 |              0 |              0 |
|      3 | GI3_050001_NGC628_0001__wu12_2   | 1.218908e254 |              0 |              1 |
|      1 | GI3_050001_NGC628_0001__wu377_0  | 1.553102e254 |              0 |             39 |
|      1 | GI3_050001_NGC628_0001__wu108_0  | 1.553102e254 |              0 |             49 |
|      4 | GI3_050001_NGC628_0001__wu13_1   |  2.71667e254 |              0 |             34 |
|      4 | GI3_050001_NGC628_0001__wu707_1  |  9.07193e216 |              0 |              2 |
|      4 | GI3_050001_NGC628_0001__wu96_0   |     337490.8 |              0 |             10 |
|      1 | GI3_050001_NGC628_0001__wu96_1   |     322856.6 |              0 |             10 |
|      1 | GI3_050001_NGC628_0001__wu96_2   |     326189.8 |              0 |             10 |
|      1 | GI3_050001_NGC628_0001__wu74_0   |     305999.3 |              0 |             10 |
|      1 | GI3_050001_NGC628_0001__wu74_1   |     306578.1 |              0 |             10 |
|      4 | GI3_050001_NGC628_0001__wu588_0  |     934082.2 |              0 |             49 |
|      4 | GI3_050001_NGC628_0001__wu588_1  |       929387 |              0 |             49 |
|      4 | GI3_050001_NGC628_0001__wu588_2  |       918652 |              0 |             49 |



Some of the cpu numbers are greater than 1e216 - which is a silly number. Could someone point me in the right direction to debug this?

The odd thing is the same client can run the job twice and one run is fine, whilst the next gives the stupid values. I've seen it happen on OS X, 32 & 64 bit windows. Not linux yet, but I only have one test client at the moment there

Thanks in advance
Kevin
ID: 44796 · Report as offensive
Profile yoyo
Avatar

Send message
Joined: 23 May 06
Posts: 41
Germany
Message 44854 - Posted: 12 Jul 2012, 21:18:36 UTC

This is solved in the latest wrapper.
yoyo
Germany biggest distributed computing community Rechenkraft.net
ID: 44854 · Report as offensive

Message boards : Server programs : Enormous numbers in cpu_time

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.