Message boards : BOINC Manager : message timeout
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 1 Apr 07 Posts: 13 |
Kathryn, I have not gotten those message timeout errors, nor those process errors (not since I fooled with network activity). But since I create that file Boinc Manager still has not changed clients. The only messages I am getting are: (First one) Mon Apr 2 08:25:41 2007|SETI@home|[task_debug] result 25ja04ab.7635.24577.554814.3.13_1 checkpointed and the latest one Mon Apr 2 11:00:58 2007|SETI@home|[task_debug] result 25ja04ab.7635.24577.554814.3.13_1 checkpointed What is next? |
Send message Joined: 30 Oct 05 Posts: 1239 |
Log into your account on the Seti webpage (click on "Your Account"). Scroll down to the preferences section. Copy and paste all of them into a message here. Also, what projects are you running (Seti and Rosetta as fas as I can tell right now) and how many work units are on that computer, their estimated time to completion and their deadlines? It's possible that the scheduler thinks you are in Earliest Deadline First mode and will work on that Seti unit until it's finished. Kathryn :o) |
Send message Joined: 1 Apr 07 Posts: 13 |
Kathryn, Here are my Seti preferences Do work while computer is running on batteries? (matters only for portable computers) no Do work while computer is in use? yes Do work only between the hours of (no restriction) Leave applications in memory while suspended? no Switch between applications every (recommended: 60 minutes) 60 minutes On multiprocessors, use at most 1 processors Use at most Enforced by version 5.6 and greater 100 percent of CPU time Disk and memory usage Use at most 8 GB disk space Leave at least (Values smaller than 0.001 are ignored) 40 GB disk space free Use at most 25% of total disk space Write to disk at most every 60 seconds Use at most 50% of page file (swap space) Use at most Enforced by version 5.8 and greater 50% of memory when computer is in use Use at most Enforced by version 5.8 and greater 90% of memory when computer is idle Network usage Connect to network about every (determines size of work cache; maximum 10 days) 0.1 days Confirm before connecting to Internet? (matters only if you have a modem, ISDN or VPN connection) no Disconnect when done? (matters only if you have a modem, ISDN or VPN connection) no Maximum download rate: 1000 KB/s Maximum upload rate: 600 KB/s Use network only between the hours of (no restriction) Skip image file verification? no Projects: Rosetta - to completion 21:46:10 Due April 10 Einstein to completion 24:18:18 - Due April 7 Seti - to completion 12:49:17 - Due April 5 I also have ClimatePrediction - but there are no workunits. |
Send message Joined: 30 Oct 05 Posts: 1239 |
I'll have to see if I can find some more details on EDF. But the deadline for Seti is the closest. So it would make sense that EDF might be in effect. Let me do a bit more checking. I might not be able to get back to you until late tonight. Kathryn :o) |
Send message Joined: 29 Aug 05 Posts: 15573 |
Please open client_state.xml (with a browser, or with TextEdit). Find and copy the following: * Everything between the <time_stats></time_stats> flags. (it'll be 4 flags here with numbers between them) * For each project the <short_term_debt></short_term_debt> and <long_term_debt></long_term_debt> flags with the numbers between them. Post all those in an answer window here. Close the client_state.xml file, don't save it. Just exit it. |
Send message Joined: 1 Apr 07 Posts: 13 |
Jord, Here they are: <time_stats> <on_frac>0.940208</on_frac> <connected_frac>-1.000000</connected_frac> <active_frac>0.816090</active_frac> <cpu_efficiency>0.647537</cpu_efficiency> <last_update>1175542468.906536</last_update> </time_stats> Rosetta <short_term_debt>20852.940989</short_term_debt> <long_term_debt>189799.004157</long_term_debt> ClimatePrediction (No Work) <short_term_debt>0.000000</short_term_debt> <long_term_debt>9243.277100</long_term_debt> Einstein <short_term_debt>20511.203997</short_term_debt> <long_term_debt>218639.863178</long_term_debt> lhcathome (No Work) <short_term_debt>0.000000</short_term_debt> <long_term_debt>-51226.931973</long_term_debt> Seti <short_term_debt>-41364.144986</short_term_debt> <long_term_debt>-366455.212463</long_term_debt> I also should post the messages from BoincManager: Mon Apr 2 14:27:56 2007|SETI@home|[task_debug] result 25ja04ab.7635.24577.554814.3.13_1 checkpointed Mon Apr 2 14:28:27 2007|SETI@home|[cpu_sched] Preempting 25ja04ab.7635.24577.554814.3.13_1 (removed from memory) Mon Apr 2 14:28:27 2007|SETI@home|[task_debug] task_state=QUIT_PENDING for 25ja04ab.7635.24577.554814.3.13_1 from preempt Mon Apr 2 14:28:27 2007|Einstein@Home|[cpu_sched] Starting h1_0707.5_S5R1__8447_S5RIa_1(resume) Mon Apr 2 14:28:27 2007||[task_debug] ACTIVE_TASK::start(): forked process: pid 22591 Mon Apr 2 14:28:27 2007|Einstein@Home|[task_debug] task_state=EXECUTING for h1_0707.5_S5R1__8447_S5RIa_1 from start Mon Apr 2 14:28:27 2007|Einstein@Home|Restarting task h1_0707.5_S5R1__8447_S5RIa_1 using einstein_S5RI version 426 Mon Apr 2 14:28:28 2007|SETI@home|[task_debug] Process for 25ja04ab.7635.24577.554814.3.13_1 exited Mon Apr 2 14:28:28 2007|SETI@home|[task_debug] task_state=UNINITIALIZED for 25ja04ab.7635.24577.554814.3.13_1 from handle_exited_app Mon Apr 2 14:28:28 2007|SETI@home|[task_debug] exit status 0 Mon Apr 2 14:29:30 2007|Einstein@Home|[task_debug] result h1_0707.5_S5R1__8447_S5RIa_1 checkpointed Mon Apr 2 14:30:31 2007|Einstein@Home|[task_debug] result h1_0707.5_S5R1__8447_S5RIa_1 checkpointed So Seti ran for 6 hours before it switched to Einstein. |
Send message Joined: 29 Aug 05 Posts: 15573 |
Rosetta OK, these numbers here are seconds. The mean of all those seconds is always zero. So Seti won't crunch or download work for a while, it won't crunch for 41,364 seconds and it won't download new work for 366,455 seconds. The next project that will run, measured by the short term debt figures, is either Einstein or Rosetta. That depends on which has work. Both those projects have the highest positive debts. Short term debt says to BOINC which project will get the CPU next. Long term debt will tell BOINC which project to download work from next. When the numbers are positive they are active projects. When the numbers are negative, they are inactive projects. I don't think anything ran in EDF (Earliest Deadline First) mode. It was probably just that Seti had a lot of time to catch up. Just let it run like this for now. As for the message timeouts, still looking into it. |
Send message Joined: 1 Apr 07 Posts: 13 |
Jord, So should I continue to allow the Task Debug to run? I have not been getting any message timeouts right now and Einstein is running now (for over 2 hours). Maybe it is playing catchup? |
Send message Joined: 29 Aug 05 Posts: 15573 |
Yes, let it run with the cc_config.xml flags on. Then in case your trouble returns, you have a log of it, either in the txt file or the old file (the *.old files are just backup files for when the original log gets too big). |
Send message Joined: 1 Apr 07 Posts: 13 |
So far, I have not gotten any errors and it seems like the work units are taking turns every one to two hours. I have Network Activity set to 'based on preferences'. I do hope (knock on wood) that those errors do not come back. |
Send message Joined: 16 Apr 06 Posts: 386 |
Another 'message timeout' on the following thread, although it started off as a 'missing uploads' problem (hinting at network issues?). The 'message timeout' occurred while networking was suspended. Also some error code 500s (is that serverside or clientside?). http://www.climateprediction.net/board/viewtopic.php?p=62485#62485 |
Send message Joined: 29 Aug 05 Posts: 15573 |
Also some error code 500s (is that serverside or clientside?). I haven't seen errors 500 in a long time and to be honest, I don't see one in that post either. But if they are around, it's a condition of the route between the client and the server. It can only be solved from the client side. All that that specific post talks about is hadcm3 version 5.15 and scheduler version 5.09 (server version 509). Perhaps that that confused you? |
Send message Joined: 16 Apr 06 Posts: 386 |
It was this bit (same user, a few posts earlier): Scheduler request failed: HTTP internal server error I searched on that, and the post I found said that was an error code 500 (but perhaps that was a jump too far). The CPDN servers are in a state of flux at the moment due to being moved from pillar to post, so I'm not sure which of his problems are due to problems on the client side, and which are due to the server. |
Send message Joined: 29 Aug 05 Posts: 15573 |
As far as I know, the ordinary HTTP error is the error 500. An internal server error is server side. The BOINC Wiki still details what to do with errors 500. |
Send message Joined: 29 Aug 05 Posts: 15573 |
BOINC Wiki article on Errors 500. |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.