Message boards : BOINC client : Irritating
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 Oct 05 Posts: 408 |
I was looking at one of my machines and noticed there seemed to be quite a lot of wu's, (LHC wu's), on there with a relatively short deadline, (4 days). This was due to a stupid mistake I made. Normally, that machine has it's "connect time" set to 0.5 days, I had been fiddling to get some Rosetta wu's and when I set it back, I put 5 days instead of .5 and the thing had downloaded great wodges of work from everywhere before I could stop it, most of which cleared sensibly. Anyway, from experience, I know that a good number of LHC wu's go to 100% quite quickly, others take hours. I figured a good thing to do was to run each of these wu's for 10 minutes. If they were all still there, I could maybe bump the CPU quota for LHC for a while to get this cleared. So I suspended the other projects, and let a wu run for ~10 minutes, the I suspended it and let the next run for ~10 and so on. At least 2 of the wu's did finish in that time so the problem was starting to look less serious, but I figured I may as well do them all. Upon suspending a wu, all the remaining unstarted units suddenly failed with a computation error. The problem was that CreateProcess() failed to create a new process as the systems paging file was too small. The thing is, this is not an error with the wu or application, it is an indication of a busy machine, and at best, one in need of a reboot, or page file bump. I would have thought this "computation error" should not crash out the wu's, rather, it should simply log the reason it failed to start as expected in the message log so corrective action could be taken. Even if it was a new install, it is something that is possibly correctable, but on a machine that has been running for ages, BOINC knows that the machine is basically good, just busy. BOINC core 5.2.6 25/01/06 16:27:47|LHC@home|Pausing result woct1_v6s4hvnom_mqx-oct1__16__64.209_59.219__10_12__6__80_1_sixvf_boinc42500_5 (left in memory) 25/01/06 16:27:47|LHC@home|CreateProcess() failed - The paging file is too small for this operation to complete. (0x5af) 25/01/06 16:27:48|LHC@home|CreateProcess() failed - The paging file is too small for this operation to complete. (0x5af) 25/01/06 16:27:48|LHC@home|CreateProcess() failed - The paging file is too small for this operation to complete. (0x5af) 25/01/06 16:27:48|LHC@home|CreateProcess() failed - The paging file is too small for this operation to complete. (0x5af) 25/01/06 16:27:49|LHC@home|CreateProcess() failed - The paging file is too small for this operation to complete. (0x5af) 25/01/06 16:27:49|LHC@home|Unrecoverable error for result woct1_v6s4hvnom_mqx-oct1__16__64.21_59.22__8_10__6__70_1_sixvf_boinc42549_2 (CreateProcess() failed - The paging file is too small for this operation to complete. (0x5af)) 25/01/06 16:27:49||request_reschedule_cpus: start failed 25/01/06 16:27:49|LHC@home|Computation for result woct1_v6s4hvnom_mqx-oct1__16__64.21_59.22__8_10__6__70_1_sixvf_boinc42549_2 finished Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Send message Joined: 8 Sep 05 Posts: 168 |
Not Boinc's error..... BOINC Wiki |
Send message Joined: 29 Aug 05 Posts: 225 |
|
Send message Joined: 2 Oct 05 Posts: 408 |
>>> Not Boinc's error..... BOINC did not cause the run on the paging file, I did, I explained that above, it did, however, trash my remaining wu's by failing them on the process creation failure. >>> The wiki is still your friend. Tells me nothing I did not already know, and indeed explained above. As I said, the problem was caused when I tried to create an excessive number of client processes. Since they use large amounts of virtual memory, they exceeded the page file size. My comment is that this situation should not be handled the way it is. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.