Message boards :
BOINC client :
6.6.9 Windows Notes & Bugs
Message board moderation
Author | Message |
---|---|
Send message Joined: 25 Feb 09 Posts: 10 |
I am still experiencing stopping CUDA work units, but strangely only on one (Computer #3 below) CUDA capable machine. This happens to GGPUGrid, and SETI@Home work units on ALL (I’ve tested each one) versions of BOINC higher than 6.6.5. The work units show "waiting to run", but no CUDA work units are running. Exiting, and restarting BOINC, restarts a CUDA work unit, but within a few minutes no CUDA work units are running. This issue also occurred on computer #3 with video driver 6.14.11.8122. I had also experienced this issue on all 5 computers using BOINC 6.6.7, but only computer #3 continues to have a problem with BOINC 6.6.9. The issue above is different than the hanging SETI CUDA work unit problem well documented in other threads. I have 5 CUDA capable computers (Windows XP Service Pack 3) and I'm running SETI@ Home (CUDA 6.08 on BOINC 6.6.9) (also GPUGRID) on all 5, as well as World Community Grid, & Einstein@home. Below are the specs for my 5 CUDA capable computers: #1 Intel Core 2CPU 6600@2.40Ghz 1.98GB RAM GeForce 9600GT 512RAM Driver 6.14.11.8206 #2 Intel 2 Quad Q6600@2.40GHz 1.98GB RAM GeForce 9800GTX+ 512MB RAM Driver 6.14.11. 8206 #3 Intel 2 Quad Q6600@2.40GHz 1.98GB RAM GeForce 9600GT 512MB RAM Driver 6.14.11. 8206 #4 Intel 2 Quad Q9550@2.83GHz 1.98GB RAM GeForce 9600GT 512MB RAM Driver 6.14.11. 8206 #5 Intel 2 Quad Q9550@2.83GHz 3.25GB RAM GeForce 9800GT 1.024GB RAM Driver 6.14.11.8206 I have rolled back computer #3 to BOINC 6.6.5, and the problem has gone away. Rick |
Send message Joined: 29 Aug 05 Posts: 15483 |
Hi Rick, can you test 6.6.11 and see if it is fixed in that version, please? According to the developers this version should fix that CUDA hang/waiting to run problem. |
Send message Joined: 25 Feb 09 Posts: 10 |
Jord, I'll get right on it. Rick |
Send message Joined: 5 Oct 06 Posts: 5082 |
It looks as if Rom is clearing the decks ready for an official release - he's just bumped the milestone on a couple of my recent tickets (a bugfix from 6.6 to 6.8, and an enhancement from 6.6 to Undetermined). I also saw that he'd fixed a couple of older ones today, but none of mine - boo-hoo. Get your version 6.6 test reports in now - hurry, while stocks last. |
Send message Joined: 25 Feb 09 Posts: 10 |
Jord, After 5 hours of running 6.6.11, I have not had a single instance of CUDA work units stopping. Next test will be when my GPUGRID work unit finishes in about 1.75 hours. Tested 6.6.11 on computer: #3 Intel 2 Quad Q6600@2.40GHz 1.98GB RAM GeForce 9600GT 512MB RAM Driver 6.14.11. 8206 Tomorrow, I'll install 6.6.11 on 9 of my 10 other computers (1 laptop is in the shop) as a double check. It looks like the BOINC programers did a great job! Rick |
Send message Joined: 29 Aug 05 Posts: 15483 |
I don't see it, so is it perhaps something with WCG? Also, make sure you don't have throttling anywhere, be it in the web preferences, or in the local preferences. |
Send message Joined: 29 Aug 05 Posts: 15483 |
And that's of course starting version 6.6.10. The 6.6.10 installer kindly deleting all and everything on the disk without trace from Data and program dir before planting itself on and using the same original manually designated data_dir. Actually, what happened and what I saw was that it did the remigration to the BOINC directory and migration to the data directory. With the migration things can and will go wrong. Nothing went wrong for me. Knocks wood. I'll inform David and Rom about your troubles. |
Send message Joined: 5 Oct 06 Posts: 5082 |
There's been another report of (so far) unexplained preempting for 1 second with BOINC v6.6.10 - this time without the 'CPU - throttle' explanation. http://lunatics.kwsn.net/gpu-crunching/it-works.msg14850.html#msg14850 (SETI third-party site - CUDA application) |
Send message Joined: 25 Feb 09 Posts: 10 |
Jord, BOINC 6.6.11 completed 1 GPUGRID CUDA work unit, and started another without a problem, and continues to supply my GPU with work. The next test is to see how BOINC 6.6.11 will finish all GPUGRID work, and switch to Seti CUDA work units. This switch should be in about 12 hours. I've started upgrading my other computers to 6.6.11, and will let you know if I encounter any problems. Rick |
Send message Joined: 26 Feb 09 Posts: 2 |
I am still experiencing stopping CUDA work units, but strangely only on one (Computer #3 below) CUDA capable machine. This happens to GGPUGrid, and SETI@Home work units on ALL (I’ve tested each one) versions of BOINC higher than 6.6.5. The work units show "waiting to run", but no CUDA work units are running. Exiting, and restarting BOINC, restarts a CUDA work unit, ... I installed 6.6.11 about 24 hours ago to overcome the CUDA "waiting to run" problem that I had expeienced with 6.6.9. I was away from my machine for most of the day and when I returned I found it running 4 SETI Astropulse on the 4 CPU cores and one 1 SETI CUDA on the GTX295, all "high priority" with the second GPU core displaying "waiting to run". Mindful of the earlier 6.6.x versions' propensity to not shut down the connected client, I elected to shut down the connected client from the BM Advanced menu and then exited Boinc Manager. As a double check I then opened Windows Task Manager and found an instance of Boinc.exe still showing as 25% (i.e. a complete core of my Q9450). Having killed that I re-launched Boinc Manager and everything is back as it should be with 6 WU's being crunched - but it does not seem to me that the problem has been solved. For reference I am running Vista Home Premium 64 bit. F. |
Send message Joined: 5 Oct 06 Posts: 5082 |
There are reports from SETI that work requests for CUDA plan class are not being filled, and instead are being met with a 86400 second (1 day) backoff. Not clear from the log whether this is server-mandated or a client response: 3/1/2009 6:14:18 AM SETI@home [wfd] request: CPU (0.00 sec, 0) CUDA (371520.00 sec, 1) SETI multibeam (practicable for CUDA) work was available around that time - request not made during a project outage. References: SETI Will not fetch new work! Lunatics pre-release Problem downloading Seti Enhanced - Astropulse ok |
Send message Joined: 8 Jan 06 Posts: 448 |
Meantime tried 6.6.14, which is as far as connectivity / responsiveness a step backward. Poorly connects to local and remote hosts, most of the time offering blank screens and not telling it's even trying to connect. Some jingling in the left bottom status bar after which there is no info whatsoever. Connecting from a 6.6.11 client to the 6.6.14 core client works fine. Also the 5.10.45 BM still on disk connects well and has no display issues. On XP Duo. The Core client runs fine s'far as I can determine. V6.6.14 has a problem were Boinc starts up blank even though the project are working properly. This could be related to that problem. V6.6.15 is now posted on the dl index page. There hasn't yet been an anouncement on the mailing list. |
Send message Joined: 5 Oct 06 Posts: 5082 |
Indeed 6.6.15 fixed that issue of empty grid lists and status bar. Showing tasks in order of receipt is the 'natural', unsorted, display order for tasks. But once you have applied a sort (of any description), there is no 'unsort' option in BOINC: I have to resort to editing the Registry, which is hardly an elegant solution. And having done that, now I find that BOINC doesn't - for CUDA tasks - start them in order of receipt any more, but always starts them Earliest Deadline First. Welcome back EDF. |
Send message Joined: 29 Aug 05 Posts: 15483 |
And having done that, now I find that BOINC doesn't - for CUDA tasks - start them in order of receipt any more, but always starts them Earliest Deadline First. Welcome back EDF. it is possible that with the new GPU scheduler now finally working, BOINC is learning how long those wretched CUDA tasks take. |
Send message Joined: 5 Oct 06 Posts: 5082 |
And having done that, now I find that BOINC doesn't - for CUDA tasks - start them in order of receipt any more, but always starts them Earliest Deadline First. Welcome back EDF. LOL |
Send message Joined: 5 Oct 06 Posts: 5082 |
The scheduler is indeed working well. On the two machines where I've stopped fiddling and changing things, and where I have got DCF nicely balanced across all four sub-projects, I'm seeing a perfect procession of 11/03/2009 10:55:42 SETI@home Sending scheduler request: To fetch work. 11/03/2009 10:55:42 SETI@home Reporting 1 completed tasks, requesting new tasks 11/03/2009 10:55:47 SETI@home Scheduler request completed: got 1 new tasks Trouble is, this is a CUDA machine, so those requests go through about once every 20 minutes: and even with long-running tasks on other projects, and a reasonable cache size, the sched_request file is 60KB. That's an awful lot of administrative network traffic. |
Send message Joined: 5 Oct 06 Posts: 5082 |
And having done that, now I find that BOINC doesn't - for CUDA tasks - start them in order of receipt any more, but always starts them Earliest Deadline First. Welcome back EDF. 20 minutes. BOINC's so obsessive about EDF, it even pre-empts running tasks. You'd think that, even with a 7-day deadline for 'shorty' tasks (and 5 days slack in the cache), it could afford to wait 20 minutes for completion? |
Send message Joined: 5 Oct 06 Posts: 5082 |
One thing noticed on various releases starting 6.6.9, and currently on 6.6.11 client is that the dates (Vista 32), randomly show in alternation 24 hour or 12 hours am /pm notation. When hitting the message filter, all message lines fall in line. Note that the system is set to 24 hour notation, but the log shows then in 12 hour am/pm. And continuing through 6.6.14 with the interim, pre-release, v6.6.15 (ish) manager. |
Send message Joined: 5 Oct 06 Posts: 5082 |
Funnily enough, I was just musing on a similar theme. As an utterly, utterly trivial example: I claim bragging rights to changeset [trac]changeset:17606[/trac], following a report I made to BOINC_alpha on Monday. Now I don't expect my name in lights, or bonus cobblestones, or anything like that: fixing a typo is perfectly reasonably a silent process. But I still think it would be better if error reports (even this trivial) had some sort of reply. One word would do. I don't expect "Thanks": "Noted", "Fixed", or even "Ooops" are plenty. But there is a severe lack of 'ack' packets in the communication link between users/testers and development/administration. |
Send message Joined: 29 Aug 05 Posts: 15483 |
[trac]changeset:17608[/trac]: Manager: show elapsed time instead of CPU time in Task tab. CPU time is visible in task Properties. Manager: in task Properties, show final CPU and elapsed times if job is finished |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.