6.6.9 Windows Notes & Bugs

Message boards : BOINC client : 6.6.9 Windows Notes & Bugs
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Rick A. Sponholz
Avatar

Send message
Joined: 25 Feb 09
Posts: 10
United States
Message 23297 - Posted: 25 Feb 2009, 14:54:13 UTC
Last modified: 25 Feb 2009, 14:56:27 UTC

I am still experiencing stopping CUDA work units, but strangely only on one (Computer #3 below) CUDA capable machine. This happens to GGPUGrid, and SETI@Home work units on ALL (I’ve tested each one) versions of BOINC higher than 6.6.5. The work units show "waiting to run", but no CUDA work units are running. Exiting, and restarting BOINC, restarts a CUDA work unit, but within a few minutes no CUDA work units are running. This issue also occurred on computer #3 with video driver 6.14.11.8122. I had also experienced this issue on all 5 computers using BOINC 6.6.7, but only computer #3 continues to have a problem with BOINC 6.6.9. The issue above is different than the hanging SETI CUDA work unit problem well documented in other threads. I have 5 CUDA capable computers (Windows XP Service Pack 3) and I'm running SETI@ Home (CUDA 6.08 on BOINC 6.6.9) (also GPUGRID) on all 5, as well as World Community Grid, & Einstein@home. Below are the specs for my 5 CUDA capable computers:

#1 Intel Core 2CPU 6600@2.40Ghz 1.98GB RAM
GeForce 9600GT 512RAM Driver 6.14.11.8206

#2 Intel 2 Quad Q6600@2.40GHz 1.98GB RAM
GeForce 9800GTX+ 512MB RAM Driver 6.14.11. 8206

#3 Intel 2 Quad Q6600@2.40GHz 1.98GB RAM
GeForce 9600GT 512MB RAM Driver 6.14.11. 8206

#4 Intel 2 Quad Q9550@2.83GHz 1.98GB
RAM GeForce 9600GT 512MB RAM Driver 6.14.11. 8206

#5 Intel 2 Quad Q9550@2.83GHz 3.25GB RAM
GeForce 9800GT 1.024GB RAM Driver 6.14.11.8206

I have rolled back computer #3 to BOINC 6.6.5, and the problem has gone away.
Rick
ID: 23297 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 23299 - Posted: 25 Feb 2009, 17:48:38 UTC - in response to Message 23297.  

Hi Rick, can you test 6.6.11 and see if it is fixed in that version, please?
According to the developers this version should fix that CUDA hang/waiting to run problem.
ID: 23299 · Report as offensive
Rick A. Sponholz
Avatar

Send message
Joined: 25 Feb 09
Posts: 10
United States
Message 23302 - Posted: 25 Feb 2009, 23:35:43 UTC - in response to Message 23299.  

Jord,
I'll get right on it.
Rick
ID: 23302 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23303 - Posted: 25 Feb 2009, 23:44:23 UTC

It looks as if Rom is clearing the decks ready for an official release - he's just bumped the milestone on a couple of my recent tickets (a bugfix from 6.6 to 6.8, and an enhancement from 6.6 to Undetermined). I also saw that he'd fixed a couple of older ones today, but none of mine - boo-hoo.

Get your version 6.6 test reports in now - hurry, while stocks last.
ID: 23303 · Report as offensive
Rick A. Sponholz
Avatar

Send message
Joined: 25 Feb 09
Posts: 10
United States
Message 23306 - Posted: 26 Feb 2009, 5:04:19 UTC - in response to Message 23299.  

Jord,
After 5 hours of running 6.6.11, I have not had a single instance of CUDA work units stopping. Next test will be when my GPUGRID work unit finishes in about 1.75 hours. Tested 6.6.11 on computer:

#3 Intel 2 Quad Q6600@2.40GHz 1.98GB RAM
GeForce 9600GT 512MB RAM Driver 6.14.11. 8206

Tomorrow, I'll install 6.6.11 on 9 of my 10 other computers (1 laptop is in the shop) as a double check. It looks like the BOINC programers did a great job!
Rick
ID: 23306 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 23311 - Posted: 26 Feb 2009, 13:57:27 UTC - in response to Message 23310.  

I don't see it, so is it perhaps something with WCG?
Also, make sure you don't have throttling anywhere, be it in the web preferences, or in the local preferences.
ID: 23311 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 23313 - Posted: 26 Feb 2009, 15:04:37 UTC - in response to Message 23312.  

And that's of course starting version 6.6.10. The 6.6.10 installer kindly deleting all and everything on the disk without trace from Data and program dir before planting itself on and using the same original manually designated data_dir.

Actually, what happened and what I saw was that it did the remigration to the BOINC directory and migration to the data directory. With the migration things can and will go wrong. Nothing went wrong for me. Knocks wood.

I'll inform David and Rom about your troubles.
ID: 23313 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23314 - Posted: 26 Feb 2009, 17:24:41 UTC

There's been another report of (so far) unexplained preempting for 1 second with BOINC v6.6.10 - this time without the 'CPU - throttle' explanation.

http://lunatics.kwsn.net/gpu-crunching/it-works.msg14850.html#msg14850
(SETI third-party site - CUDA application)
ID: 23314 · Report as offensive
Rick A. Sponholz
Avatar

Send message
Joined: 25 Feb 09
Posts: 10
United States
Message 23315 - Posted: 26 Feb 2009, 17:34:34 UTC - in response to Message 23299.  

Jord,

BOINC 6.6.11 completed 1 GPUGRID CUDA work unit, and started another without a problem, and continues to supply my GPU with work. The next test is to see how BOINC 6.6.11 will finish all GPUGRID work, and switch to Seti CUDA work units. This switch should be in about 12 hours. I've started upgrading my other computers to 6.6.11, and will let you know if I encounter any problems.

Rick
ID: 23315 · Report as offensive
Fred W

Send message
Joined: 26 Feb 09
Posts: 2
United Kingdom
Message 23317 - Posted: 26 Feb 2009, 20:11:42 UTC - in response to Message 23297.  

I am still experiencing stopping CUDA work units, but strangely only on one (Computer #3 below) CUDA capable machine. This happens to GGPUGrid, and SETI@Home work units on ALL (I’ve tested each one) versions of BOINC higher than 6.6.5. The work units show "waiting to run", but no CUDA work units are running. Exiting, and restarting BOINC, restarts a CUDA work unit, ...

I installed 6.6.11 about 24 hours ago to overcome the CUDA "waiting to run" problem that I had expeienced with 6.6.9.
I was away from my machine for most of the day and when I returned I found it running 4 SETI Astropulse on the 4 CPU cores and one 1 SETI CUDA on the GTX295, all "high priority" with the second GPU core displaying "waiting to run".
Mindful of the earlier 6.6.x versions' propensity to not shut down the connected client, I elected to shut down the connected client from the BM Advanced menu and then exited Boinc Manager. As a double check I then opened Windows Task Manager and found an instance of Boinc.exe still showing as 25% (i.e. a complete core of my Q9450). Having killed that I re-launched Boinc Manager and everything is back as it should be with 6 WU's being crunched - but it does not seem to me that the problem has been solved.
For reference I am running Vista Home Premium 64 bit.

F.
ID: 23317 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23383 - Posted: 1 Mar 2009, 10:29:58 UTC

There are reports from SETI that work requests for CUDA plan class are not being filled, and instead are being met with a 86400 second (1 day) backoff. Not clear from the log whether this is server-mandated or a client response:

3/1/2009 6:14:18 AM SETI@home [wfd] request: CPU (0.00 sec, 0) CUDA (371520.00 sec, 1)
3/1/2009 6:14:18 AM SETI@home Sending scheduler request: Requested by user.
3/1/2009 6:14:18 AM SETI@home Requesting new tasks
3/1/2009 6:14:23 AM SETI@home Scheduler request completed: got 0 new tasks
3/1/2009 6:14:23 AM SETI@home [wfd] backing off CUDA 86400 sec

SETI multibeam (practicable for CUDA) work was available around that time - request not made during a project outage.

References:
SETI Will not fetch new work!
Lunatics pre-release Problem downloading Seti Enhanced - Astropulse ok
ID: 23383 · Report as offensive
Aurora Borealis
Avatar

Send message
Joined: 8 Jan 06
Posts: 448
Canada
Message 23603 - Posted: 10 Mar 2009, 18:17:11 UTC - in response to Message 23602.  
Last modified: 10 Mar 2009, 18:17:55 UTC

Meantime tried 6.6.14, which is as far as connectivity / responsiveness a step backward. Poorly connects to local and remote hosts, most of the time offering blank screens and not telling it's even trying to connect. Some jingling in the left bottom status bar after which there is no info whatsoever. Connecting from a 6.6.11 client to the 6.6.14 core client works fine. Also the 5.10.45 BM still on disk connects well and has no display issues. On XP Duo. The Core client runs fine s'far as I can determine.

V6.6.14 has a problem were Boinc starts up blank even though the project are working properly. This could be related to that problem. V6.6.15 is now posted on the dl index page. There hasn't yet been an anouncement on the mailing list.
ID: 23603 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23611 - Posted: 11 Mar 2009, 9:17:02 UTC - in response to Message 23610.  

Indeed 6.6.15 fixed that issue of empty grid lists and status bar.

On the quick a few bugs reported in this thread are still there and many would still like to get a sort of the results in the deadline column that lists them in the order of receipt, not due date. This allows to know which results will start next and micro manage a little, where certain combinations of projects on multi-cores continues to be very inefficient... losses at many tens of percentage points and substantial temperature elevation.

Showing tasks in order of receipt is the 'natural', unsorted, display order for tasks. But once you have applied a sort (of any description), there is no 'unsort' option in BOINC: I have to resort to editing the Registry, which is hardly an elegant solution.

And having done that, now I find that BOINC doesn't - for CUDA tasks - start them in order of receipt any more, but always starts them Earliest Deadline First. Welcome back EDF.
ID: 23611 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 23612 - Posted: 11 Mar 2009, 10:01:26 UTC - in response to Message 23611.  

And having done that, now I find that BOINC doesn't - for CUDA tasks - start them in order of receipt any more, but always starts them Earliest Deadline First. Welcome back EDF.

it is possible that with the new GPU scheduler now finally working, BOINC is learning how long those wretched CUDA tasks take.
ID: 23612 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23613 - Posted: 11 Mar 2009, 10:08:54 UTC - in response to Message 23612.  

And having done that, now I find that BOINC doesn't - for CUDA tasks - start them in order of receipt any more, but always starts them Earliest Deadline First. Welcome back EDF.

it is possible that with the new GPU scheduler now finally working, BOINC is learning how long those wretched CUDA tasks take.

LOL
ID: 23613 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23614 - Posted: 11 Mar 2009, 11:06:18 UTC

The scheduler is indeed working well. On the two machines where I've stopped fiddling and changing things, and where I have got DCF nicely balanced across all four sub-projects, I'm seeing a perfect procession of

11/03/2009 10:55:42	SETI@home	Sending scheduler request: To fetch work.
11/03/2009 10:55:42	SETI@home	Reporting 1 completed tasks, requesting new tasks
11/03/2009 10:55:47	SETI@home	Scheduler request completed: got 1 new tasks

Trouble is, this is a CUDA machine, so those requests go through about once every 20 minutes: and even with long-running tasks on other projects, and a reasonable cache size, the sched_request file is 60KB. That's an awful lot of administrative network traffic.
ID: 23614 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23617 - Posted: 11 Mar 2009, 17:10:02 UTC - in response to Message 23612.  

And having done that, now I find that BOINC doesn't - for CUDA tasks - start them in order of receipt any more, but always starts them Earliest Deadline First. Welcome back EDF.

it is possible that with the new GPU scheduler now finally working, BOINC is learning how long those wretched CUDA tasks take.

20 minutes.

BOINC's so obsessive about EDF, it even pre-empts running tasks. You'd think that, even with a 7-day deadline for 'shorty' tasks (and 5 days slack in the cache), it could afford to wait 20 minutes for completion?
ID: 23617 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23673 - Posted: 13 Mar 2009, 22:45:23 UTC - in response to Message 23672.  

One thing noticed on various releases starting 6.6.9, and currently on 6.6.11 client is that the dates (Vista 32), randomly show in alternation 24 hour or 12 hours am /pm notation. When hitting the message filter, all message lines fall in line. Note that the system is set to 24 hour notation, but the log shows then in 12 hour am/pm.

And continuing through 6.6.14 with the interim, pre-release, v6.6.15 (ish) manager.
ID: 23673 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23763 - Posted: 18 Mar 2009, 12:11:24 UTC

Funnily enough, I was just musing on a similar theme.

As an utterly, utterly trivial example:

I claim bragging rights to changeset [trac]changeset:17606[/trac], following a report I made to BOINC_alpha on Monday.

Now I don't expect my name in lights, or bonus cobblestones, or anything like that: fixing a typo is perfectly reasonably a silent process.

But I still think it would be better if error reports (even this trivial) had some sort of reply.

One word would do. I don't expect "Thanks": "Noted", "Fixed", or even "Ooops" are plenty.

But there is a severe lack of 'ack' packets in the communication link between users/testers and development/administration.
ID: 23763 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 23792 - Posted: 19 Mar 2009, 17:34:37 UTC - in response to Message 23791.  

[trac]changeset:17608[/trac]:

Manager: show elapsed time instead of CPU time in Task tab. CPU time is visible in task Properties.

Manager: in task Properties, show final CPU and elapsed times if job is finished
ID: 23792 · Report as offensive
1 · 2 · Next

Message boards : BOINC client : 6.6.9 Windows Notes & Bugs

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.