nvlddmkm.sys bsod when running BOINC tasks

Message boards : Questions and problems : nvlddmkm.sys bsod when running BOINC tasks
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23236 - Posted: 23 Feb 2009, 0:40:10 UTC

Only when BOINC is running tasks does this error occur.

The system will randomly (but many times per day) crash to a blue screen, giving nvlddmkm.sys as the cause.

I am running all the latest drivers, etc.

Again, this issue only occurs with BOINC running tasks. If I disable BOINC I never have the crash. Nor does the crash ever occur with any other app or game.

I keep trying new versions of BOINC (dev versions) but no change. Currently using 6.6.9, Windows Vista x64.
ID: 23236 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 23237 - Posted: 23 Feb 2009, 1:29:49 UTC

One thing that I was going to suggest, was to go back to an earlier version of the video drivers. (This was a trick from years ago, which seemed to work a lot.)
But I often find that Googling the program/error-message produces some interesting results.
In this case, nvlddmkm.sys produced lots of hits, of which this is one of the early ones. But perhaps one of the 52 thousand or so others may help more.

ID: 23237 · Report as offensive
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23238 - Posted: 23 Feb 2009, 2:10:59 UTC - in response to Message 23236.  

Yeah, I've been through all the various Google solutions.

Unfortunately, since this only relates to BOINC, I can't seem to find a fix.

I run a ton of other games, etc...with no problems. Not sure why BOINC is causing the issue. When BOINC is idle, all is well...but as soon as it kicks in...sooner or later it will BSOD with this error.

At my wits end...after years and years of running BOINC, I may finally have to uninstall it, and forget it. :-(
ID: 23238 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 23239 - Posted: 23 Feb 2009, 2:35:40 UTC - in response to Message 23238.  

When BOINC is idle, all is well...but as soon as it kicks in...sooner or later it will BSOD with this error.

Sounds like an overheating problem. Recently chased dust bunnies?
ID: 23239 · Report as offensive
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23240 - Posted: 23 Feb 2009, 2:56:55 UTC - in response to Message 23239.  


Not over heating, I've monitored the temps while the app is running, and the highest it ever gets is 81 degrees C, well under the 105 degrees C limit (where the cards go into safe mode). Card fan speeds top out at 51%...so the it's not even being stressed.

Also, I play games that stress out the GPUs much more than BOINC does, and it never crashes when running those.
ID: 23240 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 23241 - Posted: 23 Feb 2009, 6:04:19 UTC - in response to Message 23236.  
Last modified: 23 Feb 2009, 6:09:51 UTC

Only when BOINC is running tasks does this error occur.

BOINC or Seti CUDA?

The difference is that BOINC by itself will not crunch numbers, you need the science application to do that. And those either run on the CPUs or on the GPU(s), while only the GPU(s) will cause the blue screen of death you're talking about.

I want to point people to David's entries in the thread at Seti as well, to point out all the things he strongly believes are not the cause of this.

A little recap, according to him it's not:
- Heat.
- Drivers.
- the GPU(s).
- anything else but BOINC.

I'd also like to quote the one piece of evidence for him that clinches it:
Again, I play extremely graphics intensive games, that frequently push the GPUs to the limit, and have the fans running at 100%, but it NEVER crashes. EVER.

and
Also, I play games that stress out the GPUs much more than BOINC does, and it never crashes when running those.


He has not tried to run Seti without the GPU, even though that option is available in the Seti Project Preferences (taking the check off of "Use Graphics Processing Unit (GPU) if available" and saving the changes. Although any work for the GPU on the system will still have to be run on it, or aborted. Seti CUDA work can be recognized by its 6.08 application number that the tasks are linked to in the Tasks tab in BOINC Manager).

-------------------

Dave, you said that your games stress the GPU more than BOINC does. How do you know this? Do you know how Seti CUDA runs on the GPU? Did you know that Seti CUDA saturates the GPU? That it uses 200MB+ of the memory on the videocard to store the task you're doing? That you cannot do much of anything else with the videocard then, when you are running Seti CUDA? Or don't you believe that?
ID: 23241 · Report as offensive
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23251 - Posted: 23 Feb 2009, 16:41:46 UTC - in response to Message 23241.  


Thanks for the in-depth response.

I have not yet tried to disable CUDA in my SETI prefences. I will do so now, and see what the results are. I will post back here once I have seen whether the problem still occurs.

One question, should I also disable the "Use GPU while computer is in use" in the manager preferences as well? I assume so.

Also, I know SETI saturates the GPU...I've set it to run while I am sitting here, and watched the results via GPU-Z. That I how I know the max heat it hits when running SETI. I've then checked that against the numbers I see after playing an intensive game, and the heat levels are always higher after gaming then when running BOINC.

Also, it's interesting that sometimes the BSOD occurs when BOINC has barely just started running, sometimes after quite some time. So something there triggers the error, versus the GPU being overheated. Perhaps something that needs to be worked in in conjuction with Nvidia?

Again, I'm disabling the CUDA support, and will report back soon.


ID: 23251 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 23252 - Posted: 23 Feb 2009, 17:00:35 UTC - in response to Message 23251.  

One question, should I also disable the "Use GPU while computer is in use" in the manager preferences as well? I assume so.

Only when you use the local preferences. If you weren't using them before, there's no need to start using them now. They will override any of the same preferences from the web-site.


ID: 23252 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 23253 - Posted: 23 Feb 2009, 18:42:21 UTC - in response to Message 23251.  

I haven't seen any mention if your processor is overclocked, or what Model your Power Supply is or it's wattage,
it's quite possible that it's getting tripped over the edge when doing work on the CPU and GPU,
My 9800GTX+ says a minimum of 24A on the 12V line, expect your GTX280 to need more.

Claggy
ID: 23253 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23254 - Posted: 23 Feb 2009, 19:33:19 UTC

Don't forget also that BOINC/SETI will be putting a sustained, 100%, continuous load on the power supply and all GPU components for hours, perhaps days, continuously. That's likely to be a different working envelop than even the most intensive games.
ID: 23254 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 23259 - Posted: 23 Feb 2009, 23:34:00 UTC
Last modified: 23 Feb 2009, 23:36:54 UTC

I am running nvidia 182.06 and 6.6.9 on two vista 64 systems. One has 9800gtx+, the other gtx280 and both run 24/7. My temps are 65c or under and I saw a few vidia kernel errors (not same as bsod) at 81c and set my fan speed to 100% (on the gtx280). It went down to 62c and I have not seen any nvidia kernel errors since. I have a 750watt supply for the 280 and a 550 for the 9800gtx+
See if you can drop the temps down to under 65c and make sure you are running 182.06

If you keep getting bsod's try switching boinc project to gpugrid instead of seti cuda and see if the bsod's go away. I have not seen any bsod's since about december when seti cuda first came out. At that time gpugrid was more stable then seti but in the last couple of weeks I have not seen any problems like kernel restarts. I do not run any games at all so possibly my systems cannot really be compared to yours. Currently, seti and gpugrid seem well behaved and neither one is poping up kernel restart reports let alone BSOD.

hth
ID: 23259 · Report as offensive
Fred - efmer.com
Avatar

Send message
Joined: 8 Aug 08
Posts: 570
Netherlands
Message 23267 - Posted: 24 Feb 2009, 13:17:32 UTC - in response to Message 23241.  

That you cannot do much of anything else with the videocard then, when you are running Seti CUDA? Or don't you believe that?
That may not be entirely correct. I base this on some measurements I did. I noticed when I over clock the CPU at 20% the GPU task tend to run faster as well. Changing from 20 min to 17 min (average). The task will stress out the GPU all right when it has enough data. It takes CPU time to keep the GPU happy. At first I thought data is fed to the GPU and than it runs, the data is taken out of the GPU and thats it. But that is not the case. As I had some request for GPU throttling and to my surprise this worked pretty well, just by throttling the CUDA (CPU) task. Below 100% you can see the temperature of the GPU drop and at 70% there is a real noticeable drop. I'm not sure if the cause is the lack of work or that they wait for each other to communicate, the result is the same less GPU run time.
ID: 23267 · Report as offensive
Fred - efmer.com
Avatar

Send message
Joined: 8 Aug 08
Posts: 570
Netherlands
Message 23273 - Posted: 24 Feb 2009, 14:47:47 UTC - in response to Message 23267.  

This may not help, but while testing a driver on my machine. I suddenly got a lot of crashes on the same driver nvlddmkm.sys. Try executing verifier.exe. After that select (depending on the language) something like remove current settings (3e from above). Next reboot system. Sometimes this debugging tool is linked in on an unsigned drivers like nvlddmkm.sys.
ID: 23273 · Report as offensive
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23279 - Posted: 24 Feb 2009, 17:19:13 UTC - in response to Message 23273.  


Ok...no luck. Even with GPU use turned off, I am still getting the nvlddmkm.sys bsod. But still only when BOINC is running tasks.

If I exit BOINC, my computer will run all day and night without a single BSOD. Turn it back on, and right back to crashing.

I'm at my wits end. Going to have to uninstall BOINC..I really, really don't want to, but this is getting ridiculous. :-(
ID: 23279 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 23281 - Posted: 24 Feb 2009, 17:43:17 UTC - in response to Message 23279.  
Last modified: 24 Feb 2009, 17:44:25 UTC

When applications under BOINC are running they will put hardware under immense pressure, your CPUs will be under constant load. Any driver or interrupt at that time that isn't the correct one for a piece of hardware on your system will make it flunk out.

If you want to try to debug the problem, please state:

- What make and model your motherboard is.
- Which chipset driver version you are using.
- Which DirectX version you are using and if you ever updated it since you installed it.
- Sound card make and model and driver version. (is it embedded?)
- Video card make and model and driver version.
- Network card make and model and driver version. (is it embedded?)
- Any other hardware make and model and driver version.

Post a list of your interrupts (IRQs). Perhaps that you have a problem there.
Tell if you still have open PCI and PCIe slot to swap cards around, if necessary.

Check with Prime95 if you get the same BSOD. If you do, you do have a hardware or driver problem.
Ask at different (not necessarily BOINC) forums for help on this issue.

And if you can't be bothered, then that's too bad. It's not for lack of trying to give help.
ID: 23281 · Report as offensive
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23341 - Posted: 27 Feb 2009, 20:17:37 UTC - in response to Message 23281.  


If you want to try to debug the problem, please state:

- What make and model your motherboard is.
-- EVGA 780i SLi

- Which chipset driver version you are using.
-- nForce 15.23
- Which DirectX version you are using and if you ever updated it since you installed it.
--- DirectX 10
- Sound card make and model and driver version. (is it embedded?)
--- SoundBlaster X-Fi ExtremeMusic --- 6.00.0001.1284
- Video card make and model and driver version.
--- EVGA GTX 280 --- 7.15.0011.8206 (English)
- Network card make and model and driver version. (is it embedded?)
---- Nvidia nForce 10/100/1000 -- 67.8.9.0 -- yes
- Any other hardware make and model and driver version.

Post a list of your interrupts (IRQs). Perhaps that you have a problem there.
Tell if you still have open PCI and PCIe slot to swap cards around, if necessary.

---- IRQs are fine, no conflicts.


Check with Prime95 if you get the same BSOD. If you do, you do have a hardware or driver problem.
Ask at different (not necessarily BOINC) forums for help on this issue.

---- I'll try Prime95 and report back

And if you can't be bothered, then that's too bad. It's not for lack of trying to give help.


I can be bothered, but there is a limit as to how much time I can spend trying to fix BOINC, since nothing else I do on my system causes the issue.



ID: 23341 · Report as offensive
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23342 - Posted: 27 Feb 2009, 20:42:09 UTC - in response to Message 23341.  

Update:
Prime95 has been running for over 30 minutes, passed all tests, and starting another round. System is fully stressed. No crashes.

ID: 23342 · Report as offensive
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23344 - Posted: 27 Feb 2009, 22:10:11 UTC - in response to Message 23342.  


2 hours of straight full on Prime95, and no BSODs (I have BOINC disabled).

So far, its all BOINC, all the time. Going to let Prime95 continue to run for quite some time to be sure, but looking more and more like my components and drivers are sound.
ID: 23344 · Report as offensive
DaveG

Send message
Joined: 21 Dec 06
Posts: 28
United States
Message 23346 - Posted: 27 Feb 2009, 23:16:11 UTC - in response to Message 23344.  


Still no crashes with Prime95.

Going to have to uninstall BOINC for now until the issue is resolved. I can't have the system rebooting constantly all day long. It's a shame, I've been running it for years.

Definitely the app though, nothing else ever causes a crash, and all my drivers and components are sound.

ID: 23346 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 23357 - Posted: 28 Feb 2009, 11:47:43 UTC - in response to Message 23279.  

Ok...no luck. Even with GPU use turned off, I am still getting the nvlddmkm.sys bsod. But still only when BOINC is running tasks.

You never told us what non-GPU task were running when you got these driver crashes.

The answer isn't 'BOINC' - as Ageless said a while back, BOINC doesn't do any computing, and very little graphics (unless you call up the statistics charts). The 'tasks' come from any one of 70+ science projects, with up to four different experiments (science applications) each. Which were you running?
ID: 23357 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : nvlddmkm.sys bsod when running BOINC tasks

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.