All project with CUDA capabilities are freeze and Display driver stopped responding and has recovered..

Message boards : Questions and problems : All project with CUDA capabilities are freeze and Display driver stopped responding and has recovered..
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
pubodee

Send message
Joined: 18 Oct 13
Posts: 5
Thailand
Message 50894 - Posted: 18 Oct 2013, 16:45:11 UTC

18/10/2556 22:15:23 | | Starting BOINC client version 7.0.28 for windows_x86_64
18/10/2556 22:15:23 | | log flags: file_xfer, sched_ops, task
18/10/2556 22:15:23 | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
18/10/2556 22:15:23 | | Data directory: C:\ProgramData\BOINC
18/10/2556 22:15:23 | | Running under account Pubodee
18/10/2556 22:15:23 | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz [Family 6 Model 26 Stepping 5]
18/10/2556 22:15:23 | | Processor: 256.00 KB cache
18/10/2556 22:15:23 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx tm2 popcnt pbe
18/10/2556 22:15:23 | | OS: Microsoft Windows 7: Professional x64 Edition, Service Pack 1, (06.01.7601.00)
18/10/2556 22:15:23 | | Memory: 5.99 GB physical, 11.98 GB virtual
18/10/2556 22:15:23 | | Disk: 931.51 GB total, 648.33 GB free
18/10/2556 22:15:23 | | Local time is UTC +7 hours
18/10/2556 22:15:23 | | NVIDIA GPU 0 (ignored by config): GeForce GTX 480 (driver version 327.23, CUDA version 5.50, compute capability 2.0, 1536MB, 8381757MB available, 1444 GFLOPS peak)
18/10/2556 22:15:23 | | OpenCL: NVIDIA GPU 0 (ignored by config): GeForce GTX 480 (driver version 327.23, device version OpenCL 1.1 CUDA, 1536MB, 8381757MB available)
18/10/2556 22:15:23 | | No usable GPUs found

This system run BOINC both CPU and GPU fine. However, three day ago it BSOD
with no reason maybe due to heavy rain and lightening. (don't capture BSOD screen --")

When run BOINC the screen blank and freeze sometimes show
"Display driver stopped responding and has recovered"
then I can't do anything even go to task manager to kill BOINC process except press
ctrl+alt+del to log off or restart due to screen freeze.

After set <ignore_nvidia_dev> for GTX 480 BOINC on CPU is run perfectly fine
however when tries several install and uninstall both nvidia driver 310.90 -> 331.40
and BOINC 7.0.25 -> 7.0.64 the problem with CUDA project still persistent.

However, when test with several games, FurMark, 3DMark etc. no problem at all.
So, I can't find the core of the problem that is hardware failure or software.

Thanks for all assistance!


PS.
GTX 480 temp is 45c/75c
ID: 50894 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 50895 - Posted: 18 Oct 2013, 17:14:53 UTC - in response to Message 50894.  

You don't say which project(s) you're running, but it might be similar to a problem we're seeing at GPUGrid - http://www.gpugrid.net/forum_thread.php?id=3491.

With a BSOD or other sudden, unplanned, closedown, BOINC doesn't get a chance to finish writing the files for work done so far on jobs in progress - the 'checkpoint' files. If they're in a mixed-up, partially-written, state, it's likely that the project application won't be able to use them.

Most times, the job just exits with an unrecoverable error, and BOINC moves on to the next one. But the current GPUGrid app behaves just as you describe, with continual driver restarts. They've just found out what the problem is, apparently, but haven't had time to fix it yet.

Since you've found a way of getting control of the machine and letting BOINC run, the next move might be to abort any partially-run CUDA tasks that will be waiting to run next time the GPU is available. Tasks that haven't been started yet should be OK. Make sure you exit BOINC fully after aborting the tasks - BOINC doesn't complete its clean-up until the next restart.

Give that a try and report back, please.
ID: 50895 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 50896 - Posted: 18 Oct 2013, 17:17:55 UTC - in response to Message 50894.  

Use Blue Screen View to see what the BSOD said and post that information here.

The system freezing could point to:
1. A corrupted driver. Have you tried to fully uninstall the display driver before reinstalling it? The motherboard driver?
2. A bad Power Supply Unit. Especially if there was a spike due to lightning.
3. Damaged hardware, and then more specifically, the GPU. Games and benchmark programs use the GPU differently, they don't put the GPU completely under load for several minutes to hours. A power spike can have damaged your videocard. Or RAM. Or (part of your) motherboard.
ID: 50896 · Report as offensive
pubodee

Send message
Joined: 18 Oct 13
Posts: 5
Thailand
Message 50902 - Posted: 19 Oct 2013, 2:16:36 UTC

I've 4 project with CUDA
-Einstein@Home
-GPUGRID
-Milkyway@Home
-SETI@Home

When continue from partial CUDA task cause freeze and display driver reset.
However, tasks that haven't been started have the same problem.
I'll check with GPUGrid forum and try somethings.

This is latest attempt to run BOINC from BlueScreenView.


101813-23088-01.dmp 18/10/2556 9:26:39 0x00000116 fffffa80`05c58010 fffff880`099ff6b4 ffffffff`c0000001 00000000`00000005 dxgkrnl.sys dxgkrnl.sys+5d140 DirectX Graphics Kernel Microsoft® Windows® Operating System Microsoft Corporation 6.1.7601.18228 (win7sp1_gdr.130731-2222) x64 ntoskrnl.exe+75bc0 C:\Windows\Minidump\101813-23088-01.dmp 8 15 7601 406,936 18/10/2556 9:32:24



dxgkrnl.sys dxgkrnl.sys+5d140 fffff880`080c8000 fffff880`081bc000 0x000f4000 0x51fa153d 1/8/2556 14:58:53 Microsoft® Windows® Operating System DirectX Graphics Kernel 6.1.7601.18228 (win7sp1_gdr.130731-2222) C:\Windows\system32\drivers\dxgkrnl.sys
ntoskrnl.exe ntoskrnl.exe+3122ea fffff800`04618000 fffff800`04bfd000 0x005e5000 0x521ea035 29/8/2556 8:13:25 Microsoft® Windows® Operating System NT Kernel & System 6.1.7601.18247 (win7sp1_gdr.130828-1532) C:\Windows\system32\ntoskrnl.exe
nvlddmkm.sys nvlddmkm.sys+14d6b4 fffff880`098b2000 fffff880`0a3a9000 0x00af7000 0x52314e10 12/9/2556 12:16:00 NVIDIA Windows Kernel Mode Driver, Version 327.23 NVIDIA Windows Kernel Mode Driver, Version 327.23 9.18.13.2723 C:\Windows\system32\drivers\nvlddmkm.sys


Ageless about your suggestion
1. Uninstall and install only display driver from Windows not from Driver Sweeper.
2. Not sure about this because I don't have psu test unit btw my psu is Corsair HX-1000
survive many lightning after 2008.
3. That the problem which I don't know because some glitch in circuit may cause the
problem. However, RAM is OK when test with Memtest86+ but MB not test yet.


Actually my system have GTX 275 for physx and run CUDA too.
When BOINC freeze I remove it to solve the problem before install it.
ID: 50902 · Report as offensive
pubodee

Send message
Joined: 18 Oct 13
Posts: 5
Thailand
Message 50904 - Posted: 19 Oct 2013, 12:02:13 UTC

After read about junk calculation or check point file when BOINC crash.

This incident look like when program crash at certain point and resume, it will cause
some problem in loop calculation due to check point file is corrupt or junk file.

It is not like when pause or suspended BOINC because the program will sort file in order
for resume calculation. So, I set "No new tasks" for all CUDA project and let WCG finished
then restart program and turn "Allow new taks".

Viola!!!, my BOINC get back to normal.

Actually, I used some scientific calculation program before -> Guassian
That program have check point file system too but they have auto save check point when
certain loop are complete then when the system crash you will loose some calculation
because the internal function will ignore junk calculation and restart from the complete loop
check point file; Not like BOINC that will cause reset CUDA driver loop.

If they can't solve this bug then must be add in FAQ.


Thank!!!
Pubodee
ID: 50904 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 50905 - Posted: 19 Oct 2013, 13:14:11 UTC - in response to Message 50904.  

If the crash happens between checkpoints, that should already happen. The risk is that the crash may happen during checkpointing, when the files are inconsistent. That can be mitigated by not removing the old checkpoint until the new one is complete - then a restart would go two steps back instead of just one - but there's still a moment of risk while the file names are being switched over.

The checkpoint files themselves are the responsibility of the individual projects. Better to report any identified problems with them on the project's own message board, not here. As I said (and linked) in my first reply, GPUGrid are aware they have a problem, and have diagnosed the cause. They're just waiting on some (scarce) developer time to fix it.
ID: 50905 · Report as offensive
pubodee

Send message
Joined: 18 Oct 13
Posts: 5
Thailand
Message 50907 - Posted: 19 Oct 2013, 15:08:14 UTC - in response to Message 50905.  

Thank.

However, it happened with all CUDA project not CPU project.
-Einstein@Home
-GPUGRID
-Milkyway@Home
-SETI@Home
This mean I must report to all their's own message board?
ID: 50907 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 50908 - Posted: 19 Oct 2013, 15:15:20 UTC - in response to Message 50907.  

Well, only after you've confirmed that the application from that particular project has a problem.

MilkyWay doesn't have a CUDA application, only OpenCL.
ID: 50908 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 50909 - Posted: 19 Oct 2013, 15:26:01 UTC - in response to Message 50908.  

MilkyWay doesn't have a CUDA application, only OpenCL.

And Setiathome has Cuda32, Cuda42, Cuda5 and OpenCL apps suitable for your GPU:

Setiathome Applications

Is it just one of these apps that cause problems or all?

Claggy
ID: 50909 · Report as offensive
pubodee

Send message
Joined: 18 Oct 13
Posts: 5
Thailand
Message 50910 - Posted: 19 Oct 2013, 15:41:05 UTC - in response to Message 50909.  

-Einstein@Home Binary Radio Pulsar Search (Arecibo, GPU) v1.39 (BRP4G-cuda32-nv301)
-SETI@home v7 7.00 (cuda50)
-Milkyway@Home 1.02 (opencl_nvidia)
-Milkyway@Home Separation (Modified Fit) 1.28 (opencl_nvidia)

All on GPU that cause the problem after BSOD.
ID: 50910 · Report as offensive
relaxed83

Send message
Joined: 18 Oct 13
Posts: 8
United States
Message 50921 - Posted: 20 Oct 2013, 14:58:44 UTC - in response to Message 50894.  

I had issues like this and I RMAed my GPU, works ever since... Check your CPU's thermal paste too I had to re-seat my heatsink.
ID: 50921 · Report as offensive
david johnson

Send message
Joined: 9 Jun 09
Posts: 14
United States
Message 50943 - Posted: 21 Oct 2013, 15:23:08 UTC

I just recently got back to running BOINC after a short hiatus. I had stopped running BOINC because of a lot of stability problems? I completely removed and reinstalled the latest version of BOINC and again I am having problems with frequent reboots and BSOD issues. I am running windows 7 x64 with up to date patches, I am the administrator on all the machines I installed it on and only use NVIDIA graphics cards. I don't have a screen saver enabled.

My main system is a recently reloaded (less than a month) Windows 7 x64 Ultimate. I have 12 processors/cores (Intel Core i7-3930K CPU @ 3.2G) and 8G of ram and a brand new NVIDIA 770 that replaced an NVIDIA 260. I have no other issues with the machine except when I run BOINC. I recently installed a new heatsink on the CPU (even when crunching its < 50c).

I am fairly certain it has happened when the GPU was not running (I clicked Snooze GPU) but its possible the snooze expired as I had walked away from the computer for some time. I have the problem with multiple machines and various versions of NVIDIA cards (all Windows 7 x64)

When I look at tasks there are frequently tasks with "computation error". Please help! I will disable the GPU for now and see if it gets more stable? I'm assuming the issue is in the display drivers and even with 12 processors I don't see much point in running it without the GPU.

STDERRGUI.TXT
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00000001401DCE1A read attempt to address 0x00000000

Engaging BOINC Windows Runtime Debugger...

********************
BOINC Windows Runtime Debugger Version 7.0.64

Dump Timestamp : 10/20/13 22:09:32
Loaded Library : dbghelp.dll
Loaded Library : symsrv.dll
Loaded Library : srcsrv.dll
Loaded Library : version.dll


STDOUTDAE.TXT
20-Oct-2013 22:08:58 [---] Starting BOINC client version 7.0.64 for windows_x86_64
20-Oct-2013 22:08:58 [---] log flags: file_xfer, sched_ops, task
20-Oct-2013 22:08:58 [---] Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
20-Oct-2013 22:08:58 [---] Data directory: D:\ProgramData\BOINC
20-Oct-2013 22:08:58 [---] Running under account Administrator
20-Oct-2013 22:08:58 [---] Processor: 12 GenuineIntel Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz [Family 6 Model 45 Stepping 7]
20-Oct-2013 22:08:58 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx tm2 dca pbe
20-Oct-2013 22:08:58 [---] OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00)
20-Oct-2013 22:08:58 [---] Memory: 7.94 GB physical, 15.88 GB virtual
20-Oct-2013 22:08:58 [---] Disk: 931.51 GB total, 275.65 GB free
20-Oct-2013 22:08:58 [---] Local time is UTC -4 hours
20-Oct-2013 22:08:58 [---] CUDA: NVIDIA GPU 0: GeForce GTX 770 (driver version 327.23, CUDA version 5.50, compute capability 3.0, 2048MB, 1892MB available, 3411 GFLOPS peak)
20-Oct-2013 22:08:58 [---] OpenCL: NVIDIA GPU 0: GeForce GTX 770 (driver version 327.23, device version OpenCL 1.1 CUDA, 2048MB, 1892MB available, 3411 GFLOPS peak)

STDOUTGUI.TXT
[10/18/13 03:46:39] TRACE [4464]: init_poll(): connected

3:48:00 AM: Error: Memory VFS already contains file 'webexternallink.xpm'!
3:48:00 AM: Error: Memory VFS already contains file 'nvidiaicon.xpm'!
3:48:00 AM: Error: Memory VFS already contains file 'atiicon.xpm'!
3:48:16 AM: Error: Memory VFS already contains file 'webexternallink.xpm'!
3:48:16 AM: Error: Memory VFS already contains file 'nvidiaicon.xpm'!
3:48:16 AM: Error: Memory VFS already contains file 'atiicon.xpm'!
3:48:27 AM: Error: Memory VFS already contains file 'webexternallink.xpm'!
3:48:27 AM: Error: Memory VFS already contains file 'nvidiaicon.xpm'!
3:48:27 AM: Error: Memory VFS already contains file 'atiicon.xpm'!
3:48:35 AM: Error: Memory VFS already contains file 'webexternallink.xpm'!
3:48:35 AM: Error: Memory VFS already contains file 'nvidiaicon.xpm'!
3:48:35 AM: Error: Memory VFS already contains file 'atiicon.xpm'!
3:48:46 AM: Error: Memory VFS already contains file 'webexternallink.xpm'!
3:48:46 AM: Error: Memory VFS already contains file 'nvidiaicon.xpm'!
3:48:46 AM: Error: Memory VFS already contains file 'atiicon.xpm'!
3:49:54 AM: Error: Memory VFS already contains file 'webexternallink.xpm'!
3:49:54 AM: Error: Memory VFS already contains file 'nvidiaicon.xpm'!
3:49:54 AM: Error: Memory VFS already contains file 'atiicon.xpm'!
[10/20/13 17:07:51] TRACE [3612]: init_asynch() boinc_socket: 752

[10/20/13 17:07:51] TRACE [3612]: init_asynch() connect: -1


CLIENT_STATE.XML
<client_state>
<host_info>
<timezone>-14400</timezone>
<domain_name>DEV11</domain_name>
<ip_addr>10.254.85.29</ip_addr>
<host_cpid>2bf2ec63014425252842caa44f35bee7</host_cpid>
<p_ncpus>12</p_ncpus>
<p_vendor>GenuineIntel</p_vendor>
<p_model> Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz [Family 6 Model 45 Stepping 7]</p_model>
<p_features>fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx tm2 dca pbe</p_features>
<p_fpops>3886147826.751692</p_fpops>
<p_iops>13659924974.119431</p_iops>
<p_membw>83333333.333333</p_membw>
<p_calculated>1382227939.455654</p_calculated>
<p_vm_extensions_disabled>0</p_vm_extensions_disabled>
<m_nbytes>8527855616.000000</m_nbytes>
<m_cache>262144.000000</m_cache>
<m_swap>17053802496.000000</m_swap>
<d_total>1000200994816.000000</d_total>
<d_free>293459722240.000000</d_free>
<os_name>Microsoft Windows 7</os_name>
<os_version>Ultimate x64 Edition, Service Pack 1, (06.01.7601.00)</os_version>
<coprocs>
<coproc_cuda>
<count>1</count>
<name>GeForce GTX 770</name>
<available_ram>1984212992.000000</available_ram>
<have_cuda>1</have_cuda>
<have_opencl>1</have_opencl>
<peak_flops>3411456000000.000000</peak_flops>
<cudaVersion>5050</cudaVersion>
<drvVersion>32723</drvVersion>
<totalGlobalMem>2147483648.000000</totalGlobalMem>
<sharedMemPerBlock>49152.000000</sharedMemPerBlock>
<regsPerBlock>65536</regsPerBlock>
<warpSize>32</warpSize>
<memPitch>2147483647.000000</memPitch>
<maxThreadsPerBlock>1024</maxThreadsPerBlock>
<maxThreadsDim>1024 1024 64</maxThreadsDim>
<maxGridSize>2147483647 65535 65535</maxGridSize>
<clockRate>1110500</clockRate>
<totalConstMem>65536.000000</totalConstMem>
<major>3</major>
<minor>0</minor>
<textureAlignment>512.000000</textureAlignment>
<deviceOverlap>1</deviceOverlap>
<multiProcessorCount>8</multiProcessorCount>
<coproc_opencl>
<name>GeForce GTX 770</name>
<vendor>NVIDIA Corporation</vendor>
<vendor_id>4318</vendor_id>
<available>1</available>
<half_fp_config>0</half_fp_config>
<single_fp_config>63</single_fp_config>
<double_fp_config>63</double_fp_config>
<endian_little>1</endian_little>
<execution_capabilities>1</execution_capabilities>
<extensions>cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 </extensions>
<global_mem_size>2147483648</global_mem_size>
<local_mem_size>49152</local_mem_size>
<max_clock_frequency>1110</max_clock_frequency>
<max_compute_units>8</max_compute_units>
<opencl_platform_version>OpenCL 1.1 CUDA 4.2.1</opencl_platform_version>
<opencl_device_version>OpenCL 1.1 CUDA</opencl_device_version>
<opencl_driver_version>327.23</opencl_driver_version>
</coproc_opencl>
<pci_info>
<bus_id>1</bus_id>
<device_id>0</device_id>
<domain_id>0</domain_id>
</pci_info>
</coproc_cuda>
</coprocs>
</host_info>
<time_stats>
<on_frac>0.449535</on_frac>
<connected_frac>0.892453</connected_frac>
<cpu_and_network_available_frac>0.957078</cpu_and_network_available_frac>
<active_frac>0.859951</active_frac>
<gpu_active_frac>0.844445</gpu_active_frac>
<client_start_time>1382363049.424505</client_start_time>
<previous_uptime>4874.534945</previous_uptime>
<last_update>1382367919.957451</last_update>
</time_stats>
<net_stats>
<bwup>78960.710248</bwup>
<avg_up>2102294.572708</avg_up>
<avg_time_up>1382367923.369452</avg_time_up>
<bwdown>797694.292073</bwdown>
<avg_down>120754839.865780</avg_down>
<avg_time_down>1382346698.857296</avg_time_down>
</net_stats>
*** rest of file not included...
ID: 50943 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 50945 - Posted: 21 Oct 2013, 15:53:49 UTC - in response to Message 50943.  

I have 12 processors/cores (Intel Core i7-3930K CPU @ 3.2G) and 8G of ram and a brand new NVIDIA 770 that replaced an NVIDIA 260.

Did you change power supply units when you went from the one to the next GPU? If not, what is your present PSU rated at?

When I do calculations using http://support.asus.com/powersupply.aspx and http://www.corsair.com/us/learn_n_explore/?psu=yes (Thanks Claggy for the links), it'll throw a minimum PSU value of 750 Watts out. If you have anything less, then that can easily count for the instability problems that you see.
ID: 50945 · Report as offensive
david johnson

Send message
Joined: 9 Jun 09
Posts: 14
United States
Message 50951 - Posted: 21 Oct 2013, 21:30:53 UTC - in response to Message 50945.  

I have fairly new 750W PS installed.
ID: 50951 · Report as offensive
david johnson

Send message
Joined: 9 Jun 09
Posts: 14
United States
Message 50953 - Posted: 21 Oct 2013, 21:41:02 UTC - in response to Message 50951.  

also, I had similar issues with the GTX260 previously installed and in that case the total came out to 550 watts
ID: 50953 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 50955 - Posted: 21 Oct 2013, 21:51:12 UTC - in response to Message 50951.  

I have fairly new 750W PS installed.

A bit of an A-brand, or a cheap PSU?

When I look at tasks there are frequently tasks with "computation error"

Well, you can check on those tasks what their stderr.txt said. If you don't want to, give at least a link to such an erroneous task, but a link to your system at the project page would be nice as well.

Also, which projects do you run, and if more than one, do you see this behaviour at all of them? Only on CPU tasks, on GPU tasks, or on both?

ID: 50955 · Report as offensive
david johnson

Send message
Joined: 9 Jun 09
Posts: 14
United States
Message 50960 - Posted: 22 Oct 2013, 7:20:33 UTC - in response to Message 50955.  

here's my projects:

http://boincstats.com/signature/-1/user/7275/sig.png

I set all project to: "no new tasks" and let them all finish until I can solve the issue, I cant have the machine rebooting 10 times a day.[/img]
ID: 50960 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 50961 - Posted: 22 Oct 2013, 7:37:36 UTC - in response to Message 50960.  

That's not exactly useful. We can't do anything with that information, since there are plenty of David Johnson's at the projects, but none of them are you. I checked.

So please, post the first 40 lines of a BOINC start-up sequence, this will show the hostIDs of your computer at the projects as well.
ID: 50961 · Report as offensive
david johnson

Send message
Joined: 9 Jun 09
Posts: 14
United States
Message 50964 - Posted: 22 Oct 2013, 17:13:35 UTC - in response to Message 50961.  

Not sure why that wasn't useful? Didn't it show all the projects I'm attached to? In any case here is the start of my BOINC event log. I have suspended all projects except Einstein@home and its been stable for over a day (longest its been since the reload). I'm going to install a 900-1000 watt power supply today to make sure its not related.

10/22/2013 3:11:37 AM | | No config file found - using defaults
10/22/2013 3:11:37 AM | | Starting BOINC client version 7.0.64 for windows_x86_64
10/22/2013 3:11:37 AM | | log flags: file_xfer, sched_ops, task
10/22/2013 3:11:37 AM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
10/22/2013 3:11:37 AM | | Data directory: D:\ProgramData\BOINC
10/22/2013 3:11:37 AM | | Running under account Administrator
10/22/2013 3:11:37 AM | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz [Family 6 Model 45 Stepping 7]
10/22/2013 3:11:37 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx tm2 dca pbe
10/22/2013 3:11:37 AM | | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00)
10/22/2013 3:11:37 AM | | Memory: 7.94 GB physical, 15.88 GB virtual
10/22/2013 3:11:37 AM | | Disk: 931.51 GB total, 273.36 GB free
10/22/2013 3:11:37 AM | | Local time is UTC -4 hours
10/22/2013 3:11:37 AM | | CUDA: NVIDIA GPU 0: GeForce GTX 770 (driver version 331.58, CUDA version 6.0, compute capability 3.0, 2048MB, 1802MB available, 3411 GFLOPS peak)
10/22/2013 3:11:37 AM | | OpenCL: NVIDIA GPU 0: GeForce GTX 770 (driver version 331.58, device version OpenCL 1.1 CUDA, 2048MB, 1802MB available, 3411 GFLOPS peak)
10/22/2013 3:11:37 AM | rosetta@home | URL http://boinc.bakerlab.org/rosetta/; Computer ID 1518755; resource share 100
10/22/2013 3:11:37 AM | superlinkattechnion | URL http://cbl-boinc-server2.cs.technion.ac.il/superlinkattechnion/; Computer ID 109671; resource share 0
10/22/2013 3:11:37 AM | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 4676852; resource share 100
10/22/2013 3:11:37 AM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6418760; resource share 300
10/22/2013 3:11:37 AM | Spinhenge@home | URL http://spin.fh-bielefeld.de/; Computer ID 223333; resource share 50
10/22/2013 3:11:37 AM | | General prefs: from http://bam.boincstats.com/ (last modified 09-Jun-2009 15:15:06)
10/22/2013 3:11:37 AM | | Computer location: home
10/22/2013 3:11:37 AM | | General prefs: no separate prefs for home; using your defaults
10/22/2013 3:11:37 AM | | Reading preferences override file
10/22/2013 3:11:37 AM | | Preferences:
10/22/2013 3:11:37 AM | | max memory usage when active: 8132.80MB
10/22/2013 3:11:37 AM | | max memory usage when idle: 8132.80MB
10/22/2013 3:11:37 AM | | max disk usage: 0.00GB
10/22/2013 3:11:37 AM | | don't compute while active
10/22/2013 3:11:37 AM | | don't use GPU while active
10/22/2013 3:11:37 AM | | suspend work if non-BOINC CPU load exceeds 25 %
10/22/2013 3:11:37 AM | | (to change preferences, visit a project web site or select Preferences in the Manager)
ID: 50964 · Report as offensive
david johnson

Send message
Joined: 9 Jun 09
Posts: 14
United States
Message 50966 - Posted: 22 Oct 2013, 19:37:53 UTC - in response to Message 50964.  

I also just installed GPU-Z and when a CUDA task is running the GPU temp doesn't get above 55c, fan at 60%, GPU load at 41%? Maybe its just the work unit but why only 41% load on GPU? I'll keep an eye on it and see if other work units use more GPU.
ID: 50966 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : All project with CUDA capabilities are freeze and Display driver stopped responding and has recovered..

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.