Computer reboots when running BOINC

Message boards : Questions and problems : Computer reboots when running BOINC
Message board moderation

To post messages, you must log in.

AuthorMessage
Harry

Send message
Joined: 4 Dec 20
Posts: 6
Canada
Message 101963 - Posted: 5 Dec 2020, 4:03:36 UTC

My system is running the latest Windows 64 Pro level (all patches installed) on an Asus MB with an Intel I7 multicore processor and 8GB RAM in a RAID1 HDD configuration.

Back in the summer I noticed my computer would have rebooted overnight. This happened so often I decided to shutdown BOINC at startup. I've been running his way for months now and my computer has not rebooted. Yesterday I decided to get BOINC running again so I downloaded the latest level from this web site, including the Oracle Virtual box. I started BOINC up again after I restarted for the VBox install and it started fine with fresh work units. I am connected to 6 projects: Asteroids, ClimatePrediction, Cosmology, Rosetta, Seti, and World Community Grid. After it connected with the servers I had 5 tasks active: 1 Asteroids, 2 Rosetta and 2 World Community Grid. After about 30-60 minutes, my system did a hard reboot. The Windows event viewer doesn't show much other than the event denoting a restart without a clean shutdown. I run my hdd in a RAID1 configuration so when this happens my system is difficult to use while the RAID1 goes through verification.

If there are BOINC or VBOX logs that might help, please tell me where they are and I can try to get them to someone so they can investigate. I am at a loss what is wrong. My only suspicion is that it has to do with WCG since I only started including them this past summer around the time the problem started. I have to keep BOINC down until this is resolved.

Harry
ID: 101963 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1486
Australia
Message 101964 - Posted: 5 Dec 2020, 4:47:04 UTC - in response to Message 101963.  

Just for starters, there's no Windows work for ClimatePrediction at present, and, as has been posted a lot here for months, SETI is effectively no more.
So you can take those 2 off your list.

So the next step is, do any of the remaining really need VM?
ID: 101964 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14628
Netherlands
Message 101966 - Posted: 5 Dec 2020, 10:02:11 UTC

I would install a heat sensing program such as Core Temp, allow its Logging On (F4) (it will log to a .csv file in C:\Program Files\Core Temp) and when the computer has rebooted, check its contents (you can open it in Notepad) and check what its temperatures were at the time of the crash.

You don't say if you run work on a GPU as well, and if so, what brand and model GPU.
How old is the system? Have you ever opened it up to remove dust build-up from the inside?
ID: 101966 · Report as offensive
ProDigit

Send message
Joined: 8 Nov 19
Posts: 633
United States
Message 101990 - Posted: 5 Dec 2020, 23:27:26 UTC

I'm also thinking either insufficient cooling, or an insufficient/broken PSU, or at least one that can't handle a load that maxes out your hardware.
ID: 101990 · Report as offensive
Harry

Send message
Joined: 4 Dec 20
Posts: 6
Canada
Message 102028 - Posted: 8 Dec 2020, 9:49:46 UTC - in response to Message 101966.  

I do have a GPU: GEForce GTX 750 Ti.

Thanks for the advice about Core Temp; I just installed it. The max column showed high 90 degrees values and TJmax of 100 so I guess I have been running hot. I just reset the max values, enabled heat over protection and enabled logging. Tomorrow I will run BOINC to see what happens.

The funny thing is, my ASUS Z97 motherboard came with some software to monitor the core temps but I couldn't find any documentation about normal operating temps or maximums for my processor so I turned off the alarms; yeah I know, pretty dumb but it was the middle of summer and I couldn't be sure they were telling me something really important or just giving me false alarms.

I only added one extra fan when I built the system in 2014. The case came with one for the power supply and the GPU has its own and the processor has one that came with the Intel processor which I thought would be sufficient. So, I figured I only needed to add one more to the front of the case to push air directly over the two HDDs. I thought I read recently that the thermal paste could need replacing after a few years but I've never done this and I don't know if it is easy to use too much or not enough. Cleaning off the existing paste looks like a delicate operation using a link free cloth.

Harry
ID: 102028 · Report as offensive
Harry

Send message
Joined: 4 Dec 20
Posts: 6
Canada
Message 102029 - Posted: 8 Dec 2020, 9:54:02 UTC - in response to Message 101964.  

I thought SETI was on hiatus till they got more data. If you say it will no longer do that then I can remove it.

If none of the active tasks need VM then I would think they won't start the VM environment. Some of my tasks do use VM so that is why I installed it.
ID: 102029 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14628
Netherlands
Message 102031 - Posted: 8 Dec 2020, 10:42:49 UTC - in response to Message 102028.  
Last modified: 8 Dec 2020, 10:43:26 UTC

Thanks for the advice about Core Temp; I just installed it. The max column showed high 90 degrees values and TJmax of 100 so I guess I have been running hot.
While running idle? Did you ever reopen the case and clean out the accumulated dust from fans and filters?

..and the processor has one that came with the Intel processor which I thought would be sufficient.
On the AMD Ryzen CPUs the complementary Wraith cooler is sufficient. On any Intel CPU you require aftermarket cooling as these CPUs get hot just by looking at the ceiling. And you don't immediately need to go for a full AIO watercooling solution, air coolers such as the Hyper Evo are good as well. If you have clearance in your case, that is.

So, I figured I only needed to add one more to the front of the case to push air directly over the two HDDs. I thought I read recently that the thermal paste could need replacing after a few years but I've never done this and I don't know if it is easy to use too much or not enough. Cleaning off the existing paste looks like a delicate operation using a link free cloth.
Lint free cloth and isopropyl alcohol. It's rather easy to redo the thermal compound, enough videos about it on YT. You'll need about the size of a kernel of rice on Intel and AMD CPUs, or spread it out with an old credit card so it lightly covers the whole top of the CPU.

There's no real need to active cool HDDs, the air hole at the top isn't big enough to be affected by an air stream over them.

I like the channel of Linus Tech Tips the most, they explain things quite well. So when searching on YT, do for instance a search on "ltt air cooling case" or "ltt clean cpu" or "ltt cpu compound application".
ID: 102031 · Report as offensive
Harry

Send message
Joined: 4 Dec 20
Posts: 6
Canada
Message 102092 - Posted: 11 Dec 2020, 1:15:03 UTC - in response to Message 102031.  

Well so much for my attempt to replace the thermal paste. I discovered why my system was overheating; the Intel heat sink that came with the processor was not seated fully, When I removed it I discovered one of the four locking pins that was supposed to go into one of the four holes on the MB around the socket was partially bent. The old paste was spread out in a big dot on the lid so I guess it was seated enough to work but not 100% correct; I was lucky it ran since I built it in 2014. So, I removed the old paste as per the instructions you provided and some YT videos :). Everything looked clean and I tried to reseat the original heatsink/fan but couldn't get the four posts to go in the holes at the same time. When I rechecked I saw that one of the four pins was bent. Tried to straighten it out and somehow broke one of the retaining clips inside the post. So, will go to the store tomorrow to buy a new (hopefully more efficient) CPU cooling fan.

As a side note, who thought it was a good idea to imprint black directional arrows on the tops of black posts for insertion into a rather dark cramped case. Any way I moved I seemed to leave some portion of the CPU socket in some or complete shadow making it impossible to locate the holes and then determine what direction to turn the darn posts to tighten things.
ID: 102092 · Report as offensive
Harry

Send message
Joined: 4 Dec 20
Posts: 6
Canada
Message 102574 - Posted: 15 Jan 2021, 1:09:09 UTC

Well, after cobbling together some replacement legs and pins from an older PC I was able to get my heat sink reinstalled with new Arctic Silver 5 thermal paste after a thorough cleaning using isopropyl alcohol. The heat sink posts went into their holes with a nice satisfying couple of clicks for each post. However, I am still running hot. The first three cores reach 100 C pretty quickly when running BOINC. Even at idle activity running under 10% load my cores are running in the mid 30-40C. Looks like I really do have to get a better heat sink but that is going to be a very big job since with my case I cannot access the bottom of the motherboard in order to attach the metal bracket those new heat sinks use. I will have to remove the MB to do this :(

So, waiting for the lock down to end!
ID: 102574 · Report as offensive
Profile Dave

Send message
Joined: 28 Jun 10
Posts: 1380
United Kingdom
Message 102575 - Posted: 15 Jan 2021, 6:05:15 UTC - in response to Message 102574.  

Well, after cobbling together some replacement legs and pins from an older PC I was able to get my heat sink reinstalled with new Arctic Silver 5 thermal paste after a thorough cleaning using isopropyl alcohol. The heat sink posts went into their holes with a nice satisfying couple of clicks for each post. However, I am still running hot. The first three cores reach 100 C pretty quickly when running BOINC. Even at idle activity running under 10% load my cores are running in the mid 30-40C. Looks like I really do have to get a better heat sink but that is going to be a very big job since with my case I cannot access the bottom of the motherboard in order to attach the metal bracket those new heat sinks use. I will have to remove the MB to do this :(

So, waiting for the lock down to end!

Are there no heatsinks better than the original that are better than the original that don't require the metal bracket?
ID: 102575 · Report as offensive
Harry

Send message
Joined: 4 Dec 20
Posts: 6
Canada
Message 103262 - Posted: 27 Feb 2021, 0:43:42 UTC - in response to Message 102575.  
Last modified: 27 Feb 2021, 0:54:24 UTC

I decided to approach INTEL about my problem because I could see others with the same processor (I7 4790K) were also reporting temperature throttling and some were delidding theirs in order to replace the internal thermal paste; not something I wanted to contemplate. INTEL had me run several of their utilities and their Extreme Tuning benchmark to gather data and about a week after I did this they said my processor was no longer under warranty but they would send me a replacement heat sink. I couldn't get them to tell me anything about what the "happy numbers" should be for the processor voltage, frequency etc.

While doing this investigation I found I was overclocking but don't recall ever doing this; such are the mystries of life. In any case, I reset my BIOS to defaults and reset some settings to the way I need them and for RAID1 support so that now my processor was running at 4100MHz. At that point, with the replacement fan installed, Using Core Temp (many thanks to Jord who recommended I run that and yes I did blow out the dust!) I could see a big initial improvement but over about a 5 hour period with BOINC tasks running I could see the max temperatures per core kept slowly spiking into the high 90C range, starting from the low to mid 70s. To me that seemed to be saying the processor was still having overheating problems.

I took a look at the BOINC settings and changed the computing preferences and reduced the "% of CPU time" from 75 to 50, leaving the "% of the CPUs" at 50. This did the trick for me. Now, after several days, my max temperatures haven't exceeded 81C for any of the four cores.

Bottom line for me: reduce the overclocking, replace the fan (looks identical to the original) and reduce the BOINC %CPU. When I was speaking to the INTEL support person he mention a web site where you can check out processor tuning and I wrote down ARK.FREQUENCY.COM but I must have written it down incorrectly because it doesn't exist and they closed my case as soon as they shipped the fan, without asking me if the problem was resolved, and haven't responded to me since, so who knows.

Yes, I know this processor is supposed to have a base speed of 4000MHz but it was also supposed to allow some over clocking. I think there is another BIOS setting I can change to get it below 4100MHz but at this point I am happy with the current results. I am back to running BOINC tasks! As I write this I have 5 tasks running with temperatures peaking in the very low 70s and no temperature throttling. Perhaps my experience can be a lesson to other folks not experts in processor tuning.
ID: 103262 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 14628
Netherlands
Message 103265 - Posted: 27 Feb 2021, 14:26:11 UTC - in response to Message 103262.  

When I was speaking to the INTEL support person he mention a web site where you can check out processor tuning and I wrote down ARK.FREQUENCY.COM
The only Intel addresses I know that start with ARK are of this sort: https://ark.intel.com/content/www/us/en/ark/products/80807/intel-core-i7-4790k-processor-8m-cache-up-to-4-40-ghz.html

90C is still within the margin of the i7-4790K's TjMax (maximum temperature on the die) of around 100C, but I wouldn't run it continuously at such temperatures. I assume Intel sent you another of their default fans?
I would look for external cooling if I were you. But that depends on your case. If you have a midi tower or tower, you can look for the Cooler Master Hyper EVO 212, the Corsair A500 or some other after market package.
ID: 103265 · Report as offensive

Message boards : Questions and problems : Computer reboots when running BOINC

Copyright © 2021 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.