boinc-client crash and reboot my machine

Message boards : Questions and problems : boinc-client crash and reboot my machine
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
computezrmle

Send message
Joined: 2 Feb 22
Posts: 53
Germany
Message 107696 - Posted: 3 Apr 2022, 18:00:30 UTC - in response to Message 107695.  

... a clean REBOOT and that broke my RAID-1 I have a disk in resynchronization...

An old rule claims:
"Keep it simple and stupid!"


If there's a usecase where a RAID is a must it would be better to limit the applications on that system to the required minimum.
=> BOINC should not run on that system.

Running BOINC on an arbitrary system means that system most likely doesn't need a RAID.
ID: 107696 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 1995
United Kingdom
Message 107698 - Posted: 3 Apr 2022, 20:40:34 UTC - in response to Message 107696.  

Running BOINC on an arbitrary system means that system most likely doesn't need a RAID.


I have RAID for my data but not for my system drive. I have yet to have a problem with BOINC related to the RAID.

And, most people I know who use it don't have RAID but RAED. (Random Array of Expensive Disks! (At least by my standards.)
ID: 107698 · Report as offensive
computezrmle

Send message
Joined: 2 Feb 22
Posts: 53
Germany
Message 107702 - Posted: 4 Apr 2022, 5:30:47 UTC - in response to Message 107701.  

... you're calling somebody stupid ...

I never called anybody being stupid.
The generic "old rule" should just remind one (BTW: me too) that solutions appearing simple and stupid are often the better choice.
The more complex a solution is the more error-prone it becomes.
Didn't expect that this needs to be explained.



How on earth can BOINC not like RAID?

As Dave already mentioned:
"Whatever it is almost certain that the problem is not BOINC itself but that BOINC is exposing that the problem is there."
+1


Back to the OP:
It might solve the problem to move BOINC to a disk not being part of the RAID or (simple ...) not to use RAID at all.
ID: 107702 · Report as offensive
sprzyswa

Send message
Joined: 2 Feb 21
Posts: 30
France
Message 107704 - Posted: 4 Apr 2022, 9:13:27 UTC - in response to Message 107693.  
Last modified: 4 Apr 2022, 9:35:45 UTC

A (very long) while ago I had a power outage due to a heavy thunderstorm.
After that all my machines rebooted and seemed to work fine for a couple of weeks.
Then, as of a sudden, 1 machine crashed (rebooted) every time BOINC started a GPU task.

I finally could trace the error down to a corrupted filesystem (multiple sector allocation on the harddisk).
The solution was to
- backup all data
- reformat the disk
- restore all data
- force a reinstall of the OS, drivers and all applications

Since then the machine runs fine again.


I checked the RAID-1 disks with smartctl and e2fsck and they are clean, checked the RAM which is clean too, I also reinstalled the whole system (Ubuntu 20.04) I think a hardware problem, motherboard or CPU, because even with the application on a bootable USB key I have the same problem.

With VirtualBox on my machine à got the same problem...

Sam.
Powered by Debian & Ubuntu 20.04 LTS
Boinc version 7.16.6 x86_64-pc-linux-gnu
ID: 107704 · Report as offensive
computezrmle

Send message
Joined: 2 Feb 22
Posts: 53
Germany
Message 107705 - Posted: 4 Apr 2022, 12:01:58 UTC

An additional idea that came in my mind.

One of the first tests the BOINC client does is to launch a subprocess that checks the GPU capabilities.
If the system uses a wrong or somehow broken GPU driver this may be a possible source for the trouble.

Could be tested starting BOINC without GPU support.
See options "<ignore_ati_dev>N</ignore_ati_dev>" ... "ignore_nvidia_dev>N</ignore_nvidia_dev>":
https://boinc.berkeley.edu/wiki/Client_configuration
ID: 107705 · Report as offensive
sprzyswa

Send message
Joined: 2 Feb 21
Posts: 30
France
Message 107708 - Posted: 4 Apr 2022, 20:31:30 UTC - in response to Message 107705.  

An additional idea that came in my mind.

One of the first tests the BOINC client does is to launch a subprocess that checks the GPU capabilities.
If the system uses a wrong or somehow broken GPU driver this may be a possible source for the trouble.

Could be tested starting BOINC without GPU support.
See options "<ignore_ati_dev>N</ignore_ati_dev>" ... "ignore_nvidia_dev>N</ignore_nvidia_dev>":
https://boinc.berkeley.edu/wiki/Client_configuration


I am trying to recompile boinc-client and I had the same problem when compiling on 3 CPUs which were working at more than 90% the machine rebooted so boinc is not the cause but I think a hardware problem either the motherboard or the processor. Besides, I also had this problem during memory tests using the 4 CPUs, which occurs with Boinc when the 4 CPUs are at more than 90% load. So sorry to have increminated Boinc who is there I think for nothing.

Sam.
Powered by Debian & Ubuntu 20.04 LTS
Boinc version 7.16.6 x86_64-pc-linux-gnu
ID: 107708 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 4926
United Kingdom
Message 107709 - Posted: 4 Apr 2022, 20:43:46 UTC - in response to Message 107708.  

BOINC (the program) is highly unlikely to stress "4 CPUs are at more than 90% load", except for 60 seconds or less at startup during benchmarking. That level of CPU activity is more likely to be attributable to one or more science projects, running under the direction of BOINC. We haven't discussed projects yet in this thread.
ID: 107709 · Report as offensive
sprzyswa

Send message
Joined: 2 Feb 21
Posts: 30
France
Message 107712 - Posted: 4 Apr 2022, 21:21:39 UTC - in response to Message 107709.  

BOINC (the program) is highly unlikely to stress "4 CPUs are at more than 90% load", except for 60 seconds or less at startup during benchmarking. That level of CPU activity is more likely to be attributable to one or more science projects, running under the direction of BOINC. We haven't discussed projects yet in this thread.


I'm on einstein@home and LHC@home and often the 4 CPUs are at more than 90% load, I had to put a liquid cooler to avoid raising the temperature of the processor too much.

Sam.
Powered by Debian & Ubuntu 20.04 LTS
Boinc version 7.16.6 x86_64-pc-linux-gnu
ID: 107712 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 735
United States
Message 107714 - Posted: 5 Apr 2022, 1:39:16 UTC - in response to Message 107713.  

Sam.
I've never found liquid cooling to be necessary. Just use decent heatsinks and fans. Water cooling is for quietness.

Nobody would ever proclaim any of my hosts as quiet and they are ALL water cooled.
GPU fans make the most noise, even when they are hybrid water cooled. And most of them also sing quite loudly. (coil whine)


ID: 107714 · Report as offensive
BOINC Moderator
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 10 Mar 20
Posts: 60
Message 107726 - Posted: 5 Apr 2022, 10:45:10 UTC
Last modified: 5 Apr 2022, 23:37:39 UTC

Nice conversation here on (water) cooling but completely off topic in this thread. Please start a thread outside this one on the subject and I'll happily move your earlier posts over. But for the topic, let's go back to sprzyswa and his problem(s).

Edit, thread on water cooling vs air cooling continues at https://boinc.berkeley.edu/forum_thread.php?id=14632
ID: 107726 · Report as offensive
sprzyswa

Send message
Joined: 2 Feb 21
Posts: 30
France
Message 107730 - Posted: 5 Apr 2022, 20:13:18 UTC - in response to Message 107726.  

Nice conversation here on (water) cooling but completely off topic in this thread. Please start a thread outside this one on the subject and I'll happily move your earlier posts over. But for the topic, let's go back to sprzyswa and his problem(s).


Thanks, but for me Boinc is not the cause of my problem, so for me this discussion is closed.

Sam.
Powered by Debian & Ubuntu 20.04 LTS
Boinc version 7.16.6 x86_64-pc-linux-gnu
ID: 107730 · Report as offensive
Previous · 1 · 2

Message boards : Questions and problems : boinc-client crash and reboot my machine

Copyright © 2022 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.