Posts by floyd

41) Message boards : Questions and problems : How to solve libcurl3 dependency in Boinc Manager (Message 88527)
Posted 19 Oct 2018 by floyd
Post:
What do you mean the one from your libcurl3-less distribution?
I mean BOINC as packaged by Ubuntu. I assumed they do package it and that version of course can't depend on libcurl3 if there is no libcurl3.
Does an online package list like https://www.debian.org/distrib/packages exist for Ubuntu?
42) Message boards : Questions and problems : How to solve libcurl3 dependency in Boinc Manager (Message 88519)
Posted 19 Oct 2018 by floyd
Post:
Is there a version of Boinc Manager that does not require libcurl3?
Sure, the one from your libcurl3-less distribution. By the way, I think the client depends on libcurl, not the manager.
43) Message boards : Questions and problems : Not detecting GPU on Linux (Message 86843)
Posted 3 Jul 2018 by floyd
Post:
Glad you got it working, but I'd like to add a short off topic remark. Maybe you should revise your cooling solution. I see no reason why just a CPU and a single GTX 750 wouldn't be able to run full throttle, even in summer, unless you live in a particularly hot place.
44) Message boards : Questions and problems : Not detecting GPU on Linux (Message 86780)
Posted 30 Jun 2018 by floyd
Post:
Same problem, everytime I install a new version of Debian, boinc loses the GPU(s)!

lspci | grep VGA :

"01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2)"
I think we're talking about an upgrade from Debian jessie to stretch here. One thing that comes to mind is that all the fglrx stuff is no longer available in stretch. If your old installation relied on anything of that, I think the result may not be fully functional, depending on how you upgraded. But never mind, you'll probably want to do a fresh install of the nvidia driver anyway. Debian stretch comes with everything required, for the GTX 750 at least.

I can add at least one more NVIDIA GTX750 (two if the power supply will stand them)
If the power supply will stand them? If there's any doubt, don't even try it! Pushing your hardware to the limits, and possibly beyond, is a great way to burn it. And a dying PSU can kill anything in your system. Sounds familiar?

I have another machine with the same motherboard, CPU and memory plus two NVIDIA GTX1070 Ti GPUs, but no point in switching it on since the motherboard says it needs Windo$e10
They'll always tell you you need the latest Windows for anything to work.

works OK with Windoze7
See? You can't give a d*n on what they say. Since you already have the hardware, why not at least try it with Linux if Windows 7 or 10 are no options for you? That stuff is too expensive to just sit idle.

If I "apt-get install boinc-client-nvidia-cuda" (as root) the cards work and the machine dies! Normally the motherboard, power supply or hard drive burn out! I've got through 3 motherboards, 2 power supplies and 7 2Tb/3Tb hard drives! I can't afford to keep replacing things.
I've never seen BOINC kill any hardware. How would it do that? It can't put more stress on the hardware than anything else can, and if your system can't take 100% load it's defective from the start.

I don't really know how you can burn that much hardware, but one thing I could imagine is a not properly dimensioned or not properly connected PSU. There's more to look at than just a nice big total power output. Another thing could be heat. Multi GPU crunchers can generate a lot of heat, and you need to take care of that.

Back to the original topic, running BOINC with Nvidia GPUs on Debian stretch in this case. I added a Nvidia GPU to a headless cruncher only a few days ago, and all I had to install in addition to the already installed plain boinc-client package (which in your case probably is boinc aka boinc-client + boinc-manager) was nvidia-opencl-icd. That pulled in everything required. You'll probably have to do some configuration with your X setup, but computation works just like that.

Don't install any of the boinc-client-whatever packages. They're there for convenience, but if you don't exactly understand what they do they just add to the confusion. And don't install the original Nvidia driver. Sure, first thing everyone tells you is to go to nvidia.com and install the latest binary blob from there, but if your distribution's packaged driver is recent enough, it's much more convenient and less error-prone.

When you install nvidia-opencl-icd, you don't only (obviously) want its dependencies, but also the recommended packages. That way you eventually get nvidia-kernel-dkms, which keeps the Nvidia kernel module up to date. No more need to reinstall the driver after kernel upgrades. But apt-get doesn't subsequently install the recommendations of already installed packages, so you better make sure there's no remnants of old driver installations before you install nvidia-opencl-icd.

I don't think the driver in stretch is recent enough for a 1070 Ti, but the one in stretch-backports should do. There's a possible pitfall however. Just ask if you need to know.
45) Message boards : BOINC client : Upgraded Boinc to 7.10.1 and Manager can't connect to localhost anymore (Message 86062)
Posted 1 May 2018 by floyd
Post:
Quoting the linked bug report:

WorkingDirectory option in boinc-client.service was changed from

WorkingDirectory=~

to

WorkingDirectory=/var/lib/boinc

in upstream PR 2419 and this prevents starting boinc-client on Debian based
distros. The change was because enterprisey distros have older systemd's that
don't support ~ in paths.

Just a thought from someone using BOINC 7.6.33 on Debian 9.4 with systemd 232. In that version BOINC does not include a service file and the Debian package has an own that does NOT set WorkingDirectory. man systemd.exec states:

If not set, defaults to the root directory when systemd is running as a system instance and the respective user's home directory if run as user.
I think that's just what's needed here if those older systemds use the same defaults.
46) Message boards : Questions and problems : BOINC downclocks my GPU's RAM by 500 MHz; doesn't happen with heavy gaming (Message 85936)
Posted 19 Apr 2018 by floyd
Post:
This is likely not a BOINC issue. You could try Folding@Home for comparison.
There was a thread at Einstein a few years ago, when the 970 was new. I haven't read through it though.
47) Message boards : Questions and problems : "No work available to Process" (Message 85554)
Posted 30 Mar 2018 by floyd
Post:
You haven't looked close enough if you think your first computer is working fine. It completed SixTrack tasks sucessfully, as the other one would probably do if they hadn't run out. BOTH computers show 100% error rate for VirtualBox tasks, and the error messages suggest something is wrong with hardware support. My first action would be to check the BIOS settings regarding virtualization. For other options, please wait for somebody else to step up.
48) Message boards : Questions and problems : BOINC 7.8.3 is not picking up the avx, avx2 flags in Ubuntu 16.04.3 (Message 84469)
Posted 21 Jan 2018 by floyd
Post:
Here is the BOINC.log on the i7-4770 machine:
21-Jan-2018 11:05:39 [---] Starting BOINC client version 7.8.3 for x86_64-pc-linux-gnu
21-Jan-2018 11:05:39 [---] log flags: file_xfer, sched_ops, task
21-Jan-2018 11:05:39 [---] Libraries: libcurl/7.47.0 OpenSSL/1.0.2g zlib/1.2.8 libidn/1.32 librtmp/2.3
21-Jan-2018 11:05:39 [---] Data directory: /var/lib/boinc-client
21-Jan-2018 11:05:40 [---] CUDA: NVIDIA GPU 0: GeForce GTX 1070 (driver version 387.34, CUDA version 9.1, compute capability 6.1, 4096MB, 3984MB available, 6561 GFLOPS peak)
21-Jan-2018 11:05:40 [---] OpenCL: NVIDIA GPU 0: GeForce GTX 1070 (driver version 387.34, device version OpenCL 1.2 CUDA, 8105MB, 3984MB available, 6561 GFLOPS peak)
21-Jan-2018 11:05:40 [---] Host name: i7-4770-PC
21-Jan-2018 11:05:40 [---] Processor: 8 GenuineIntel Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz [Family 6 Model 60 Stepping 3]
21-Jan-2018 11:05:40 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
21-Jan-2018 11:05:40 [---] OS: Linux Ubuntu: Ubuntu 16.04.3 LTS [4.13.0-26-generic]

That part was not much in question insofar as I was concerned, since the Ryzen 1700 machine showed that avx was there
To me it sounded as if you claimed AVX was not detected on the Intel machine. Now we know it is, and likely the server got that information too. Just to avoid another misunderstanding - BOINC does nothing with this information but pass it on. If you still don't get AVX tasks that's the project's fault or their decision. There's nothing you can do about it IMO.
49) Message boards : Questions and problems : BOINC 7.8.3 is not picking up the avx, avx2 flags in Ubuntu 16.04.3 (Message 84466)
Posted 21 Jan 2018 by floyd
Post:
avx and avx2 are both there.

I am running BoincTasks on a Win7 64-bit machine connected to the Ubuntu machine over the LAN. How are you running it?
Your BOINC command tool, be it BoincTasks or boincmgr or boinccmd, does not matter and its output is not decisive. What's important on your part is the BOINC client on the Ubuntu machine.

I have notified BoincTasks about it, and maybe they can find the problem.
It seems you still haven't checked the log files. Please do so before you run around pointing everywhere. The problem is with your client, or the LHC server, or the communication between them. Now it's your turn to verify the first.
50) Message boards : Questions and problems : BOINC 7.8.3 is not picking up the avx, avx2 flags in Ubuntu 16.04.3 (Message 84456)
Posted 21 Jan 2018 by floyd
Post:
That is curious. I don't see "stdoutdae.txt" anywhere in the /var/lib/boinc-client directory, where all the other logs are kept. (I do see it in my Windows 7 machine though.)

In later Debian (based) builds it goes to /var/log/boinc.log.
An even better place to check would be /var/lib/boinc-client/sched_request_*, to see what is actually reported to the server.
51) Message boards : Questions and problems : Is there a "device_nums" for app_config? (Message 84439)
Posted 19 Jan 2018 by floyd
Post:
I want 4 Einstein on the 4gb board but only 2 on the 2gb board.
Make sure the 4GB board is #0, then use app_config to set a maximum of 6 concurrent tasks. If you're lucky #0 is filled first, resulting in a 4+2 distribution.
52) Message boards : Questions and problems : "Suspended - user request" vs "Task suspended by user" (Message 84383)
Posted 14 Jan 2018 by floyd
Post:
Turned off all automatic suspending options such as CPU usage or memory in the options.
Activity -> Run based on preferences
53) Message boards : GPUs : 1 of 2 WUs suspended due to activity on another GPU. (Message 83922)
Posted 17 Dec 2017 by floyd
Post:
I expected that the board it would run on would have its 2 Einstein tasks bumped off into suspension. Not only were they bumped off, but the one of the 2 Einstein tasks on the other board was also suspended. Thus I had only 2 active tasks, 1 on each GPU and 2 unused CPU cores.

This should not have happened.
Oh, I think it should, but "this" is not what you think it is.

Why was one of the Einstein tasks suspended from that other graphics board?
BOINC is clearing the other GPU for another Prime task. There's two individual Einstein tasks to suspend and restrictions to the timing to follow. Just watch for a while, the last Einstein task will be suspended soon.
This will continue to happen unless you can make Prime run on 0.5 GPU.
54) Message boards : Questions and problems : "Could not open directory 'slots' from '/home/user'" (Message 81888)
Posted 9 Oct 2017 by floyd
Post:
I upgraded from Debian 9.1 to 9.2 this morning and have since been unable to start BOINC 7.6.33 x86_64-pc-linux-gnu.
That upgrade didn't include BOINC and shouldn't have done anything to your user account. I don't see how that could break BOINC. However, the kernel and Nvidia drivers were upgraded. You did restart the computer after that, didn't you?

brian@brian:~$ boinc
Uhm, do you always start BOINC like that? And is /home/brian really your data directory? That's not how it's supposed to be. But again, IF this worked before the upgrade it should still work now.
55) Message boards : Questions and problems : How to recover BOINC? (Message 76149)
Posted 26 Feb 2017 by floyd
Post:
I think you missed two of the most important points about backups: Don't make a backup of data that could be changing during that process, and never ever overwrite data that something is working on. In this case that means before you even think of touching /var/lib/boinc-client, make sure BOINC is shut down.

I'd have done it this way:
0. Suspend network activity in BOINC.
1. Stop BOINC. Don't suspend it, shut it down!
2. Backup /var/lib/boinc-client and /etc/boinc-client. Verify the backup.
3. Reinstall as needed. Remember, if you do a fresh install of BOINC, it will be running and it will have a new identity.
4. Shut down BOINC. This is even more important now than before.
5. Restore /var/lib/boinc-client and /etc/boinc-client. Call yourself names because you forgot to backup the latter.
6. Start BOINC and make sure everything is working.
7. Resume network activity.

boinccmd --project http://www.worldcommunitygrid.org/ resume

BUT, get
Operation failed: authentication error

hmm.
I ran this as myself in /var/lib/boinc-client
which should pick up the auth in the current directory.

Which is of course not readable. Or is it? That would make the whole authentication useless.

Any ideas?

20 GOTO 10. Uh, make that GOTO 4. And hope your backup is good.
56) Message boards : Questions and problems : connecting to boinc remotely for rosetta@home (Message 76040)
Posted 20 Feb 2017 by floyd
Post:
1) What is the correct format of /etc/boinc-client/gui_rpc_auth.cfg ? I would like to supply a password for connect and the file is empty.
Just the password, without a LF at the end.

2) Do I need anything else other than "allow_remote_gui_rpc" set in /etc/boinc-client/cc_config.xml to allow connection?
You do not want allow_remote_gui_rpc, instead add your desktop to remote_hosts.cfg.

3) Did I read somewhere that setting #2 overides password/other settings ??
I don't think so, but that would be another reason why you don't want it.

4) is 1043 the correct port for connect?
Don't know. Why don't you just ssh to the server and check? Besides, if you don't use some kind of filter you don't have to care.

5) I don't see anything in /var/log/boinc.log on my desktop machine to indicate any problems when I try and connect to the remote machine. Should I be looking elsewhere?
Sorry, I don't have any remote machines right now so I can't check. If you use DNS, try the server name with or without domain. For me, the Manager would not connect to "yeenoghu", without a message IIRC, though that name did resolve. "yeenoghu.home" did work. I seem to remember it was the other way around before.

Edit: typo
57) Message boards : Questions and problems : ** resolved ** GPU not being used / CPU High Priority Mode / Einstein (Message 75706)
Posted 4 Feb 2017 by floyd
Post:
What is high priority mode?
BOINC has an internal high priority mode (not to be confused with the OS level priority) that is activated when tasks are in danger of missing the deadline. Resources are assigned to high priority tasks first, and those tasks are not interrupted by the usual task switch. One problem with that is that BOINC does no longer tell you when high priority is active. Many people who don't know about it or misinterpret the effects actually think of a malfunction.
It is possible that a GPU is idle because no CPU is available to run a task on it. Somehow BOINC seems to always run one GPU task though. I don't know the reason behind that, it doesn't seem intentional to me.

Why did it happen?
As mentioned, high priority is activated when a task could miss the deadline. Possible reasons are
(a) A task is assigned too short before deadline. Rosetta is an example for this case. There are tasks that were assigned just 2 (now 3) days before deadline and if you don't run a very small cache this will be too close if they aren't pushed ahead. There were complaints about the effects we see here and I've experienced it myself.
(b) Tasks take longer than expected. I'll take Einstein as an example because I think that's where your problem is in this case. You have some pretty fast GPUs and I think you'll be running single tasks on them. That means you're doing many fast tasks. Watch the expected run times. They'll be going down and BOINC will fetch more and more tasks to keep your cache filled. Unfortunately the same happens for CPU tasks and those are not that fast at all. When one of them finishes, the expected run times for the whole lot jump up and BOINC suddenly notices it has more work than it can handle. BANG, trouble.
(c) You can't run as many tasks as expected. (Again taking Einstein as an example and your 4 CPU cores, though the startup message says 8.) You define your cache size in days of work and in the process of translating that to tasks, BOINC assumes 4 CPU cores available, not taking into account that in normal operation you'll run at least 2 GPU tasks and those permanently block 2 CPU cores, leaving only 2 available. So your cache of CPU tasks actually lasts twice as long as expected. And if you run 2 tasks on each of your 2 GPUs, that means NO free CPU cores. Likewise if you have more GPUs. In any case, your cache is probably larger than you think, and certainly larger than BOINC thinks.

What will stop it happening again?
I don't think there's anything you can do to guarantee it won't happen again but you can adjust your setting to reduce the effects. First, make your cache as small as possible. It's only meant for occasional network outages or project downtimes. With a reliable ISP and some backup projects you should be fine with a day or two. Set "Store up to an additional" to a low value to avoid fetching large chunks of work. That could reduce the effects of case (a) above. Personally, I try to avoid running both CPU and GPU tasks for the same project on a single machine to avoid case (b). That's the most likely one, the one with the largest effects and I think it is what we are seeing here, triggered at Einstein it seems.
58) Message boards : Questions and problems : ** resolved ** GPU not being used / CPU High Priority Mode / Einstein (Message 75668)
Posted 3 Feb 2017 by floyd
Post:
I'm wondering if this isn't just another case of unnoticed high priority mode. That is very often the reason when BOINC doesn't work as expected - still it does work as designed. In earlier BOINC versions the Manager would indicate when tasks ran in high priority mode but that feature was removed. I really think this was a bad decision and should be reverted.
If in this case the CPU tasks were high priority, BOINC would run a full set of them (8 I think) plus one GPU task even if that takes another full core. With the high priority tasks suspended, enough CPU is available for more GPU support.
59) Message boards : Questions and problems : BOINC Manager on Ubuntu - how??? (Message 73819)
Posted 6 Nov 2016 by floyd
Post:
I had the same issue once, the reason was a permission error in the configuration directory. It should look like this:

$ ls -al /etc/boinc-client/
total 32
drwxr-xr-x   2 root  root   4096 Oct 29 09:43 .
drwxr-xr-x 144 root  root  12288 Nov  6 06:39 ..
-rw-rw-r--   1 boinc boinc   520 Oct 26 11:52 cc_config.xml
-rw-rw-r--   1 root  boinc  1449 Oct 26 11:56 global_prefs_override.xml
-rw-r-----   1 root  boinc     1 Oct  9 20:33 gui_rpc_auth.cfg
-rw-r--r--   1 root  boinc   296 Feb 23  2016 remote_hosts.cfg
60) Message boards : GPUs : AMD Drivers for Debian Jessie - Installing not successful (Message 73625)
Posted 29 Oct 2016 by floyd
Post:
I'm afraid I don't fully understand how you got it working, but good to hear you did.

Install BOINC from the Debian Jessie back-port repository. The version in the standard repository did not recognize the GPU even thought it was there and the driver was up-to-date.

Older BOINC versions expect to find the OpenCL library under a name that's not provided by current OpenCL packages. You would have needed to install the OpenCL development package. Recent BOINC versions, as included in jessie-backports, have a workaround for this. If you encounter a suggestion to manually add a link somewhere, ignore it.

I have removed my modifications to boinc-client and still need to do a restart so that the GPU is recognized. I can repair it if manually restarting the client becomes too much of an irritation.

If "repair it" means editing /etc/init.d/boinc-client, I strongly recommend against it. This file is not meant to be modified and unexpected things can happen if you do so. In general, don't make manual changes to system files or directories that aren't explicitly meant for it. This includes the link mentioned above.

As a side note, the BOINC packages in jessie-backports have been updated and a new /etc/init.d/boinc-client has been installed today. Not sure what would have happened to your modified file. Probably you would have been asked if you wanted to keep your modified version or install the new one. That's both not exactly what you would have wanted, but some predefined default action could have been really problematic.

You seem to be running lightdm. Open /etc/lightdm/lightdm.conf, find the line
#display-setup-script=
and change it to
display-setup-script=/usr/bin/xhost +si:localuser:boinc
That's not "xhost local:boinc" as has been mentioned before.

Now for the dummy plug. I've built quite a few GPU crunchers running Linux, both nVidia and AMD, including headless or dual GPU with one monitor, and I have never needed a dummy plug. Have you verified that it's still not working for you? If so, please show your xorg.conf.


Previous 20 · Next 20

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.