Posts by Joseph Stateson

21) Message boards : GPUs : PCI express risers to use multiple GPUs on one motherboard - not detecting card? (Message 95349)
Posted 20 Jan 2020 by Profile Joseph Stateson
circling back to the original discussion. I signed up for Einstein at home and did some PCIe testing.

I had always heard from other users that Einstein was PCIe dependent, to the point that anything less than x16 links caused tasks to run slower. but actual testing on numerous different cards and PCIe lane widths shows that's not true. Einstein is even less PCIe dependent on both the Gamma Ray and Gravity Wave tasks. I saw about 1% PCIe bus use on both types of tasks on just a PCIe 3.0 x1 link, so it's no surprise that you haven't seen a slowdown. In light of this, it looks like SETI actually uses more PCIe bandwidth (at least on the optimized CUDA special App). Maybe in the past with old tasks Einstein used to have more reliance on PCIe, but it does not appear to be the case anymore.

As far as how many cards you can run, you will have to test and find the limiting factor of how many GPUs can be attached before the system will no longer boot. my guess is it will be somewhere between 3-7 GPUs. no way to tell without

The next limit will be CPU resources to support the GPU tasks. you only have a 4c/4t CPU, and a rather old/weak one at that compared to modern chips. luckily Gamma-ray tasks don't seem to mind running on a weak CPU, but you may have bad results with the Gravity wave tasks.

you'll have to test the impact to Milkway though. I'm not going to attach to that one with the machines I have now since it relies so heavily on DP performance and recent Nvidia cards like I have have abysmal DP performance for the cost/power use. I might build a Radeon VII based system in the future for Milkyway though, that's the best bang for buck card on that project.

That gravity wave "2.07" consistently take 100% of CPU on my 4/8t and I had to limit concurrent tasks to 6 (not just 8) and also exclude the Zotac P106-90 card which was OK on SETI but not too useful on Einstein. However, Asteroids at home uses only 0.01 CPU and that seem to work OK on the two slowest GPUs. Currently running 6 of the gravity and 2 Asteroids. Maybe you can comment on this post and bump up my question.

[edit] Should be running 3 Asteroids as there are a total of 9 GPUs. the cpu count should be six of the 1.0 and three of the 0.01 but unaccountably only 2 Asteroids are running. Took a while and scheduling priority went from -1,000 to only -0.29 but I am not running the additional Asteroids. Something in 7.16.3 I think
22) Message boards : GPUs : PCI express risers to use multiple GPUs on one motherboard - not detecting card? (Message 95320)
Posted 19 Jan 2020 by Profile Joseph Stateson
Well, there are no publishable pictures of the complete beast. Indeed the complete beast is very boring to look at, just a large box with power, data and cooling connections.
Initially a small system was tested using RTX2080 to give an idea of what feeders were going to be needed. Next tests were with earlier Quadro which left the RTX2080 behind, after six months (and some mods to the cooling) the RTX8000 were installed, and they are a step up again. The trouble with bench marks and specs is they don't always reflect what happens in real life under very high stress.

Being a totally air-cooled system the GPUs were obtained without their fans, etc. blast air at ~4C keeps everything in check.

But we digress.

I dare you to run Boinc on it, just for a day.

Some Russian scientist tried something similar on their own "super beast" It did not go well
23) Message boards : GPUs : PCI express risers to use multiple GPUs on one motherboard - not detecting card? (Message 95281)
Posted 18 Jan 2020 by Profile Joseph Stateson
Also, why do people bother mining? I've tried it on GPUs and ASICs, it just is not profitable. The electricity cost is approximately twice the coins you earn.

I have been "mining" since classic SETI but it was not called that back in 1999. Three (?) years ago I quit the Texas A&M club and joined the Gridcoin club. At the time I joined a single GRC was just under a quarter USD as I recall. If it had risen to a full quarter I would have 61,000 * 0.25 = $15,250. Unfortunately it is currently worth less than 1/4 cent I will let you do the math. The conclusion of this exercise is that I get a small return of something more valuable than just mining for "credits".
24) Message boards : Questions and problems : 7.16.3 has idle GPU: which parameter is causing the delay? (Message 95271)
Posted 18 Jan 2020 by Profile Joseph Stateson
Did not notice the "debug" so I set just the sched_op to 1 not the debug flag.

I think the problem is the project out of work / server busy and a coincidence it happened at this time.

However, the "resetting" of the parameters after requested a "read config" was not expected. why would backoff times be reset to 0?

Scheduling priority is back to -1,000.97 as shown by both bonctasks and boinc manager.
BT shows 20 minutes backoff interval for nvidia.. I assume this is all correct as the server has problems. Should have checked their servers before posting. My other systems were crunching SETI just fine but I didn't check to see if they were getting new work.
25) Message boards : Questions and problems : 7.16.3 has idle GPU: which parameter is causing the delay? (Message 95268)
Posted 18 Jan 2020 by Profile Joseph Stateson
Set <sched_op_debug> and see what you're actually asking for. You need to distinguish between "SETI doesn't have any work available" and "I didn't even ask for any work, available or not".

Issuing "read config" seems to have messed with those parameters. I did not expect to see "0" for priority. I also verified using boinc manager, not just boinctasks

108			1/18/2020 10:29:14 AM	Re-reading cc_config.xml	
140	Einstein@Home	1/18/2020 10:29:19 AM	Sending scheduler request: To report completed tasks.	
141	Einstein@Home	1/18/2020 10:29:19 AM	Reporting 1 completed tasks	
142	Einstein@Home	1/18/2020 10:29:19 AM	Not requesting tasks: "no new tasks" requested via Manager	
143	Einstein@Home	1/18/2020 10:29:21 AM	Scheduler request completed	
144	SETI@home	1/18/2020 10:29:47 AM	update requested by user	
145	SETI@home	1/18/2020 10:29:51 AM	Sending scheduler request: Requested by user.	
146	SETI@home	1/18/2020 10:29:51 AM	Requesting new tasks for NVIDIA GPU	
147	SETI@home	1/18/2020 10:30:36 AM	Scheduler request completed: got 0 new tasks

Duration correction factor	1.0000000000
Scheduling priority	0.00
CPU backoff time	--
Backoff Interval	-
NVIDIA backoff time	--
Backoff Interval	-

163	SETI@home	1/18/2020 10:37:52 AM	Sending scheduler request: To fetch work.	
164	SETI@home	1/18/2020 10:37:52 AM	Requesting new tasks for NVIDIA GPU	
165	SETI@home	1/18/2020 10:38:34 AM	Scheduler request completed: got 0 new tasks	

so my theory that getting the priority positive seems a too simple solution to a complex problem.

Not getting any tasks and not a lot of help from the "project properties" toward diagnosing the problem. This might even be a problem with the project servers. I just got a timeout trying to access my account at the site
26) Message boards : Questions and problems : 7.16.3 has idle GPU: which parameter is causing the delay? (Message 95266)
Posted 18 Jan 2020 by Profile Joseph Stateson
I am trying to reduce my count of Einstein tasks that I accidently downloaded and had set SETI to NNT to concentrate on getting rid of the backlog of Einstein tasks. Due to limited CPU / Threads, the Einstein project is set to maximum of 6 concurrent tasks, one per each of the first 6 GPUS. This leaves 3 GPU idle and there are two threads available.

After about 18 hours I decided to let SETI start downloading but I set the resource to 0 by selecting the "work=0" venue and requesting an update. Unlike my attempt at Einstein, I verified the resource was 0 before allowing more tasks. I am not getting any tasks, three GPUs are idle so I requested another update and nothing happened. I looked at the SETI project properties and am posting some of what I see as I suspect something there is causing the lack of work.
Duration correction factor	1.0000000000
Scheduling priority	-1,012.61
CPU backoff time	--
Backoff Interval	-
NVIDIA backoff time	1/18/2020 9:57:21 AM
Backoff Interval	00:10:00

in the time it took me to write this post (10 minutes?) the above changed as follows:
Duration correction factor	1.0000000000
Scheduling priority	-1,000.97
CPU backoff time	--
Backoff Interval	-
NVIDIA backoff time	1/18/2020 10:12:21 AM
Backoff Interval	00:20:00

I noticed the scheduling priority is slowly get back to a positive number. When that becomes positive will I start getting tasks? What can be done to speed this up assuming my guess is correct?

assuming it increased 10 points in 10 minutes it looks like 1000 / 10 = 100 minutes to wait. Can the priority be set to a value not a huge amount under zero?
27) Message boards : GPUs : PCI express risers to use multiple GPUs on one motherboard - not detecting card? (Message 95261)
Posted 18 Jan 2020 by Profile Joseph Stateson

I wonder if you can daisychain the 4 way splitters to get infinite cards?

I suspect you can have an infinite number of "bus id's" but, as suggested by pro digit, if a unique lane must be associated with each "bus id" then there is a limit.

On the other hand, if the driver is smart enough, it could use the same lane for all the traffic to the multiplexer (the 4-in-1) but that is a guess as I have no knowledge of the workings of the multiplexer.

Looking at this and assuming it is not "fake news" one would think that 104 boards on risers would need 104 lanes.
28) Message boards : GPUs : PCI express risers to use multiple GPUs on one motherboard - not detecting card? (Message 95237)
Posted 17 Jan 2020 by Profile Joseph Stateson
I finally received the x1 to x16 USB risers. I don't yet have the 4 way version, it's in the post.

I connected an AMD R9 280x via one of the risers to the PCI Express 2.0 x16 slot, and it ran Milkyway or Einstein at full speed (two tasks per GPU). Same full speed when connecting it to the PCI Express 1.0 x1 slot.

I'm not sure it is really only PCI Express 1.0 though. This specs page doesn't state the version for the x1 slots:
I find it hard to believe they'd use both versions on the same motherboard, I'm just going by what someone wrote above.

I ran SETI on P5K, P5E and P7N using core-2-quad for years and gave most away. I did get one down from the attic that had had a lot of x1 slots and tried risers but the problem was the CPU, not the risers when adding more GPUs and more so on windows.

One test I would like to run but I no longer have socket 775 boards would be run a load test on Einstein to see if the problem is the number of boards:

Using a core 2 duo, run 2 concurrent tasks on 2 boards and compare that to 1 task each on 4 boards. I have been wondering if dedicating a core to a single board with 2 tasks is more efficient than 2 cores allocated to 4 boards.

[edit] Forgot to mention in my earlier post: I bought a second set of 1-16 risers from the same company as the first set. The new purchase came with a warning that the manufacturer had released a number of risers that had the polarity reversed on the capacitors. He included a picture of an incorrect assembly: the shaded top 1/2 at the top of the capacitor was not on the same side as the colored design on the board where it was soldered. This would mean the + and - were reversed. The seller said to return any defective to him for replacement. I went and checked all my risers and all were ok.
29) Message boards : GPUs : PCI express risers to use multiple GPUs on one motherboard - not detecting card? (Message 95230)
Posted 17 Jan 2020 by Profile Joseph Stateson
I had mixed results with risers on old motherboards and especially those 4-in-1 risers.

An older X8DTL (1366 socket) required a board in the X16 slot to install Ubuntu 18.04. After installing ubuntu I was able to replace it with a riser. A 4-in-1 riser only showed one board when more than 1 ATI was used so I never got more than 4 boards to work.

my TB-85 (8 slot) worked fine with 8 risers, all gtx1060, ubuntu 18.04. For a "seti wow event" I temporarily added first a gtx1070 and then a 1070Ti. Things quickly went south, probably because of the different mix of boards.

I would see the following about twice a week
"Unable to determine the device handle for GPU 0000:01:00.0: GPU is lost. Reboot the system to recover this GPU**" 
In addition, the fan sensors frequently reported "ERR" instead of RPM.

I saw that "reboot" message daily when I added the second "extra" board. I tried a 2nd splitter thinking that keeping similar boards on the same splitter would help. I ended up getting an H110BTC that has 12 x1 slots and the 4-in-1s are in the scrap pile.

The TB85 had settings for lane speed and I tried a lot of variations where I set the lane speed to spec 2 for the slot that had the 4-in-1 but eventually left it all as "default" as things got even worse.

The problem I have now with TB-85 and H110BTC are projects like Einstein and GPUgrid that use almost a full CPU while SETI and Milkyway use a small fraction. My gen 6 & 7 CPUs only support 8 threads so there is a problem on the H110BTC as I cannot feed Einstein fast enough and 10-minute work units stretch to 30+ minutes with 9 boards. I solve this by limiting the number of concurrent tasks and reporting fewer GPUs to the project than I have.

** I created a program that shutdown the GPUs and reports using a text message here but I have not had a problem since I quit using those 4-in-1 risers and I have a mix of 1660, 1060, 1070, p102-100, p104-100, p104-90 and all work fine. I had to do this because more often than not, the work units would "time out" and another job was assigned and very quickly I would have 100's of errored out tasks.
30) Message boards : Questions and problems : Move data dir on Ubuntu ? (Message 95168)
Posted 15 Jan 2020 by Profile Joseph Stateson
You might want to add the "ReadWritePath" and also the "EnvironmentFile" as shown below. Change the paths "/var/lib/boinc" to what you want and move the filles there.
After editing "/lib/systemd/system/boinc-client.service" you will have to run "systemctl daemon-reload"
A discussion of systemctl is here

if something goes wrong use this for debugging
journalctl -xe

I have not moved my files but I have used the environment file at "etc/default/boinc-client" to pass parameters to boinc.
Post if problems and also confirm if you got it working.

Description=Berkeley Open Infrastructure Network Computing Client

ReadWritePaths=-/var/lib/boinc -/etc/boinc-client
ExecStop=/usr/bin/boinccmd --quit
ExecReload=/usr/bin/boinccmd --read_cc_config
ExecStopPost=/bin/rm -f lockfile

31) Message boards : The Lounge : The Seti is Slumbering Cafe (Message 95092)
Posted 15 Jan 2020 by Profile Joseph Stateson
Collatz has always made me feel stupid.

41 valid tasks and an RAC of 114k or so.

How about 320,000 credits every 5 and 1/2 seconds?

The project is good for credit points only and ranks up there with bitcoin utopia. No scientific value what-so-ever but that is just my honest opinion worth about 2c. I did run up a lot of points on it and also on bitcoin utopia but could have been finding solution for medical problems over at WCG or other more useful work. Again, just IMHO but I didn't know better.
32) Message boards : The Lounge : The Seti is Slumbering Cafe (Message 95084)
Posted 15 Jan 2020 by Profile Joseph Stateson
I not k now what metodoth or program you use to spoofed the GPU count, but i could tell for sure, max concurrent & scheduler works totaly different (not broken) from the previous versions than on the 7.16 Boinc. That is why we not use that with the spoofed client we use. Instead of that we manage the number of active cores/threads with CPU usage.

BTW I will remain at the outrage pub for about 1/2 hour, need to work tomorrow soon, hope that will be enought to satisfy the SETI Gods and bring the servers back to life. Tried to find a virgin here to sacrify at the vulcano and that was impossible.

I made a change to my program as I had been applying the 64 to all projects. I am now using the project app_config and setting the # of gpus depending on the project. Since this system has 9 GPUs then the below just limits the count to 4 instead of 9. Seti still has 64 to get through the off-line time. However, the 4000 limit I use did not get me over the 13+ hours.
root@h110btc:/var/lib/boinc/projects/ cat app_config.xml

I set the value in cs_scheduler
    // update hardware info, and write host info
    iGPU = (gstate.spoof_gpus == -1) ? 0 : gstate.spoof_gpus;
    if(p->app_configs.spoofedgpus > 0) iGPU = p->app_configs.spoofedgpus;
    host_info.write(mf, !cc_config.suppress_net_info, false, iGPU);
33) Message boards : The Lounge : The Seti is Slumbering Cafe (Message 95080)
Posted 15 Jan 2020 by Profile Joseph Stateson
My contention is that GPUs should never sit idle, regardless of any perceived debt. Apparently, the software feels otherwise.
I'd be interested to see if you experience anything like this.

Exactly what I have been looking at in the last 2 hours and trying to figure out. I had 4 GPU idle that should have been running Einstein and the other 5 GPUs are running milkyway. This system normally runs SETI and GPUgrid at %100 and Einstein at %0. I added Milkyway at 0 and after a while the Einstein GPUs went idle.

The work count in excess of 64 seem to be "lost work units" and I am guessing that number is not used when checking the GPU count. Both mining systems had a lot of "lost work units": However, I cannot account for something like 300 lost units. I only run Einstein when seti is offline. I clicked on Einstein's "www host schedule log" which duplicate info shown in the event viewer: "...lost tasks..." However, I also saw a strange message "..[CRITCAL] … two instances of the scheduler running.." or something to that wording. I am not running two instances of Boinc. The so-called "schedule" is an Einstein app that (my understanding) arranges to download database items, not just project work units.

There is no reason for the 4 GPUs to be idle. I aborted the Milkyway as I didn't want them stopping Einstein from running. Einstein then started up and, !INCREDIBLY! I got 3 GPUgrid work units. Probably been a week or more since any showed up. 7 of the 9 GPUs are at %100 utilization but I got 2 idle due to the CPU not having enough threads.
34) Message boards : The Lounge : Help Desk Expert? (Message 95076)
Posted 15 Jan 2020 by Profile Joseph Stateson
Yea, happened to me too here. At DVDFab I posted CUDA BluRay movie "rip" times for various NVidia boards and became their first "Knowledgebase Contributor". Not sure if that was a good idea but I think I am still allowed one backup for each movie I buy.
35) Message boards : The Lounge : The Seti is Slumbering Cafe (Message 95072)
Posted 15 Jan 2020 by Profile Joseph Stateson
The extended outage allowed me to notice that a 4 core (8 thread) CPU cannot feed 9 GPUs running Einstein. I had to configure for 4 concurrent Einstein and 5 concurrent Milkyway and in addition had to scrap the "64" spoofed GPUs as that got too many Einstein. I had resources set to 0 but got way more than 64 work units. Should have gotten 1 for each GPU but I am looking at 110 on one mining system and 241 on another. Resource on both for Einstein was 0 so something not right.
36) Message boards : Questions and problems : Building client only, fails because of missing libnotify (Message 94650)
Posted 2 Jan 2020 by Profile Joseph Stateson
Thanks for your answer. I'm building in a , with no packages available.

I'm more concerned by the fact that libnotify is required while I'm trying to not build the manager :)

As mentioned by keith, _autosetup is the key an if errors show up then a problem

There is no configure file. Those 2 lines of code are in
running _autosetup uses and possibly other files to create "configure" is supposed to have that test for the manager as that is how it is determined if the manager gets created or not.
If you see an error message like "cant find wxwidgets" then you accidently included the manager.

walk-through below

I have no idea what "simili linux-from-scratch" is. Do you have bash or dash or something else. what version?
I myself got caught by a script that behaved differently than what I expected as it used sh instead of bash. Does your configure file have "#! /bin/sh" at the top or something else?

If you have
./configure --disable-server --disable-manager
that test of line 36044 will take the no path and continue on
else it will take the yes path and continue on


suggestion: write a small batch file and run it with those lines of code something like
if test (whatever) = yes; then
echo "found a yes"
If you get a syntax error then problem with whatever is running the script.

also good is "bash -x ./configure" assuming you have bash. else source, else I don't know.


[EDIT] My opinion is only worth 2c. It used to be worth a lot less but I got promoted to "Help Desk Expert" so maybe the 2c is good.

The above assume you got the source from GitHub after having selected the branch "client 7.16.3" else all bets are off
37) Message boards : Projects : Access Android desktop remotely? (Message 94646)
Posted 1 Jan 2020 by Profile Joseph Stateson
You can use add-ons such as BOINCTasks Mobile (requires Windows) and AndroBOINC

1) Was not aware of AndroBOINC. Went there and looked around and found screenshots

they will not display even with JavaScript enabled on Google's chrome. Edge requires a policy change to run JavaScript and I suspect the problem is something else. /svn/www/projects.png is not a valid URL, it is folder. I tried the export to GitHub but that failed.

Where can I see screen shots? I do not have android devices.

2) I use splashtop to access the windows system running Boinctasks. Probably not much different than using mobile boinctasks on iPad under safari except the tiny screen on the iPhone requires zoom and panning. $10 a year gets me remote access from anywhere, not just the subnet.
38) Message boards : BOINC client : Support for Visual Studio versions newer than 2013? (Message 94640)
Posted 1 Jan 2020 by Profile Joseph Stateson
I have been building the client using VS2013 and also the latest Linux gcc (GitHub)
Recently was able to build the milkyway app for windows on Linux using mingw cross compiler (githhub)
Also built TBar's "special seti" source on Linux using latest gcc and CUDA libs (found zip at forum)

Was looking at building a windows version of that seti app and found a problem:

1>CUDACOMPILE : nvcc warning : nvcc support for Microsoft Visual Studio 2013 and earlier has been deprecated and is no longer being maintained
1>  support for this version of Microsoft Visual Studio has been deprecated! Only the versions between 2015 and 2019 (inclusive) are supported!

This is not a problem for the client as it does not run any CUDA code. However, the app clearly needs to be built with CUDA and I am guessing the newer libraries from NVidia might not be linkable with object code build by VS2013. That seti app uses source code from the client, especially include files and using VS2017 will require mods to the sources. I can try building the seti app using VS2017 and was wondering if there is any active work in making the client compatible with VS2015 or later? Possibly the seti app is best built with the mingw cross compiler instead of any MS product.
39) Message boards : Questions and problems : Data breach notification on (Message 94626)
Posted 31 Dec 2019 by Profile Joseph Stateson
I have never seen that pic nor was I even aware of this capability. I do use chrome for first visits or searching as I have chrome locked down. My other browser is Edge. I don't like it but it does work better on forms mainly because I keep chrome on tight leash. Sometimes chrome wont even show a required "captcha" popup because I loaded it with so many blocking extensions.

My normal desktop "office" system has McAfee via Dell and I pay for subscription to McAfee. OTH my surface pro has only windows 10 plus I do pay for Malware Bytes premium. One thing I noticed on the surface pro. If I browse to Seti@home and select "Number Crunching' and then the most popular thread "server panic" I ALWAYS get a warning that a trojan was found. Some site in had or has a trojan or is well known for poor security and is on Malwarebytes list. McAfee shows no problem, but who knows? The following is a screen grab from my SP4. BTW SETI has a "server panic" so often they start a new thread as the messages are too long. Currently # 118 If you read the message behind the "trojan warning" you can understand why they constantly have panics: a 20,000 WU cache size and any # of gpus you want (as long as you are a member of the club).

[edit] Thanks for letting me correct this post.
40) Message boards : Questions and problems : Big-little configuration and Boinc setup (Message 94625)
Posted 31 Dec 2019 by Profile Joseph Stateson
Thanks for posting this. I was unware of the big little terminology and went and read up on it here

My take: Intel extends battery life by reducing the clock speed when cpu not being used much.
ARM has the potential of switching to a core that has fewer transistors in addition to reducing the clock speed.
However, the OS has to implement the strategy and the applications needs to be tailored.
The article indicates that if one app needs a big core than all switch and vice-versa but better operating systems and better tailored apps can be more efficient.

I remember running a boinc app on a blackberry, forget what the Android version was but it made for a really good hand warmer when crunching.

Previous 20 · Next 20

Copyright © 2020 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.