Posts by JStateson

1) Message boards : GPUs : AMD GPU Task Turns Computer Off Immediately (Message 92805)
Posted 3 days ago by Profile JStateson
Post:
.... So this tells me it is not a heat issue, I would expect the computer to run for a little while before shutting down.
Whilst you would not expect the temperature to rise instantly and therefore might expect to see a bit of a delay, maybe the firmware is using something other than a temperature change to invoke a protection mechanism. I have no idea if this is ever done but perhaps it might be current draw that triggers the response. You don't mention what project is supplying the GPU work but if that work is really compute intensive, maybe some current limit is being tripped. I could imagine that happening quite quickly - almost instantly.



Try underclocking the GPU using msi afterburner or, if supported, AMD's wattman. If it works at low speed then that current surge could be the problem.
2) Message boards : BOINC client : strange error msg: could not assign boinc user to group render (Message 92645)
Posted 19 days ago by Profile JStateson
Post:
I did an update to Ubuntu 18.04 followed by an upgrade and saw the following error messages:
Setting up boinc-client (7.16.1+dfsg+201908161115~ubuntu18.04.1) ...
usermod: group 'render' does not exist
Could not assign boinc user to group 'render'


Everything seems to be working fine. The BOINC client did terminate during that upgrade but a restart worked fine and I rebooted just to make sure.

I assume the errors are ignorable.
3) Message boards : GPUs : Client wont start Getting Stuck at OpenCL (Message 92644)
Posted 19 days ago by Profile JStateson
Post:
How you resolved that? Same issue is happening with me.


never heard back from original poster, be nice to know even if the answer seems "stupid". The saying "the only stupid question is the one that is not asked" can apply to the answer. OTOH the post had been up almost a week maybe they gave up and left.

my suggestion was to look in the windows event log for errors. the point where the client got hung up appears to be where it is asking for hardware info which it gets from the OS: 16-Jul-2019 01:07:01

pressing ctrl-c a couple of times could actually cause that exit at 16-Jul-2019 01:15:21

there are debug flags that can provide help but i am not not familiar with how to use them or how to interpret the results.

what problem are you yourself seeing?
4) Message boards : Questions and problems : Windows install issues (Message 92634)
Posted 20 days ago by Profile JStateson
Post:
On more than one occasion I have accidently installed 32bit BOINC on a 64 bit windows. I remember seeing errors with libraries. Just a guess. However, I was looking at the following


Faulting module name: LIBEAY32.dll, version: 1.0.2.7, time stamp: 0x56d5fc8e


Using https://www.freeformatter.com/epoch-timestamp-to-date-converter.html

Your LIBEALY32.dll is dated 3/1/2016, 2:33:18 PM

I just checked two of my 7.14.2 win10x64 systems and they show 12/18/2016 4:46 PM for the same version 1.0.2.7

Not sure of the significance. If your version includes VM possible the dll package is older.

I have not used service installs of BOINC since I got rid of my XP systems so I cannot advise other to say netplwiz can be used to log in automatically.
5) Message boards : Questions and problems : Windows install issues (Message 92625)
Posted 21 days ago by Profile JStateson
Post:
Is this 64 bit windows? 32 bit runtime problem?
6) Message boards : Questions and problems : need help debugging a problem: Linux 7.16.1 (Message 92621)
Posted 21 days ago by Profile JStateson
Post:
Have had this happen again. AFAICT it is cause by the GPUs that are on a splitter.

Was thinking about something along this line:

Instead of
 boinc_temporary_exit(180,"Cuda device initialisation failed");


do this instead since the app knows which failed


 boinc_temporary_exit(180,"Cuda device=7 initialisation failed");


Unless I am mistaken, that string is passed back to the BOINC client as it shows up in the message log.
The device id could be extracted by the client and it would know which CUDA device was defective.
Could this info be used to prevent tasks being assigned to that device?

Another piece of the puzzle from this task (not sure how long in SETI database)

The SETI app reports the following:
In cudaAcc_initializeDevice(): Boinc passed DevPref 7
setiathome_CUDA: CUDA Device 7 specified, checking...
   Device cannot be used
  Cuda device initialisation retry 1 of 6, waiting 5 secs...
setiathome_CUDA: Found 6 CUDA device(s):
  Device 1: GeForce GTX 1660 Ti, 5944 MiB, regsPerBlock 65536
     computeCap 7.5, multiProcs 24 
     pciBusID = 4, pciSlotID = 0
  Device 2: GeForce GTX 1070 Ti, 8117 MiB, regsPerBlock 65536
     computeCap 6.1, multiProcs 19 
     pciBusID = 1, pciSlotID = 0
  Device 3: GeForce GTX 1070, 8119 MiB, regsPerBlock 65536
     computeCap 6.1, multiProcs 15 
     pciBusID = 5, pciSlotID = 0
  Device 4: GeForce GTX 1060 3GB, 3019 MiB, regsPerBlock 65536
     computeCap 6.1, multiProcs 9 
     pciBusID = 3, pciSlotID = 0
  Device 5: GeForce GTX 1060 3GB, 3019 MiB, regsPerBlock 65536
     computeCap 6.1, multiProcs 9 
     pciBusID = 8, pciSlotID = 0
  Device 6: GeForce GTX 1060 3GB, 3019 MiB, regsPerBlock 65536
     computeCap 6.1, multiProcs 9 
     pciBusID = 10, pciSlotID = 0
In cudaAcc_initializeDevice(): Boinc passed DevPref 7


I assume the Stderr output is from the app, not boinc, when writing the following:

Clearly, the client says to use device 7 (DevPref 7). It does not know that device is defective. If it did it would not have recommend that device. It needs some feedback from the app to make that decision. Complicating this is the "Device cannot be used" might apply to this project's app and some other project's app might not have a problem using the device. However, the above series of messages seem strange: Why is the app even trying other devices if it was given a preference of "7". This is best answered by the project, but the BOINC developers should be aware that the app is trying devices other than what was recommended because 7 had a problem. The client (IMHO) would never launch an app unless a resource was available. ******

Another factor is the pciBusID. As above they are numbered: 4,1,5,3,8,10. Note that "2" is missing

When I ran nvidia-smi on that system that generated the above paragraph from Stderr Output, I get the following ***
jstateson@tb85-nvidia:~$ nvidia-smi
Unable to determine the device handle for GPU 0000:02:00.0: GPU is lost.  Reboot the system to recover this GPU 


What is interesting is that BOINC runs just fine, the SETI apps on the other 6 GPUS also run fine, but the nvidia-smi app cannot get the handle to one of its GPU's and simple says to reboot the system. Handles are provided by the OS (Ubuntu 18.04) That gpu "Device 7" is hung, nvidia-smi says the bus id is 2, the client in the messages log uses devices D0...D6 and the SETI app uses 1..7. I do not know how the numbering of the bus-id works. One would think that the device driver's numbering would be used rather than a made up number (1..7) or (0..6) etc.

Also, I suspect this forum is not the place to offer constructive criticism. It is a public forum for questions / problems about running the client or manager and criticism here tends to bring out tribal instincts from non-programmers. Maybe there is a better place. For Gridcoin, the programmers tend to use steemit or reddit. GitHub also has a forum. Maybe there is a better place to discuss this, assuming anyone really wants to.

*** I thought that was funny. It reminded me of a project for the Canadian Navy I worked on. The contract specified that the system had to run a minimum of 24 hour without rebooting. Here, there is a problem with the GPU and the driver has lost communication so nvidia-smi recommends a reboot of the system. If the GPUs were each assigned target acquisition that could be a real problem in a naval conflict. Fortunately, BOINC is not a mission critical app, nor is SETI.

****** If a resource is available and an app is launched and that app fails to used that resource and then repeatedly tries to find another resource it seem this could cause a race between itself and the client as I assume the client is also looking for open resources. If a resource is freed, say GPU-x and the client gets x as a resource possibly the app could also get that same x which could cause a conflict. I am also seeing left over tasks that the client cannot terminate
7209	SETI@home	8/10/2019 3:34:45 PM	[error] garbage_collect(); still have active task for acked result blc32_2bit_guppi_58643_76143_HIP73005_0101.26078.409.23.46.97.vlar_0; state 5

It is just a guess /speculation that these are related but they only show up on my systems that have splitters to add additional GPUs.
7) Message boards : Questions and problems : need help debugging a problem: Linux 7.16.1 (Message 92595)
Posted 23 days ago by Profile JStateson
Post:

From what I read around, if you set environment variable
export CUDA_DEVICE_ORDER=PCI_BUS_ID
the GPU IDs will be ordered by pci bus IDs and will show the same output as in nvidia-smi.


Thanks Jord!

Tried that, first in bash and then ran /etc/init.d/boinc-client restart
Did not work so I then edited "profile" and rebooted
Had same problem but at least it was not missing when I logged in using xterm.
I then a /etc/init.d/boinc-client restart while in bash but no change.

Get the following all the time. As you can see nvidia reports different order. The boinc manager matches the coproc_info.xml file.

TERM=xterm
SHELL=/bin/bash
CUDA_DEVICE_ORDER=PCI_BUS_ID
SHLVL=1
LOGNAME=jstateson
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
XDG_RUNTIME_DIR=/run/user/1000
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
LESSOPEN=| /usr/bin/lesspipe %s
_=/usr/bin/printenv
jstateson@tb85-nvidia:~$ cd /var/lib/boinc-client/
jstateson@tb85-nvidia:/var/lib/boinc-client$ grep -i gtx coproc_info.xml
   <name>GeForce GTX 1660 Ti</name>
   <name>GeForce GTX 1070 Ti</name>
   <name>GeForce GTX 1070</name>
   <name>GeForce GTX 1070</name>
   <name>GeForce GTX 1060 3GB</name>
   <name>GeForce GTX 1060 3GB</name>
   <name>GeForce GTX 1060 3GB</name>
      <name>GeForce GTX 1660 Ti</name>
      <name>GeForce GTX 1070 Ti</name>
      <name>GeForce GTX 1070</name>
      <name>GeForce GTX 1070</name>
      <name>GeForce GTX 1060 3GB</name>
      <name>GeForce GTX 1060 3GB</name>
      <name>GeForce GTX 1060 3GB</name>
jstateson@tb85-nvidia:/var/lib/boinc-client$ nvidia-smi
Mon Aug 26 15:52:26 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40       Driver Version: 430.40       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 107...  Off  | 00000000:01:00.0 Off |                  N/A |
|100%   41C    P8    13W / 180W |     12MiB /  8117MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1070    Off  | 00000000:02:00.0 Off |                  N/A |
|100%   46C    P8    12W / 151W |      9MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 106...  Off  | 00000000:03:00.0 Off |                  N/A |
|100%   40C    P8     8W / 120W |      9MiB /  3019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 166...  Off  | 00000000:04:00.0  On |                  N/A |
|100%   42C    P8    16W / 120W |     17MiB /  5944MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   4  GeForce GTX 1070    Off  | 00000000:05:00.0  On |                  N/A |
|100%   37C    P8     9W / 151W |     18MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   5  GeForce GTX 106...  Off  | 00000000:08:00.0 Off |                  N/A |
|100%   41C    P5     7W / 120W |      9MiB /  3019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   6  GeForce GTX 106...  Off  | 00000000:0A:00.0 Off |                  N/A |
|100%   41C    P8     9W / 120W |      9MiB /  3019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+


I then went to boinc-master and did a recursive grep for CUDA_DEVICE_ORDER
Nothing showed up but I did get a hit on PCI_BUS_ID but pretty sure it is not used for ranking
---------- GPU_NVIDIA.CPP
    CU_DEVICE_ATTRIBUTE_PCI_BUS_ID = 33,
        (*p_cuDeviceGetAttribute)(&cc.pci_info.bus_id, CU_DEVICE_ATTRIBUTE_PCI_BUS_ID, device);


It would really be useful for debugging purposes (hardware or software) if the GPU0...GPU6 shown by nVidia matches the D0..D6 as shown by BT or BM.

Back around 2007, before I retired, I took a picture of my self standing in front of a 4096 blade system that took up an entire bay. Maybe 8 huge racks of servers that were being shipped to an Okinawa army base. There was a problem with the 1394a control interface. No one pointed fingers or complained about hardware. It just had to be fixed and fixed it was, in software. I know of no way other than stopping the fan and making a note of which device stopped to identify GPUs. Will be more careful of where I put my finger in the future.
8) Message boards : Questions and problems : need help debugging a problem: Linux 7.16.1 (Message 92576)
Posted 24 days ago by Profile JStateson
Post:
I'd be prepared to place a small bet that this is a hardware problem, not related to the software version (of either BOINC or SETI) in use.


I agree 100% as this has never happened to me on a machine without GPU risers card and cables.


Risers and cables is a symptom of adding more GPUs on a motherboard than it was designed to use or the OS to manage or the drivers to handle.

I can run nvidia[-smi in a loop all day with 2 or 3 video boards and the fans speeds and usage are reported just fine. When I add additional GPUs I start seeking "ERR" under fan speed at random GPUs and usage varies erratically.

We are pushing the envelope: "going where no BOINC program has gone before" At least, for the 2 week WOW mission.
9) Message boards : Questions and problems : need help debugging a problem: Linux 7.16.1 (Message 92569)
Posted 25 days ago by Profile JStateson
Post:

My dual GTX 1660 Ti machine is currently drawing about 360W from the wall, falling to a little over 300W when the CPU is idled


This system has 670 at wall and power supply is either 750 or 850 Seasonic gold. Will have to pull it out to see exactly what it is. There are two gtx1060 on a 4-in-1 splitter and possibly those are the problem. Next time it fails I will remove the splitter and go with just 4.

[EDIT] is 850 watt. I used a DeWalt inspection camera to read the info. I managed to avoid knocking any of the x1 adapters loose on the rig under the power supply.
10) Message boards : Questions and problems : need help debugging a problem: Linux 7.16.1 (Message 92564)
Posted 25 days ago by Profile JStateson
Post:
Using the following two error messages
   Device cannot be used
  Cuda device initialisation retry 1 of 6, waiting 5 secs


I cannot find any matching phrase looking recursively through 7.16.1

grep -r "Cuda device initialisation retry"

grep -r "Device cannot be used" .

I did find the "Cuda device initialisation retry" in the SETI source and spotted the following as an exit during Cuda initialization:
	  boinc_temporary_exit(180,"Cuda device initialisation failed");


Somehow this error needs to get more visibility to the user. Possibly it is buried in the event queue. All I see in the manager is a lot of tasks "waiting to run" which is NOT an error but a symptom

I was unable to find "Device cannot be used" anywhere but if
	  boinc_temporary_exit(180,"Cuda device initialisation failed");

is reported to the client then they did their job even if not much.
11) Message boards : Questions and problems : What happened to "requested" and "granted" credits? (Message 92563)
Posted 25 days ago by Profile JStateson
Post:
WCG does as well.


Thanks, I knew I had seen it somewhere.

I have a program that graphs work unit elapsed time by GPU. I had the idea of graphing credit instead. According to Richard (another post somewhere but I don't remember where) the credit estimate is based on expected work to be done which goes into calculating the time estimate. If I could figure out how to get that original "requested credit" it might make a more accurate plot of credit runtime. There is a nice plot of actual credit runtime here but it was done manually by looking up 100 values from the project web site. I would like to implement something like this in my Boinctasks history analyzer but need something close to the actual credit since I cannot get the true credit.

Another problem: for some reason, WCH is the only project I cannot scrape for statistics.

It appears they do more security checking than any other project and my program cannot access my data even though I have auto logon enabled. There is probably a way to access it but it is not worth the trouble to debug,
12) Message boards : Questions and problems : need help debugging a problem: Linux 7.16.1 (Message 92562)
Posted 25 days ago by Profile JStateson
Post:
This is likely a hardware problem. It is solved by rebooting but I would like to know what could cause this. I now have the capability of building the (Linux) client and could look at where this occurs and possibly come up with an error message that could notify the user the problem has started.

---once every couple of days----

On a 5 GPU rig, one of the GPUs crunches for 0-1 seconds then goes on to another work unit. A queue of "waiting to run" starts building up. Because there are 4 other working GPUs, they pull from this queue so the queue grows only slowly. After about an hour or two there might be 40 items in the queue.

There are no error message in the event queue and the work units all eventually finish and report back OK. There is just no productivity from the GPU that has the problem (assuming it is the same GPU)

There are "error" messages in the stderr file associated with the task.
https://setiathome.berkeley.edu/result.php?resultid=7986887720

Another problem (may be a feature): The GPU's are numbered 0..X where 0 is given to the "best" GPU and larger numbers for the "weaker". I do not know why BOINC bothers to rank GPUs. There seems to be no need and it makes it difficult to find which GPU is causing the problem assumeing the problem is a unique GPU. Why can't BOINC use the same GPU number that nVidia uses in nvidia-smi or that ATi uses in "sensors". Currently, I have to stop the fan spinning on a GPU. Look at nvidia-smi to see which GPU has the fan stopped, then make a note of the BUS-ID and then look up that BUS-ID in the file coproc_info.xml and then look at the Boinc Manager to see if that "Dx" is the same "Dx" that is crunching only 0-1 seconds. This is very awkward plus is dangerous depending on the fan type. I can post a picture of my bloody finger if anyone want to see it.

[edit] I may have looked in the wrong event queue using Boinctasks. Next time I will check the event queue more carefully for an error message.
13) Message boards : Questions and problems : What happened to "requested" and "granted" credits? (Message 92515)
Posted 17 Aug 2019 by Profile JStateson
Post:
At one time there was a report that showed credits requested & granted, and, after validation, the granted credits were filled in. Was that classic seti or one of the current projects? I looked at some project stats but didn't see any statistical column for "granted". I thought I had seen that somewhere. If that feature was available it would be nice to know a the time of upload what the requested credit was. Even if the value was wrong, it would be useful for a rough estimate of credit performance.
14) Message boards : Questions and problems : Ubuntu: how to kill a task owned by boinc? (Message 92466)
Posted 12 Aug 2019 by Profile JStateson
Post:
The task is (pardon the screen/text grab)
3376 jstateson	20	0 29696	-3808	3284 S	0.0	0.1	0:00.03 -bash
3407 boinc	30	10 78.8G	79014	36014 S	0.0 10.0	0:00.00 ../../projects/setiathome.berkeley.edu/setiathome_x41p_V0.98bl_x86_64-pc-linux-gnu_cuda90


so, the owner of task 3407 is "boinc" and I own 3376

using sudo kill -9 3407 has no effect when that task is "hung"

can I log in as "boinc" to terminate it? Is there a password?
Maybe it is hung so bad it cant receive the terminate signal.
Is there another way to terminate?

$ sudo reboot

If a sigkill is not working, it isn't a BOINC issue. Are you sure that the reaper (launchd PID 1) is able to run? It may take the O/S a while to get to it.


I think that is what happened - the O/S was unable to respond. This was a 4 core system with 10 GPUs and as the motherboard had only 6 slots I had to used splitters. I was even getting periodic messages from nVidia that it had "lost" a GPU or two when reporting fan speeds. I am back to a single GPU for each slot and no problems. It did work good for a long time with 8 but when I added the last pair of gtx1060 and a second splitter all hell broke lose. It was difficult to determine which board had the problem as they had the same name "gtx1060" and I had to stop the fan with my finger and check the fan speed report to identify which board I was testing. On rare occasions i have seen windows 10 task manager unable to terminate a task but usually I don't have to manually reboot windows when this happens as it reboots itself within a second or two after a blue screen. This was the first time kill -9 did not work which I did not expect.
15) Message boards : Questions and problems : Ubuntu: how to kill a task owned by boinc? (Message 92456)
Posted 11 Aug 2019 by Profile JStateson
Post:
The task is (pardon the screen/text grab)
3376 jstateson	20	0 29696	-3808	3284 S	0.0	0.1	0:00.03 -bash
3407 boinc	30	10 78.8G	79014	36014 S	0.0 10.0	0:00.00 ../../projects/setiathome.berkeley.edu/setiathome_x41p_V0.98bl_x86_64-pc-linux-gnu_cuda90


so, the owner of task 3407 is "boinc" and I own 3376

using sudo kill -9 3407 has no effect when that task is "hung"

can I log in as "boinc" to terminate it? Is there a password?
Maybe it is hung so bad it cant receive the terminate signal.
Is there another way to terminate?
16) Message boards : Questions and problems : strange error: garbage_collect ?cannot collect? (Message 92454)
Posted 11 Aug 2019 by Profile JStateson
Post:
Trying to debug the problem as it is happening once or twice a day.

it would appear that memory is not a problem.

Looking here
if (rp->got_server_ack) {
            // see if - for some reason - there's an active task
            // for this result.  don't want to create dangling ptr.
            //
            ACTIVE_TASK* atp = active_tasks.lookup_result(rp);
            if (atp) {
                msg_printf(rp->project, MSG_INTERNAL_ERROR,
                    "garbage_collect(); still have active task for acked result %s; state %d",
                    rp->name, atp->task_state()


State 5 means finished ok from what I understand. Looks like the Linux seti app does not realize it finished.

On my boinc manager, under status I see the following typical behavior
....running....uploading....ready-to-report

(1) At what point is the status set to 5? Is it after the upload? after the "ready to report"
I am guessing the error occurs as the 5 is generated just after finishing the "running" but "uploading" does not take place for some reason. So it is got the server ack but is marked as still running or a dangling "active task".

(2) what exactly does "uploading" mean?

(3) what exactly does "reporting" mean?

Could there be a timing problem in the app when looking for the ack from the server? Who handles the ack: boinc or the app?
Even if this is not a boinc problem I would like to know answers to 1,2 and 3 before going over to SETI and stirring the pot.

==============some other observations=============
kill and kill -9 do not kill the "dangling" task even under sudo. I am not an expert but kill -9 has always worked for me. I do see that "boinc" is the owner of the dangling task. Is that what is keeping me from being able to kill it? I would rather kill it than reboot. bionccmd --quit stops boinc but not that dangling task. A restart of the service failsL I see the task with command "boinc --detactgpu xx (don't remember exactly) and the task disappears and reappears as the service keeps trying to start but boinc never gets past that detectgpu. I end up with reboot of system and often have to power off and on as it never totally shuts down.
17) Message boards : Questions and problems : strange error: garbage_collect ?cannot collect? (Message 92453)
Posted 10 Aug 2019 by Profile JStateson
Post:
Two of my GPUs on a 10 GPU mining rig are stuck: 0% utilization with work unit showing %100 done

error messages:

7209	SETI@home	8/10/2019 3:34:45 PM	[error] garbage_collect(); still have active task for acked result blc32_2bit_guppi_58643_76143_HIP73005_0101.26078.409.23.46.97.vlar_0; state 5	
10233	SETI@home	8/10/2019 4:20:49 PM	[error] garbage_collect(); still have active task for acked result blc33_2bit_guppi_58643_86349_HIP33332_0131.3725.0.23.46.188.vlar_0; state 5	


what's happening?

googling I found a previous report dated 2010 over at SETI.

[EDIT] Cannot even kill boinc. tried sudo kill -9 8109 (boinc) and just kill 8109 and task 8109 never disappears from top or htop. Argument shows boinc with command line --detectgpu so it (7.16.1) seems stuck trying to detect the gpu and not bothering to accept the kill signal.

This was after using the /etc/init.d/boinc-client stop
to try to stop

going to reboot

[EDIT 2] Suspended and NNT and rebooted. The two "stuck" tasks were assigned GPUs 0 and 1 and finished in under minutes. resumed rest of tasks look back to normal.

maybe I ran of out memory with only 8gb and 10 gpus.
18) Message boards : BOINC client : Seems 7.16.1 has been released but not shown on download page (Message 92436)
Posted 9 Aug 2019 by Profile JStateson
Post:
Effectively, Gianfranco's PPA is "the" bleeding-edge test repository for BOINC under Linux. By adding that PPA, you are giving prior consent to receiving unconfirmed code.

Having said that, Gianfranco does build from official release branch code. The v7.16.1 release code branch was forked (without announcement) 10 days ago, but as yet no formal testing processes have been initiated. I installed the same version myself this morning, and I've initiated some conversations with other members of the development team. That's as much as I'm prepared to say until I hear back from the other developers.


Just wanted to get something other than the 7.9 that comes form 18.04 default install.

Noticed the following:

8/9/2019 8:21:09 AM	GUI RPC request from non-allowed address 2.0.206.175	

that ip address is on your side of the pond. This really concerns me but I do know how to follow up on this and it is scary that every install of Linux boinc shows an attempted probe. Maybe is is just a way to count how many deployments have been made but I have seen probes from more than one location and always just after a fresh install.

[EDIT]
Some thoughts on dependencies and how things can quickly change.

I keep a list of what I did so as to quickly install ubuntu followed by either AMD or NVIDIA (gave up on Intel).

Things change:
Wanted to convert AMDGPU-PRO system to NVidia (ubuntu 18.04) but amdgpu--uninstall does not exist even though recent posts on askubuntun and stackoverflow still show that as how to uninstall.

Anyway, I simply put in nvkidia and nvidia took over which was nice.
However my driver install for 390 that worked last week does not work anymore and googleing around found driver-430

Anyway, got it working
19) Message boards : BOINC client : Seems 7.16.1 has been released but not shown on download page (Message 92433)
Posted 9 Aug 2019 by Profile JStateson
Post:
sudo add-apt-repository ppa:costamagnagianfranco/boinc

followed by apt-get install boinc-client picked up 7.16.1

I checked at the boinc donwload page and only 7.14 is shown.
there is no warning about developmental, must the real McCoy

1			8/9/2019 7:52:43 AM	Starting BOINC client version 7.16.1 for x86_64-pc-linux-gnu	
2			8/9/2019 7:52:43 AM	log flags: file_xfer, sched_ops, task	
3			8/9/2019 7:52:43 AM	Libraries: libcurl/7.58.0 OpenSSL/1.1.1 zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3	
4			8/9/2019 7:52:43 AM	Data directory: /var/lib/boinc-client	
5			8/9/2019 7:52:44 AM	CUDA: NVIDIA GPU 0: GeForce GTX 1070 (driver version 430.40, CUDA version 10.1, compute capability 6.1, 4096MB, 3972MB available, 6852 GFLOPS peak)	
6			8/9/2019 7:52:44 AM	OpenCL: NVIDIA GPU 0: GeForce GTX 1070 (driver version 430.40, device version OpenCL 1.2 CUDA, 8118MB, 3972MB available, 6852 GFLOPS peak)	


Since GPUGrid has serious problems I am switching those GPUs to the new fast seti app that uses Linux. The WOW even starts in a week. Maybe not go back to GPUGrid ever.
20) Message boards : Questions and problems : Trouble running 2 RX 570 on Einstein@home (Message 92419)
Posted 7 Aug 2019 by Profile JStateson
Post:
i recall that AMD can enable crossfire mode by default unlike nvidia. if enabled try disabling it


Next 20

Copyright © 2019 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.