Posts by Keith Myers

21) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91302)
Posted 30 Apr 2019 by Profile Keith Myers
Post:
Fingers crossed. It might be wise to mark those library files as

<executable/>

as well - they certainly contain binary code which is going to be executed.

Don't think they work that way. At least they weren't marked executable in the CUDA90 special app app_info which I managed to find laying around in a forgotten disk. I looked at how the CUDA90 libcufft and libcudart libraries were referenced and used that app_info as pattern. They certainly worked well for that app.

Agree, fingers crossed. Think it will work this time when I can get work again.
22) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91296)
Posted 30 Apr 2019 by Profile Keith Myers
Post:
This is the app_info I am going to attempt to use.

<app_info>

  <app>
    <name>einsteinbinary_BRP4</name>
  </app>
  <file_info>
    <name>einsteinbinary_cuda64</name>
    <executable/>
  </file_info>
  <file_info>
    <name>einsteinbinary_cuda-db.dev</name>
  </file_info>
  <file_info>
    <name>einsteinbinary_cuda-dbhs.dev</name>
  </file_info>
  <file_info>
    <name>libcufft.so.8.0</name>
  </file_info>
  <file_info>
   <name>libcudart.so.8.0</name>
  </file_info>
  <app_version>
    <app_name>einsteinbinary_BRP4</app_name>
    <version_num>999</version_num>
    <api_version>7.2.2</api_version>
    <coproc>
      <type>CUDA</type>
      <count>1.0</count>
    </coproc>
    <file_ref>
      <file_name>einsteinbinary_cuda64</file_name>
      <main_program/>
    </file_ref>
    <file_ref>
      <file_name>einsteinbinary_cuda-db.dev</file_name>
      <open_name>db.dev</open_name>
      <copy_file/>
    </file_ref>
    <file_ref>
      <file_name>einsteinbinary_cuda-dbhs.dev</file_name>
      <open_name>dbhs.dev</open_name>
      <copy_file/>
    </file_ref>
    <file_ref>
    <file_name>libcufft.so.8.0</file_name>
    </file_ref>
    <file_ref>
      <file_name>libcudart.so.8.0</file_name>
    </file_ref>
  </app_version>

</app_info>


23) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91295)
Posted 30 Apr 2019 by Profile Keith Myers
Post:
Well since this is Linux and not Windows, Dependency Walker is of no use. To find the dependencies of any executable application in Linux all you need to do is:
ldd <application name>
in Terminal.

keith@Nano:~/boinc/projects/einstein.phys.uwm.edu$ ldd einsteinbinary_cuda64
	linux-vdso.so.1 (0x0000007f86d1f000)
	libcufft.so.8.0 => /usr/local/cuda/lib64/libcufft.so.8.0 (0x0000007f7ede5000)
	libcuda.so.1 => /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1 (0x0000007f7de9f000)
	libcudart.so.8.0 => /usr/local/cuda/lib64/libcudart.so.8.0 (0x0000007f7de2e000)
	libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000007f7de02000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000007f7dca9000)
	/lib/ld-linux-aarch64.so.1 (0x0000007f86cf4000)
	libstdc++.so.6 => /usr/lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000007f7db14000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000007f7da5a000)
	libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000007f7da45000)
	librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000007f7da2e000)
	libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000007f7da0a000)
	libnvrm_gpu.so => /usr/lib/aarch64-linux-gnu/tegra/libnvrm_gpu.so (0x0000007f7d9c7000)
	libnvrm.so => /usr/lib/aarch64-linux-gnu/tegra/libnvrm.so (0x0000007f7d985000)
	libnvrm_graphics.so => /usr/lib/aarch64-linux-gnu/tegra/libnvrm_graphics.so (0x0000007f7d966000)
	libnvidia-fatbinaryloader.so.32.1.0 => /usr/lib/aarch64-linux-gnu/tegra/libnvidia-fatbinaryloader.so.32.1.0 (0x0000007f7d908000)
	libnvos.so => /usr/lib/aarch64-linux-gnu/tegra/libnvos.so (0x0000007f7d8ea000)


This is the symbolic link for the cuda libraries the app needs.

keith@Nano:/usr/local/cuda/lib64$ ls -l
lrwxrwxrwx 1 root root        21 Apr 26 19:15 libcudart.so.8.0 -> libcudart.so.10.0.166
lrwxrwxrwx 1 root root        20 Apr 26 19:15 libcufft.so.8.0 -> libcufft.so.10.0.166
24) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91288)
Posted 30 Apr 2019 by Profile Keith Myers
Post:
The Windows and Linux processes are different. BOINC started by using Linux conventions, so you may be in luck, but I remember having to introduce David Anderson to

https://docs.microsoft.com/en-us/windows/desktop/dlls/dynamic-link-library-search-order#search-order-for-desktop-applications

when the Linux techniques failed under Windows. You may have to reverse the process.


I think the problem comes from having to use the repo version of BOINC with its stranglehold on group and user ownership. I checked every time to make sure the executable had all its dependencies satisfied. And they were every location where the application was loaded. But when it came to actually running the client, it always failed to find the CUDA8 libraries.

I should have just compiled a new BOINC for the aarch64 platform on my own and placed it in /home like I do with all my x86_64 hosts. That makes it so much easier to use BOINC and edit and move files.

Finally had enough and stripped out the main BOINC files and moved them to a new boinc folder in /home and removed all the init files and dynamic links scattered all over the system referencing the old repo locations of things.

Finally can run BOINC from /home like normal for me. I think after I finally am able to process a task correctly and get rid of my low daily allowance, I will try once more to make soft or symbolic links to the CUDA8 libraries. I think with BOINC in /home now, with /home being owned by $USER that the symbolic links will probably work as expected.
25) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91284)
Posted 30 Apr 2019 by Profile Keith Myers
Post:
I tried for the first two days to make soft links and exporting Paths and LDLibrary paths with no success. That is why all my first tasks had the missing libcufft and libcudart library errors.
Finally decided to try just copying the system CUDA10 system libraries into the ones the application needed and referencing them directly in app_info.

The system already comes with all the necessary libraries pre-installed. This is a developers kit system image made for developing apps so CUDA10, CUDnn, GCC, C+ and C++ are already there. I shouldn't have to install another version of CUDA.

Since I didn't have any app_info version anymore that used CUDA file references, I just patterned the cuda references after the references for the dev files in the original app_info. That included file copies.

Thanks for explaining that is the culprit since I didn't know or understand what that did. I have removed the file copies from app_info. Now just have to wait out the penalty box again.
26) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91273)
Posted 30 Apr 2019 by Profile Keith Myers
Post:
Help Richard! Maybe you can tell me why I am still unable to run any gpu tasks. I can start them but they error out with a disk limit exceeded. Even though I increased the Home venue disk usage twice now. The amount needed didn't change.

keith@Nano:~/boinc$ ./boinc
29-Apr-2019 19:51:32 [---] Starting BOINC client version 7.9.3 for aarch64-unknown-linux-gnu
29-Apr-2019 19:51:32 [---] log flags: file_xfer, sched_ops, task, cpu_sched, sched_op_debug
29-Apr-2019 19:51:32 [---] Libraries: libcurl/7.58.0 OpenSSL/1.1.0g zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3
29-Apr-2019 19:51:32 [---] Data directory: /home/keith/boinc
29-Apr-2019 19:51:32 [---] CUDA: NVIDIA GPU 0: NVIDIA Tegra X1 (driver version unknown, CUDA version 10.0, compute capability 5.3, 3957MB, 2179MB available, 236 GFLOPS peak)
29-Apr-2019 19:51:32 [Einstein@Home] Found app_info.xml; using anonymous platform
29-Apr-2019 19:51:33 [---] [libc detection] gathered: 2.27, Ubuntu GLIBC 2.27-3ubuntu1
29-Apr-2019 19:51:33 [---] Host name: Nano
29-Apr-2019 19:51:33 [---] Processor: 4 ARM ARMv8 Processor rev 1 (v8l) [Impl 0x41 Arch 8 Variant 0x1 Part 0xd07 Rev 1]
29-Apr-2019 19:51:33 [---] Processor features: fp asimd evtstrm aes pmull sha1 sha2 crc32
29-Apr-2019 19:51:33 [---] OS: Linux Ubuntu: Ubuntu 18.04.2 LTS [4.9.140-tegra|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]
29-Apr-2019 19:51:33 [---] Memory: 3.86 GB physical, 0 bytes virtual
29-Apr-2019 19:51:33 [---] Disk: 29.21 GB total, 17.65 GB free
29-Apr-2019 19:51:33 [---] Local time is UTC -7 hours
29-Apr-2019 19:51:33 [---] Config: GUI RPC allowed from any host
29-Apr-2019 19:51:33 [---] Config: GUI RPCs allowed from:
29-Apr-2019 19:51:33 [---] 192.168.2.34
29-Apr-2019 19:51:33 [---] Config: report completed tasks immediately
29-Apr-2019 19:51:33 [Einstein@Home] URL http://einstein.phys.uwm.edu/; Computer ID 12775352; resource share 25
29-Apr-2019 19:51:33 [SETI@home] URL http://setiathome.berkeley.edu/; Computer ID 8707387; resource share 75
29-Apr-2019 19:51:33 [Einstein@Home] General prefs: from Einstein@Home (last modified ---)
29-Apr-2019 19:51:33 [Einstein@Home] Computer location: home
29-Apr-2019 19:51:33 [---] General prefs: using separate prefs for home
29-Apr-2019 19:51:33 [---] Reading preferences override file
29-Apr-2019 19:51:33 [---] Preferences:
29-Apr-2019 19:51:33 [---] max memory usage when active: 1978.28 MB
29-Apr-2019 19:51:33 [---] max memory usage when idle: 3560.91 MB
29-Apr-2019 19:51:33 [---] max disk usage: 10.00 GB
29-Apr-2019 19:51:33 [---] max CPUs used: 2
29-Apr-2019 19:51:33 [---] suspend work if non-BOINC CPU load exceeds 25%
29-Apr-2019 19:51:33 [---] (to change preferences, visit a project web site or select Preferences in the Manager)
29-Apr-2019 19:51:33 [---] Setting up project and slot directories
29-Apr-2019 19:51:33 [---] Checking active tasks
29-Apr-2019 19:51:33 [---] Setting up GUI RPC socket
29-Apr-2019 19:51:33 [---] gui_rpc_auth.cfg is empty - no GUI RPC password protection
29-Apr-2019 19:51:33 [---] Checking presence of 63 project files
29-Apr-2019 19:51:33 Initialization completed
29-Apr-2019 19:51:33 [Einstein@Home] [sched_op] Starting scheduler request
29-Apr-2019 19:51:33 [Einstein@Home] Sending scheduler request: To report completed tasks.
29-Apr-2019 19:51:33 [Einstein@Home] Reporting 5 completed tasks
29-Apr-2019 19:51:33 [Einstein@Home] Requesting new tasks for NVIDIA GPU
29-Apr-2019 19:51:33 [Einstein@Home] [sched_op] CPU work request: 0.00 seconds; 0.00 devices
29-Apr-2019 19:51:33 [Einstein@Home] [sched_op] NVIDIA GPU work request: 9504.00 seconds; 1.00 devices
29-Apr-2019 19:51:37 [Einstein@Home] Scheduler request completed: got 6 new tasks
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] Server version 611
29-Apr-2019 19:51:37 [Einstein@Home] Project requested delay of 60 seconds
29-Apr-2019 19:51:37 [Einstein@Home] New computer location: home
29-Apr-2019 19:51:37 [Einstein@Home] General prefs: from Einstein@Home (last modified ---)
29-Apr-2019 19:51:37 [Einstein@Home] Computer location: home
29-Apr-2019 19:51:37 [---] General prefs: using separate prefs for home
29-Apr-2019 19:51:37 [---] Reading preferences override file
29-Apr-2019 19:51:37 [---] Preferences:
29-Apr-2019 19:51:37 [---] max memory usage when active: 1978.28 MB
29-Apr-2019 19:51:37 [---] max memory usage when idle: 3560.91 MB
29-Apr-2019 19:51:37 [---] max disk usage: 10.00 GB
29-Apr-2019 19:51:37 [---] Number of usable CPUs has changed from 2 to 3.
29-Apr-2019 19:51:37 [---] max CPUs used: 3
29-Apr-2019 19:51:37 [---] suspend work if non-BOINC CPU load exceeds 25%
29-Apr-2019 19:51:37 [---] (to change preferences, visit a project web site or select Preferences in the Manager)
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] estimated total CPU task duration: 0 seconds
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] estimated total NVIDIA GPU task duration: 7473 seconds
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] handle_scheduler_reply(): got ack for task p2030.20170414.G44.61-02.33.N.b6s0g0.00000_1388_0
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] handle_scheduler_reply(): got ack for task p2030.20170414.G44.61-02.33.N.b6s0g0.00000_1389_0
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] handle_scheduler_reply(): got ack for task p2030.20170414.G44.61-02.33.N.b6s0g0.00000_1387_1
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] handle_scheduler_reply(): got ack for task p2030.20170414.G44.61-02.33.N.b6s0g0.00000_1391_0
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] handle_scheduler_reply(): got ack for task p2030.20170414.G44.61-02.33.N.b6s0g0.00000_1390_0
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] Deferring communication for 00:01:00
29-Apr-2019 19:51:37 [Einstein@Home] [sched_op] Reason: requested by project
29-Apr-2019 19:51:39 [Einstein@Home] Started download of p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646.bin4
29-Apr-2019 19:51:39 [Einstein@Home] Started download of p2030.20170414.G44.77-01.19.C.b0s0g0.00000.zap
29-Apr-2019 19:51:42 [Einstein@Home] Finished download of p2030.20170414.G44.77-01.19.C.b0s0g0.00000.zap
29-Apr-2019 19:51:42 [Einstein@Home] Started download of p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647.bin4
29-Apr-2019 19:51:44 [Einstein@Home] Finished download of p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646.bin4
29-Apr-2019 19:51:44 [Einstein@Home] Started download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99.bin4
29-Apr-2019 19:51:44 [Einstein@Home] Starting task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646_1
29-Apr-2019 19:51:44 [Einstein@Home] [cpu_sched] Starting task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646_1 using einsteinbinary_BRP4 version 999 in slot 3
29-Apr-2019 19:51:46 [Einstein@Home] Aborting task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646_1: exceeded disk limit: 127.11MB > 19.07MB
29-Apr-2019 19:51:46 [Einstein@Home] [sched_op] Deferring communication for 00:01:28
29-Apr-2019 19:51:46 [Einstein@Home] [sched_op] Reason: Unrecoverable error for task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646_1
29-Apr-2019 19:51:47 [Einstein@Home] Finished download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99.bin4
29-Apr-2019 19:51:47 [Einstein@Home] Started download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000.zap
29-Apr-2019 19:51:47 [Einstein@Home] Computation for task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646_1 finished
29-Apr-2019 19:51:47 [Einstein@Home] Output file p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646_1_0 for task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1646_1 absent
29-Apr-2019 19:51:48 [Einstein@Home] Finished download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000.zap
29-Apr-2019 19:51:48 [Einstein@Home] Started download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100.bin4
29-Apr-2019 19:51:48 [Einstein@Home] Starting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99_0
29-Apr-2019 19:51:48 [Einstein@Home] [cpu_sched] Starting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99_0 using einsteinbinary_BRP4 version 999 in slot 3
29-Apr-2019 19:51:50 [Einstein@Home] Aborting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99_0: exceeded disk limit: 127.11MB > 19.07MB
29-Apr-2019 19:51:50 [Einstein@Home] [sched_op] Deferring communication for 00:03:55
29-Apr-2019 19:51:50 [Einstein@Home] [sched_op] Reason: Unrecoverable error for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99_0
29-Apr-2019 19:51:51 [Einstein@Home] Finished download of p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647.bin4
29-Apr-2019 19:51:51 [Einstein@Home] Started download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101.bin4
29-Apr-2019 19:51:51 [Einstein@Home] Computation for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99_0 finished
29-Apr-2019 19:51:51 [Einstein@Home] Output file p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99_0_0 for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_99_0 absent
29-Apr-2019 19:51:51 [Einstein@Home] Starting task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647_0
29-Apr-2019 19:51:51 [Einstein@Home] [cpu_sched] Starting task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647_0 using einsteinbinary_BRP4 version 999 in slot 3
29-Apr-2019 19:51:52 [Einstein@Home] Aborting task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647_0: exceeded disk limit: 127.11MB > 19.07MB
29-Apr-2019 19:51:52 [Einstein@Home] [sched_op] Deferring communication for 00:07:33
29-Apr-2019 19:51:52 [Einstein@Home] [sched_op] Reason: Unrecoverable error for task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647_0
29-Apr-2019 19:51:52 [Einstein@Home] Finished download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100.bin4
29-Apr-2019 19:51:52 [Einstein@Home] Started download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102.bin4
29-Apr-2019 19:51:52 [Einstein@Home] Starting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100_0
29-Apr-2019 19:51:52 [Einstein@Home] [cpu_sched] Starting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100_0 using einsteinbinary_BRP4 version 999 in slot 4
29-Apr-2019 19:52:01 [Einstein@Home] Aborting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100_0: exceeded disk limit: 127.11MB > 19.07MB
29-Apr-2019 19:52:01 [Einstein@Home] [sched_op] Deferring communication for 00:13:53
29-Apr-2019 19:52:01 [Einstein@Home] [sched_op] Reason: Unrecoverable error for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100_0
29-Apr-2019 19:52:01 [Einstein@Home] Computation for task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647_0 finished
29-Apr-2019 19:52:01 [Einstein@Home] Output file p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647_0_0 for task p2030.20170414.G44.77-01.19.C.b0s0g0.00000_1647_0 absent
29-Apr-2019 19:52:13 [Einstein@Home] Computation for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100_0 finished
29-Apr-2019 19:52:13 [Einstein@Home] Output file p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100_0_0 for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_100_0 absent
29-Apr-2019 19:52:14 [Einstein@Home] Finished download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101.bin4
29-Apr-2019 19:52:14 [Einstein@Home] Finished download of p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102.bin4
29-Apr-2019 19:52:14 [Einstein@Home] Starting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101_0
29-Apr-2019 19:52:14 [Einstein@Home] [cpu_sched] Starting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101_0 using einsteinbinary_BRP4 version 999 in slot 3
29-Apr-2019 19:52:15 [Einstein@Home] Aborting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101_0: exceeded disk limit: 127.11MB > 19.07MB
29-Apr-2019 19:52:15 [Einstein@Home] [sched_op] Deferring communication for 00:19:53
29-Apr-2019 19:52:15 [Einstein@Home] [sched_op] Reason: Unrecoverable error for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101_0
29-Apr-2019 19:52:16 [Einstein@Home] Computation for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101_0 finished
29-Apr-2019 19:52:16 [Einstein@Home] Output file p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101_0_0 for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_101_0 absent
29-Apr-2019 19:52:16 [Einstein@Home] Starting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102_0
29-Apr-2019 19:52:16 [Einstein@Home] [cpu_sched] Starting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102_0 using einsteinbinary_BRP4 version 999 in slot 3
29-Apr-2019 19:52:17 [Einstein@Home] Aborting task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102_0: exceeded disk limit: 127.11MB > 19.07MB
29-Apr-2019 19:52:17 [Einstein@Home] [sched_op] Deferring communication for 00:48:42
29-Apr-2019 19:52:17 [Einstein@Home] [sched_op] Reason: Unrecoverable error for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102_0
29-Apr-2019 19:52:19 [Einstein@Home] Computation for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102_0 finished
29-Apr-2019 19:52:19 [Einstein@Home] Output file p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102_0_0 for task p2030.20170414.G44.61-02.33.N.b5s0g0.00000_102_0 absent
^C29-Apr-2019 19:52:44 [---] Received signal 2
29-Apr-2019 19:52:45 [---] Exiting
keith@Nano:~/boinc$
27) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91269)
Posted 29 Apr 2019 by Profile Keith Myers
Post:
And that is a big thank you. I had forgotten to do that. I am updating the app_info right now to add those file references to the existing ones. At least I did this before the next attempt at running the app when I can get new work again.
28) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91265)
Posted 29 Apr 2019 by Profile Keith Myers
Post:
I think I fixed the issue of the missing libraries a couple of days ago, by using trick of putting the libcufft and libcudart libraries directly in the project directory like we did with the CUDA80 Seti MB app.

None of the normal and supposedly proper ways to link the system's stock libcudart and libcufft libraries to those required by the app worked. Neither did exporting the LDLibrary path.

But without any work to test with I still don't know if putting the files directly in the same directory as the application will work. The app is working for the developer, I just haven't figured out why it won't work on my system yet.

I didn't realize that the total of failed work units was what was applied to the daily quote limit.

Question. What could I possibly do if I had downloaded 500 tasks the first time and instantly errored them out? Would I have had to wait 45 days before getting work for the project again? I only have my normal 0.5 days of work cache for the host. Same as all my hosts.

[Edit] I just reduced the host's venue down to 0.1 day of cache. Hope that only retrieves 1 or 2 tasks in case they fail again.
29) Message boards : Questions and problems : Why am being forced into two consecutive 24 delays with no work returned yet (Message 91261)
Posted 29 Apr 2019 by Profile Keith Myers
Post:
Why am being forced into two consecutive 24 delays with no work returned yet. I am trying to deploy a new app and after returning all the sent work as errors, I was forced into a 24 hour delay.

However after the first 24 hour delay expired and I requested work again, I was forced into another 24 hour delay. WHY?

I just got the same "reached daily quota of 11 tasks"

This is the host. https://einsteinathome.org/host/12775352
30) Message boards : BOINC client : 7.14.2 and 7.12.1 both fail to get work units on very fast systems (Message 91260)
Posted 29 Apr 2019 by Profile Keith Myers
Post:
But if there is a deficiency in the server or client code, the issue should be passed on to the developers at https://github.com/BOINC/boinc

Likely as Richard pointed out, the feeder is probably not set correctly for how many tasks are requested by so many fast hosts.
31) Message boards : BOINC client : 7.14.2 and 7.12.1 both fail to get work units on very fast systems (Message 91258)
Posted 29 Apr 2019 by Profile Keith Myers
Post:
Hope this information gets passed on to Eric or Tom. Since Jake is leaving the project shortly he won't be our project contact anymore.
32) Message boards : BOINC client : 7.14.2 and 7.12.1 both fail to get work units on very fast systems (Message 91245)
Posted 28 Apr 2019 by Profile Keith Myers
Post:
No need. Look at that event log again, without the clutter:

15402 Milkyway@Home 4/28/2019 10:12:30 AM Scheduler request completed: got 0 new tasks
15404 Milkyway@Home 4/28/2019 10:12:30 AM Project requested delay of 91 seconds
15416 Milkyway@Home 4/28/2019 10:12:30 AM [work_fetch] backing off AMD/ATI GPU 723 sec


Yes, I missed that in all the clutter. I agree with the observation that the project is rarely ever out of work. Only when doing rare maintenance or has broken. Now that the tasks per gpu has been increased from historical 80 per to 300 now, I would need many hours to work through my 0.5 day cache with only the MW project running on my hosts. But I have Nvidia and not ATI/AMD cards.

So why does the client get assigned no work on the request when in fact the server DOES have work. Could this be the case if the RTS buffer size is set too low at MW and too many people hit the buffer just before Beemer Biker hit the buffer with his request which exhausted the available work to 0?

Requesting new tasks for AMD/ATI GPU
15821 Milkyway@Home 4/28/2019 10:14:04 AM [sched_op] CPU work request: 0.00 seconds; 0.00 devices
15822 Milkyway@Home 4/28/2019 10:14:04 AM [sched_op] AMD/ATI GPU work request: 120960.00 seconds; 4.00 devices
15823 Milkyway@Home 4/28/2019 10:14:06 AM Scheduler request completed: got 0 new tasks

[Edit] Incorrect in my number of allowed tasks. This is from Jake Weiss' post in project News

Hey guys,

So the current set up allows for users to have up to 200 workunits per GPU on their computer and another 40 workunits per CPU with a maximum of 600 possible workunits.

On the server, we try to store a cache of 10,000 workunits. Sometimes when a lot of people request work all at the same time, this cache will run low.

So all of the numbers I have listed are tunable. What would you guys recommend for changes to these numbers?

Jake
33) Message boards : BOINC client : 7.14.2 and 7.12.1 both fail to get work units on very fast systems (Message 91239)
Posted 28 Apr 2019 by Profile Keith Myers
Post:
Richard is the expert in decrypting the work fetch debug output. Everything appears normal for intervals. The work requests look normal. What I don't understand is why you are getting backoffs for 10 and 5 minutes directly after the scheduler acknowledges receipt of reported work. That is coming from the scheduler and not from your host or client. Normally the scheduler backs off if there are issues in contacting the servers or the client has issues downloading work and the client can't acknowledge correct reception of the sent tasks. Have you looked at the Transfers tab in the Manager after you have requested work and see if you have task downloads in backoff?
34) Message boards : GPUs : Not getting any work for my Anonymous platform at Einstein@home (Message 91238)
Posted 28 Apr 2019 by Profile Keith Myers
Post:
Well I finally managed to get work. Turns out BOINC only allows one BETA application per project. I had the BETA cpu app running already for the RPi3+ for Home venue. Once I moved the Nano to School venue and turned off cpu work, I was finally able to get some gpu work for the beta application.

Still debugging the Nano platform. Errored out all of my daily allotment of tasks when libraries weren't found. Having to wait out the 24 hour penalty box before trying again.
35) Message boards : BOINC client : 7.14.2 and 7.12.1 both fail to get work units on very fast systems (Message 91231)
Posted 28 Apr 2019 by Profile Keith Myers
Post:
Maybe you can ask Richard Haselgrove where to find the #3076 appveyor artifact link for the Windows client so you can download it and try it.

I thought using the older 7.12.1 client would worked for you.
36) Message boards : GPUs : Not getting any work for my Anonymous platform at Einstein@home (Message 91229)
Posted 27 Apr 2019 by Profile Keith Myers
Post:
Hope someone can help. I am trying to get work for the BRP4 application at Einstein@home Not getting anything other than:
Nano

131 Einstein@Home 4/27/2019 1:57:56 PM Sending scheduler request: To fetch work.
132 Einstein@Home 4/27/2019 1:57:56 PM Requesting new tasks for NVIDIA GPU
133 Einstein@Home 4/27/2019 1:58:00 PM Scheduler request completed: got 0 new tasks
134 Einstein@Home 4/27/2019 1:58:00 PM No work sent
135 Einstein@Home 4/27/2019 1:58:00 PM Your app_info.xml file doesn't have a version of Gamma-ray pulsar binary search #1 on GPUs.
136 Einstein@Home 4/27/2019 1:58:00 PM No work available for the applications you have selected. Please check your preferences on the web site.

I don't have the Gamma-ray pulsar binary search #1 on GPUs project selected. I've been told that this is a flaw with Einstein and the message is sent to all and not to worry about. The problem is the no work is available for the application I've selected. The BRP4 application is selected at the website, saved for the Home location and the client has been updated multiple times as well as completely restarted.

This is my app_info.xml

<app_info>

<app>
<name>einsteinbinary_BRP4</name>
</app>
<file_info>
<name>einsteinbinary_cuda64</name>
<executable/>
</file_info>
<file_info>
<name>einsteinbinary_cuda-db.dev</name>
</file_info>
<file_info>
<name>einsteinbinary_cuda-dbhs.dev</name>
</file_info>
<app_version>
<app_name>einsteinbinary_BRP4</app_name>
<version_num>999</version_num>
<api_version>7.2.2</api_version>
<coproc>
<type>CUDA</type>
<count>1.0</count>
</coproc>
<file_ref>
<file_name>einsteinbinary_cuda64</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>einsteinbinary_cuda-db.dev</file_name>
<open_name>db.dev</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>einsteinbinary_cuda-dbhs.dev</file_name>
<open_name>dbhs.dev</open_name>
<copy_file/>
</file_ref>
</app_version>

</app_info>

This is my app_version section in client_state.xml for the application

<app_version>
<app_name>einsteinbinary_BRP4</app_name>
<version_num>999</version_num>
<platform>aarch64-unknown-linux-gnu</platform>
<avg_ncpus>1.000000</avg_ncpus>
<flops>14732440740.064838</flops>
<api_version>7.2.2</api_version>
<file_ref>
<file_name>einsteinbinary_cuda64</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>einsteinbinary_cuda-db.dev</file_name>
<open_name>db.dev</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>einsteinbinary_cuda-dbhs.dev</file_name>
<open_name>dbhs.dev</open_name>
<copy_file/>
</file_ref>
<coproc>
<type>NVIDIA</type>
<count>1.000000</count>
</coproc>
<dont_throttle/>
</app_version>

There are currently 10K BRP4 tasks available in the RTS buffer for that application. GAURAV KHANNA, the developer of the app I am using is getting work for his aarch64-unknown-linux-gnu Tegra hosts and is processing the BRP4 tasks fine.

This my Event Log startup:
1 4/27/2019 11:54:58 AM Starting BOINC client version 7.9.3 for aarch64-unknown-linux-gnu
2 4/27/2019 11:54:58 AM log flags: file_xfer, sched_ops, task
3 4/27/2019 11:54:58 AM Libraries: libcurl/7.58.0 OpenSSL/1.1.0g zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3
4 4/27/2019 11:54:58 AM Data directory: /var/lib/boinc-client
5 4/27/2019 11:54:59 AM CUDA: NVIDIA GPU 0: NVIDIA Tegra X1 (driver version unknown, CUDA version 10.0, compute capability 5.3, 3957MB, 3315MB available, 236 GFLOPS peak)
6 Einstein@Home 4/27/2019 11:54:59 AM Found app_info.xml; using anonymous platform
7 4/27/2019 11:55:00 AM [libc detection] gathered: 2.27, Ubuntu GLIBC 2.27-3ubuntu1
8 4/27/2019 11:55:00 AM Host name: Nano
9 4/27/2019 11:55:00 AM Processor: 4 ARM ARMv8 Processor rev 1 (v8l) [Impl 0x41 Arch 8 Variant 0x1 Part 0xd07 Rev 1]
10 4/27/2019 11:55:00 AM Processor features: fp asimd evtstrm aes pmull sha1 sha2 crc32
11 4/27/2019 11:55:00 AM OS: Linux Ubuntu: Ubuntu 18.04.2 LTS [4.9.140-tegra|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]
12 4/27/2019 11:55:00 AM Memory: 3.86 GB physical, 0 bytes virtual
13 4/27/2019 11:55:00 AM Disk: 29.21 GB total, 17.33 GB free
14 4/27/2019 11:55:00 AM Local time is UTC -7 hours
15 4/27/2019 11:55:00 AM Config: GUI RPC allowed from any host
16 4/27/2019 11:55:00 AM Config: GUI RPCs allowed from:
17 4/27/2019 11:55:00 AM 192.168.2.34
18 4/27/2019 11:55:00 AM Config: report completed tasks immediately
19 Einstein@Home 4/27/2019 11:55:00 AM URL http://einstein.phys.uwm.edu/; Computer ID 12775352; resource share 25
20 SETI@home 4/27/2019 11:55:00 AM URL http://setiathome.berkeley.edu/; Computer ID 8707387; resource share 75
21 SETI@home 4/27/2019 11:55:00 AM General prefs: from SETI@home (last modified 24-Apr-2019 09:04:51)
22 SETI@home 4/27/2019 11:55:00 AM Computer location: home
23 4/27/2019 11:55:00 AM General prefs: using separate prefs for home
24 4/27/2019 11:55:00 AM Reading preferences override file
25 4/27/2019 11:55:00 AM Preferences:
26 4/27/2019 11:55:00 AM max memory usage when active: 1978.28 MB
27 4/27/2019 11:55:00 AM max memory usage when idle: 3560.91 MB
28 4/27/2019 11:55:00 AM max disk usage: 5.00 GB
29 4/27/2019 11:55:00 AM max CPUs used: 3
30 4/27/2019 11:55:00 AM don't use GPU while active
31 4/27/2019 11:55:00 AM suspend work if non-BOINC CPU load exceeds 25%
32 4/27/2019 11:55:00 AM (to change preferences, visit a project web site or select Preferences in the Manager)
33 4/27/2019 11:55:00 AM Setting up project and slot directories
34 4/27/2019 11:55:00 AM Checking active tasks
35 4/27/2019 11:55:00 AM Setting up GUI RPC socket
36 4/27/2019 11:55:00 AM gui_rpc_auth.cfg is empty - no GUI RPC password protection
37 4/27/2019 11:55:00 AM Checking presence of 49 project files
38 4/27/2019 11:55:00 AM Suspending computation - user request
39 Einstein@Home 4/27/2019 11:56:00 AM project resumed by user
40 SETI@home 4/27/2019 11:56:03 AM project resumed by user
41 Einstein@Home 4/27/2019 11:56:16 AM update requested by user
42 Einstein@Home 4/27/2019 11:56:21 AM Sending scheduler request: Requested by user.
43 Einstein@Home 4/27/2019 11:56:21 AM Requesting new tasks for NVIDIA GPU
44 Einstein@Home 4/27/2019 11:56:24 AM Scheduler request completed: got 0 new tasks
45 Einstein@Home 4/27/2019 11:56:24 AM No work sent
46 Einstein@Home 4/27/2019 11:56:24 AM Your app_info.xml file doesn't have a version of Gamma-ray pulsar binary search #1 on GPUs.
47 Einstein@Home 4/27/2019 11:56:24 AM No work available for the applications you have selected. Please check your preferences on the web site.


Can anyone point out why I can't get work? Thanks in advance.

[Edit] This is the scheduler response for last contact:

2019-04-27 21:35:21.5051 [PID=9189] Request: [USER#xxxxx] [HOST#12775352] [IP xxx.xxx.xxx.154] client 7.9.3
2019-04-27 21:35:21.5296 [PID=9189 ] [debug] have_master:1 have_working: 1 have_db: 1
2019-04-27 21:35:21.5296 [PID=9189 ] [debug] using working prefs
2019-04-27 21:35:21.5296 [PID=9189 ] [debug] have db 1; dbmod 1556121891.000000; global mod 1556121891.000000
2019-04-27 21:35:21.5296 [PID=9189 ] [send] effective_ncpus 3 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2019-04-27 21:35:21.5296 [PID=9189 ] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2019-04-27 21:35:21.5296 [PID=9189 ] [send] Not using matchmaker scheduling; Not using EDF sim
2019-04-27 21:35:21.5296 [PID=9189 ] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2019-04-27 21:35:21.5296 [PID=9189 ] [send] CUDA: req 51840.00 sec, 1.00 instances; est delay 0.00
2019-04-27 21:35:21.5296 [PID=9189 ] [send] work_req_seconds: 51840.00 secs
2019-04-27 21:35:21.5296 [PID=9189 ] [send] available disk 4.98 GB, work_buf_min 43200
2019-04-27 21:35:21.5296 [PID=9189 ] [send] active_frac 0.997416 on_frac 0.983078 DCF 1.000000
2019-04-27 21:35:21.5296 [PID=9189 ] Anonymous platform app versions:
2019-04-27 21:35:21.5297 [PID=9189 ] app: einsteinbinary_BRP4 ver: 999
2019-04-27 21:35:21.5305 [PID=9189 ] [mixed] sending locality work first (0.4812)
2019-04-27 21:35:21.5310 [PID=9189 ] [mixed] sending non-locality work second
2019-04-27 21:35:21.5525 [PID=9189 ] [send] Didn't find anonymous platform app for hsgamma_FGRPB1G
2019-04-27 21:35:21.5599 [PID=9189 ] [debug] [HOST#12775352] MSG(high) No work sent
2019-04-27 21:35:21.5600 [PID=9189 ] [debug] [HOST#12775352] MSG(high) Your app_info.xml file doesn't have a version of Gamma-ray pulsar binary search #1 on GPUs.
2019-04-27 21:35:21.5600 [PID=9189 ] [debug] [HOST#12775352] MSG(high) No work available for the applications you have selected. Please check your preferences on the web site.
2019-04-27 21:35:21.5600 [PID=9189 ] Sending reply to [HOST#12775352]: 0 results, delay req 60.00
2019-04-27 21:35:21.5601 [PID=9189 ] Scheduler ran 0.059 seconds
37) Message boards : The Lounge : The Seti is Down Cafe (Message 91199)
Posted 24 Apr 2019 by Profile Keith Myers
Post:
I blame Keith...just because...hahaha....

Not me! I was cursing the PC god's yesterday trying to make my new Jetson Nano SBC behave. So my normal farm was being very neglected. Very different from the Raspberry Pi3+
38) Message boards : The Lounge : The Seti is Down Cafe (Message 91039)
Posted 10 Apr 2019 by Profile Keith Myers
Post:
Well as soon as I posted that the project was back up and had reported one host, the subsequent updates on the other hosts promptly crashed the servers again.

Notice: Undefined property: stdClass::$name in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 236 Notice: Undefined property: stdClass::$email_addr in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 240 Notice: Undefined property: stdClass::$url in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 246 Notice: Undefined property: stdClass::$country in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 250 Notice: Undefined property: stdClass::$create_time in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 254 Notice: Undefined property: stdClass::$authenticator in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 255 Notice: Undefined property: stdClass::$id in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 276 Notice: Undefined property: stdClass::$total_credit in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 147 Notice: Undefined property: stdClass::$expavg_credit in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 148 Notice: Undefined property: stdClass::$seti_nresults in /disks/carolyn/b/home/boincadm/projects/sah/html/seti_boinc_html/project.inc on line 187 Notice: Undefined property: stdClass::$seti_total_cpu in /disks/carolyn/b/home/boincadm/projects/sah/html/seti_boinc_html/project.inc on line 193 Notice: Undefined property: stdClass::$id in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 173 Notice: Undefined property: stdClass::$cross_project_id in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 177 Notice: Undefined property: stdClass::$email_addr in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 177 Notice: Undefined property: stdClass::$teamid in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 188 Notice: Undefined property: stdClass::$id in /disks/carolyn/b/home/boincadm/projects/sah/html/inc/user.inc on line 193 SETI@home is temporarily shut down for maintenance. Please try again later.
39) Message boards : The Lounge : The Seti is Down Cafe (Message 91030)
Posted 10 Apr 2019 by Profile Keith Myers
Post:
The schedulers are currently down. Hope for just a brief outage. Everything else working OK on the website.
40) Message boards : The Lounge : The Seti is Down Cafe (Message 90972)
Posted 7 Apr 2019 by Profile Keith Myers
Post:
Project web site has finally appeared. Project down for maintenance.


Previous 20 · Next 20

Copyright © 2019 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.