Definition of tasks as CPU or GPU

Message boards : Questions and problems : Definition of tasks as CPU or GPU
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Dave

Send message
Joined: 28 Jun 10
Posts: 1377
United Kingdom
Message 102732 - Posted: 28 Jan 2021, 19:51:51 UTC

On Prime Grid, I have my Ryzen only accepting GPU tasks. I discovered today (probably should have noticed ages ago) that some of them also use 100%CPU. I already understood that some CPU usage is inevitable running a GPU task but didn't know they could use 100% of a core/thread. If I have my machine set to use only 25% of available cores, does that 25% include the one being used for a GPU task or not?

BOINC7.17.0, XUbuntu20.10 but I suspect what I have noticed is true more generally.
ID: 102732 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 4498
United Kingdom
Message 102733 - Posted: 28 Jan 2021, 20:16:52 UTC - in response to Message 102732.  

GPUs can do very fast maths, but are not good at the management and control - decision-making - parts of a scientific application. Different applications, different hardware, different operating system - and different programmers - have different ways of coping with this.

If a GPU application keeps a CPU running at 100%, the CPU is (in most cases - not every time: check the small print) very lightly loaded. It's in what's called a busy-wait state, where it asks - millions of times a second - "Do you need me for anything?". Usually, the answer is no, but every few milliseconds, the GPU will want to have the old results taken away, and to be given something new to do. For efficiency, it's important that the CPU is alert to respond immediately.

To answer your direct question: yes, BOINC will count that as using a whole CPU, and it'll prevent a pure CPU task from running as a result. But it won't be using power-hungry components like the floating point unit hard, if at all. If you reduced the CPU loading for thermal purposes, you might be able - careful! - to get away with increasing the permitted %age a bit.
ID: 102733 · Report as offensive
Profile Dave

Send message
Joined: 28 Jun 10
Posts: 1377
United Kingdom
Message 102734 - Posted: 28 Jan 2021, 21:26:56 UTC - in response to Message 102733.  

Thanks Richard,
That fills a gap in my understanding of BOINC. I don't have to worry much about cooling as I have liquid cooling and would only be taking my Ryzen up to using six cores instead of five.
ID: 102734 · Report as offensive
ProDigit

Send message
Joined: 8 Nov 19
Posts: 633
United States
Message 102750 - Posted: 30 Jan 2021, 20:57:52 UTC
Last modified: 30 Jan 2021, 21:17:11 UTC

If you have an Nvidia GPU, the Nvidia drivers will poll the CPU, resulting in 100% CPU utilization.
This is however, just null-data, so the CPU is not actually processing anything.
It's just enough to keep the CPU thread from powering down to sleep mode, or lowering in frequency, by locking the GPU into that one CPU thread/core.
This will result in a marginal increase of performance, at the cost of a minimal CPU increase in power consumption.
If you enable kernel times in Windows Task Manager, or use HTOP in Linux, you'll see that the actual data used by the GPU (depending on the speed of the GPU) often lies anywhere between 25-75% of the thread; the remaining CPU usage (to 100%) it isn't actual processing data.
Meaning, if an RTX 2080 will use about 50% of a single thread on a Ryzen 3900x CPU. Which means it actually only needs about 2-2,5Ghz on a single thread. The rest, is just so that the GPU will be able data quicker.
This because the CPU core/thread that's feeding the GPU already has some of the data loaded in L-Cache, which will speed up moving that data into the GPU.
Should Nvidia do like AMD, then as soon as the CPU has fed the data, the thread could be utilized for other processes.
If you have a 4 chiplet design, it means you have 4 separate L-cache buffers.
If all cores within a chiplet are utilized by another application, the data loaded in that L-cache will need to be written to (or loaded into) another chiplet's L-cache first, which takes compute, time and energy.

This is why Nvidia GPUs lock their GPU data into one CPU core/thread, and only switches the thread, when no more data is in the L-cache connected to that thread, but is loaded in another L-Cache block instead.
This is all done on the driver level, and switching to another thread, may be for a variety of reasons, including a cooler (less utilized) chiplet, or a too utilized chiplet (too much programs wanting to use the first chiplet, keeping too many cores occupied).


If you run Boinc with one GPU, and set your CPU to run at 25% CPU utilization; in case of a 8 core 16 thread, Boinc will aim to only utilize 4 threads.
Boinc will see the CPU utilization for GPU depending on the project.
Some projects say "0.01 CPU + Nvidia GPU" (or something), like The Collatz Conjecture.
In such case, Boinc will disregard the CPU needs of the GPU, and will run 4 CPU WUs, plus whatever the GPU utilizes (essentially using less than 10% of an extra thread, with that thread using up an additional 90% of idle data).
In essence Boinc will look like it'll be using 5 threads
If the project says '0.97 CPU + Nvidia GPU), then you'll be using roughtly 3/4th of an extra CPU thread, with 1/4th idle data, so also 5 threads; though the 5th thread will do more actual CPU computations.

Same is true for multiple GPUs, you'll use one extra thread per GPU.


If on the other hand, you limit your CPU in the bios to 4 cores 4 threads max, and have 2 GPUs, and set in Boinc to use 100% of CPU,
Boinc will run 4 CPU WUs and share them with the GPUs, in case of the first project.
On the second project utilizing 0.97 CPU, Boinc will now utilize more than 1 core for feeding the 2 GPUs.
So it'll assign 1 core to the GPU, and the remaining 3 cores are split between CPU WUs, and the GPUs.
So in essence you'll be running 2 CPU threads at full speed, 1,94 threads for the GPU at full speed, and 0.06 of a thread, for the remaining CPU WU.
In real life, the CPU threads are divided, so each CPU WU will run roughly at 0.97% of the speed. The GPU threads will receive higher priority, and should still run at 100% speed.

If your system has 10 GPUs, each using 0.01 CPU for GPU WUs, on a quad core system, Boinc will feed each GPU using a total of 0.10 threads, and splits the remaining CPU threads to CPU WUs.
If the project is set to use 0.97 CPU, boinc will only feed 5 GPUs (4 threads divided by 0.97 = 4.123GPUs).
In this scenario, 5 GPUs will be fed, each GPU potentially running at 80% (bottlenecked by the CPU).
The set parameters in each project ('0.01CPU + Nvidia GPU'; or '0.97 CPU + Nvidia GPU'), is not always correct for every system.
It's perfectly possible for a system to use more (or less) CPU than indicated.

Hope this makes sense.
ID: 102750 · Report as offensive

Message boards : Questions and problems : Definition of tasks as CPU or GPU

Copyright © 2021 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.