An undoubtedly very fascinating thread about GPU capabilities running multiple tasks

Message boards : The Lounge : An undoubtedly very fascinating thread about GPU capabilities running multiple tasks
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
betreger
Volunteer tester
Help desk expert

Send message
Joined: 18 Oct 14
Posts: 1474
United States
Message 107352 - Posted: 12 Mar 2022, 23:32:37 UTC
Last modified: 12 Mar 2022, 23:39:50 UTC

Glory is crunching GWs 24/7 on 3 GPUs over at Einstein
ID: 107352 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1288
United Kingdom
Message 107356 - Posted: 13 Mar 2022, 18:59:10 UTC - in response to Message 107354.  

Please define "fancy cards" as doing so would help others see if their cards would be suitable to run Einstein's Gravitational Wave app on their GPUs.
ID: 107356 · Report as offensive
betreger
Volunteer tester
Help desk expert

Send message
Joined: 18 Oct 14
Posts: 1474
United States
Message 107357 - Posted: 13 Mar 2022, 19:04:41 UTC - in response to Message 107354.  

1 box has 2 3GB GTX1060s, not a fancy card, the other has a GTX1660s which is not very fancy. GWs want 3GB of memory and open CL.
ID: 107357 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1288
United Kingdom
Message 107371 - Posted: 14 Mar 2022, 14:47:16 UTC - in response to Message 107359.  

I have 3GB and OpenCL 1.2 on my 280X/7970 cards, they run, but never complete, the GPU usage is about 10%, the CPU has to do all the work.

I just wander if the failure of your 280x to complete is in anyway related to the behaviour I am seeing on my 1070ti when running Einstein gravitational wave tasks. They run to 99% complete in about 10 minutes, with the last 1% taking about 5 minutes with very little activity on either CPU or GPU apart from a burst the last few seconds when both are nearly 100%. I really must visit the Einstein forum and see if anything similar has been posted over there.
ID: 107371 · Report as offensive
BOINC Moderator
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 10 Mar 20
Posts: 68
Message 107373 - Posted: 14 Mar 2022, 15:43:27 UTC
Last modified: 14 Mar 2022, 16:03:57 UTC

Moved everything over from the other thread
ID: 107373 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1288
United Kingdom
Message 107376 - Posted: 14 Mar 2022, 16:13:21 UTC - in response to Message 107372.  

Note - I said "related", not "identical".

That aside.
I assume you are (were) running only one task per GPU, as indeed I am.
There is a common item, the actual application, if this has simply been transliterated from a CPU-oriented one to something that is now being run on GPUs of various capabilities (such as our two) then there could (and indeed probably would) be a different set of symptoms displayed. Really frustrating for both of us - you seeing tasks failing after fair length of time, while mine just sit there in thumb-twiddle mode for minutes.

Interesting observation, I've just swapped from running gravitational waves to Gamma-ray on the GPU and they run smoothly to completion in about 10 minutes with no pause and ~95% GPU utilisation. This is much more like the behaviour I would expect of a well formed application running on a GPU.
ID: 107376 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1288
United Kingdom
Message 107379 - Posted: 14 Mar 2022, 17:58:17 UTC

A couple of observations.
First, as you've said, at least one of your GPUs is quite old and low on resources. No problems there, but there have been quite a number of folks who have found that running multiple tasks on such GPUs actually reduces the overall performance (tasks run to completion per hour) when compared with running one task, and that the GPU usage is way below that expected. This effect does vary with GPU type and application so may be a red herring in this case.
Second, OpenCL versions. This is an interesting one. When an application is developed for OpenCL the developer decides what version is to be used, let's say they decide on OpenCL version 1.0. Running alongside this, the GPU developers (an indeed just about every processor developer) decides what the minimum and maximum OpenCL versions their hardware will support, lets say 0.8 to 1.5. When the application is run one of the first things that is checked by the processor is the APPLICATION version of OpenCL, and the processor effectively switches into that mode and runs in that mode for the duration of that application. Now lets consider a slightly different scenario, the developer decides on OpenCL version 3, but we have the same processor in use; the version check fails and the application will not run, how graceful this failure is and what error messages seen are in part down to the developers. This may explain the problems encountered while trying to run the beta applications.
ID: 107379 · Report as offensive
ProfileKeith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 879
United States
Message 107382 - Posted: 14 Mar 2022, 19:20:30 UTC

Both the Einstein GW and GRP apps pause around 90-98% at the end of the run to do work unit top list processing on the cpu. That is the cause of the pause on the tasks. All that is normal.
The reason is the programmers want better FP64 precision than a gpu can provide for sorting the toplist so they transfer that last bit of computing from the gpu back to the cpu.
To overcome the under utilization on the gpu for that last 10% of the WU crunching, you can run doubles or triples on each gpu if they have enough memory. By staggering the start/endings of the work units, you can keep the gpu utilization at 98%-100%.


ID: 107382 · Report as offensive
betreger
Volunteer tester
Help desk expert

Send message
Joined: 18 Oct 14
Posts: 1474
United States
Message 107383 - Posted: 14 Mar 2022, 19:31:52 UTC - in response to Message 107382.  

+1 well said
ID: 107383 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1288
United Kingdom
Message 107384 - Posted: 14 Mar 2022, 21:35:56 UTC - in response to Message 107382.  

Thanks for the explanation Keith.
ID: 107384 · Report as offensive
Ian&Steve C.

Send message
Joined: 24 Dec 19
Posts: 228
United States
Message 107385 - Posted: 14 Mar 2022, 22:24:46 UTC - in response to Message 107376.  
Last modified: 14 Mar 2022, 22:25:31 UTC


Interesting observation, I've just swapped from running gravitational waves to Gamma-ray on the GPU and they run smoothly to completion in about 10 minutes with no pause and ~95% GPU utilisation. This is much more like the behaviour I would expect of a well formed application running on a GPU.


the app you're currently running (v1.22) actually has a substantial code flaw making this app not well optimized for Nvidia cards (way slower than it should be for the level of compute performance in modern Nvidia GPUs). many people tried to claim "AMD is just better at Einstein" which turned out to be wrong in actuality. The way it was coded essentially held back Nvidia GPUs and forced some of the calculations to run serially instead of in parallel like it should be. even though this app reports "95%" core use, I'm sure you'll notice that the power used is less than other apps which truly load up the GPU. The old code is like a handbrake for Nvidia GPUs.

However, if you update your current drivers from the 461 version you're currently running (which only support OpenCL 1.2), to more recent drivers 470+ (which support OpenCL 3.0), you will get the newer v1.28 gamma ray application from Einstein and your tasks will run about 40-50% faster. Petri and I worked together to improve the code in the Einstein app (mostly him, I did a bunch of testing and had very small code contributions), and I communicated the necessary changes to the Einstein project developers who incorporated it into this new application. the new app is available for everyone with an Nvidia GPU as long as you have the required drivers. The method in use to restore parallelized code for Nvidia is a method only available in OpenCL 2.0 and up. The timing of Nvidia releasing OpenCL 3.0 capable drivers aligning with Petri's interest in fixing the app couldn't have been better :).

update the drivers, the project will send you the new app, and you process work faster. as simple as that. (only new app for gamma ray at this time)


for the gravitational wave tasks, the app is fairly CPU bound since the devs couldn't put certain calculations onto the GPU. so the GPU is constantly waiting around for work from the CPU, which is why GPU utilization can be rather low with this app. using a faster CPU can help, but not enough to reach full utilization. most people just run 2-3 tasks concurrently to try to load up the GPU more. it's also normal (for this app) to pause at 99% while it does some final calculations. I can't remember if it's using the CPU or GPU double precision (fp64) to finish this bit, but both would explain the GPU low utilization at this time.
ID: 107385 · Report as offensive
Ian&Steve C.

Send message
Joined: 24 Dec 19
Posts: 228
United States
Message 107386 - Posted: 14 Mar 2022, 22:39:48 UTC

and even though it seems this thread was spun off from another thread. it seems better suited to the GPU forum instead of here.
ID: 107386 · Report as offensive
1 · 2 · 3 · Next

Message boards : The Lounge : An undoubtedly very fascinating thread about GPU capabilities running multiple tasks

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.