Message boards : GPUs : How to increase GPU performance?
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Sep 20 Posts: 22 |
Hello everyone. I am running a GTX 1660 AMP under Windows 10. When I run projects like Number fields the GPU runs only at a small percentage of it's potential. According to Windows task manager at 5% at 52 degrees C. Is there a way of increasing throughput? I have tried to run a BOINC project and a Folding at home process simultaniously which does increase the percentage of use and the temperature but it also slows down both processes. |
Send message Joined: 5 Oct 06 Posts: 5112 |
The efficiency of the application, in this situation, is very much determined by the way it is coded. Eric Driver, the administrator of NumberFields, is new to GPU programming, and this is his first attempt at it. I also tried this app, and noticed a different problem: if the GPU is also providing the video signal for the monitor, screen refresh becomes noticeably sluggish. Both of these are symptoms of an immature application. They can't be cured externally, although sometimes workrounds can mitigate the effects. I would expect them to be improved in later releases. Eric has posted about releasing the source code on Github once he's sure it's working properly. If you are an experienced GPU programmer, perhaps you could help him out? |
Send message Joined: 29 Sep 20 Posts: 22 |
Thanks for your replay, Richard. Unfortunately I am not an expert on GPU programming. I was just wondering if maybe I could run 2 WU at the same time like I have read in several other posts but they never explain in detail how they achieved that. |
Send message Joined: 5 Oct 06 Posts: 5112 |
By all means try it, but be prepared for disappointment. Have a look at the User manual, especially the section on Project-level configuration. You can create a configuration file that specifies how much of the GPU will be scheduled to run the app. Note that this doesn't control the app: it just means that BOINC allows extra copies to run on the same hardware. |
Send message Joined: 29 Sep 20 Posts: 22 |
Before I tried to fiddle with the settings by writing xml files I tried a simpler approach and tested my GPU with Einstein. On their settings page you can simply allow several WU at once to your GPU. The outcome was that my GPU still didn't go higher than 5% usage and the temperature remained at 53 degress although it crunshed two WU's at the same time. On the other hand it took more than twice as long so nothing to be gained there. I just wonder what those pros with high-end graphics cards do differently. They always claim that they run three or even four WU's at the same time. Maybe they just didn't realize that the calculations take longer that way or they have a trick up their sleeves how to avoid that. Curious... Thanks Richard |
Send message Joined: 14 Aug 19 Posts: 55 |
You're probably not seeing the the true GPU utilization. I don't have Win10, but there's a setting in its task manager for seeing GPU compute loads. That's what you need to look for. You can also use a 3rd party app like Afterburner or my favorite, System Information Viewer. I'm sure your GPU load is actually much higher than 5%, but if it isn't then your CPU is probably over loaded. Reduce the number of CPU jobs running for better output. You can't directly compare how projects run, as different apps can behave quite differently. It's true in all cases that running concurrent tasks will take longer than running one. The test is to see how it works over time. As a simple example, if a single task takes five minutes but running two at once takes less than ten minutes, that advantage adds up over time. In such a case you can keep scaling up until you hit the limit. You still have to make sure you have enough CPU support, Nvidia apps in particular tend to like having a full thread available. Team USA forum Follow us on Twitter Help us #crunchforcures! |
Send message Joined: 17 Nov 16 Posts: 883 |
Or use the provided utility from Nvidia if you are running drivers provided by Nvidia and not Microsoft The nvidia-smi application shows card utilization, fan speed and temps when run in the command terminal The application is located at C:\Program Files\NVIDIA Corporation\NVSMI Run nvidia-smi -l 2to poll the Nvidia cards in the host every 2 seconds |
Send message Joined: 8 Nov 19 Posts: 718 |
Windows task manager, GPU-Z, HW Monitor, MSI Afterburner, .... All these programs can't really tell how busy your GPU is running at. They will give you an estimate, but it's often dead wrong. The closest thing you can measure to GPU utilization, is wattage or temperature. If the GPU is 100% loaded, it'll draw x-amounts of watts, and with the fan at 100% will run at Y-degrees temperature. In Linux, the wattage can be directly read via the nvidia-smi command. Nvidia GPUs are notorious for not having enough actual cores doing the DPP instructions. Most of their 32bit FPP shaders are waiting for 64bit DPP instructions to complete. The only ones that actually have enough actual cores to feed all shaders, are the professional GPUs, like the TITAN, Quadro, and Tesla). Quadro being high price, low performance. For such projects you're better off running AMD GPUs, like the RX5700XT or newer. |
Send message Joined: 25 May 09 Posts: 1289 |
WRONG - GPU cooling systems are not designed to be used as GPU load indicators. They are designed around a theoretical mix of loading of each of the blocks that make up a modern GPU being at its maximum, which below the absolute maximum that would be achieved running at their individual maxima. Then there is overclocking, which pushes parts of a GPU to beyond their design speed thus drawing more power than the norm. The tools you list are designed to how how much of each of the resources is being used. Most older consumer GPUs DO NOT HAVE ANY double float processing units, not just "fewer". As for the performance of the latest Quadro units - yes, they are stupidly expensive, and they make the AMD equivalents look as if they are stuck in glue for both single & double float calculations. |
Send message Joined: 14 Aug 19 Posts: 55 |
Windows task manager, GPU-Z, HW Monitor, MSI Afterburner, .... Often dead wrong? There's no reason to think these programs aren't accurate as they're getting the data from the driver. nvidia-smi is part of the driver package, also available in Windows and all these values including utilization can be verified with it if one desires. I've done it, and GPU-Z and Afterburner are accurate, as well as System Information Viewer in my testing. Afterburner in particular is really just a nice GUI for nivida-smi; if we can trust it to set clock speed and power limits (we can) we can trust it to report utilization. If you're arguing the driver isn't reporting accurate data that's different, but let's see the proof. There are bugs from time to time, but that's expected with software. Temperature is not a reliable indicator; ever seen the results from an incorrectly mounted or poorly functioning cooler? Conversely a well functioning system can keep things frosty, relatively speaking, and temps might be "low". This could mislead you into thinking the GPU isn't working that hard when it actually is, especially for lower powered GPUs like the OPs. I wouldn't strictly trust the wattage reported from software either, but this is just muddying the waters anyway as it relates to the original question. Team USA forum Follow us on Twitter Help us #crunchforcures! |
Send message Joined: 29 Aug 05 Posts: 15533 |
Temperature is not a reliable indicator; ever seen the results from an incorrectly mounted or poorly functioning cooler?Nothing said about GPUs that are passively cooled but that may have a case fan blow over them to cool them that way. |
Send message Joined: 14 Aug 19 Posts: 55 |
I wouldn't recommend using those for compute purposes. Team USA forum Follow us on Twitter Help us #crunchforcures! |
Send message Joined: 25 May 09 Posts: 1289 |
This all depends on one's definition of "passive cooling" - I know of one location that has way over a thousand very late model Quadro GPUs that only have heat sinks on them, with cooling being by forced, refrigerated, dried air. In one sense of the word they are passively cooled in that they have no control over their cooling as it is forced upon them by the air flow which is controlled by a vast array of temperature sensors around the system. In another sense, then fact that there are all those temperature sensors around they may have considered some form of reactive cooling...... |
Send message Joined: 29 Aug 05 Posts: 15533 |
|
Send message Joined: 17 Nov 16 Posts: 883 |
Real trepidation now that I hear that Nvidia is going to utilize an algorithm to detect crypto-mining on new drivers to be released for RTX 3060 gpus this month. And the algorithm will cut crunching performance by 50%. Who wants to bet that the algorithm will also detect science application crunching as "crypto-mining" and cut our crunching performance? This supposedly is in preparation for the newly announced CMP processors Nvidia is making specifically for crypto-mining without video outputs. I don't see how this will alleviate the pressure on the fabs and wafer allotments for gaming gpu silicon either. |
Send message Joined: 8 Nov 19 Posts: 718 |
Temperature wise is under a controlled condition, with fans set at a specific speed, and at a controlled ambient temperature. Wattage is obvious. Other indicators (software) often see only a part of the GPU. Eg: if your 64bit Double Precision processors are working at 100%, but none of the shaders are, the GPU may be showing 100% activity, however it may be pulling only 1/3rd of the amount of wattage, and as a result, runs very cool. The GPU frequency may even be boosted to max, but if you're not using the 32bi / 16 bit shaders, then you're not using the GPU to it's fullest. As far as temperature not being a good indicator, temperature is directly related to wattage consumption, so yes, in a controlled environment, temperature can very accurately tell you how much of the GPU is used! Maybe not as a direct indicator, but it can show you as an average (as temps fluctuate slower than wattage ratings). The best way to see GPU utilization is read it's wattage in Linux is easy (nvidia-smi). |
Send message Joined: 29 Aug 05 Posts: 15533 |
64bit Double Precision processorsYou do know these don't exist? Single precision and double precision are forms of floating point operations within the CPU/GPU, where the format for 32-bit numbers is called single precision, and the format for 64-bit numbers is called double precision. Floating point is used to represent fractional values, or when a wider range is needed than is provided by fixed point (of the same bit width), even if at the cost of precision. Double precision may be chosen when the range or precision of single precision would be insufficient. (Source) |
Send message Joined: 14 Aug 19 Posts: 55 |
Temperature wise is under a controlled condition, with fans set at a specific speed, and at a controlled ambient temperature. Not everyone is running systems in climate controlled rooms. Years ago I had a dual axial fan 760 that ran quite hot, to the point it was severely thermal throttling. That GPU cooler was not a good choice for that system. If I were only monitoring temps I would have thought the card was working extremely hard, and been very confused when my output was not what what it should have been. Temperature is not a good substitute regardless of the factors you state, although it's important to monitor for system health. I find it very odd that you trust nivdia-smi for temp and power reporting but not the actual utilization rate, unless your argument is that power consumption is equivalent to utilization . That's not true either as the card can run at or near 100% without maxing out its power draw. If the compute load is 100%, the card is working to its capacity regardless of the power rate. The application's demands are what matter for utilization reporting, not that the GPU is performing every type of calculation it's capable of. My preferred GPU monitoring app for Nvidia on Linux is nvtop. It shows you everything discussed here and more, including VRAM usage, so everyone can see their preferred metrics. To be clear, I do monitor the power rate but that's mainly because I set power limits. If there's a driver crash the power limit has to be reapplied, at least on Windows. Team USA forum Follow us on Twitter Help us #crunchforcures! |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.