It appears that CPU tasks take way too long to crunch when GPU is active

Message boards : Questions and problems : It appears that CPU tasks take way too long to crunch when GPU is active
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Bill
Avatar

Send message
Joined: 13 Jun 17
Posts: 91
United States
Message 91930 - Posted: 25 Jun 2019, 20:24:23 UTC

For several weeks now, I have noticed that tasks are taking too long to crunch on my laptop. This is a little long and convoluted, so let me try to explain this as clear as I can. This is all occurring on a Lenovo P51 laptop with an i7-7820HQ CPU, Intel HD Graphics 630, and an Nvidia Quadro M1200.

I started noticing that the new N-Body Milkyway tasks were taking a real long time to compute. Sometimes a day or three of total computing time! I then noticed a few days ago that Einstein tasks were starting to take over a day to computer (at least ETA), and yesterday I noticed the same for Seti tasks (ETA increased to around a full day, although the Seti tasks appear to be taking the normal ~2 hrs as I'm typing this).

When I say it is taking a long time to compute, it is either the elapsed compute time, or the ETA, or both are really long. Additionally, the ETA timer doesn't seem to consistently count down. Sometimes a few seconds tick off, but then sometimes 5+ seconds are added back a little later. Its as if Boinc doesn't know how much time a task is going to take at any moment.

At first, I suspected it was just the MW tasks were new and perhaps too big, which would explain why they were taking so long. I don't think that theory holds anymore since I started noticing this on the other projects. I then suspected this was happening because I started enabling iGPU tasks on Seti. However, I set Seti to NNT, and since all iGPU tasks have completed, the same problem is happening.

GPU tasks for the most part do not appear to be affected by this, but the GPU tasks seem to have as much CPU time to complete a task as wall time, which is odd. It also appears that the ETA clock jumps around more often when gpu tasks are running.

The problem I am having is that there seems to be a lot going on here, and I have not meticulously taken notes. I feel like I need to note what the ETA of a task is when downloaded, the ETA at certain progress points of the task, and whether it was computed with/without GPU tasks. Obviously, that is a lot of note taking and I don't have the time to record all that information. So, I'm not sure if the problem I'm having is something related to Windows 10 Pro, the computer itself, or something Boinc/Project related. I do know that the CPU ramps up to boost clock speeds, and thermals are within their normal range, so I don't suspect that the CPU is downclocking for any reason. CPU usage for all other programs appears normal as well.

Any ideas of where I can start to attempt to figure this out?
ID: 91930 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 91934 - Posted: 26 Jun 2019, 1:52:46 UTC - in response to Message 91930.  

Not enough cpu to support all the work attempted and not enough time slices per thread. Drop cpu utilization by either reducing cpu work or reducing gpu work that needs cpu attention.
ID: 91934 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2518
United Kingdom
Message 91937 - Posted: 26 Jun 2019, 5:25:35 UTC - in response to Message 91930.  

Just guessing here but could the hard disk be struggling to keep up with data from the gpu as well as cpu?
ID: 91937 · Report as offensive
MarkJ
Volunteer tester
Help desk expert

Send message
Joined: 5 Mar 08
Posts: 272
Australia
Message 91938 - Posted: 26 Jun 2019, 7:56:13 UTC

Are you using the iGPU (the Intel graphics)? That will slow down the cpu tasks.

The second thing to look at would be cooling, with the cpu and the GPUs going a laptop will be struggling with the heat and likely throttle to protect itself. Laptop cooling systems tend to get clogged with fluff easily so may need cleaning.
MarkJ
ID: 91938 · Report as offensive
Bill
Avatar

Send message
Joined: 13 Jun 17
Posts: 91
United States
Message 91941 - Posted: 26 Jun 2019, 12:44:02 UTC

Addressing everyone's responses in order:

Keith: Perhaps, but this was not a problem before. I did update the bios, so perhaps with recent security patches, that has affected the processor. I've backed down to 75% of CPUs and 90% of CPU time, we will see how that affects things.
Dave: It is an SSD, so I doubt it, but I can't say that I've looked at this.
Mark: I was, and I thought that was a problem, but I haven't been running iGPU tasks for a day and I'm still having the same problem.
ID: 91941 · Report as offensive
MarkJ
Volunteer tester
Help desk expert

Send message
Joined: 5 Mar 08
Posts: 272
Australia
Message 91946 - Posted: 27 Jun 2019, 12:08:58 UTC

Have you checked the CPU and GPU temps? Use GPU-Z for the GPU details and some other utility to get the CPUs temp readings while running. I assume you’re running Windows.
MarkJ
ID: 91946 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 91949 - Posted: 27 Jun 2019, 13:53:24 UTC

I still like Speedfan a lot to quickly check temperatures and where necessary adjust fan speeds.
ID: 91949 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 91951 - Posted: 27 Jun 2019, 15:37:34 UTC - in response to Message 91941.  

Keith: Perhaps, but this was not a problem before. I did update the bios, so perhaps with recent security patches, that has affected the processor. I've backed down to 75% of CPUs and 90% of CPU time, we will see how that affects things.

You mentioned security patches. The recent security mitigations for Intel processors have reduced their performance by 30% according to many online testing reviews of the patches. You might be observing this factor.

Updating the BIOS could have changed the setup of the processor clocks and boost settings and/or the memory speeds and the motherboard settings could have changed also from previous setup.

I would avoid using the internal iGPU for crunching as that normally severely increases the cpu crunching times.
ID: 91951 · Report as offensive
Bill
Avatar

Send message
Joined: 13 Jun 17
Posts: 91
United States
Message 91993 - Posted: 1 Jul 2019, 13:03:53 UTC

So I have set the amount of CPUs to 75% (6 of 8 threads), and CPU usage to 90%. I also have stopped crunching iGPU tasks. WIth rec_half_life_days set to 1, my ETA for tasks has dropped significantly. MW N-body tasks are now down to 4-9 hours (a wide variety, but much less than before). SETI tasks appear to be down to about their normal 2 hours, and Einstein is under 10 hours (depending on the application). Overall, performance is definitely better, whether the GPU is crunching or not. What gets me is that I would run this computer 100% of cores/threads 100% of the time, and this was not a problem before, and I wasn't running iGPU tasks.

Oh, I also set the storage back down to 0.5 days/ 0.2 additional days. I always forget that higher values aren't necessarily better.

However, something still doesn't seem right. I'm still crunching on the Quadro M1200 for all three projects, and the percentage of CPU time compared to GPU times seems off. For example, this Seti task has roughly 17.5 minutes for both CPU and GPU times. Similar situation for this Einstein task. Milkyway tasks don't seem to be as bad, but the ratio looks off to me.

I have a dedicate thread for GPU tasks in the respective app_config files. Well, except for Einstein, I don't know the exact application names since they changed them up, but they show 1 CPU per GPU. I also am only running one GPU task on the Quadro. I feel like I'm missing something, but I'm not sure what.
ID: 91993 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 91996 - Posted: 1 Jul 2019, 14:02:14 UTC - in response to Message 91993.  

It depends on the application whether it needs just a little bit of cpu to support the gpu task or A LOT of cpu to support the gpu task. The SoG app at Seti expects a full cpu core to support the gpu task.
Same for Einstein but not for MW. If all depends on how the app developer wrote the application code. At Seti for example, the older CUDA42 or CUDA50 applications used just a fraction of cpu_time compared to the run_time. Another example is the "special app" at Seti which also uses very little of cpu_time compared to the run_time in its stock configuration. If you add the -nobs (no blocking sync) parameter to the cmdline in either the app_info or app_config, the application will use a full cpu thread to support the gpu task and the run_times will match the cpu_times and shave 5-10 seconds off the crunching time.
ID: 91996 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 91997 - Posted: 1 Jul 2019, 14:17:22 UTC - in response to Message 91996.  

It also depends on how the programming language chosen has been implemented on the hardware platform. OpenCL on NVidia hardware usually needs a lot of CPU support: OpenCL on AMD or iGPU is less likely to (though it does vary according to the problem, the programmer, and the options chosen).
ID: 91997 · Report as offensive
Bill
Avatar

Send message
Joined: 13 Jun 17
Posts: 91
United States
Message 92000 - Posted: 1 Jul 2019, 18:11:19 UTC

Ah, okay. I thought it was a universal truth that CPU usage was less for GPU tasks.
ID: 92000 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 92002 - Posted: 1 Jul 2019, 18:23:23 UTC - in response to Message 92000.  

It is.

But the next question is, 'by how much?' - 2% or 98%?
ID: 92002 · Report as offensive
Bill
Avatar

Send message
Joined: 13 Jun 17
Posts: 91
United States
Message 92027 - Posted: 2 Jul 2019, 12:07:04 UTC - in response to Message 92002.  

Well, I was just looking at my Seti tasks last night and I noticed even those GPU tasks seem to be eating up a lot of CPU time (check it out here). The first Nvidia task has a run time of 907.68, CPU time of 902, or 99% of the wall time uses CPU time. There are many other Nvidia tasks that have roughly the same percentage. That doesn't seem right to me. I get the feeling I bumped a switch walking out the door and I didn't realize it.
ID: 92027 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 92028 - Posted: 2 Jul 2019, 12:16:55 UTC - in response to Message 92027.  

You'll find that the CPU is actually doing very little, despite appearing to be busy.

It's what's known as a spin-wait loop: the CPU is just asking 'Do you need me yet? Do you need me yet? Do you need me yet?' over and over again. It's one (very inefficient) way of ensuring that it's ready to leap into action at a microsecond's notice when the GPU needs to be told what to do next.

The CPU should be able to power-down the floating point unit, the SIMD units, and much else while that's going on. Saves a few watts.
ID: 92028 · Report as offensive
Bill
Avatar

Send message
Joined: 13 Jun 17
Posts: 91
United States
Message 92036 - Posted: 2 Jul 2019, 13:39:38 UTC - in response to Message 92028.  

What you're saying makes sense, but I'm still confused. When I look at my AMD APU, the GPU tasks for that processor have a very small cpu time relative to wall time. Why is that different than for the tasks my Quadro is processing? Is it just a difference in architecture? If it is, okay and I'll let it be. I have been under the assumption that if the wall time and CPU time for a GPU task are about the same, something is wrong with how I'm crunching.
ID: 92036 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 92037 - Posted: 2 Jul 2019, 13:53:47 UTC - in response to Message 92036.  

It's a difference between the way NVidia and AMD implement OpenCL runtime support. Allegedly.

But on the other hand, other people say that:

a) NVidia implements OpenCL by calling the underlying CUDA layer
b) CUDA has all the most efficient ways of calling the CPU only when needed.

The implication would be that NVidia has deliberately crippled OpenCL by failing to implement the full set of inter-layer calling protocols. But I don't have enough first-hand information to say that.
ID: 92037 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 92042 - Posted: 2 Jul 2019, 15:33:42 UTC - in response to Message 92037.  

As far as I always understood, one OpenCL application should be able to be used on any OpenCL capable piece of hardware out there, just in the same way that OpenGL can be used on any capable hardware out there without having to write a specific API for that piece of hardware.
ID: 92042 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 92043 - Posted: 2 Jul 2019, 15:53:53 UTC - in response to Message 92042.  

As far as I always understood, one OpenCL application should be able to be used on any OpenCL capable piece of hardware out there, just in the same way that OpenGL can be used on any capable hardware out there without having to write a specific API for that piece of hardware.


Was wondering about this myself. Why cant Einstein or Seti tasks run on whichever GPU is available: I am guessing it is only for verification purpose.

What about resuming from a checkpoint? If a task on a GTX-1080Ti is resumed from its checkpoint I assume it could be assigned to any other 1080Ti, but what about other models like the much weaker 1050?
ID: 92043 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 92045 - Posted: 2 Jul 2019, 17:10:57 UTC

OK, let's pick that lot apart, considering:

Environment
Source code
Compiled binaries
Data

Why cant Einstein or Seti tasks run on whichever GPU is available?
That's an environment question: it was decided when SETI switched to the BOINC platform (or possibly later, when BOINC added GPU processing) that a 'task' would be assigned to a particular device, and the result would be expected back from that device. Note the word "assigned": any task is capable of being computed on any device - it just happens to have been sent to yours. The replication copy has been sent to a device on somebody else's computer. It could be the same type of device as yours, or it could be different.

I am guessing it is only for verification purpose
Yes, it's primarily an anti-cheating device. But a weak one, as the successful deployment of rebranding tools confirms. The only real constraint is that the result must be returned from a host with the same HostID as it was issued to.

What about resuming from a checkpoint?
Happens already. If you have multiple cards of the same type - NVidia GPUs, for example - there is no certainty that a task that was running on one card when you shut down, will resume on the same card on restart. The checkpoint file - being pure data - doesn't mind.

one OpenCL application should be able to be used on any OpenCL capable piece of hardware out there
That certainly applies at the source code level. I think opinion varies at the compiled binary level: some programmers find that minor changes have to be made during compilation, others don't. I believe that you can take Einstein binaries for different GPU types and compare them, and find them identical.
ID: 92045 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : It appears that CPU tasks take way too long to crunch when GPU is active

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.