GPU tasks reset to 0% when they restart from idle time

Message boards : Questions and problems : GPU tasks reset to 0% when they restart from idle time
Message board moderation

To post messages, you must log in.

AuthorMessage
BlueEyedCockroach

Send message
Joined: 26 Mar 20
Posts: 3
France
Message 97070 - Posted: 26 Mar 2020, 9:38:33 UTC
Last modified: 26 Mar 2020, 9:45:39 UTC

Hello, I am new to BOINC and there's something weird with my GPU tasks that I can't figure out.

I am using BOINC with Science United and default parameters so GPU tasks are running on idle time, which is never more than one consecutive hour in my case. The GPU tasks I get from GPUGRID can't be completed in one run and they are resumed from 0% whenever they start again. So I never get to complete any, even if my GPU is being used, which seem to be kind of a waste to me.

Checkpoints work fine with CPU tasks, so I was wondering why it's not working wirth GPU tasks as well ?

Boinc 7.14.2 (x64) running on Windows 10.
So far I only got GPU tasks from GPUGRID.
ID: 97070 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2518
United Kingdom
Message 97071 - Posted: 26 Mar 2020, 10:02:51 UTC - in response to Message 97070.  

Checkpoints work fine with CPU tasks, so I was wondering why it's not working wirth GPU tasks as well ?


I don't have a card capable of running GPU tasks at the moment but pretty certain it is to do with the frequency of checkpointing. If you enable acheckpoint debug under options>event log options you will be able to see if any checkpoints are reached. There is an option to leave non-gpu tasks in memory while suspended. I don't think that is there for gpu tasks even by altering lines in config files. I don't know if there is an equivalent option by altering config files to the leave non-gpu tasks in memory while suspended. I suspect not. (The main reason I tick it on my machines is it reduces the frequency of crashes on my main project. (Recently there was a batch of work from CPDN that on slow machines could take 12 hours or more to reach a checkpoint so machines turned off frequently would never reach a checkpoint never mind finish.
ID: 97071 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 97072 - Posted: 26 Mar 2020, 10:26:00 UTC - in response to Message 97070.  

GPUGrid tasks (specifically) do checkpoint, but not very often. The ones I've seen recently only checkpoint when they reach 10%, 20%, 30% and so on - irrespective of how long it's taken them to reach those points. And the tasks have varying run times, so it's hard to predict when the checkpoint is going to happen.

Before it reaches the first 10% progress (but after the first 60 seconds), BOINC invents a 'pseudo progress' indication just to reassure you that the task is still alive. The pseudo progress is a guesstimate, and can't be relied on to predict when the real 10% point will be reached.

If I were you, I'd try to let a GPU task run for as long as possible, and see how far it gets if you leave the machine switched on but idle. Then, decide if GPUGrid is right for your computer and your way of working. Their tasks tend to require a lot of computation, and are best advised for a modern, powerful, GPU.

If you decide that you can't run them, you can always set 'no new tasks' locally via BOINC Manager - use 'Project commands' in simple view, or the projects tab in advanced view. I don't think Science United over-rides that, but I could be wrong :-(
ID: 97072 · Report as offensive
BlueEyedCockroach

Send message
Joined: 26 Mar 20
Posts: 3
France
Message 97073 - Posted: 26 Mar 2020, 10:38:10 UTC - in response to Message 97071.  

I don't have a card capable of running GPU tasks at the moment but pretty certain it is to do with the frequency of checkpointing. If you enable acheckpoint debug under options>event log options you will be able to see if any checkpoints are reached.


Thank you, I have enabled this option and will see what it says.

I think I got fooled by an error in the french translation of the UI. The option "Request tasks to checkpoint at most every N seconds" is translated as ""Request tasks to checkpoint at least every N seconds" so I figured that checkpoints were managed by BOINC software, but now I understand that they are managed by the projects executables themselves (seems like an "old" software design... but ok :) ).

I will see how it goes, but I may eventually turn off GPU as you did if my graphic card isn't powerfull enough for proposed projects.
ID: 97073 · Report as offensive
BlueEyedCockroach

Send message
Joined: 26 Mar 20
Posts: 3
France
Message 97075 - Posted: 26 Mar 2020, 12:53:58 UTC - in response to Message 97072.  

If I were you, I'd try to let a GPU task run for as long as possible, and see how far it gets if you leave the machine switched on but idle. Then, decide if GPUGrid is right for your computer and your way of working. Their tasks tend to require a lot of computation, and are best advised for a modern, powerful, GPU.

So I let a GPUGRID task run and it reached the first checkpoint at 20% after 2 hours, and I very rarely leave my PC idle that long.

If you decide that you can't run them, you can always set 'no new tasks' locally via BOINC Manager - use 'Project commands' in simple view, or the projects tab in advanced view. I don't think Science United over-rides that, but I could be wrong :-(

I tried to play with the 'no new task' button but Science United seems to override this choice whenever it's synchronized, and it keeps pushing in my list tasks that I will never complete :(
I guess I'll have to disable Science United to opt-out from GPUGRID. That's sad, I liked the general idea.
ID: 97075 · Report as offensive
ProDigit

Send message
Joined: 8 Nov 19
Posts: 718
United States
Message 97083 - Posted: 26 Mar 2020, 17:07:59 UTC - in response to Message 97075.  

Did you check on your account on their website, if there is any option to disable long tasks? Some longer tasks are better for high power GPUs. I have them, and some tasks on prime still take well over 10 hours to do.
But that project, you can deselect the long tasks. Not sure what options your project offer, as I don't currently run it.
ID: 97083 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 97085 - Posted: 26 Mar 2020, 17:14:29 UTC - in response to Message 97083.  

There haven't been any short tasks for years, and the recent long tasks were a mistake and issued for a licence-expired app - they all failed. There is only one - new - app at the moment. No choice.
ID: 97085 · Report as offensive

Message boards : Questions and problems : GPU tasks reset to 0% when they restart from idle time

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.