GPU starved and client responds "don't need"

Message boards : BOINC client : GPU starved and client responds "don't need"
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47796 - Posted: 17 Feb 2013, 11:24:48 UTC - in response to Message 47784.  


Finally

2013-02-17 08:40:45 | SETI@home | Not requesting tasks: some download is stalled

Which is the reason you're 150 jobs short of what you expect there to be. Abort the stalled download, or in tasks tab abort the relevant task that's stuck.

That's all I see of note, and not being an expert, maybe there's more.



Thank you very much for your feedback here. I hope someone can fix this.
Don't need thing.

And to keep requesting new work even if Seti transfers are stalled which they will be until more BW is allocated to the servers.
But that has been asked for several years now and probebly will be asked for a long time in the future.

Fix theese 2 things and multi Gpu crunchers will continue better.


ID: 47796 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 47797 - Posted: 17 Feb 2013, 11:26:32 UTC - in response to Message 47794.  
Last modified: 17 Feb 2013, 11:26:55 UTC

Please get rid the "don't need" option. If it's not here for testing/debugging anymore.


The Don't need is an Informational Message, it is not an option, If Boinc feels it has eithier enough work Total, or has done enough work from a project (for it's resource share), then it doesn't need any more.

Boinc used to have no Informational messages about why it wasn't asking for work, so you didn't know why it wasn't asking for work, now it does.

Claggy
ID: 47797 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47798 - Posted: 17 Feb 2013, 11:28:26 UTC - in response to Message 47787.  

You're right on that, Claggy. The config is interesting in that TRuEQ & TuVaLu has set 12 concurrent downloads from any project, a bunch, 'assumed' [which is bad], that the download was altogether borged permanently... not recoverable.


I wrote in another thread that allow only 1 download for every computor.
Then everyone would have more dl "pipes" to the server.

When installing and investigating BM 7.0.4+ options I saw this multi transfer option and tested it.
So far no problem with it.
I get the same stalled seti transfers and the server backoffs as before.

The problem will be when alot of people discover this multi option it will need more dl "pipes" from the server. Which it doesn't have.
ID: 47798 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47799 - Posted: 17 Feb 2013, 11:32:13 UTC - in response to Message 47788.  

Not sure about the wisdom of controlling operations with having both a app_config.xml and app_info.xml. The app_config is a replacement control file to app_info, so maybe there's a conflict between these 2.

They work fine together. But if I'm using both, I'd put everything in app_info that is defined there, and only spill over into app_config for the tags which can't be defined anywhere else - basically, just max_concurrent at this stage.


When I have work it runs fine.
The Gpu's get loaded as planed and no problems.

The problem is as with BM 7.0.x and up to 24(i think) was getting work from the servers properly.

With thees 2 options "don't need" and not dl when there is stalled transfers.
It seems like being back from start.....(almost).

That is why I created this thread.
ID: 47799 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47800 - Posted: 17 Feb 2013, 11:34:27 UTC - in response to Message 47790.  

Regarding the thread title:

"don't need" is not a server response. It is a client-generated information message, explaining why the client didn't request new work: as is clear from the context

2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] Starting scheduler request
2013-02-17 08:36:04 | Moo! Wrapper | [work_fetch] request: CPU (0.00 sec, 0.00 inst) ATI (0.00 sec, 0.00 inst)
2013-02-17 08:36:04 | Moo! Wrapper | Sending scheduler request: Requested by user.
2013-02-17 08:36:04 | Moo! Wrapper | Not requesting tasks: don't need
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] ATI work request: 0.00 seconds; 0.00 devices


If the OP could amend the thread title, and focus on the real issue, there might be something there we can look into: I'm not quite sure why no work would be needed, from that mess of a log file.


If you need other config from cc_config.xml logging just let me know.
ID: 47800 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47801 - Posted: 17 Feb 2013, 11:36:05 UTC - in response to Message 47790.  

Regarding the thread title:

"don't need" is not a server response. It is a client-generated information message, explaining why the client didn't request new work: as is clear from the context

2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] Starting scheduler request
2013-02-17 08:36:04 | Moo! Wrapper | [work_fetch] request: CPU (0.00 sec, 0.00 inst) ATI (0.00 sec, 0.00 inst)
2013-02-17 08:36:04 | Moo! Wrapper | Sending scheduler request: Requested by user.
2013-02-17 08:36:04 | Moo! Wrapper | Not requesting tasks: don't need
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] ATI work request: 0.00 seconds; 0.00 devices


If the OP could amend the thread title, and focus on the real issue, there might be something there we can look into: I'm not quite sure why no work would be needed, from that mess of a log file.


Is the OP me or someone else.
Can I change the title, how and what will it say??
ID: 47801 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47802 - Posted: 17 Feb 2013, 11:41:34 UTC - in response to Message 47797.  

Please get rid the "don't need" option. If it's not here for testing/debugging anymore.


The Don't need is an Informational Message, it is not an option, If Boinc feels it has eithier enough work Total, or has done enough work from a project (for it's resource share), then it doesn't need any more.

Boinc used to have no Informational messages about why it wasn't asking for work, so you didn't know why it wasn't asking for work, now it does.

Claggy


Well, the "new" work cache setting I have 3days + 1day of work.
But I only have like maybe 1-2 days of work and the clients response is don't need".

This don't need must be new to 7.0.4+ versions.

I ran 7.0.28 and all of my options and workfetch and cue worked.
But I wanted to run more then 1 cal_ati beta ap task so I added app_config.xml to my seti beta folder and upgraded to BM 7.0.4+

Now I see how I can fix this for me.
When finnished the Seti Beta tasks I will downgrade to the functioning version of 7.0.28 and continue as before.
That will solve this.

Thank you for your feedback.





ID: 47802 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47803 - Posted: 17 Feb 2013, 11:43:46 UTC
Last modified: 17 Feb 2013, 11:44:22 UTC

Where do I find a moderator here to change thread name to " GPU starved and client gives answer "don't need"???

I made this post only to click the red cross and be able to contact a moderator.
ID: 47803 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 47805 - Posted: 17 Feb 2013, 11:49:09 UTC - in response to Message 47803.  

Where do I find a moderator here to change thread name to " GPU starved and client gives answer "don't need"???

I made this post only to click the red cross and be able to contact a moderator.

You can change it yourself by editing that last post.
ID: 47805 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47806 - Posted: 17 Feb 2013, 12:16:53 UTC

A moderator helped me.
ID: 47806 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47807 - Posted: 17 Feb 2013, 12:18:09 UTC - in response to Message 47805.  

Where do I find a moderator here to change thread name to " GPU starved and client gives answer "don't need"???

I made this post only to click the red cross and be able to contact a moderator.

You can change it yourself by editing that last post.


Now I saw that.
I wonder why I haven't noticed that option when editing before.

:)
ID: 47807 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47822 - Posted: 17 Feb 2013, 17:22:43 UTC

I removed the app_config.xml as suggested.
So now I am running only 1 SETI Beta on Gpu 0.
and running 1 Moowrap task on 3 Gpu's(Gpu 1(multi=1).Which is, use all gpu's fo rthat task.
and as many seti ap's I am able to dl on Gpu 0 and 2.

I clicked update on Seti with 1 stalled dl.
It says, transfers stalled=no new tasks.

I clicked update on Mowrap that has 1 task in cue.

cue is set to 3+1 days

2013-02-17 18:16:09 | Moo! Wrapper | update requested by user
2013-02-17 18:16:10 | Moo! Wrapper | Sending scheduler request: Requested by user.
2013-02-17 18:16:10 | Moo! Wrapper | Not requesting tasks: project is not highest priority
2013-02-17 18:16:13 | Moo! Wrapper | Scheduler request completed

A new message for me: Project is not highest priority......

Any more ideas before i downgrade to 7.0.28 again?
ID: 47822 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 47823 - Posted: 17 Feb 2013, 18:13:33 UTC - in response to Message 47822.  

Any more ideas before i downgrade to 7.0.28 again?

Learn to read your own logs. Here's a single work fetch cycle from lower down, and its outcome:

2013-02-17 08:45:38 | | [work_fetch] work fetch start
2013-02-17 08:45:38 | | [work_fetch] ATI: buffer_low: yes; sim_excluded_instances 0
2013-02-17 08:45:38 | | [work_fetch] set_request(): ninst 3 nused_total 1.000000 nidle_now 0.000000 fetch share 1.000000 req_inst 0.000000
2013-02-17 08:45:38 | | [work_fetch] ------- start work fetch state -------
2013-02-17 08:45:38 | | [work_fetch] target work buffer: 259200.00 + 43200.00 sec
2013-02-17 08:45:38 | | [work_fetch] --- project states ---
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] REC 69099.529 prio -1.000982 can req work
2013-02-17 08:45:38 | OProject@Home | [work_fetch] REC 82.896 prio -0.000236 can't req work: scheduler RPC backoff (backoff: 515.56 sec)
2013-02-17 08:45:38 | SETI@home | [work_fetch] REC 59664.099 prio -0.778463 can't req work: scheduler RPC backoff (backoff: 45.02 sec)
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] REC 46196.671 prio -4.096825 can't req work: "no new tasks" requested via Manager
2013-02-17 08:45:38 | WUProp@Home | [work_fetch] REC 0.002 prio -0.000009 can't req work: non CPU intensive
2013-02-17 08:45:38 | FreeHAL@home | [work_fetch] REC 0.020 prio 0.000000 can't req work: non CPU intensive
2013-02-17 08:45:38 | | [work_fetch] --- state for CPU ---
2013-02-17 08:45:38 | | [work_fetch] shortfall 1209096.29 nidle 3.00 saturated 0.00 busy 0.00
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] fetch share 0.000 (blocked by prefs) (no apps)
2013-02-17 08:45:38 | OProject@Home | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | SETI@home | [work_fetch] fetch share 0.000 (blocked by prefs) (no apps)
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] fetch share 0.000 (blocked by prefs)
2013-02-17 08:45:38 | | [work_fetch] --- state for ATI ---
2013-02-17 08:45:38 | | [work_fetch] shortfall 842913.07 nidle 0.00 saturated 6016.78 busy 0.00
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] fetch share 1.000
2013-02-17 08:45:38 | OProject@Home | [work_fetch] fetch share 0.000 (no apps)
2013-02-17 08:45:38 | SETI@home | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | | [work_fetch] ------- end work fetch state -------
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] Starting scheduler request
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] request: CPU (0.00 sec, 0.00 inst) ATI (842913.07 sec, 0.00 inst)
2013-02-17 08:45:38 | Moo! Wrapper | Sending scheduler request: To report completed tasks.
2013-02-17 08:45:38 | Moo! Wrapper | Reporting 1 completed tasks
2013-02-17 08:45:38 | Moo! Wrapper | Requesting new tasks for ATI
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] ATI work request: 842913.07 seconds; 0.00 devices
2013-02-17 08:45:43 | Moo! Wrapper | Scheduler request completed: got 10 new tasks

At that time, OProject had the highest priority (note that all numbers are negative): SETI@home had middle priority: and Moo! Wrapper had lowest priority.

Both OP and SETI were being delayed before attempting to pester their servers again ('RPC backoff' - Remote Procedure Call). Moo was fetchable - work was requested and allocated. That's how it works.

If you try to bypass normal scheduling by clicking the update button, BOINC will only request work from the highest priority project. We have had bugs with that: it should be the highest priority fetchable project, and I think it's fixed now in v7.0.52. You would have to match up the work fetch cycle in the log with the time you clicked the button. Please do that in the peace and comfort of your own home: we don't need a new log snippet for every twist and turn in your search.
ID: 47823 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47824 - Posted: 17 Feb 2013, 19:15:17 UTC - in response to Message 47823.  

Any more ideas before i downgrade to 7.0.28 again?

Learn to read your own logs. Here's a single work fetch cycle from lower down, and its outcome:

2013-02-17 08:45:38 | | [work_fetch] work fetch start
2013-02-17 08:45:38 | | [work_fetch] ATI: buffer_low: yes; sim_excluded_instances 0
2013-02-17 08:45:38 | | [work_fetch] set_request(): ninst 3 nused_total 1.000000 nidle_now 0.000000 fetch share 1.000000 req_inst 0.000000
2013-02-17 08:45:38 | | [work_fetch] ------- start work fetch state -------
2013-02-17 08:45:38 | | [work_fetch] target work buffer: 259200.00 + 43200.00 sec
2013-02-17 08:45:38 | | [work_fetch] --- project states ---
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] REC 69099.529 prio -1.000982 can req work
2013-02-17 08:45:38 | OProject@Home | [work_fetch] REC 82.896 prio -0.000236 can't req work: scheduler RPC backoff (backoff: 515.56 sec)
2013-02-17 08:45:38 | SETI@home | [work_fetch] REC 59664.099 prio -0.778463 can't req work: scheduler RPC backoff (backoff: 45.02 sec)
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] REC 46196.671 prio -4.096825 can't req work: "no new tasks" requested via Manager
2013-02-17 08:45:38 | WUProp@Home | [work_fetch] REC 0.002 prio -0.000009 can't req work: non CPU intensive
2013-02-17 08:45:38 | FreeHAL@home | [work_fetch] REC 0.020 prio 0.000000 can't req work: non CPU intensive
2013-02-17 08:45:38 | | [work_fetch] --- state for CPU ---
2013-02-17 08:45:38 | | [work_fetch] shortfall 1209096.29 nidle 3.00 saturated 0.00 busy 0.00
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] fetch share 0.000 (blocked by prefs) (no apps)
2013-02-17 08:45:38 | OProject@Home | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | SETI@home | [work_fetch] fetch share 0.000 (blocked by prefs) (no apps)
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] fetch share 0.000 (blocked by prefs)
2013-02-17 08:45:38 | | [work_fetch] --- state for ATI ---
2013-02-17 08:45:38 | | [work_fetch] shortfall 842913.07 nidle 0.00 saturated 6016.78 busy 0.00
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] fetch share 1.000
2013-02-17 08:45:38 | OProject@Home | [work_fetch] fetch share 0.000 (no apps)
2013-02-17 08:45:38 | SETI@home | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | | [work_fetch] ------- end work fetch state -------
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] Starting scheduler request
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] request: CPU (0.00 sec, 0.00 inst) ATI (842913.07 sec, 0.00 inst)
2013-02-17 08:45:38 | Moo! Wrapper | Sending scheduler request: To report completed tasks.
2013-02-17 08:45:38 | Moo! Wrapper | Reporting 1 completed tasks
2013-02-17 08:45:38 | Moo! Wrapper | Requesting new tasks for ATI
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] ATI work request: 842913.07 seconds; 0.00 devices
2013-02-17 08:45:43 | Moo! Wrapper | Scheduler request completed: got 10 new tasks

At that time, OProject had the highest priority (note that all numbers are negative): SETI@home had middle priority: and Moo! Wrapper had lowest priority.

Both OP and SETI were being delayed before attempting to pester their servers again ('RPC backoff' - Remote Procedure Call). Moo was fetchable - work was requested and allocated. That's how it works.

If you try to bypass normal scheduling by clicking the update button, BOINC will only request work from the highest priority project. We have had bugs with that: it should be the highest priority fetchable project, and I think it's fixed now in v7.0.52. You would have to match up the work fetch cycle in the log with the time you clicked the button. Please do that in the peace and comfort of your own home: we don't need a new log snippet for every twist and turn in your search.


But even if oproject is the highest prio to fetch work...
Wasn't cpu projects and gpu projects treated differently when running work-fetch cycle?
And oproject ALX is an NCI project and they was also treated differently from ord. cpu projects.
But that was in the early state of BM 7.x.x


And where do I read about the rpc process when running work fetch cycle?

I was mostly curious why the "don't need" answer came when work cache was far from full.
And why Seti wasn't able to request more tasks when there where tasks stalled in cue which they are most of the day.

I can't write code in C++ so I can't do it better then the dev's

I hope I didn't bother too much.

//Me out
ID: 47824 · Report as offensive
TRuEQ & TuVaLu
Avatar

Send message
Joined: 23 May 11
Posts: 108
Sweden
Message 47837 - Posted: 18 Feb 2013, 16:29:53 UTC
Last modified: 18 Feb 2013, 16:32:30 UTC

I suspended a couple of seti tasks to give room to 1 seti beta task.
And see this...

2013-02-18 17:25:28 | SETI@home | Sending scheduler request: To report completed tasks.
2013-02-18 17:25:28 | SETI@home | Reporting 1 completed tasks
2013-02-18 17:25:28 | SETI@home | Not requesting tasks: some task is suspended via Manager
2013-02-18 17:25:34 | SETI@home | Scheduler request completed


But I still want to dl the limited 100 tasks to my cue.

I just resumed the tasks that was suspended. So problem is solved.

I am still evaluating this .52 version....
I am about to run some Einstein soon.
And after that I will run WCG with app_config.xml to have it run 2 tasks on each gpu win cpu setting 0.45 and gpu setting 0.5
I will run the projects one by one since it feels more stable that way.
ID: 47837 · Report as offensive
Previous · 1 · 2

Message boards : BOINC client : GPU starved and client responds "don't need"

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.