Cannot Get Work -Climate Prediction

Message boards : Questions and problems : Cannot Get Work -Climate Prediction
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102233 - Posted: 18 Dec 2020, 15:50:54 UTC
Last modified: 18 Dec 2020, 15:51:50 UTC

I have a long standing problem with climate.prediction.net whereby I cannot get the server to send me new work. The shed_ops_debug shows that I am requesting n seconds and m units of work but no work (and no error message) is returned. All other projects are working fine.

My machines are :-

Ubuntu 20.04, Boinc 7.16.14, Ryzen 5 2600, no GPU. This machine has 850,000 credits with CP before it stopped working, no obvious change to trigger the problem.

Ubuntu 20.04, Boinc 7.16.6, Ryzen 5 3600, Nvidea GT710 1mb GPU. This machine was added after the problem started and has never managed to get work.

I have sent many logs to the CP forums and they cannot see any problems with the work fetch. I can attach some here if you think it would help.

I have detached, rebooted, reattached several times, with and without setting nnt for all other projects and/or increasing the buffer size to 10+10 before reattaching. I have not set nnt and then run the existing buffers down to zero as I am loath to do that and waste good processing time.

I have set up a new user within CP and attached to that, no joy.

So, to my question. I have spent a short while looking through the code in github for the server side scheduler and it seems to me that there are two conditions where it blocks the send of work and logs the fact but does not appear to return an error message to the user.

During a work request it sets a lock file on the host id. During the next work request it finds the lock file still exists so exits.

It receives an unrecognised code sign key.


Now, obviously, I cannot check the server for an uncleared lock file but is there any way I can change my host id and is there any way I can resync my code sign key?
ID: 102233 · Report as offensive
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 868
United States
Message 102235 - Posted: 18 Dec 2020, 19:52:16 UTC - in response to Message 102233.  

It receives an unrecognised code sign key.

Not familiar with Climate Prediction.

Does it sign its applications?

I assume you are not talking about SSL certs or something as I think you have implied you can contact the project and get a reply from the scheduler. You are just not getting work

Do you have this cc_config.xml parameter set in the Proxy Info section?

<unsigned_apps_ok>0</unsigned_apps_ok>

Does CP use this parameter and do you have it set wrong?
ID: 102235 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2534
United Kingdom
Message 102237 - Posted: 18 Dec 2020, 20:23:07 UTC - in response to Message 102235.  

Does CP use this parameter and do you have it set wrong?


Having followed and been part of the discussion over on the CPDN boards I am pretty certain that this is with a default installation of BOINC. Attach to project and request work. No fiddling with anything apart from resource shares and suspending other projects etc. to try and kick things into action.
ID: 102237 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 102238 - Posted: 18 Dec 2020, 21:03:46 UTC

Finally find where I posted about this: Not downloading tasks

Sat 11 Jan 2020 22:54:25 AEDT | climateprediction.net | No tasks sent
ID: 102238 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102239 - Posted: 18 Dec 2020, 21:34:31 UTC - in response to Message 102235.  

It receives an unrecognised code sign key.

Not familiar with Climate Prediction.

Does it sign its applications?

I assume you are not talking about SSL certs or something as I think you have implied you can contact the project and get a reply from the scheduler. You are just not getting work

Do you have this cc_config.xml parameter set in the Proxy Info section?

<unsigned_apps_ok>0</unsigned_apps_ok>

Does CP use this parameter and do you have it set wrong?


I’m guessing it’s a mechanism Boinc uses to ensure that the applications have not been tampered with. I’m sure it’s not specific to CP.

The parameter is one I’ve not seen before, I’ll check on the setting in the morning and report back.
ID: 102239 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102240 - Posted: 18 Dec 2020, 21:59:46 UTC - in response to Message 102238.  

Finally find where I posted about this: Not downloading tasks

Sat 11 Jan 2020 22:54:25 AEDT | climateprediction.net | No tasks sent


So all I need is a convenient thunderstorm? :-)

I’d guess it’s not a hard bug in any of the newer versions or they’d be swamped with error reports and I’m fairly certain the client is actually making the request which is why I looked at the server side - either for a bug there or a mismatch between the two sides.
ID: 102240 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 102241 - Posted: 18 Dec 2020, 23:15:25 UTC - in response to Message 102233.  

I have detached, rebooted, reattached several times, with and without setting nnt for all other projects and/or increasing the buffer size to 10+10 before reattaching. I have not set nnt and then run the existing buffers down to zero as I am loath to do that and waste good processing time.

The only problem I have had with the Linux version (other than the 32-bit libraries) is the buffer being too short. It depends on what other projects you are running too. For example, Rosetta tends to be rather long also. Often the default 0.1 + 0.5 day buffer is not enough.

The world won't end (just yet) if you let it run dry, and set it to 0.5 + 1.0 days. That should work.
ID: 102241 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102242 - Posted: 18 Dec 2020, 23:31:55 UTC - in response to Message 102241.  

I have detached, rebooted, reattached several times, with and without setting nnt for all other projects and/or increasing the buffer size to 10+10 before reattaching. I have not set nnt and then run the existing buffers down to zero as I am loath to do that and waste good processing time.

The only problem I have had with the Linux version (other than the 32-bit libraries) is the buffer being too short. It depends on what other projects you are running too. For example, Rosetta tends to be rather long also. Often the default 0.1 + 0.5 day buffer is not enough.

The world won't end (just yet) if you let it run dry, and set it to 0.5 + 1.0 days. That should work.


My default is 0.1 + 0.1 and I assume that setting nnt then changing the buffer to 10 + 10 before requesting work would have the same effect without requiring me to run down the existing WUs first?
ID: 102242 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 102245 - Posted: 19 Dec 2020, 0:24:26 UTC - in response to Message 102240.  

It stopped again, and I gave up on it.
ID: 102245 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102246 - Posted: 19 Dec 2020, 0:51:27 UTC - in response to Message 102245.  

It stopped again, and I gave up on it.


It’s like a loose tooth - I keep on going back to it!
ID: 102246 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102253 - Posted: 19 Dec 2020, 14:09:31 UTC - in response to Message 102239.  

It receives an unrecognised code sign key.

Not familiar with Climate Prediction.

Does it sign its applications?

I assume you are not talking about SSL certs or something as I think you have implied you can contact the project and get a reply from the scheduler. You are just not getting work

Do you have this cc_config.xml parameter set in the Proxy Info section?

<unsigned_apps_ok>0</unsigned_apps_ok>

Does CP use this parameter and do you have it set wrong?


I’m guessing it’s a mechanism Boinc uses to ensure that the applications have not been tampered with. I’m sure it’s not specific to CP.

The parameter is one I’ve not seen before, I’ll check on the setting in the morning and report back.


I checked the parameter on both machines and it is unset (0).

I did notice, however, that no alt platforms was still set from the days when I was getting 32 bit apps from ?WCG? that were crashing. I’ve reset it just in case.
ID: 102253 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102267 - Posted: 20 Dec 2020, 16:49:28 UTC - in response to Message 102253.  


I did notice, however, that no alt platforms was still set from the days when I was getting 32 bit apps from ?WCG? that were crashing. I’ve reset it just in case.


It was worth a try, made no difference though.
ID: 102267 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 102274 - Posted: 20 Dec 2020, 22:43:44 UTC - in response to Message 102242.  

My default is 0.1 + 0.1 and I assume that setting nnt then changing the buffer to 10 + 10 before requesting work would have the same effect without requiring me to run down the existing WUs first?
If the BOINC scheduler behaved the way we thought it should, we wouldn't be here discussing it.
ID: 102274 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102276 - Posted: 20 Dec 2020, 23:32:25 UTC - in response to Message 102274.  

My default is 0.1 + 0.1 and I assume that setting nnt then changing the buffer to 10 + 10 before requesting work would have the same effect without requiring me to run down the existing WUs first?
If the BOINC scheduler behaved the way we thought it should, we wouldn't be here discussing it.


Touché, I offer no defence :-)
ID: 102276 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1283
United Kingdom
Message 102279 - Posted: 21 Dec 2020, 8:45:37 UTC

I've been running 1 +0.01 for some time, but CPDN does not follow the convention on sending out work as the majority of its tasks are of extremely long duration. As far as I can see it sends out enough work to populate all allowed CPU cores (in my case four out of eight), then only restocks as cores become available until work balance is restored as far as CPDN. Also, on my system at least, BONC is pretty good at "bumping" CPDN out of the way of other work that is due to expire and not refilling the cache until CPDN is out of the way.
ID: 102279 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102281 - Posted: 21 Dec 2020, 10:51:31 UTC

So back to the original questions :-

Is there any way I can force a change of host id?

Is there any way I can resync the code sign key?
ID: 102281 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 102282 - Posted: 21 Dec 2020, 14:32:27 UTC - in response to Message 102281.  

Is there any way I can force a change of host id?
It is easy enough on Ubuntu by just updating. This is the same machine with different Linux kernels:
https://www.cpdn.org/show_host_detail.php?hostid=1507626
https://www.cpdn.org/show_host_detail.php?hostid=1508717

But you could probably do it by just changing the PC name if you are on Windows.
And you could then change the name back again I suppose.
ID: 102282 · Report as offensive
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 285
United Kingdom
Message 102284 - Posted: 21 Dec 2020, 15:30:25 UTC - in response to Message 102282.  

Is there any way I can force a change of host id?
It is easy enough on Ubuntu by just updating. This is the same machine with different Linux kernels:
https://www.cpdn.org/show_host_detail.php?hostid=1507626
https://www.cpdn.org/show_host_detail.php?hostid=1508717

But you could probably do it by just changing the PC name if you are on Windows.
And you could then change the name back again I suppose.


Hmm. I’m on Ubuntu 20.04 and the next update is 20.10 which, as far as I understand, gives problems with Boinc so I’d rather not update from the LTS version yet.

I’ve just tried a name change and it registered with CP but stayed as the same machine with the same host id
ID: 102284 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1283
United Kingdom
Message 102285 - Posted: 21 Dec 2020, 15:38:46 UTC - in response to Message 102282.  
Last modified: 21 Dec 2020, 15:44:53 UTC

Changing computer name is not one of the triggers used to get a new user computer ID, basically BOINC ignores the name you give to a computer.
A change in operating system, bigger than just a Linux version change, but a more substantial type (Linux to Windows) will get you a new computer ID. Changing Linux family may count, but I'm not sure about that.
BIG hardware changes (e.g. changing from an Intel to an AMD CPU) will almost certainly get you a new computer ID.

There have been times when changing BOINC version has triggered a mass computer ID change, but that was very much the luck of the draw.

edit to add:
Linux kernel version sometimes works, but sometimes doesn't.
ID: 102285 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2534
United Kingdom
Message 102287 - Posted: 21 Dec 2020, 15:47:31 UTC

Hmm. I’m on Ubuntu 20.04 and the next update is 20.10 which, as far as I understand, gives problems with Boinc so I’d rather not update from the LTS version yet.


I have a fresh install of 20.10 on both my Ryzen and the old Laptop and both seem to be working OK with BOINC.

I have compiled from source however rather than using the version supplied by repository.
ID: 102287 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : Cannot Get Work -Climate Prediction

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.