Thread 'CPDN project offline again'

Message boards : Projects : CPDN project offline again
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 86357 - Posted: 29 May 2018, 17:55:23 UTC - in response to Message 86355.  

IBM sponsors WCG but I don't know if their infrastructure is part of some IBM quasi-Amazon web hosting cloud devops BS etc.

Last year, WCG moved from their own servers to "the cloud". It was a big deal. But what that really means I don't know. I expect corporate IBM required the move. It works for them.
ID: 86357 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 86363 - Posted: 29 May 2018, 19:02:46 UTC - in response to Message 86362.  

They had a few hiccups at the time of the move, but have resumed their usual high reliability since.
ID: 86363 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 86368 - Posted: 29 May 2018, 20:37:24 UTC

It looks like cpdn is effectively off line, due to the huge number of access attempts.
ID: 86368 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15573
Netherlands
Message 86371 - Posted: 29 May 2018, 21:03:48 UTC - in response to Message 86368.  

That's called a denial of service attack. Distributed if there's a plan behind it.
ID: 86371 · Report as offensive
mmonnin

Send message
Joined: 1 Jul 16
Posts: 146
United States
Message 86414 - Posted: 1 Jun 2018, 19:58:36 UTC - in response to Message 86383.  

I doubt it's participants trying to get a news status - the project has been down so long there are probably a million trickles trying to get to the servers. Although I thought the boinc client would guve uo on a project comms after a few weeks of silence which it sounds like this is.


Not sure the client gives up but the duration between attempts gets longer. I'm not sure how long it ends up being. I think from what I've seen when a project is down all the tasks have the same backoff time so I would think the attempts per PC would come in bursts with periods of radio silence.
ID: 86414 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 86416 - Posted: 1 Jun 2018, 20:28:47 UTC

I don't know the reason for the access problems a few days ago, but I told the project IT person, who reset the server. Which has been working fine ever since.

People having trouble trying to upload to the original servers is a different problem.
ID: 86416 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15573
Netherlands
Message 86424 - Posted: 2 Jun 2018, 12:31:37 UTC - in response to Message 86420.  

yeah I know there''s a "exponential backoff" for a "downed project" that goes up to a week between requests I think
It's a random value with a maximum of 24 hours, which will randomly change again if 24 hours is met. This to ensure that all computers reaching the 24 hour back off don't simultaneously chime back at the same time.

but once upon a time there was supposed to be an "automatic detach" for a project that seemed dead after a month of trying.
I can't remember this, or if it's ever been implemented. Doesn't seem like a good thing, because then everyone who has old projects left on their Projects list, ones that have been deceased long ago, will have lost these projects, as with every BOINC start up it will contact all projects in the list, to see if they have new Notices...
ID: 86424 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2498
United States
Message 86425 - Posted: 2 Jun 2018, 14:14:07 UTC - in response to Message 86424.  

but once upon a time there was supposed to be an "automatic detach" for a project that seemed dead after a month of trying.
I can't remember this, or if it's ever been implemented. Doesn't seem like a good thing, because then everyone who has old projects left on their Projects list, ones that have been deceased long ago, will have lost these projects, as with every BOINC start up it will contact all projects in the list, to see if they have new Notices...
Well, thinking like a bad person, register a now dead project URL to my attack server ... .
ID: 86425 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15573
Netherlands
Message 86426 - Posted: 2 Jun 2018, 14:51:26 UTC - in response to Message 86425.  

No, that's the thing. You can't register to a dead project, because you won't be able to reach the scheduling server. But if you had the project added at the time of it going permanently dormant, after a month it won't automatically remove that project from the list.
ID: 86426 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5134
United Kingdom
Message 86429 - Posted: 2 Jun 2018, 17:41:25 UTC - in response to Message 86426.  

I can think of a way, but I'd better send you a PM about it.
ID: 86429 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15573
Netherlands
Message 86469 - Posted: 6 Jun 2018, 17:12:41 UTC

Andy Bowery, CPDN main hooha wrote:
Hi All,

The CPDN project has been taken offline. We are currently performing a database dump from the backup project. This is being performed in order to construct a new dedicated main database server.

Best regards,

Andy
ID: 86469 · Report as offensive
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2725
United Kingdom
Message 86518 - Posted: 10 Jun 2018, 6:11:06 UTC - in response to Message 86125.  

And more from Andy,

Hi All,


The CPDN project has been taken offline. We are currently performing a database dump from the backup project. This is being performed in order to construct a new dedicated main database server.


Best regards,

Andy

ID: 86518 · Report as offensive
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2725
United Kingdom
Message 86533 - Posted: 12 Jun 2018, 11:32:49 UTC - in response to Message 86525.  

I have little doubt the project will be back up. The problem lies with a major meltdown of some of Oxford Uni's computing infrastructure. However rebuilding everything is going to take a while with the many terrabytes of data involved. The aim is to make the system more robust at the same time so once it is back up and any teething problems are ironed out we should be better off than before.
ID: 86533 · Report as offensive
Jean-David

Send message
Joined: 19 Dec 05
Posts: 93
United States
Message 86572 - Posted: 16 Jun 2018, 1:03:24 UTC - in response to Message 86424.  

I have received no work in months (Linux machine), so my back-off must be at maximum. But it seems nowhere near a week
My recent experience for the last week has been:

08-Jun-2018 02:52:30 Scheduler request failed: Couldn't connect to server
08-Jun-2018 06:43:37 Scheduler request failed: Couldn't connect to server
08-Jun-2018 12:11:38 Scheduler request failed: Couldn't connect to server
08-Jun-2018 17:15:14 Scheduler request failed: Couldn't connect to server
09-Jun-2018 02:11:38 Scheduler request failed: Couldn't connect to server
09-Jun-2018 06:48:27 Scheduler request failed: Couldn't connect to server
09-Jun-2018 10:38:09 Scheduler request failed: Couldn't connect to server
09-Jun-2018 15:04:30 Scheduler request failed: Couldn't connect to server
09-Jun-2018 18:32:13 Scheduler request failed: Couldn't connect to server
09-Jun-2018 23:39:00 Scheduler request failed: Couldn't connect to server
10-Jun-2018 00:52:25 Scheduler request failed: Couldn't connect to server
10-Jun-2018 05:14:38 Scheduler request failed: Couldn't connect to server
10-Jun-2018 10:13:41 Scheduler request failed: Couldn't connect to server
10-Jun-2018 18:34:00 Scheduler request failed: Couldn't connect to server
11-Jun-2018 01:16:15 Scheduler request failed: Couldn't connect to server
11-Jun-2018 06:10:27 Scheduler request failed: Couldn't connect to server
11-Jun-2018 10:47:09 Scheduler request failed: Couldn't connect to server
11-Jun-2018 15:15:29 Scheduler request failed: Couldn't connect to server
11-Jun-2018 20:56:30 Scheduler request failed: Couldn't resolve host name
12-Jun-2018 01:12:38 Scheduler request failed: Couldn't connect to server
12-Jun-2018 05:13:42 Scheduler request failed: Couldn't connect to server
12-Jun-2018 10:32:04 Scheduler request failed: Couldn't connect to server
12-Jun-2018 15:25:49 Scheduler request failed: Couldn't connect to server
12-Jun-2018 20:49:24 Scheduler request failed: Couldn't connect to server
13-Jun-2018 02:06:54 Scheduler request failed: Couldn't connect to server
13-Jun-2018 07:11:33 Scheduler request failed: Couldn't connect to server
13-Jun-2018 14:36:32 Scheduler request failed: Couldn't connect to server
13-Jun-2018 14:38:29 Fetching scheduler list
13-Jun-2018 14:38:32 Master file download succeeded
13-Jun-2018 19:33:30 Scheduler request failed: Couldn't connect to server
13-Jun-2018 20:13:45 Scheduler request failed: Couldn't connect to server
13-Jun-2018 23:54:30 Scheduler request failed: Couldn't connect to server
14-Jun-2018 02:39:36 Scheduler request failed: Couldn't connect to server
14-Jun-2018 07:48:25 Scheduler request failed: Couldn't connect to server
14-Jun-2018 12:38:24 Scheduler request failed: Couldn't connect to server
14-Jun-2018 18:25:26 Scheduler request failed: Couldn't connect to server
15-Jun-2018 00:39:47 Scheduler request failed: Couldn't connect to server
15-Jun-2018 07:53:30 Scheduler request failed: Couldn't connect to server
15-Jun-2018 12:57:33 Scheduler request failed: Couldn't connect to server
15-Jun-2018 18:36:10 Scheduler request failed: Couldn't connect to server
15-Jun-2018 18:38:05 Fetching scheduler list
15-Jun-2018 18:38:10 Master file download succeeded
ID: 86572 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 86573 - Posted: 16 Jun 2018, 1:59:29 UTC

There's been NO work for Linux for over a year, so unless you run Wine, you're wasting your time.
And the project is still off line.
ID: 86573 · Report as offensive
Jean-David

Send message
Joined: 19 Dec 05
Posts: 93
United States
Message 86577 - Posted: 16 Jun 2018, 16:02:20 UTC - in response to Message 86573.  
Last modified: 16 Jun 2018, 16:28:17 UTC

Not really wasting my time. I run World Community Grid(30), Rosetta@home(13), Seti@home(13) too. But I am really disappointed because I feel CPDN is the most important thing I could run. If they all had work units, 44% of my spare resources would go to CPDN. (Actually, those resource shares result in CPDN getting 50% of my spare CPU time.)
ID: 86577 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 86578 - Posted: 16 Jun 2018, 16:58:03 UTC - in response to Message 86577.  

But I am really disappointed because I feel CPDN is the most important thing I could run.

I would feel that way too if they did more predictive work. But insofar as I can tell, most of their studies are why some climate event occurred ten years ago. That is nice to know, but not of as much actionable significance as something ten years in the future. Maybe it is their academic inclination, or maybe that is all the data supports.
ID: 86578 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 86582 - Posted: 16 Jun 2018, 20:08:00 UTC
Last modified: 16 Jun 2018, 20:40:18 UTC

Climate prediction work is done by the various weather bureaus.
The part of the research centres running models on cpdn are doing "attribution studies".

Some of these are listed on the front page, which is not BOINC related, and is still running.

It's starting to get suspicious as to why it's still down, as the last post from Andy implied that large parts of the uni had lost their servers.
I'm wondering if there's been a policy change "upstairs" which has stopped things. And if the project will reappear in a different form.

edit
I just realised: Trinity term finished yesterday, and Michaelmas term doesn't start until October.
So the University of Oxford is deserted again, except for maintenance/upgrade people.
ID: 86582 · Report as offensive
Alan K

Send message
Joined: 16 Dec 16
Posts: 6
United Kingdom
Message 86583 - Posted: 16 Jun 2018, 21:21:11 UTC - in response to Message 86582.  
Last modified: 16 Jun 2018, 21:21:32 UTC

Not forgetting all the PhD students and post docs. Does that mean Andy might get somewhere?
ID: 86583 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 86585 - Posted: 16 Jun 2018, 21:46:56 UTC

Ah, the advanced students don't get to go on hols.
Andy works for the IT section, so he could be anywhere.

The "good news" is that there's still a message on the url for the Message board, and the front page of the project is still there.
So some vestiges of the project are still alive.

I did wonder if anyone "here" knew any students at Oxford, who might be able to find out about rumors and goings on.
But there's always the BOINC workshop in a months time. Perhaps some news will come out of that.
Long way off though.
ID: 86585 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Projects : CPDN project offline again

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.