Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 13 · Next

AuthorMessage
Thyme Lawn

Send message
Joined: 2 Sep 05
Posts: 103
United Kingdom
Message 34330 - Posted: 23 Aug 2010, 13:40:36 UTC
Last modified: 23 Aug 2010, 13:40:50 UTC

CPDN Main Project Upload Problem

uploader.oerc.ox.ac.uk is back up but Milo is not sure how stable it is going to be (a disk in the RAID array was showing a SMART failure). He is contacting the manufacturer for support.
ID: 34330 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 34361 - Posted: 25 Aug 2010, 13:18:56 UTC

CPDN main project upload outage

The disk of upload server uploader.oerc which caused earlier problems has failed completely. Milo ordered a replacement disk and must now install it. As during the previous outage, the server may display as running on the Server status page even though it is down.

The outage will affect FAMOUS upload files 3, 6, 9, 12, 15 and 18. If other files become stuck in Transfers please see Thyme Lawn's advice two posts above this.

You can choose whether to suspend Boinc network activity or take no action.
ID: 34361 · Report as offensive
Profileidahofisherman
Avatar

Send message
Joined: 11 Aug 06
Posts: 154
United States
Message 34363 - Posted: 25 Aug 2010, 19:14:17 UTC

Whats happening with Goldbach's Conjecture? It has been down for about a week or more.
ID: 34363 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 34368 - Posted: 26 Aug 2010, 1:16:49 UTC

CPDN uploads

Uploader.oerc is running and accepting uploads.

ID: 34368 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 34386 - Posted: 27 Aug 2010, 11:59:35 UTC

CPDN main project uploads

As you can see from the Server status page upload server uploader.oerc is down. Its disk is nearly full so Milo is transferring data from it. Affected uploads will be FAMOUS files 3, 6, 9, 12, 15 and 18. Files with other numbers upload to different servers and will not be affected.

You can either suspend Boinc network activity or take no action.
ID: 34386 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34388 - Posted: 27 Aug 2010, 14:52:16 UTC

DNetc went offline yesterday at about 1PM PDT for a planned software update. The estimate was for a five hour outage. For several hours after that the home page and message boards were live. At about 5PM, they went offline and the site (at least til now - 8AM Friday PDT) is totally offline.

MW went offline sometime after 7AM this morning -- totally offline. For MW, this sort of thing happens often enough when the good folks at RPI are working at reconfiguring power to the building -- something they do unusually frequently.
ID: 34388 · Report as offensive
ProfileGundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 34392 - Posted: 27 Aug 2010, 17:59:31 UTC - in response to Message 34388.  

DNetc went offline yesterday at about 1PM PDT for a planned software update. The estimate was for a five hour outage. For several hours after that the home page and message boards were live. At about 5PM, they went offline and the site (at least til now - 8AM Friday PDT) is totally offline.

From Message 89655 at the BOINCStats News forum:
DNETC RAID Failure
One of the raid hdd's have died in half of update process - we are trying to restore project from backup. Project off - for one day (28.08)

Gruß,
Gundolf
ID: 34392 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34395 - Posted: 27 Aug 2010, 21:36:47 UTC - in response to Message 34392.  

Ah --- thanks for that. I hope they were at least in RAID 5 mode (ideally RAID 5 plus a hot spare -- as disks are not that expensive). Then again, I know some folks love the speed of a multi-drive RAID 0 setup....


DNetc went offline yesterday at about 1PM PDT for a planned software update. The estimate was for a five hour outage. For several hours after that the home page and message boards were live. At about 5PM, they went offline and the site (at least til now - 8AM Friday PDT) is totally offline.

From Message 89655 at the BOINCStats News forum:
DNETC RAID Failure
One of the raid hdd's have died in half of update process - we are trying to restore project from backup. Project off - for one day (28.08)

Gruß,
Gundolf

ID: 34395 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 34398 - Posted: 28 Aug 2010, 3:48:23 UTC

CPDN main project uploads

Uploader.oerc is up and running.

ID: 34398 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34406 - Posted: 28 Aug 2010, 17:02:52 UTC - in response to Message 34392.  

Looks like Dnetc is having further trouble with the rebuild. Late last night they had their home page address up (with no content) so it lloked like some progress. Now, eight hours later, they are back to their start point. Perhaps another drive failure???
ID: 34406 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34416 - Posted: 29 Aug 2010, 15:47:50 UTC - in response to Message 34406.  

OK -- Dnetc repeated 24 hours later the same cycle, home up with no content, then 8 hours later, back to the same 'dead' start point. Perhaps their back ups are corrupt as well.

Looks like Dnetc is having further trouble with the rebuild. Late last night they had their home page address up (with no content) so it lloked like some progress. Now, eight hours later, they are back to their start point.

ID: 34416 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34426 - Posted: 29 Aug 2010, 21:39:54 UTC - in response to Message 34416.  

Perhaps the folks at Dnetc are going to have to start all over from scratch -- looks like their various recovery efforts are not meeting with success at all.
ID: 34426 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34443 - Posted: 30 Aug 2010, 21:54:36 UTC - in response to Message 34426.  

Still no follow up from Dnetc -- they have a placekeeper up at their home page -- but nothing further over the past 24 hours. I do hope they can get their project back up, since it is one of only two projects with broad based (single precision and double precision CUDA and ATI) GPU support.


Index of /
[ICO] Name Last modified Size Description
[DIR] webalizer/ 30-Aug-2010 15:16 -
Apache/2.2.16 (Debian) Server at dnetc.net Port 80
ID: 34443 · Report as offensive
Andrew Hime

Send message
Joined: 30 Aug 10
Posts: 1
United States
Message 34446 - Posted: 30 Aug 2010, 22:57:07 UTC

You guys can always go to http://www.distributed.net and download the Stream client in the meantime. The keyrate for RC5 is down 75%.
ID: 34446 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34448 - Posted: 30 Aug 2010, 23:55:41 UTC - in response to Message 34426.  

Dnetc is working through a full set of issues -- hardware first (Hard drive and memory), then software (working with a restore from backup). They hope to be back online later in the week.
ID: 34448 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 34455 - Posted: 31 Aug 2010, 11:00:02 UTC

CPDN main project

Milo is moving data from server disks so they don't fill up. There will be some upload server outages. Here is the Server status page.
ID: 34455 · Report as offensive
Thyme Lawn

Send message
Joined: 2 Sep 05
Posts: 103
United Kingdom
Message 34462 - Posted: 31 Aug 2010, 19:56:22 UTC

CPDN main project

Milo has temporarily shut the project down to perform a database archive.

This means that all scheduler requests will fail (including trickles, work requests, reporting completed tasks and attaching to the project) and the BOINC forums are inaccessible.

uploader.oerc is also unavailable while more space is being made available by moving files elsewhere. This affects FAMOUS upload files 3, 6, 9, 12, 15 and 18.
ID: 34462 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34472 - Posted: 1 Sep 2010, 6:10:43 UTC

Both MW and Dnetc are back up and running -- good news there for GPU folks.

In addition to the ongoing storage issues for Climate, another of the larger projects (Einstein) has run into a problem at the server side - no uploads or downloads until they figure out the problem (or perhaps simply stop/start the server).

ID: 34472 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 34627 - Posted: 11 Sep 2010, 20:37:38 UTC

Seti and Seti Beta are down:

The air conditioning in our server closet failed this morning, causing several of our important systems to overheat. The servers have since been shut down, but until we survey the damage and address these cooling issues all parts of the project will remain offline.


Claggy
ID: 34627 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 34641 - Posted: 12 Sep 2010, 12:41:56 UTC

CPDN main project

Milo has had to disable upload climateapps1.oucs which is an upload server that takes regional model files. Its disk is full and data will need to be copied from it tomorrow.

If you have HadAM3P regional model files that cannot upload you could temporarily turn off Boinc network activity (in the Activity menu). Or you could take no action.

Here is the Server status page.

ID: 34641 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 13 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.