Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 13 · Next

AuthorMessage
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34645 - Posted: 12 Sep 2010, 15:52:32 UTC - in response to Message 34641.  

The Climate disk full scenario seems to be moving to chronic -- are there any plans for moving forward from this so that Milo doesn't have to semi-regularly shut down and move data around for a temporary solution.
ID: 34645 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34646 - Posted: 12 Sep 2010, 15:55:39 UTC - in response to Message 34627.  

I am guessing that the problems at SETI won't be looked at until Monday, and then at best won't be resolved until it is time for the weekly half-week shutdown, meaning that SETI is not going to be available for uploads/downloads until Friday this coming week, at which time the accumulated traffic jam may make things difficult for a period of days.

The thing is, the current 'half-week on/half-week off' cycle means that any problems that occur during the 'half-week on' phase become quite a bit more disruptive for SETI.
ID: 34646 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34651 - Posted: 12 Sep 2010, 19:32:57 UTC

The periodic validator/work generator memory leak bug has resurfaced for Milkyway -- no new work, completed work not getting validated. Probably will be resolved by Monday noon.
ID: 34651 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15552
Netherlands
Message 34656 - Posted: 12 Sep 2010, 21:35:27 UTC - in response to Message 34646.  

I am guessing that the problems at SETI won't be looked at until Monday, and then at best won't be resolved until it is time for the weekly half-week shutdown, meaning that SETI is not going to be available for uploads/downloads until Friday this coming week, at which time the accumulated traffic jam may make things difficult for a period of days.

Yeah, I'll wait till Tuesday before deciding if I am going to abort the 68 tasks in perpetual download/cannot download cycle. Too bad it'll reduce my next week's allowance to 30 tasks. ;-)
ID: 34656 · Report as offensive
Bernd

Send message
Joined: 24 Aug 09
Posts: 91
United States
Message 34657 - Posted: 13 Sep 2010, 3:32:27 UTC

SETI is now totally offline. The main webpage was online earlier today. I guess things are worse than originally thought.
ID: 34657 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2491
United States
Message 34658 - Posted: 13 Sep 2010, 6:16:56 UTC - in response to Message 34657.  

SETI is now totally offline. The main webpage was online earlier today. I guess things are worse than originally thought.

thinman is there answering pings and traceroute, but it looks like apache is not running. Don't know if it was disabled by hand or if drives went offline due to heat damage.

In any case we know the project isn't coming up until the A/C repairman has finished. Hope he didn't need parts and has to order parts Monday AM then wait for a shipment from a supplier.

ID: 34658 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34663 - Posted: 13 Sep 2010, 16:23:34 UTC - in response to Message 34658.  

If memory serves, this is at least the second time in the past few years that the SETI project has had an AC failure in their server closet. One may hope that from this they are able to implement physical changes that improve the reliability of their cooling there and perhaps provide amore adequate closet layout (that is perhaps they have too much equipment in a too small space to begin with).

With their half week weekly offline schedule, it may well be until the end of this week that we see things back to 'normal' (once they are fully online, a week long outage is going to show up with major traffic jams (upload/download) which may take over a day to sort out.


SETI is now totally offline. The main webpage was online earlier today. I guess things are worse than originally thought.

thinman is there answering pings and traceroute, but it looks like apache is not running. Don't know if it was disabled by hand or if drives went offline due to heat damage.

In any case we know the project isn't coming up until the A/C repairman has finished. Hope he didn't need parts and has to order parts Monday AM then wait for a shipment from a supplier.

ID: 34663 · Report as offensive
ProfileByron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 30 Aug 05
Posts: 505
Canada
Message 34664 - Posted: 13 Sep 2010, 16:47:59 UTC

SETI@home _ Server status page _ Just for info nothing yet as of this post. Give Matt and crew time to do their thing.
ID: 34664 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2491
United States
Message 34668 - Posted: 13 Sep 2010, 23:10:21 UTC

SETI update
Monday Update: The air conditioning in our server closet failed on Saturday, causing several of our important systems to overheat. Most servers and services were shut down and will remain down as we continue to wait for the air conditioning to be fixed, which will be Tuesday at the earliest.
13 Sep 2010 22:37:07 UTC


Looks like they (Berkeley) did need to order parts.

ID: 34668 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 34670 - Posted: 14 Sep 2010, 11:46:51 UTC

CPDN main project

Upload server climateapps1.oucs is now up and running though Milo may need to disable it later today for a short update. There is no need to take action.

ID: 34670 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34673 - Posted: 14 Sep 2010, 16:24:54 UTC

Dnet went offline about 2 hours ago (7AM PDT) -- home page can't be reached.

SETI remains offline (at a guess it will remain offline for the week).

ID: 34673 · Report as offensive
sygopet

Send message
Joined: 9 Mar 06
Posts: 12
United Kingdom
Message 34674 - Posted: 14 Sep 2010, 17:28:53 UTC
Last modified: 14 Sep 2010, 17:33:41 UTC

Seti update, Tuesday 16:24 UTC
Most servers and services were shut down and will remain down as we continue to wait for the air conditioning to be fixed, which will be Tuesday at the earliest.


Looks like Friday at the earliest to me, if the usual midweek shutdown is included!
ID: 34674 · Report as offensive
Bernd

Send message
Joined: 24 Aug 09
Posts: 91
United States
Message 34675 - Posted: 14 Sep 2010, 18:20:36 UTC - in response to Message 34674.  

Seti update, Tuesday 16:24 UTC
Most servers and services were shut down and will remain down as we continue to wait for the air conditioning to be fixed, which will be Tuesday at the earliest.


Looks like Friday at the earliest to me, if the usual midweek shutdown is included!


small parts came online today. Looks like they are allowing interrupted wu to download. However it looks like full restore will be delayed until after the midweek shutdown passes.
ID: 34675 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2491
United States
Message 34676 - Posted: 14 Sep 2010, 21:34:58 UTC

Seti Update
Air Conditioning Update (Tuesday): The air conditioning in our server closet failed on Saturday, causing several of our important systems to overheat. Most servers and services were shut down to prevent damage and remained down until the air conditioner was finally checked out Monday afternoon. We tested the system under full load this morning and it failed again, so full production is delayed at least another day. Sorry for the confusion/inconvenience.
14 Sep 2010 21:31:43 UTC


Not looking good.

ID: 34676 · Report as offensive
ProfileByron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 30 Aug 05
Posts: 505
Canada
Message 34686 - Posted: 15 Sep 2010, 18:30:44 UTC
Last modified: 15 Sep 2010, 19:30:40 UTC

SETI@home Update _ Wednesday _ 15 Sep 2010 17:17:53 UTC

15 Sep 2010 17:17:53 UTC

Air Conditioning Update (Wednesday): The air conditioning in our server closet failed on Saturday, causing several of our important machines to overheat. Most servers and services were immediately shut down to prevent damage and mostly remained down as initial attempts to fix the air conditioning system failed and new parts are being ordered.

We just started the project up to help clear some pipes but some services will remain off, and
the rest may go off again at any time.
Sorry for the confusion/inconvenience.

15 Sep 2010 17:17:53 UTC

SETI@home _ Server status page
SETI@home _ Message boards _ appear to be back as of this posting.
ID: 34686 · Report as offensive
Bernd

Send message
Joined: 24 Aug 09
Posts: 91
United States
Message 34689 - Posted: 16 Sep 2010, 0:46:17 UTC

Looks like seti is down till tomorrow. The web site is offline again. They did tell us that it may die again as final repairs had not been made.
ID: 34689 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2491
United States
Message 34691 - Posted: 16 Sep 2010, 3:37:36 UTC

SETI
http://setiathome.berkeley.edu/forum_thread.php?id=61442&nowrap=true#1033202
Jeff Cobb writes:
We're taking the project off line for the night, partly (mostly) for temperature concerns, but also to let the back end queues drain and give more I/O to the thumper root mirror re-sync that became necessary.


ID: 34691 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15552
Netherlands
Message 34709 - Posted: 16 Sep 2010, 22:17:49 UTC

Seti@Home Update:
Jeff Cobb wrote:
OK then, I think we may be up. The condenser fan motor was replaced today and we brought the projects up. It's good that we can test under load while we're still at the lab.

Assuming that there are no problems, we'll just call this the start of the server run for this week.
ID: 34709 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 34728 - Posted: 17 Sep 2010, 16:57:47 UTC

Dnetc is currently offline -- they are performing a hardware upgrade (moving up to a RAID 5 array). They had shorter outages during the week as they got the hardware physically installed. I'm guessing the current outage is the 'big one' for them -- they have been offline for about 10 hours so far.

This is a planned outage though -- it was announced on the homepage as September 14 to September 18.
ID: 34728 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15552
Netherlands
Message 34729 - Posted: 18 Sep 2010, 0:08:36 UTC

Looks like Orbit@Home bit the dust already. From "In production" to "down" in two days. Must be a BOINC record. :)
ID: 34729 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 13 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.