Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 21 · Next

AuthorMessage
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 52114 - Posted: 23 Jan 2014, 22:08:18 UTC - in response to Message 52113.  

You mean CPDN? There are two , climate@home and climateprediction@home.
ID: 52114 · Report as offensive
Thyme Lawn

Send message
Joined: 2 Sep 05
Posts: 103
United Kingdom
Message 52131 - Posted: 24 Jan 2014, 23:44:04 UTC
Last modified: 24 Jan 2014, 23:44:36 UTC

malariacontrol.net went offline at around 0900 UTC on Friday 24 January and will be down for at least this weekend. All pages are currently generating the following message:

Site is temporary unavailable.

Sorry, we had a major failure! We're dealing with it and should be back next week, but the server will be down at least over this weekend.

We apologize for any inconvenience.
ID: 52131 · Report as offensive
MarkJ
Volunteer tester
Help desk expert

Send message
Joined: 5 Mar 08
Posts: 272
Australia
Message 52134 - Posted: 25 Jan 2014, 3:00:56 UTC
Last modified: 25 Jan 2014, 3:01:40 UTC

Asteroids@home scheduled outage
There is going to be another server upgrade on Monday 27.01.2014 starting at 07:30 UTC. RAID array and filesystem will have to be adjusted so it will take more time. Server will be offline approximately 3 days. I assume being online on Wednesday 29.01.2014 till 11:00 UTC. If there will be more delay I will inform about it on twitter.

Radim Vančo (Kyong)
ID: 52134 · Report as offensive
ProfileBlurf

Send message
Joined: 18 Jul 11
Posts: 217
United States
Message 52210 - Posted: 29 Jan 2014, 2:08:33 UTC

climateprediction.net still down. No website, nada
ID: 52210 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15572
Netherlands
Message 52211 - Posted: 29 Jan 2014, 2:27:27 UTC - in response to Message 52210.  

Their VM crashed and they haven't got a clue what caused it or how to fix that.

Jonathan Miller wrote:
There has been what appears to be a complete failure of the Virtual Machines that host our websites and upload servers.

We have had no information about what might be wrong, nor any indication of how long it will be before they are back.
ID: 52211 · Report as offensive
David Ball

Send message
Joined: 2 Dec 06
Posts: 69
United States
Message 52212 - Posted: 29 Jan 2014, 6:10:20 UTC - in response to Message 52131.  

Any news on when Malariacontrol will be back online? All these WU trying to upload or waiting to report are a pain, especially as they are now overdue. Since projects that have big crashes often have to restore from a backup and lose the record of outstanding WU, I'm wondering whether to reset the project to try to get rid of them but I'm not exactly sure what a project reset from BOINC manager would do if the project is unreachable.

malariacontrol.net went offline at around 0900 UTC on Friday 24 January and will be down for at least this weekend. All pages are currently generating the following message:

Site is temporary unavailable.

Sorry, we had a major failure! We're dealing with it and should be back next week, but the server will be down at least over this weekend.

We apologize for any inconvenience.
ID: 52212 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 52214 - Posted: 29 Jan 2014, 11:24:28 UTC - in response to Message 52212.  

We're all in the same boat. I've just suspended the project for now so BOINC wont keep polling for new work from that at least. I'm going to hang on to my completed work for now. It shouldn't be long now I hope.

Any news on when Malariacontrol will be back online? All these WU trying to upload or waiting to report are a pain, especially as they are now overdue. Since projects that have big crashes often have to restore from a backup and lose the record of outstanding WU, I'm wondering whether to reset the project to try to get rid of them but I'm not exactly sure what a project reset from BOINC manager would do if the project is unreachable.

malariacontrol.net went offline at around 0900 UTC on Friday 24 January and will be down for at least this weekend. All pages are currently generating the following message:

Site is temporary unavailable.

Sorry, we had a major failure! We're dealing with it and should be back next week, but the server will be down at least over this weekend.

We apologize for any inconvenience.
ID: 52214 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 52220 - Posted: 29 Jan 2014, 21:33:58 UTC - in response to Message 52096.  

He's just posted an update -- not good news -- figure about 2 weeks before the server is rebuilt and online:

The Collatz server is currently down due to a hard drive controller failure. New hardware should arrive in the next day or two. A complete rebuild of the operating system, BOINC, and the Collatz software and project setup will then need to be done prior to the project coming back online. As this will be a complete rebuild, there is no sense hanging on to any completed workunits at this time. So, go ahead and abort any workunits you have whether they are complete or not. I expect the project will remain down for the next 10-15 days while I set it all back up and test it. Sorry for any inconvenience. And yes, I will make up for the lost credits once everything is back up and running.


I just checked back to the Collatz site -- no response at all again. That might actually be good news -- as for the past day or so the place holder saying 'server down, don't know when it will be back live' was showing.

The inference is that it is being worked on.

The latest Collatz update:

Collatz Status

1/23/2014
•All hard drives replaced
•Debian 7 installed
•BOINC software prerequisites installed.
•DOCSIS 3.0 modem upgrade - speed now 3-5 times faster
•Firewall upgrade (now uses uptables)

1/28/2014
•Get BOINC Source code
•Compile default BOINC server software
•Configure MySQL
•Restore Collatz database
•Write custom app to "undump" stats files back into database to get restore the latest user, host, and team stats
•Identify oldest non-complete workunit (a.k.a. don't allow gaps due to the lost WUs)
•Cancel all existing workunits (I am not going to manually create WU files this time.)


1/29/2014
•Compile all server daemons
•Compile OpenCL for CPU and GPU apps


Still To Do...
•Modify web pages with Collatz changes
•Verify all daemons and cron jobs are working properly
•Configure database backup and off-site storage
•Modify the BOINC scheduler for Collatz specific apps/plan classes
•Configure Web Stats
•Add the Collatz client apps (new solo apps for CPUs!!!)
•Test each app and platform
•...all the other things I haven't listed


Claggy
ID: 52220 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 52224 - Posted: 30 Jan 2014, 5:18:01 UTC - in response to Message 52114.  

CPND -- Climate Prediction -- been offline for a couple of days

You mean CPDN? There are two , climate@home and climateprediction@home.
ID: 52224 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 52226 - Posted: 30 Jan 2014, 17:14:45 UTC - in response to Message 52224.  

CPND -- Climate Prediction -- been offline for a couple of days

You mean CPDN? There are two , climate@home and climateprediction@home.



CPDN is back up.

MalariaControl is still down

Orbit@Home is still offline.
ID: 52226 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2497
United States
Message 52227 - Posted: 30 Jan 2014, 17:32:01 UTC

ABCLL@Home seems to have gone MIA. (Project hasn't had work for some time).
ID: 52227 · Report as offensive
David Ball

Send message
Joined: 2 Dec 06
Posts: 69
United States
Message 52232 - Posted: 31 Jan 2014, 1:58:21 UTC

Volpex ( http://volpex.cs.uh.edu/VCP/ ) hasn't had work for a while and now the website has been down for about 4 days. Does anyone know if the project is down permanently?
ID: 52232 · Report as offensive
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1451
United States
Message 52234 - Posted: 31 Jan 2014, 2:22:40 UTC - in response to Message 52232.  

No, If I recall, one of the things that the admins posted recently was they are working on is moving the project to a new and better/more reliable server but were having a few bugs to work out.

It never has been at 100% up time even when work was available.
ID: 52234 · Report as offensive
Warped
Avatar

Send message
Joined: 25 Aug 08
Posts: 40
South Africa
Message 52335 - Posted: 4 Feb 2014, 16:35:07 UTC

Malariacontrol.net remains down.

Now I see Rosetta is also offline:
http://www.downforeveryoneorjustme.com/http://boinc.bakerlab.org/rosetta/
ID: 52335 · Report as offensive
Warped
Avatar

Send message
Joined: 25 Aug 08
Posts: 40
South Africa
Message 52337 - Posted: 4 Feb 2014, 19:23:11 UTC - in response to Message 52335.  

Now I see Rosetta is also offline:
http://www.downforeveryoneorjustme.com/http://boinc.bakerlab.org/rosetta/


The site is showing signs of life. I must have missed this message:
Jan 29, 2014
We will be moving the hardware that supports Rosetta@Home on Tuesday, Feb 4th within the datacenter here at the UW. This will require that the entire project be taken offline for the duration. While we will try to minimize the down-time, we are planning for the system being offline from 0800 PST until 1500 PST on the fourth. -KEL & DOVA
ID: 52337 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 52351 - Posted: 5 Feb 2014, 14:50:41 UTC - in response to Message 52335.  
Last modified: 5 Feb 2014, 15:04:17 UTC

Malariacontrol.net remains down.

Now I see Rosetta is also offline:
http://www.downforeveryoneorjustme.com/http://boinc.bakerlab.org/rosetta/


Both websites are back in operation.

physics@home remains offline.
ID: 52351 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2497
United States
Message 52360 - Posted: 5 Feb 2014, 17:52:11 UTC

Looks like Seti has crashed.
ID: 52360 · Report as offensive
David S
Avatar

Send message
Joined: 15 Jan 13
Posts: 766
United States
Message 52372 - Posted: 5 Feb 2014, 20:01:53 UTC - in response to Message 52360.  

Looks like Seti has crashed.

Including Beta.
signature
ID: 52372 · Report as offensive
David S
Avatar

Send message
Joined: 15 Jan 13
Posts: 766
United States
Message 52390 - Posted: 6 Feb 2014, 0:22:08 UTC

On Seti's home page:

The machine that serves the boinc database crashed a few hours ago. The project is down until the database recovery is complete.

Oscar, the BOINC database server, showed as online when I checked, but it has been reported going up and down in the "Seti is down Cafe" thread in the Lounge forum.
signature
ID: 52390 · Report as offensive
David S
Avatar

Send message
Joined: 15 Jan 13
Posts: 766
United States
Message 52393 - Posted: 6 Feb 2014, 1:59:11 UTC

Seti is back, in case no one noticed.
signature
ID: 52393 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 21 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.