Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next

AuthorMessage
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 15646 - Posted: 28 Feb 2008, 17:25:00 UTC - in response to Message 15641.  

Access to the Milkyway site also seems to be unavailable.

Whatever it was, they're back now.
ID: 15646 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15650 - Posted: 2 Mar 2008, 5:12:38 UTC

MikeMarsUK posted in CPDN news on Friday 29 Feb

Following the recent maintenance earlier this week, the trickle, credit and statistics export daemons have now been restarted on the climateprediction.net servers (the other climate projects have not been affected). Accordingly you should see the lagged trickles appear on the website today, credit should appear tomorrow, and the external statistics sites should be updated with the new credit on the day after tomorrow.


Now, however................

CPDN upload server climateapps3 is not running. This means that some computers running CPDN cannot upload trickles and zip files. The best advice is

* For CPDN-only computers, suspend BOINC network activity until the server is running again. Your models can continue to crunch.

* For multi-project computers whose network activity can't be suspended, if possible suspend your HADCM model in your BOINC manager Tasks window before the model reaches its next trickle creation point at the beginning of December or zip file creation point at the beginning of December at the end of a decade. HADSM and HADAM models trickle more frequently and can be suspended immediately. This will help avoid multiple failed upload attempts and messages.

We realise that this is inconvenient for multi-project crunchers.

* Check CPDN server status here.

BBC, SAP and CPDN Beta uploads are not affected.
ID: 15650 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15651 - Posted: 2 Mar 2008, 22:56:08 UTC

The CPDN upload servers are all now functional and accepting trickles and zip files. Upload server climateapps3 had filled up so Tolu had to move data to another machine. This other disk is now almost full so Milo will deal with it this week.

So there will probably be at least one further CPDN outage during the next few days. If and when this happens the advice in my last post will again be useful. Some CPDN-only crunchers may prefer to suspend BOINC network activity all week and only enable it for short periods after checking the server status page.

At least none of the servers are broken and there are plans to provide CPDN with much more storage capacity.

Some successful trickles have not yet appeared on our models' web pages which means some members probably still have a credit delay. Many thanks to all CPDN members for your patience.

ID: 15651 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 15659 - Posted: 4 Mar 2008, 11:25:35 UTC
Last modified: 4 Mar 2008, 11:25:49 UTC

Orbit@Home is back up after a server change. Their address has changed as well. You need to register again, if you haven't done so and you need to detach your BOINC from the old address and re-attach to the new address at http://orbit.psi.edu/oah/
ID: 15659 · Report as offensive
Pepo
Avatar

Send message
Joined: 3 Apr 06
Posts: 547
Slovakia
Message 15667 - Posted: 5 Mar 2008, 0:44:31 UTC - in response to Message 14362.  

XtremLab is possibly not absolutely dead. They've [...] changed the ETA to "Our server is temporarily closed. It will be reopened in a near futur."

XtremLab's front page is live again, but just to tell, among other things, that
Our project is actually stopped and not collecting new traces. [...] We may resume the project with a new version of the application in the future.


Peter
ID: 15667 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15669 - Posted: 5 Mar 2008, 2:31:31 UTC


Milo is moving a lot of data around on the CPDN servers this week and will probably continue during the week beginning Mon 10 March. He hopes to do this without closing any CPDN servers down. Connections to the servers may at times be slow.
ID: 15669 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 15734 - Posted: 9 Mar 2008, 22:49:47 UTC - in response to Message 15047.  
Last modified: 9 Mar 2008, 22:50:00 UTC

Enigma@Home its outage is a prolonged one.
I posted it in the forums there as well. Today I got this message in from TJM:

I came back home yesterday at night, but tomorrow I'm leaving again.

The bad news is that I have no internet connection at all at my temporary "workplace" and that's not going to change anytime soon. For last 30 days I had almost no free time, I hope it will be better this month. I almost forgot how to type on keyboard because I haven't touched my PC/laptop since 7th Feb :D

At 28th April I'll move to another place, a bit closer to home. I doubt that I'll be able to resume the project until then. The worst scenario is still the end of summer, I'll be back home around 20th September (at least now I know when)..

The only good thing is that I'm saving money, when I'll be back here I'll upgrade my connection and both servers.


Planned outage message on Enigma@Home:

TJM wrote:
At the beginning of February I'm moving to a place where my access to the internet (and probably amount of free time) will be very limited (yes, there are such places in Poland :-)). I guess that (at least for first few weeks) my cellphone + gprs will be the best connection I'll be able to get there.
That means I'll have to suspend the project for some time, because I can't just leave the servers up and hope that nothing will crash/fail while I'll be hundreds of kilometers away without any way to fix problems.
Around 26th January I'll stop the workunit generator, then I'll wait for results to return. The server software will be moved to a temporary location (same hosting where project's http reverse-proxy and download mirror is running atm), where I'll be able to keep it online without worrying about hardware failures or longer power outages; however it will remain suspended for at least few weeks, until I'll get any decent internet connection.
I don't know (yet) when I'll be back, I'll post an update on this around mid Feb.

(Source thread)
ID: 15734 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 15755 - Posted: 10 Mar 2008, 17:32:05 UTC - in response to Message 15753.  
Last modified: 10 Mar 2008, 17:44:32 UTC

Seems Proteins@Home is off the air since a few hours. Both result upload fails and visiting home page & forum time out.

I got some workunits with deadline on year 1910. I eported it on the forum, and admin said "I'm checking that. The server is down for now." Minutes later, the website became inaccessible.
ID: 15755 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 15758 - Posted: 10 Mar 2008, 17:48:55 UTC - in response to Message 15755.  

Minutes later, the website became inaccessible.

It probably also trans-warped to 1910. Make sure you have your punch cards ready for when it returns. :)
ID: 15758 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 15772 - Posted: 10 Mar 2008, 20:15:25 UTC - in response to Message 15758.  

Minutes later, the website became inaccessible.

It probably also trans-warped to 1910. Make sure you have your punch cards ready for when it returns. :)

I think it was the latest code mistake that affected them (see [boinc_projects] botched checkin).

I still get such insane deadlines, so I set NNW for now.
ID: 15772 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15839 - Posted: 12 Mar 2008, 19:15:29 UTC

Several CPDN members have reported today that their model zip files have failed to upload even though CPDN server status shows as all green ie up. There is still heavy load on the CPDN servers. The best idea is (as usual when this happens) to suspend BOINC network activity if you can and try again a few hours later or tomorrrow. Suspending network activity avoids multiple failed upload attempts and multiple BOINC messages.

We understand that multiproject crunchers may not be able suspend BOINC network activity.

Thanks to all crunchers for your patience.
ID: 15839 · Report as offensive
ProfileKSMarksPsych
Avatar

Send message
Joined: 30 Oct 05
Posts: 1239
United States
Message 15861 - Posted: 13 Mar 2008, 10:31:38 UTC - in response to Message 15860.  
Last modified: 13 Mar 2008, 10:32:37 UTC

Seem QMC@Home if off the air. No work, no home page, no forum.

13/03/2008 10:41:20|QMC@HOME|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 0 completed tasks
13/03/2008 10:41:25|QMC@HOME|Scheduler request failed: Couldn't connect to server



Martin took the server down for a hardware upgrade. I just checked email. Sorry for the delayed notification.
Kathryn :o)
ID: 15861 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15888 - Posted: 14 Mar 2008, 3:21:36 UTC

Several CPDN Beta2 servers are down:

http://cpdnbeta.oerc.ox.ac.uk/server_status.php

Although the Beta2 server status page currently shows the upload server as running, some (perhaps all) zip file uploads are failing.
ID: 15888 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 15897 - Posted: 14 Mar 2008, 19:56:37 UTC

Planned outage news for Hydrogen@Home:

Jack Shultz wrote:
Since we are moving the project to a better server, I should let the job queue clear. I will stop generating workunits today and allow the current workunits to get credits.

Jack
ID: 15897 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15948 - Posted: 16 Mar 2008, 14:24:58 UTC

There will be a planned power outage in the Oxford Atmospheric, Oceanic and Planetary Physics department on Tuesday 18 March from 07.00 UTC for about half a day. The servers will need to be powered down on Monday 17 from about 17.00 UTC. Because Milo will be away on Tuesday, he may not be able to reactivate the servers until Wed 19 at approx 09.00 UTC.

The BBC servers will not be affected.
The SAP project website, forum and some uploads will be affected.
The CPDN website, forum and some computers' uploads (trickles and zip file uploads to upload server uploadatm) will be affected.
CPDN stats exports of credit for trickles received should not be affected.
The CPDN independent forum will not be affected. It's a good idea for all members to register there.

The usual advice for outages applies. CPDN and SAP members should if possible suspend BOINC network activity before the outage begins. Some multi-project crunchers may wish to fetch extra work from other projects before the outage begins so they can also suspend network activity.

The server status of ClimatePrediction projects can be checked here:

CPDN
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/server_status.php
BBC
http://bbc.cpdn.org/server_status.php
SAP
http://attribution.cpdn.org/server_status.php
ID: 15948 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 15992 - Posted: 19 Mar 2008, 8:18:00 UTC

News from Cosmology@Home:

March 18, 2008

Project Temporarily Suspended
We are suspending WU generation until the "Rombint" errors are solved and a new generator is up and running.
ID: 15992 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 16002 - Posted: 19 Mar 2008, 19:19:05 UTC

Einstein@Home is off line until they have their database problems fixed.
ID: 16002 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 16006 - Posted: 19 Mar 2008, 22:02:44 UTC - in response to Message 16002.  

Einstein@Home is off line until they have their database problems fixed.

And they're back. David Hammer fixed it.
ID: 16006 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 16010 - Posted: 20 Mar 2008, 12:42:06 UTC
Last modified: 20 Mar 2008, 12:42:16 UTC

Milkyway@Home news:

March 19, 2008

Work Generation Downtime
We're running a purge on the database to try and get things sped up. During this time we wont be generating any new work, but this should be completed sometime tonight and there will be new work after that.

March 20, 2008

Forums Down for Upgrade
I just upgraded the BOINC server software. The database is a little shaky right now as we're trying to upgrade to PHP5 to facilitate it better. Therefore the forums will be down until we can get PHP5 up, hopefully it won't take more than a day. However, the assimilator problem seems to be fixed with this release and hopefully so will the freezeups.
ID: 16010 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15540
Netherlands
Message 16032 - Posted: 20 Mar 2008, 22:02:06 UTC

With Rytis on an Easter vacation to Amsterdam, it would appear that the Primegrid database thought it was a good time to crash. So the project is temporarily unavailable.
ID: 16032 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.