Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · Next

AuthorMessage
ProfileBlurf

Send message
Joined: 18 Jul 11
Posts: 217
United States
Message 40939 - Posted: 3 Nov 2011, 19:39:04 UTC

Milkyway is back up.

Looks like there was a problem with one of the hard drives. We've replaced it and it looks like the database is back into RAM, so hopefully we'll have some smooth sailing from here on out.
ID: 40939 · Report as offensive
Hadrian

Send message
Joined: 5 Nov 11
Posts: 2
United Kingdom
Message 40983 - Posted: 5 Nov 2011, 8:45:50 UTC

There are significant problems on POEM

The system has been changed but is far from working.

Firstly it is impossible to tell POEM about the problems as I cannot logon to their system as I am told the URL is wrong.

Every work unit has the same characteristics 347 gflops. On my system this would take two minutes, so far it has run for nearly two hours.

I have subsequently aborted another 20 units as they all say 347 Gflops and 49 secs.

Every work unit does not have a proper time each one say 49 secs, as you process time remaining goes up not DOWN.

This really is a very bad implementation of a system change .

Considering we "donate" computer time this change seems to be a waste of resources.



ID: 40983 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15566
Netherlands
Message 40989 - Posted: 5 Nov 2011, 11:25:33 UTC - in response to Message 40983.  

This thread is for reporting project outages, the POEM project isn't really down.
If you want to rant about them, or ask for help, there's a whole forums outside this thread where you can start a new thread. :-)

In the mean time, to log in on POEM, follow http://boinc.fzk.de/poem/get_passwd.php, fill in your email address, click the OK button under that, go to your email account, open the email from the POEM project, click the temporary log in link in it. This link allows you to log in for 24 hours. Use it to change or redo your password. The old password isn't needed to change to a new password.

After that you are logged in again. Then don't dilly dally, but go to this thread and tell Timo what you see.
ID: 40989 · Report as offensive
Jesse Viviano

Send message
Joined: 14 Feb 11
Posts: 63
United States
Message 41063 - Posted: 8 Nov 2011, 7:29:07 UTC

It looks like the server at ABC@home has gone down. I can't upload anything, make any scheduler requests, or connect to its website.
ID: 41063 · Report as offensive
Jesse Viviano

Send message
Joined: 14 Feb 11
Posts: 63
United States
Message 41064 - Posted: 8 Nov 2011, 7:36:27 UTC - in response to Message 41063.  

It looks like ABC@home is back up again.
ID: 41064 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 41103 - Posted: 10 Nov 2011, 1:51:50 UTC

CPDN main project

CPDN will be taken offline at 9.00 am GMT on Thursday 10 November 2011.

This is to facilitate the relocation of our project servers. It is anticipated that the downtime will be for no longer than 48 hours, but the project should be considered 'at-risk' until noon on 14 November 2011 GMT.

Jonathan

_________________
Jonathan Miller
CPDN SysAdmin

It will still be possible to register and post on the independent forum:

http://climateprediction.net/board/index.php
ID: 41103 · Report as offensive
ProfileNullCoding*
Avatar

Send message
Joined: 10 Jan 11
Posts: 58
United States
Message 41118 - Posted: 11 Nov 2011, 23:17:30 UTC

FreeHAL is MIA. Or AWOL. Can't decide.

From what BOINC and FireFox will tell me the server(s) are down. Odd that all my machines were able to upload their work, but I can't get any more. :|

ID: 41118 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 41163 - Posted: 14 Nov 2011, 17:29:18 UTC - in response to Message 41103.  

CPDN remains offline (5PM GMT 14 November). Not really a surprise, it seems the nature of CPDN servers and environment in general lends to '2x' downtime cycles when doing planned maintenance.

Hopefully, we will see some form of update somewhere from one of the folks from CPDN to get an updated 'return to life'.

CPDN main project

CPDN will be taken offline at 9.00 am GMT on Thursday 10 November 2011.

This is to facilitate the relocation of our project servers. It is anticipated that the downtime will be for no longer than 48 hours, but the project should be considered 'at-risk' until noon on 14 November 2011 GMT.

Jonathan

_________________
Jonathan Miller
CPDN SysAdmin

It will still be possible to register and post on the independent forum:

http://climateprediction.net/board/index.php

ID: 41163 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 41178 - Posted: 16 Nov 2011, 2:11:33 UTC
Last modified: 16 Nov 2011, 2:32:41 UTC

Both CPDN and its Beta project are up and running again.

Jonathan worked on the servers through the weekend but was faced with a network problem which couldn't be sorted out until the non-CPDN Oxford Uni IT staff were back at work on Monday.

Some hadam3p regional model files are still unable to upload; this problem will probably be fixed on Wednesday UK time but it will take some time for the large backlog of waiting files to upload.
ID: 41178 · Report as offensive
Jimmy G (BA)

Send message
Joined: 26 Sep 11
Posts: 41
Message 41227 - Posted: 19 Nov 2011, 12:40:27 UTC - in response to Message 41178.  

Both CPDN and its Beta project are up and running again.

Jonathan worked on the servers through the weekend but was faced with a network problem which couldn't be sorted out until the non-CPDN Oxford Uni IT staff were back at work on Monday.

Some hadam3p regional model files are still unable to upload; this problem will probably be fixed on Wednesday UK time but it will take some time for the large backlog of waiting files to upload.


...and, from...

http://climateprediction.net/board/viewtopic.php?f=36&t=5927&start=150#p97158 ...

Re: News

Unread postby Les Bayliss ยป Fri Nov 18, 2011 1:12 am
There's still a problem for a few uploads, as the university's IT people haven't yet updated the relevant DNS records for the servers that were moved.
So more thumb twiddling. :roll:
(Can't post to that thread...)

Hi Folks,

FYI: As of a.m. Nov 19 my CPDN Transfers queue is populated with 23 items totaling over 500MB of .zip files, and my CPDN Tasks queue is populated with 3 completed HADAM3P's, all waiting for upload. The pile continues to grow since the Nov 10th "offline". This is to be expected?

Thankfully my service provider doesn't cap my uploads... :)

Peace,
Jimmy G
ID: 41227 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15566
Netherlands
Message 41228 - Posted: 19 Nov 2011, 12:42:40 UTC

The Albert@Home project (Einstein's test project) will not have any downloads due to server trouble this weekend. Nothing yet said about Einstein themselves.
ID: 41228 · Report as offensive
MarkJ
Volunteer tester
Help desk expert

Send message
Joined: 5 Mar 08
Posts: 272
Australia
Message 41230 - Posted: 19 Nov 2011, 13:38:43 UTC - in response to Message 41228.  

The Albert@Home project (Einstein's test project) will not have any downloads due to server trouble this weekend. Nothing yet said about Einstein themselves.


There is this in their Tech News forum posted 18th of Nov

BRP4 & FGRP1 download (server) problems
We are experiencing problems with the download server for BRP4 and FGRP1. We're working on it. During that time we may send out only work for S6Bucket or no work at all.

BM
ID: 41230 · Report as offensive
Jimmy G (BA)

Send message
Joined: 26 Sep 11
Posts: 41
Message 41246 - Posted: 21 Nov 2011, 12:21:36 UTC - in response to Message 41227.  
Last modified: 21 Nov 2011, 12:23:41 UTC

FYI: As of a.m. Nov 19 my CPDN Transfers queue is populated with 23 items totaling over 500MB of .zip files, and my CPDN Tasks queue is populated with 3 completed HADAM3P's, all waiting for upload. The pile continues to grow since the Nov 10th "offline". This is to be expected?


FYI: As of a.m. Nov 21 all CPDN Transfers queue uploaded successfully, all CPDN Tasks queue uploaded successfully. Kudos CPDN team!

:)
JG
ID: 41246 · Report as offensive
Ralph

Send message
Joined: 30 Sep 05
Posts: 50
Message 41272 - Posted: 22 Nov 2011, 16:50:47 UTC

The Milkyway site is unavailable.
ID: 41272 · Report as offensive
ProfileBlurf

Send message
Joined: 18 Jul 11
Posts: 217
United States
Message 41273 - Posted: 22 Nov 2011, 21:17:54 UTC - in response to Message 41272.  

The Milkyway site is unavailable.


Sorry to get back to y'all so late. I've been in and out of meetings since Noon Eastern. This was escalated by myself @ 2:30.
ID: 41273 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 41277 - Posted: 23 Nov 2011, 4:38:26 UTC - in response to Message 41273.  

Here's hoping that things are fixed on Wednesday, elsewise this outage might stretch into Monday. Hope folks have other GPU processing projects in place.



Sorry to get back to y'all so late. I've been in and out of meetings since Noon Eastern. This was escalated by myself @ 2:30.

ID: 41277 · Report as offensive
KAMasud

Send message
Joined: 13 Feb 07
Posts: 21
Pakistan
Message 41281 - Posted: 23 Nov 2011, 9:22:58 UTC - in response to Message 41277.  

:). Yes thankyou. Education curve.
ID: 41281 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15566
Netherlands
Message 41283 - Posted: 23 Nov 2011, 13:28:26 UTC

Pirates@Home is in drydock.

22 November 2011 20:18 UTC
    Pirates@Home is undergoing a server software upgrade.
    Check back later to see if we are still hosed.
                          - Captain Wormholio 
ID: 41283 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 41285 - Posted: 23 Nov 2011, 14:55:31 UTC

And so we begin day two of the Milkyway outage -- since no home page access, there is no information available regarding the outage. The lack of home page access suggests the possibility of an RPI problem rather than something within the perview of project folks.
ID: 41285 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 41286 - Posted: 23 Nov 2011, 14:58:21 UTC

SETI, which had not yet recovered from their weekly maintenance outage is now fully offline. There are times that it seems their 4 hour maintenance cycle on Tuesdays is more of an 'unmaintenance' cycle as recovering from the maintenance on occasion takes many times longer than the actual maintenance.
ID: 41286 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.