Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
Bernd

Send message
Joined: 24 Aug 09
Posts: 91
United States
Message 38624 - Posted: 22 Jun 2011, 3:49:28 UTC

Collatz seems to have dropped off the net. Last summer he had power problems looks like they may have started again for this year.
ID: 38624 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38625 - Posted: 22 Jun 2011, 4:04:12 UTC - in response to Message 38624.  

Right -- they had rain there, could also be his ISP -- both seem to work to 21st century American rather than European standards. You know the deal, infrastructure is a dirty word.

Collatz seems to have dropped off the net. Last summer he had power problems looks like they may have started again for this year.

ID: 38625 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15569
Netherlands
Message 38627 - Posted: 22 Jun 2011, 6:22:50 UTC - in response to Message 38625.  

Oh damn, I hope it had nothing to do with the request I sent Jon, on giving one of the users a new authenticator key. ;-)
ID: 38627 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38630 - Posted: 22 Jun 2011, 15:47:56 UTC

Good news -- Collatz is back online, apparently the storms killed off a NIC on the firewall. Things are back in place for now there.

Dnet and Aqua remain vapor so far today -- 4 days for Dnet, 2 days for Aqua.
ID: 38630 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38632 - Posted: 22 Jun 2011, 17:01:05 UTC

I've heard or seen nothing regarding the extended Dnet and Aqua outages. With Dnet, that is really not all that surprising, very thin resources and not all that good at communications.

With Aqua though, that is surprising -- I was hoping to hear something about their status by now from some source....
ID: 38632 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38634 - Posted: 22 Jun 2011, 21:42:06 UTC - in response to Message 38632.  

Collatz has a few more issues - home page is up -- but databases are offline at the moment (2:40PDT) and have been offline for about a half hour so far.

Dnet and Aqua remain totally vaporized at the moment.
ID: 38634 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 38635 - Posted: 22 Jun 2011, 21:47:35 UTC

Lattice is back up.
ID: 38635 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38636 - Posted: 22 Jun 2011, 22:36:46 UTC - in response to Message 38634.  

Collatz is back up for now -- Slicker has been working on it (there were feeder issues).
ID: 38636 · Report as offensive
Bernd

Send message
Joined: 24 Aug 09
Posts: 91
United States
Message 38638 - Posted: 23 Jun 2011, 2:18:32 UTC - in response to Message 38635.  

Lattice is back up.

And it has work also. Good news all around I say. I hope they have time in the morning to post a news update as to what happened etc.
ID: 38638 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 38647 - Posted: 23 Jun 2011, 12:32:12 UTC - in response to Message 38638.  

Yes I got 1 WU. It would be interesting to know what happened.
ID: 38647 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38648 - Posted: 23 Jun 2011, 15:40:00 UTC

Dnet and Aqua remain vapor for another day with no information regarding their apparent demise.
ID: 38648 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38655 - Posted: 23 Jun 2011, 23:51:15 UTC

Is there any source of project outage status aside from here that one might use to check on project status?
ID: 38655 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15569
Netherlands
Message 38656 - Posted: 24 Jun 2011, 2:03:07 UTC - in response to Message 38655.  

What, like http://www.esea.dk/esea/boincschedulers.php? I don't think anyone else took it up after Bruno left the scene. So the list there isn't updated anymore. However, BOINCStats has an updated list, one that you can even drag up on the screen, if you want to see it first.
ID: 38656 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38658 - Posted: 24 Jun 2011, 2:54:02 UTC - in response to Message 38656.  

No - not so much that -- status is not that difficult -- more like when a project is offline (Dnet, Aqua for example), and there is an absolute informational void as to what is going on with them (in both cases their home page is not available so they are REALLY vapor at the moment), some additional information would be a good thing.

With Dnet, there is at least a chance that they are totally gone -- there have been serious issues there over the past several months. In any event they have been vapor for working on 6 days.

With Aqua though, their vapor status is really a surprise -- they have been pretty solid for a long time. They have been vapor now working on 4 days. For Aqua, I think they had some sort of disk or RAID array problem and things have gone from bad to worse for them over the past few days.

But with both projects, that is really pure supposition -- I don't know the admins to email them, and there has been absolutely nothing regarding what their problems are or when (if) they will be able to resolve them.


What, like http://www.esea.dk/esea/boincschedulers.php? I don't think anyone else took it up after Bruno left the scene. So the list there isn't updated anymore. However, BOINCStats has an updated list, one that you can even drag up on the screen, if you want to see it first.

ID: 38658 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 38662 - Posted: 24 Jun 2011, 17:23:43 UTC - in response to Message 38658.  

I discovered an email address for someone at Aqua online so tried it tongue in cheek...it worked and this is what Neil said:

Hi Peter!

I'm Neo on the AQUA@home forum. Unfortunately, I ran a maintenance
script in the wrong directory on the AQUA@home server (on Monday, I
think), which then deleted a few critical BOINC config files instead of
the old files it was supposed to delete. After that, pretty much
everything slowly stopped working, and I didn't figure out what had
happened until the next day. To avoid more problems, we've turned the
server off until Boinc Admin gets back from vacation on Monday. He'd
probably (hopefully) know enough of what was in those configuration
files that we can start from a very old backup and make the necessary
updates to those. If not, there's very small chance that we can recover
the deleted files.

Morals of the story:
* Don't assume that there are recent backups
* Scripts that delete files en masse should require the user to specify
the directory or use an absolute address instead of assuming the current
directory

Hopefully we can recover, and hopefully people aren't too angry at us.
Sorry for the inconvenience.

Sincerely,
Neil

ID: 38662 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38663 - Posted: 24 Jun 2011, 18:58:42 UTC - in response to Message 38662.  

Thanks much for the information regarding Aqua -- looks like they may be in recovery mode next week.
ID: 38663 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 38664 - Posted: 24 Jun 2011, 19:02:56 UTC - in response to Message 38663.  
Last modified: 24 Jun 2011, 19:07:34 UTC

Recovery Mode...yes indeed, and that's putting it mildly. He says that once they're up and running again, any work units with imminent deadlines will be extended so they aren't lost. I'm lucky, mine aren't due until August. Let's hope it's fixed long before then...LOL.

Edit: I forgot to add that it might be an idea to suspend Aqua in the meanwhile so Boinc doesn't waste time trying to contact Aqua's servers.
ID: 38664 · Report as offensive
ProfileByron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 30 Aug 05
Posts: 505
Canada
Message 38665 - Posted: 24 Jun 2011, 23:21:03 UTC - in response to Message 38608.  

ClimatePrediction.net News and Announcements


To make it a bit more "official", I'm re-posting part of one of my posts from another thread:

It must also be kept in mind what was probably posted a long time ago now - There were only about 3,000 models issued for the spinup part of the hadcm3n series. And each subsequent batch will be the next phase, automatically generated from those in the first batch that completed.
Some of them will have been aborted, some abandoned, and some will still be running on computers with low resources for this project.
So the number issued for the 2nd batch will be less than 3,000. And the project's front page says that there are currently 34,812 active computers. Down a fair bit from the 40,000+ of a few weeks ago, but still a less than 1 in 10 chance of getting a hadcm3n.

There'll be a better chance when the next lot of regional models get released, but that won't happen until after the ongoing problems with the RAPIT lot is sorted out.

Basically, there's little to no work from this project at the moment.


Les Bayliss
Forum moderator
ID: 38665 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 38674 - Posted: 26 Jun 2011, 5:24:24 UTC - in response to Message 38664.  

Yup -- I did that early in the week, I've other CPU projects to pick up the slack (Einstein, Spinhenge, POEM plus some others)



Edit: I forgot to add that it might be an idea to suspend Aqua in the meanwhile so Boinc doesn't waste time trying to contact Aqua's servers.

ID: 38674 · Report as offensive
Profileritterm
Avatar

Send message
Joined: 4 Jul 08
Posts: 82
United States
Message 38676 - Posted: 26 Jun 2011, 12:42:18 UTC

Any news on EDGeS? I've been getting "server can't open database" and "feeder not running" messages for at least 2 days.
ID: 38676 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.