Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · Next

AuthorMessage
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 61126 - Posted: 22 Mar 2015, 6:00:00 UTC - in response to Message 61116.  

Collatz was running solid this time for a bit longer than it has over the past several months -- almost 10 days before a breakdown.

In the past 3 months it has 'blown up' 11 times. Worst case was over the new year when it was offline for 12 days. Average down time is around 3 days -- though that varies from one day to 12 days.
ID: 61126 · Report as offensive
ProfileTigers Dave

Send message
Joined: 24 Dec 05
Posts: 52
United States
Message 61129 - Posted: 22 Mar 2015, 18:01:40 UTC - in response to Message 61116.  

Collatz came back up several hours ago.
ID: 61129 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 61140 - Posted: 23 Mar 2015, 17:35:58 UTC - in response to Message 61129.  

And Collatz went back down overnight again <sigh>
ID: 61140 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15552
Netherlands
Message 61141 - Posted: 23 Mar 2015, 17:50:43 UTC - in response to Message 61140.  

Should I rename this thread to "News on when Collatz is down"? ;-)
ID: 61141 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 61142 - Posted: 23 Mar 2015, 18:14:48 UTC - in response to Message 61141.  

Perhaps we need a one key macro.

I've taken to considering an easier copy and past post.

"Collatz is down again"


"Collatz is up"


From the traffic that's over on Collatz when it is up, the underlying cause remains a mystery. So the project will continue to ping pong.

As for me, it really has resulted in my running collatz on a short leash with available alternatives increasingly becoming my primaries.
ID: 61142 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15552
Netherlands
Message 61148 - Posted: 24 Mar 2015, 13:45:19 UTC
Last modified: 24 Mar 2015, 13:45:34 UTC

CPDN news:

Jonathan Miller, CPDN wrote:
Hi Chaps,

Following a brief outage yesterday, we are having issues with the database server, and other VMs.

The project is effectively down, and I can't yet say what is running or not.

More info as I get it.

Jonathan Miller, CPDN, 2 hours later wrote:
We appear to be back up and running.

Yesterday we had a problem with the VMs data store, which seems to have cascaded into today's problem with the database server's drives.
Both are now fixed.

We also had a problem with the networking of the database server, and we have worked around that.
ID: 61148 · Report as offensive
Bill Walker

Send message
Joined: 13 Dec 07
Posts: 24
Canada
Message 61212 - Posted: 25 Mar 2015, 23:27:13 UTC

This was posted Monday on the Constellation Questions and Answers forum:

"We had a server crash resulting in a DB crash. In the moment, the server is not hosted on a emergency hardware with a slower internet connection. We are working to get a new server and locate them back to the data center."
ID: 61212 · Report as offensive
ProfileBlurf

Send message
Joined: 18 Jul 11
Posts: 217
United States
Message 61226 - Posted: 26 Mar 2015, 20:30:18 UTC

WuProp was down last night but is up now.
ID: 61226 · Report as offensive
ProfileTigers Dave

Send message
Joined: 24 Dec 05
Posts: 52
United States
Message 61235 - Posted: 27 Mar 2015, 11:12:42 UTC
Last modified: 27 Mar 2015, 11:12:58 UTC

Collatz is back up.
ID: 61235 · Report as offensive
ProfileSkivelitis2
Avatar

Send message
Joined: 8 Nov 14
Posts: 11
United States
Message 61393 - Posted: 4 Apr 2015, 16:42:46 UTC

Collatz down again.
ID: 61393 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15552
Netherlands
Message 61506 - Posted: 9 Apr 2015, 17:56:19 UTC

Seti@Home is out of work, due to database trouble.

Matt Lebofsky, project scientists wrote news here:
So! All the recent headaches are due to continuing issues with the master science database. While all the data seems to be intact, there's something fundamentally wrong causing informix to keep hanging up (usually when we are continuing work on reconnecting the fragmented result tables).

During the previous recent crashes I would clean up what informix was complaining about in various error messages (always the result table indexes) but this time I'm in the process of doing a comprehensive check of everything in the database just to be sure. And, in fact, I'm seeing minor problems that I've been able to clean up thus far (once again, no loss in data - just internal bookkeeping and broken index issues).

I thought this full check would be done by now (ha ha) but it's not even close. Meanwhile we *should* be able to do Astropulse work, but the software blanking engine requires the master science database to do some integrity checks, so that is all offline as well.

There are ways to speed up such events in the future. We're working on enacting several improvements. Yes, we here are all beyond tired of our project grinding to a halt and things will change for the better.

- Matt

He also wrote earlier on the 8th of April:
Hey - just so y'all know it's the science database again. This time enough is enough and I'm doing a comprehensive set of integrity checks on everything in that database before starting it up again. So no MB splitting or assimilating. Eventually some work will show up for AP splitting in the meantime...

Might be back up by the end of the work day, if not shortly after that. I did find one problem which was obscured while checking things out after previous crashes, so there's hope.

- Matt
ID: 61506 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15552
Netherlands
Message 61509 - Posted: 9 Apr 2015, 22:07:38 UTC

More from Matt at Seti:
Due to recent problems we are doing a deep cleaning of one of our larger databases. This is unfortunately progressing slower than expected, and keeping us from generating new work. We might be in this state until next week.
ID: 61509 · Report as offensive
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1439
United States
Message 61510 - Posted: 9 Apr 2015, 23:11:15 UTC

Whoa what happened here at the BOINC BOARDS...????? NOT Reachable for past 15-20 minutes.
ID: 61510 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2491
United States
Message 61519 - Posted: 10 Apr 2015, 3:57:38 UTC - in response to Message 61510.  

Whoa what happened here at the BOINC BOARDS...????? NOT Reachable for past 15-20 minutes.

Known issue. Change from http;// to https://
ID: 61519 · Report as offensive
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1439
United States
Message 61520 - Posted: 10 Apr 2015, 4:10:07 UTC - in response to Message 61519.  

http vs https was not the problem. Site was inaccessible.

My Chrome web browser automatically selects https when it's available via an extension called HTTPS Everywhere

Link to Chrome Store extension Encrypt the Web! Automatically use HTTPS security on many sites.
HTTPS Everywhere is a Firefox, Chrome, and Opera extension that encrypts your communications with many major websites, making your browsing more secure.
ID: 61520 · Report as offensive
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1439
United States
Message 61978 - Posted: 29 Apr 2015, 4:13:53 UTC

Collatz server(s) have choked & gone into "down mode" again
ID: 61978 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5124
United Kingdom
Message 61984 - Posted: 29 Apr 2015, 15:45:40 UTC

ID: 61984 · Report as offensive
Brent

Send message
Joined: 30 Mar 10
Posts: 14
United States
Message 62076 - Posted: 7 May 2015, 22:38:25 UTC

What is up with the Collatz project? It has been like a Yo-Yo the last couple of weeks, up-down-up-down-up-down!!!
ID: 62076 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 62077 - Posted: 7 May 2015, 22:50:06 UTC - in response to Message 62076.  

Brent, Collatz has been 'yo-yo mode' for the past few *months*.

What is going on there is that it is a one person project -- with limited resources and it has also seen quite a lot of user population growth over the past month - over 3000 new users there.

Frankly, I've been surprised that is has been as stable as it has been over the past several weeks. I expected it to revert to the serious ping pong it was going through November through March.
ID: 62077 · Report as offensive
BarryAZ

Send message
Joined: 4 Sep 09
Posts: 381
United States
Message 62114 - Posted: 11 May 2015, 18:13:46 UTC

GPUGrid went unreachable about 3 hours ago. (8AM PDT)
ID: 62114 · Report as offensive
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.