News on Project Outages

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

AuthorMessage
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 14962 - Posted: 17 Jan 2008, 2:07:38 UTC
Last modified: 17 Jan 2008, 2:23:52 UTC

Looks like TSP did an update that has gone horribly wrong. Replace with project name?

Ah, it's moving over to the new server. Will keep checking the situation.

from TSP yesterday: Good news after sending an email to the project list, David Anderson informed me that i can just stop the project, move everything over and start the project on the new server and there is no need for canceling any WUs. Hence that is the new plan, stop project, change the nameservers over, migrate the files, and restart the project on the new server. I'll start tonight 9pm ay.....

And today:
I'm going to stop the project today, change over the name servers, and restart the project on Friday. Lets hope this works.
TSP
ID: 14962 · Report as offensive
zombie67
Avatar

Send message
Joined: 14 Feb 06
Posts: 136
United States
Message 14965 - Posted: 17 Jan 2008, 8:09:11 UTC - in response to Message 14962.  

Give the man a break. ;)

He's on an island that shares a dual 128k, and that's over cellular. It will take about a day to get everything transferred to the new server.
Reno, NV
Team: SETI.USA
ID: 14965 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15022 - Posted: 18 Jan 2008, 13:43:47 UTC

TSP is back from the server move.

Markus wrote:
2008-01-18
If you are reading this it means you are looking at the new server and long time host for TSP. A special thanks to Tank_Master for the fire under my pants to get it done, and xname.org for DNS, #centOS for help with said DNS, David Anderson for the migration tips, and of course DrMcLovein for all his help and generous spirit. Things are still a bit buggy at the moment but the major tests have been passed, now its just some clean up work. Thank you all for your patients. New workunits after I get some sleep.


TSP, the new hospital site!
ID: 15022 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15038 - Posted: 19 Jan 2008, 20:02:22 UTC

A few hours ago (Saturday) the CPDN upload server uploadatm was down temporarily but it's now up again.


However, the CPDN servers have been problematic for over a month and this situation may continue for some time. Even when all the CPDN servers are up and show as normal on the CPDN server status page, they have been under stress because of large amounts of data extraction for the researchers and because the disks contain so much data. There may be other undiagnosed problem(s). Tolu and Milo have various plans for the CPDN servers but cannot implement them all yet.

As a result of the current situation

* The CPDN-BOINC forum is sometimes difficult to access and we see a 'too many connections' message. Try later or use the independent forum instead.

* Sometimes the servers temporarily refuse to accept trickles and zip file uploads. BOINC has a rich variety of descriptions for this problem. You may see messages such as

HTTP error
HTTP internal server error
Project communication failed
Scheduler request failed: failed sending data to the peer
Project servers may be temporarily down

These messages can appear even when all the servers seem normal on the server status page. The best idea is to suspend BOINC network activity if possible and try again later.

There have also been delays recently with the appearance of our successful trickles on our models' project web pages.

These problems do not affect the progress of our models and we are all receiving correct credits. Only the CPDN servers are affected, not BBC or SAP.

Many thanks to all CPDN crunchers for your patience!

ID: 15038 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15046 - Posted: 20 Jan 2008, 15:13:00 UTC


It's possible that sometimes a CPDN server shows on the server status page as down when in fact it's running! So in future I will only announce server outages when we have evidence that they're real....
ID: 15046 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15047 - Posted: 20 Jan 2008, 15:20:53 UTC

Planned outage message on Enigma@Home:

TJM wrote:
At the beginning of February I'm moving to a place where my access to the internet (and probably amount of free time) will be very limited (yes, there are such places in Poland :-)). I guess that (at least for first few weeks) my cellphone + gprs will be the best connection I'll be able to get there.
That means I'll have to suspend the project for some time, because I can't just leave the servers up and hope that nothing will crash/fail while I'll be hundreds of kilometers away without any way to fix problems.
Around 26th January I'll stop the workunit generator, then I'll wait for results to return. The server software will be moved to a temporary location (same hosting where project's http reverse-proxy and download mirror is running atm), where I'll be able to keep it online without worrying about hardware failures or longer power outages; however it will remain suspended for at least few weeks, until I'll get any decent internet connection.
I don't know (yet) when I'll be back, I'll post an update on this around mid Feb.

(Source thread)
ID: 15047 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15158 - Posted: 29 Jan 2008, 23:01:18 UTC
Last modified: 29 Jan 2008, 23:01:28 UTC

Looks like Einstein has some forum/DB problems. All treads give a message alike "Thread with id xxxx created but nothing returned from DB layer" back when clicked on. As if they're hidden... Perhaps they are. ;-)

Anyway, waiting for EAH to repair it and tell what it was.
ID: 15158 · Report as offensive
Profile KSMarksPsych
Avatar

Send message
Joined: 30 Oct 05
Posts: 1239
United States
Message 15160 - Posted: 30 Jan 2008, 12:13:34 UTC - in response to Message 15158.  

Anyway, waiting for EAH to repair it and tell what it was.


Should be fixed.

It was a few corrupt database tables.

Kathryn :o)
ID: 15160 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15411 - Posted: 14 Feb 2008, 2:11:42 UTC

Artificial Intelligence System is still down. Now with an updated message on the site, though.

Dear Participant,

We are experiencing technical issues with the server. We will be back online in a couple of days. Because the web site is now hosted on our internal network the http://www.intrealm.com/aisystem beta test web site was detected by the Boinc client.

It wasn't our intention to make that public. The master URL is http://www.intelligencerealm.com/aisystem. We do not intend to change that. We also have a database backup from 2 days before the crash and once the server will be retrieved we are looking forward to restore the latest data and files. The beta test web site had an older database and older web pages. We apologize for the confusion.

We have retrieved the database and the rest of the files. We will continue to investigate but right now it appears that there is no data loss.

On a positive note we would like to mention that we have finalized the new work generator and generated over 20,000 WUs. We have also finalized the development of the Windows client. We will be testing it over the next couple of days.

We are looking into various options so that we can minimize the server downtime in the future. The project will continue as planned.

This crash is a minor issue. The real challenge will begin when we will have to generate networks of neurons consisting of hundreds of parameters that need to be optimized. With a couple thousand machines we should overcome that one too.

Thank you.
ID: 15411 · Report as offensive
Profile KSMarksPsych
Avatar

Send message
Joined: 30 Oct 05
Posts: 1239
United States
Message 15418 - Posted: 14 Feb 2008, 11:02:36 UTC

QMC@Home is currently down doing some server upgrades.
Kathryn :o)
ID: 15418 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15430 - Posted: 14 Feb 2008, 23:26:24 UTC
Last modified: 21 Feb 2008, 13:25:48 UTC

Predictor@Home is back. Their server is at least.
ID: 15430 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15502 - Posted: 20 Feb 2008, 16:40:28 UTC - in response to Message 15411.  
Last modified: 20 Feb 2008, 20:28:13 UTC

Artificial Intelligence System is still down. Now with an updated message on the site, though.

They are back up, albeit very slow.

Read their news on http://www.intelligencerealm.com/aisystem/news.php
ID: 15502 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15514 - Posted: 22 Feb 2008, 15:26:37 UTC - in response to Message 15502.  
Last modified: 23 Feb 2008, 12:59:55 UTC

Artificial Intelligence System will take their server down for a few hours today.

Dear Participant,

The site will be down today for a couple of hours while we will move the servers to the hosting company.

The WUs that will finish during this time can be reported later on.

Thank you.
ID: 15514 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15533 - Posted: 23 Feb 2008, 13:00:15 UTC - in response to Message 15514.  
Last modified: 23 Feb 2008, 13:01:06 UTC

Artificial Intelligence System will take their server down for a few hours today.


They're back, with a nippy web site.
ID: 15533 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15535 - Posted: 24 Feb 2008, 0:23:16 UTC

BOINC Trac is inaccessible to most of us for the time being. Account creation has been disabled, while those who have an account cannot log in. Apparently the server deleted the password file. Until the file has been restored, Trac is off line.
ID: 15535 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15233
Netherlands
Message 15583 - Posted: 26 Feb 2008, 0:16:00 UTC - in response to Message 15535.  

BOINC Trac is back. The password file has been recovered successfully.
ID: 15583 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15587 - Posted: 26 Feb 2008, 3:51:36 UTC


Some of the CPDN main project servers are down for maintenance.

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/server_status.php

The CPDN-BOINC forum and our CPDN account and model web pages are not accessible. Trickles and zip file uploads may fail and produce messages like

26/02/2008 00:50:30|climateprediction.net|Message from server: Project is temporarily shut down for maintenance

The best idea is to suspend BOINC network activity until the CPDN servers are up again, though we realise that this is not always possible for multi-project crunchers. The BBC, SAP and Beta2 servers are all functioning normally.

ID: 15587 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15601 - Posted: 26 Feb 2008, 18:40:57 UTC

Tolu worked all Monday and a good part of the night on unscheduled server maintenance, mostly moving large volumes of data. All the CPDN servers are again fully functional.

ID: 15601 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 15638 - Posted: 28 Feb 2008, 5:21:54 UTC
Last modified: 28 Feb 2008, 5:22:22 UTC

Following the recent CPDN main project server outage there are still delays in successful CPDN trickles appearing on our models' web pages and our credits being reported to the stats sites. We cannot at the moment say exactly how long these delays will last.

These delays do not affect the running of our models.

Thanks to all crunchers for your continued patience.
ID: 15638 · Report as offensive
Ralph

Send message
Joined: 30 Sep 05
Posts: 50
Message 15641 - Posted: 28 Feb 2008, 12:44:57 UTC

Hi,

"28/02/2008 7:39:29 AM|Milkyway@home|Started upload of gs_271_1204187078_13659_0_0
28/02/2008 7:39:50 AM||Project communication failed: attempting access to reference site
28/02/2008 7:39:51 AM||Access to reference site succeeded - project servers may be temporarily down."

Access to the Milkyway site also seems to be unavailable.

ID: 15641 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

Message boards : Projects : News on Project Outages

Copyright © 2023 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.