News on Project Outages

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 31 · 32 · 33 · 34 · 35 · 36 · 37 . . . 62 · Next

AuthorMessage
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2517
United Kingdom
Message 105051 - Posted: 12 Aug 2021, 17:27:43 UTC - in response to Message 105048.  

But their having to roll back the network changes on top of the time taken over and above that scheduled has me joining your verdict on the IT staff at Oxford.
ID: 105051 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 105053 - Posted: 12 Aug 2021, 17:29:52 UTC - in response to Message 105051.  

Staff, or outside contractors?
ID: 105053 · Report as offensive     Reply Quote
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2517
United Kingdom
Message 105055 - Posted: 12 Aug 2021, 18:39:41 UTC - in response to Message 105053.  

Staff, or outside contractors?


I have no idea but the level of incompetence is the same.
ID: 105055 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 105126 - Posted: 14 Aug 2021, 11:16:06 UTC

Andy reports that a new problem - tentatively identified as a hardware failure - has been observed on the CPDN 'dev' (test) servers. He has shut down "the project" to minimise data loss, and fears that this closure may last for several days.

The main, production, CPDN server is also reporting 'shut down for maintenance'. I have asked Andy to clarify whether both versions of the project need to be shut down because of the single hardware failure, and am awaiting his reply.
ID: 105126 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 105129 - Posted: 14 Aug 2021, 14:30:45 UTC - in response to Message 105126.  

I have been caught twice now, once with a stuck download that I ended up aborting.
And now I have two N144 running, but they will complete tomorrow with no immediate prospects of upload.

But not all is lost. The ARP machine is running fine.
ID: 105129 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 105137 - Posted: 15 Aug 2021, 10:42:40 UTC

Andy has confirmed that both the main CPDN project and the test project have been deliberately stopped as a result of the suspected hardware failure. Further news will be posted as it becomes available.
ID: 105137 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 105194 - Posted: 19 Aug 2021, 10:21:00 UTC

Further news from CPDN:

Andy Bowery wrote:
Just to let you know the affected RAM board of the dev site has now been removed and both the dev and main projects are now back online.
I have been able to report overdue trickles.
ID: 105194 · Report as offensive     Reply Quote
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2517
United Kingdom
Message 105196 - Posted: 19 Aug 2021, 10:42:52 UTC - in response to Message 105194.  

Trickles uploaded here and six completed tasks reported. 48 minutes till I can confirm downloads are working again.
ID: 105196 · Report as offensive     Reply Quote
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2517
United Kingdom
Message 105228 - Posted: 24 Aug 2021, 8:18:24 UTC

Not strictly an outage but Rosetta seems to have run out of tasks to send. At some point yesterday I stopped getting tasks for my phone, this morning found their server status page said, 0 tasks unsent.
ID: 105228 · Report as offensive     Reply Quote
Harri Liljeroos

Send message
Joined: 25 Jul 18
Posts: 62
Finland
Message 105263 - Posted: 26 Aug 2021, 14:29:59 UTC
Last modified: 26 Aug 2021, 15:29:42 UTC

LHC@home website is down. Notices on Boinc Manager says some identification authentication service problems for CMS tasks, the generation of those has been stopped. That may be reason why their forums can not be reached.
[edit]There is currently a power outage in the CERN computer centre. LHC@home services may be affected.
ID: 105263 · Report as offensive     Reply Quote
Harri Liljeroos

Send message
Joined: 25 Jul 18
Posts: 62
Finland
Message 105264 - Posted: 26 Aug 2021, 17:38:33 UTC - in response to Message 105263.  

LHC@home website is down. Notices on Boinc Manager says some identification authentication service problems for CMS tasks, the generation of those has been stopped. That may be reason why their forums can not be reached.
[edit]There is currently a power outage in the CERN computer centre. LHC@home services may be affected.

Everything seems to be back to normal.
ID: 105264 · Report as offensive     Reply Quote
Profile Keith Myers
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 17 Nov 16
Posts: 863
United States
Message 105311 - Posted: 3 Sep 2021, 16:51:40 UTC

Universe@home is down for scheduled network maintenance. Possible not returning till Monday.
ID: 105311 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 105346 - Posted: 8 Sep 2021, 8:37:55 UTC

CPDN downloads are failing again.
ID: 105346 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 105347 - Posted: 8 Sep 2021, 10:24:45 UTC

Andy Bowery has noticed the problem:

The project has had to be taken offline. New RAM was installed on one of the key servers. This new RAM has unfortunately failed. It is anticipated that this downtime will last until tomorrow.
I've replied to say

Could you please check that all essential servers are working and accessible from the internet before re-starting the project? A half-way restart is very wasteful of unissued tasks.
ID: 105347 · Report as offensive     Reply Quote
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 105362 - Posted: 9 Sep 2021, 16:47:24 UTC - in response to Message 105347.  

Andy Bowery wrote:
Hi All,

The offending RAM has now been removed and the main project and dev projects have now been started again and are up and running.

Best regards,

Andy
ID: 105362 · Report as offensive     Reply Quote
Harri Liljeroos

Send message
Joined: 25 Jul 18
Posts: 62
Finland
Message 105465 - Posted: 23 Sep 2021, 18:03:54 UTC

cpdn website seems to be down.
ID: 105465 · Report as offensive     Reply Quote
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 105466 - Posted: 23 Sep 2021, 19:11:37 UTC - in response to Message 105465.  

It appears to be a network issue at the department level.
ID: 105466 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 105467 - Posted: 23 Sep 2021, 20:56:18 UTC - in response to Message 105466.  

It appears to be a network issue at the department level.
Not the Department of Engineering IT Support again?
ID: 105467 · Report as offensive     Reply Quote
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 105468 - Posted: 24 Sep 2021, 2:07:17 UTC - in response to Message 105467.  

Whoever it is, there's been an Oops.
ID: 105468 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 13 Feb 07
Posts: 21
Pakistan
Message 105475 - Posted: 25 Sep 2021, 9:44:33 UTC - in response to Message 105468.  

Whoever it is, there's been an Oops.

___________________________

How come the Oooops is always on a weekend?
ID: 105475 · Report as offensive     Reply Quote
Previous · 1 . . . 31 · 32 · 33 · 34 · 35 · 36 · 37 . . . 62 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.