Thread 'World Community Grid has announced an extended outage from Feb 14 to April 22, 2022'

Message boards : Projects : World Community Grid has announced an extended outage from Feb 14 to April 22, 2022
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
PMH_UK

Send message
Joined: 24 Dec 10
Posts: 36
United Kingdom
Message 110140 - Posted: 20 Oct 2022, 9:14:37 UTC - in response to Message 110136.  

Workunits downloading, with usual re-tries.
Website down with Error 500: (was up for a while last night).

Paul.
ID: 110140 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110150 - Posted: 20 Oct 2022, 13:41:35 UTC

20.5 hours later, and the site is still down. Not a peep from the team on Facebook or Twitter.
And of course, the BOINC part is slow on downloads, and full of HTTP errors as usual.

But, in 11 days, that doesn't matter for me at least. Nov 1st will my new electricity contract begin,
and with up to 5 times as high cost per kWh as now, crunching will be the last thing on my mind.
ID: 110150 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110152 - Posted: 20 Oct 2022, 15:34:29 UTC
Last modified: 20 Oct 2022, 15:35:15 UTC

Finally, after almost 23 hours downtime, they posted the following on Facebook:

"We are currently suffering a communication issue between the website and our database,
resulting in an inability to log in properly (or access the website in some browsers).
We apologize and will notify you when it has been resolved."
ID: 110152 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110157 - Posted: 21 Oct 2022, 6:57:43 UTC
Last modified: 21 Oct 2022, 7:02:36 UTC

Well, the site came back for a few hours, but again it went down after the staff abandoned their workplace, and went home.
No reaction on Facebook, which clearly shows that they do not have anyone checking the projects 24/7, or use some
automatic means to get a message when things go wrong.

Not good for such a big project as WCG. If things doesn't improve soon, WCG will not continue to be a big project though...
ID: 110157 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5129
United Kingdom
Message 110158 - Posted: 21 Oct 2022, 9:16:12 UTC

And there's no sign of any new work today, either - except the occasional one or two stragglers limping in.

I hope Krembil are aware - but I fear that they are not - that for a successful scientific research project under BOINC, all the moving parts have to be working at the same time. Work has to be generated, allocated, downloaded, processed, uploaded, validated, and assimilated. The entire pipeline moves at the speed of the slowest. And if one piece breaks, they're all broken.
ID: 110158 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110159 - Posted: 21 Oct 2022, 9:35:17 UTC

I did get 19 new (_0) OPNG tasks from batch 0160806 at 09:45 UTC+2, but they are crunched, and gone. The last two uploaded and reported at 10:39 UTC+2.
Since then, nothing, nada, not even any resends.

Krembil/Jurisica really have to get their act together, or just throw in the towel, and admit that they're not up to running WCG.
ID: 110159 · Report as offensive
PMH_UK

Send message
Joined: 24 Dec 10
Posts: 36
United Kingdom
Message 110160 - Posted: 21 Oct 2022, 10:42:01 UTC - in response to Message 110159.  

I just got some more _0 OPNG with no retries, 7 of 8 slots filled.

Paul.
ID: 110160 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110162 - Posted: 21 Oct 2022, 14:36:47 UTC

After my 19 OPNG's at 09:45 UTC+2, I've only received 2 (_2) OPNG resends (not asking for CPU tasks)

Website still down. Apart from fixing their inability to tame the IBM WCG system, they really need to work
on their communication skills. Not a peep from them on their social media sites, regarding this latest kaboom.
ID: 110162 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 110163 - Posted: 21 Oct 2022, 16:02:37 UTC - in response to Message 110162.  

Website still down. Apart from fixing their inability to tame the IBM WCG system, they really need to work
on their communication skills. Not a peep from them on their social media sites, regarding this latest kaboom.

No one in a large organization communicates bad results, only good ones.
ID: 110163 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110164 - Posted: 21 Oct 2022, 16:22:04 UTC

New post from WCG on Facebook:

Access to the WCG website is restricted for a short time while we work to expand the storage capacity of the WCG. This should not prevent workunits from being sent out.
This will help identify the underlying system issues and crashes that we have been having trouble with for the past few weeks.
ID: 110164 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110165 - Posted: 21 Oct 2022, 18:23:17 UTC

And the website is up again. However I doubt it will stay up for many hours.
No tasks (OPNG) for many hours now.
ID: 110165 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110170 - Posted: 22 Oct 2022, 11:30:55 UTC
Last modified: 22 Oct 2022, 11:32:34 UTC

Poof, and website down again. Rinse and repeat.....
It's quite obvious by now, that they haven't got a clue about why this happens time after time.
ID: 110170 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110177 - Posted: 22 Oct 2022, 17:37:48 UTC

Geeze, that was surprising. Someone is actually working in the WCG team on a Saturday.
The website is back up again. But of course, I do not expect it to last for long.
ID: 110177 · Report as offensive
PMH_UK

Send message
Joined: 24 Dec 10
Posts: 36
United Kingdom
Message 110203 - Posted: 24 Oct 2022, 15:42:52 UTC - in response to Message 110177.  

Website down, scheduler still responding.

Paul.
ID: 110203 · Report as offensive
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1444
United States
Message 110256 - Posted: 27 Oct 2022, 17:04:21 UTC

Updated "news" from WCG
2022-10-27 Update (Workunits & storage update)
Hi everyone, we’re happy to see that volunteers are receiving more OPN1 workunits than last week. We recently increased our DB2 storage pool and switched to a more coarse-grained scheduling method for creating and packaging new workunits for each project. This change may have temporarily disrupted WU scheduling, but we will need to monitor further and likely explore additional possible causes before we can consider the issue resolved.

Another (less optimistic) theory is that other tasks, specifically OPNG, were the cause of our recent storage issues and database-wide system errors. We have no solid evidence yet, only an observation that there is typically a decline in available OPNG work around the same time the download issues are less prevalent. A high load on the storage server and scheduler coincide with the database crashes and a phenomenon whereby the download/upload server groups intermittently register as down from the perspective of our load balancer.

We continue to monitor the system to determine what the best course of action is to stabilize our internal network.

Thank you for your support, patience and understanding.

WCG team at Krembil Research Institute

[Oct 27, 2022 8:39:27 AM]
ID: 110256 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110308 - Posted: 3 Nov 2022, 15:24:21 UTC
Last modified: 3 Nov 2022, 15:25:24 UTC

And the WCG website just crashed again.
ID: 110308 · Report as offensive
PMH_UK

Send message
Joined: 24 Dec 10
Posts: 36
United Kingdom
Message 110309 - Posted: 3 Nov 2022, 15:30:03 UTC - in response to Message 110308.  

Scheduler still respomding, so far.

Paul.
ID: 110309 · Report as offensive
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2706
United Kingdom
Message 110311 - Posted: 3 Nov 2022, 16:26:48 UTC - in response to Message 110308.  

Website back up but still no ARP work.
ID: 110311 · Report as offensive
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 421
Sweden
Message 110319 - Posted: 4 Nov 2022, 18:40:04 UTC

And down again.
ID: 110319 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5129
United Kingdom
Message 110320 - Posted: 4 Nov 2022, 19:10:25 UTC

And the download servers are playing hard to get.
ID: 110320 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next

Message boards : Projects : World Community Grid has announced an extended outage from Feb 14 to April 22, 2022

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.