Thread 'Anything and Everything to do with (WCG) World Community Grid'

Message boards : Projects : Anything and Everything to do with (WCG) World Community Grid
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 17 · 18 · 19 · 20

AuthorMessage
kasdashdfjsah

Send message
Joined: 29 Jan 24
Posts: 60
Message 115478 - Posted: 21 Feb 2025, 8:55:08 UTC

still no tasks available
ID: 115478 · Report as offensive     Reply Quote
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2789
United Kingdom
Message 115481 - Posted: 21 Feb 2025, 10:20:11 UTC - in response to Message 115478.  

In reply to kasdashdfjsah's message of 21 Feb 2025:
still no tasks available

Last task I got was about 14 hours ago, an ARP resend. Till they release the next batch of ARP, resends are about all we are likely to get. I have system set to only take ARP so don't know about MCM.
ID: 115481 · Report as offensive     Reply Quote
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2789
United Kingdom
Message 115482 - Posted: 21 Feb 2025, 11:26:40 UTC - in response to Message 115478.  

In reply to kasdashdfjsah's message of 21 Feb 2025:
still no tasks available

ARP141s now released.
ID: 115482 · Report as offensive     Reply Quote
Doug

Send message
Joined: 11 Mar 22
Posts: 4
Message 115490 - Posted: 22 Feb 2025, 18:25:47 UTC

Hi all,

Does anyone have any actual info on whether there will ever be any more OPN work? There seems to be no useful info on the WCG site.

Thanks.

Doug
ID: 115490 · Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 3 Mar 23
Posts: 15
Russia
Message 115491 - Posted: 22 Feb 2025, 23:13:17 UTC - in response to Message 115490.  
Last modified: 22 Feb 2025, 23:13:37 UTC

In reply to Doug's message of 22 Feb 2025:
Does anyone have any actual info on whether there will ever be any more OPN work?

Hi
Obviously, you should ask that on WCG forum.
ID: 115491 · Report as offensive     Reply Quote
kasdashdfjsah

Send message
Joined: 29 Jan 24
Posts: 60
Message 115512 - Posted: 28 Feb 2025, 18:33:35 UTC - in response to Message 115482.  

Still no tasks, for over a week now on my M4 Mac mini
ID: 115512 · Report as offensive     Reply Quote
kasdashdfjsah

Send message
Joined: 29 Jan 24
Posts: 60
Message 115539 - Posted: 2 Mar 2025, 21:01:37 UTC - in response to Message 115512.  

In reply to kasdashdfjsah's message of 28 Feb 2025:
Still no tasks, for over a week now on my M4 Mac mini


Update:

Now getting 10 concurrent WCG tasks again, but only from the Mapping cancer sub project, not the africa rainfall project and open pandemics covid-19 sub projects. Didn't get anything from these before, but still.
ID: 115539 · Report as offensive     Reply Quote
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2789
United Kingdom
Message 115547 - Posted: 4 Mar 2025, 10:02:10 UTC

And now getting Feeder not running error from WCG
ID: 115547 · Report as offensive     Reply Quote
PMH_UK

Send message
Joined: 24 Dec 10
Posts: 50
United Kingdom
Message 115549 - Posted: 4 Mar 2025, 14:27:10 UTC - in response to Message 115547.  

From https://www.cs.toronto.edu/~juris/jlab/wcg.html
March 4, 2025

Services seem to be down. We are working on identifying and fixing the issue.
Paul.
ID: 115549 · Report as offensive     Reply Quote
PMH_UK

Send message
Joined: 24 Dec 10
Posts: 50
United Kingdom
Message 115550 - Posted: 4 Mar 2025, 19:22:41 UTC - in response to Message 115549.  

March 4, 2025

Services seem to be down. We are working on identifying and fixing the issue.
BOINC db node crashed. Thus, all running BOINC services, API services and message queues that need to talk to db01 die similarly; the connection is closed, although the node itself is still running.
10:38 am ET: Crash recovery starting now. We should be able to restart all the services soon.
12:21 pm ET: crash recovery successful; bounced all services; restarted the feeder; should start to see work going out again.
Paul.
ID: 115550 · Report as offensive     Reply Quote
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2789
United Kingdom
Message 115551 - Posted: 5 Mar 2025, 13:45:20 UTC

March 5, 2025

The system seems to be down (again) - we will investigate.
[url] https://www.cs.toronto.edu/~juris/jlab/wcg.html[/url]
I managed to get one ARP task this morning before it all fell over again. Can't get onto any of the user pages on their site currently but BOINC seems to be contacting the server OK. I just can't change my project settings to allow MCM tasks. Getting no tasks available for Africa Rain Forest at the moment.
ID: 115551 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 451
Sweden
Message 115554 - Posted: 5 Mar 2025, 19:34:43 UTC

Still down, but the BOINC part is working. I'm getting MCM tasks.
ID: 115554 · Report as offensive     Reply Quote
PMH_UK

Send message
Joined: 24 Dec 10
Posts: 50
United Kingdom
Message 115555 - Posted: 5 Mar 2025, 20:01:46 UTC - in response to Message 115554.  

March 5, 2025

The DHCP lease issue, or whatever the root cause of our production VMs losing all network access at an increasing rate such that we are almost sure to experience a server crash multiple times a week, is being investigated by hosting.
Our plan to resolve this regardless of the outcome of the investigation is to fully migrate most production boxes to Kubernetes including the DB2, Websphere, and IBM MQ "axis" of the website/forums and webservices provided by WCG.
Previously, we had only provisioned QA on the Kubernetes cluster, and intended to further provision and deploy containers running Mesos workers as our first production boxes orchestrated by Kubernetes on the new hardware to blue/green deploy and eventually move the coordinator responsibilities and finally all workunit management pipeline responsibilities to Kubernetes running Mesos, which would give us fault tolerance at last as we pick apart all the old Mesos job descriptions and crontabs to fully migrate to Kubernetes, Slurm, Redpanda, and distributed postgres (Citus-Data).
Once finished, we will decomission Aurora/Mesos and the old CentOS 7 boxes that run and coordinate the Mesos cluster, and provision new VMs with an LTS version of Ubuntu as we have on the new hardware to add that capacity to the Kubernetes cluster.
We apologize for the delays to the start of the MAM project, we did not account for the sword of damocles hanging over every production server and falling with increasing frequency, nor for the reduced capacity of the environment in the new year. Thank you for your patience and understanding, we will be starting MAM shortly, as soon as we are through this issue.
Paul.
ID: 115555 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1490
United States
Message 115556 - Posted: 5 Mar 2025, 22:14:05 UTC - in response to Message 115555.  

The latest update on WCG:
4:27pm ET: db02 is back online; the website is back up; the DHCP agents were flushed on the old WCG nodes, which should resolve the issue

I can confirm website is back online
ID: 115556 · Report as offensive     Reply Quote
Previous · 1 . . . 17 · 18 · 19 · 20

Message boards : Projects : Anything and Everything to do with (WCG) World Community Grid

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.