Anything and Everything to do with (WCG) World Community Grid

Message boards : Projects : Anything and Everything to do with (WCG) World Community Grid
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 13 · Next

AuthorMessage
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 376
Sweden
Message 111394 - Posted: 23 Mar 2023, 19:18:41 UTC

Some people now seems to have been able to at least upload their finished work. Not so lucky here though, at least not yet.
https://www.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=683363
ID: 111394 · Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 3 Mar 23
Posts: 14
Russia
Message 111395 - Posted: 23 Mar 2023, 21:56:54 UTC - in response to Message 111392.  

Threaten me with nukes?

You got me!
LOL (=
ID: 111395 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 376
Sweden
Message 111399 - Posted: 24 Mar 2023, 0:26:40 UTC

Apparently, it was a short period today when at least the upload server was running, since some people managed to upload some of their finished tasks.
Or their tasks simply went into the wide blue yonder, never to be seen again.

But:
Last update host XML 2023-02-28 13:11:22 UTC (22 days 20:59:04 old)
Last update user XML 2023-03-01 01:21:01 UTC (22 days 08:49:25 old)
Last update team XML 2023-03-01 01:21:01 UTC (22 days 08:49:25 old)

ID: 111399 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1348
United States
Message 111400 - Posted: 24 Mar 2023, 1:11:45 UTC

WuProp website & project servers UNREACHABLE AGAIN !!!!!
Been down about 45 minutes
ID: 111400 · Report as offensive     Reply Quote
Robokapp

Send message
Joined: 8 Mar 23
Posts: 10
Message 111404 - Posted: 24 Mar 2023, 5:00:46 UTC - in response to Message 111399.  
Last modified: 24 Mar 2023, 5:01:09 UTC

there's a good chance these technical issues will result in loss of data. WUs completed months or years ago could be lost - and i bet they'll never tell us.
ID: 111404 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 376
Sweden
Message 111408 - Posted: 24 Mar 2023, 12:38:54 UTC - in response to Message 111404.  

there's a good chance these technical issues will result in loss of data. WUs completed months or years ago could be lost - and i bet they'll never tell us.
Yes, I have also thought about that. That's why I wrote about "the wide blue yonder". And I agree that it's likely that they will never say anything about that.
ID: 111408 · Report as offensive     Reply Quote
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1283
United Kingdom
Message 111409 - Posted: 24 Mar 2023, 13:44:35 UTC - in response to Message 111408.  

While they may not know about an individual user loosing data the way BOINC works on the server make sure that the results for a task sent out but never returned are sent out to another user. This may be a bit hard on the individual user, but the science data is pretty well protected.
ID: 111409 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 376
Sweden
Message 111410 - Posted: 24 Mar 2023, 18:20:12 UTC - in response to Message 111409.  

While they may not know about an individual user loosing data the way BOINC works on the server make sure that the results for a task sent out but never returned are sent out to another user. This may be a bit hard on the individual user, but the science data is pretty well protected.
A bit hard yes. I have 10 tasks ready and waiting to be uploaded. If i lose those, my life will be over, and I might as well off myself :-)
ID: 111410 · Report as offensive     Reply Quote
Sir LanDroid

Send message
Joined: 7 Apr 13
Posts: 63
United States
Message 111411 - Posted: 24 Mar 2023, 19:03:12 UTC
Last modified: 24 Mar 2023, 19:04:20 UTC

on a quick update, finally, /science filesystem is on the move to the new storage from the recovery storage unit. As of last night, after 3 hours, the new storage /science filesystem shows 1.4TB used. Assuming such average rate of file transfer, it will take about 74 hours. Hopefully, we will be able to restart BOINC from the new storage and finally put the failure behind us. We will keep you posted.

sincerely
igor
[Mar 24, 2023 11:18:51 AM]
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,44980_offset,140


That's an update a few hours ago from someone who appears to be on Krembil / Jurisica team.
Here's another post looking ahead - doesn't make much sense and does not sound good...

as for the help - logistic is tricky considering we run from a different data centre - and of course we cannot give access to a broad group - but once we can at lest walk again, there are things we plan on our side, and other with the broader community. Briefly - we need to simplify the backend - at the moment, we often run into multi points of failure, instead of robustness. But - once we will be in such a position - we want to run hackathons - this can substantially help with optimizing code we run on the grid, and bring new projects. So far, nVidia is interested to discuss this further - as our plan is to bring more GPU projects. But - of course the backend has to be upgraded before that - as peak performance during GPU stress test in 2021 was around 16PFLOPS.

thank you all for your support
Igor
[Mar 24, 2023 12:47:48 PM]
(Same link as above)
ID: 111411 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 376
Sweden
Message 111412 - Posted: 24 Mar 2023, 19:29:14 UTC

Igor Jurisica is the boss of the Jurisica lab, and not just "someone who appears to be on Krembil / Jurisica team."

https://www.cs.toronto.edu/~juris/jlab/members.html
https://www.cs.toronto.edu/~juris/jlab/contact.html
ID: 111412 · Report as offensive     Reply Quote
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2536
United Kingdom
Message 111416 - Posted: 25 Mar 2023, 17:44:11 UTC

Getting close to a full day without a post or grumble in the thread!
ID: 111416 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1348
United States
Message 111417 - Posted: 25 Mar 2023, 19:53:22 UTC - in response to Message 111416.  

Getting close to a full day without a post or grumble in the thread!

OK Dave
/Grumble activated/
Get off my lawn!
/Grumble deactivated/
ID: 111417 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1348
United States
Message 111422 - Posted: 26 Mar 2023, 21:37:11 UTC

Found this while browsing the WCG forums today>>>
RE: Recovery Update and Donations...

See --> https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,45040_offset,10

"... Fundraising update:
Thank you all - as of this morning UHN Foundation confirmed that we have received $2,227 in the World Community Grid Fund. Thank you.

We have until the end of 2023 to pay for the new storage - and these funds will help us achieve it, although we will need CAD 57 thousand..."
ID: 111422 · Report as offensive     Reply Quote
Sir LanDroid

Send message
Joined: 7 Apr 13
Posts: 63
United States
Message 111423 - Posted: 27 Mar 2023, 7:03:52 UTC

C'mon folks we are slipping - there were zero complaints about WCG being down yesterday. This is unacceptable especially since they don't even work on weekends - we must do better.
Here's the first complaint today, can someone sign up for this afternoon? Ni! 😆
ID: 111423 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 376
Sweden
Message 111427 - Posted: 27 Mar 2023, 15:25:50 UTC
Last modified: 27 Mar 2023, 15:27:26 UTC

Last update host XML 2023-02-28 13:11:22 UTC (27 days 00:45:54 old)
Last update user XML 2023-03-01 01:21:01 UTC (26 days 12:36:15 old)
Last update team XML 2023-03-01 01:21:01 UTC (26 days 12:36:15 old)

No further comments needed.
ID: 111427 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1348
United States
Message 111429 - Posted: 27 Mar 2023, 15:52:13 UTC - in response to Message 111423.  

Okay, I'll start with:
It's Monday almost mid-day in the great white north and the silence from Krembil/WCG is nearly deafening over the crickets chirping.
ID: 111429 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 376
Sweden
Message 111430 - Posted: 27 Mar 2023, 16:14:08 UTC

Well, I seriously doubt that this day will be the day WCG restarts. Despite the Terabyte calculations Igor Jurisica posted on Friday March 24.
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,44980_offset,140#683387
ID: 111430 · Report as offensive     Reply Quote
Sir LanDroid

Send message
Joined: 7 Apr 13
Posts: 63
United States
Message 111431 - Posted: 27 Mar 2023, 20:18:58 UTC
Last modified: 27 Mar 2023, 20:20:59 UTC

...it will take about 74 hours.

Sounds good, but don't assume that is around the clock crunching. Spread over 8 hour days and no weekends or a standard 40 hour week, that could be 2 more weeks. So maybe 4/10?
ID: 111431 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 376
Sweden
Message 111432 - Posted: 27 Mar 2023, 20:29:08 UTC - in response to Message 111431.  
Last modified: 27 Mar 2023, 20:30:41 UTC

...it will take about 74 hours.

Sounds good, but don't assume that is around the clock crunching. Spread over 8 hour days and no weekends or a standard 40 hour week, that could be 2 more weeks. So maybe 4/10?
It's not about any crunching, it's about moving/copying /science filesystem, to the new storage from the recovery storage unit. And that should be done without any human intervention as far as I understood it, from other posts by Igor.

But of course, I would not be surprised at all, if the whole procedure crashed and burned, some time during the weekend.
ID: 111432 · Report as offensive     Reply Quote
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2536
United Kingdom
Message 111433 - Posted: 27 Mar 2023, 20:39:09 UTC - in response to Message 111431.  

Having been a little bit on the inside in my role as a moderator for CPDN when they had hardware problems with running out of space from a new model type producing data a lot faster than it could be moved. I also would not be surprised if things take longer than predicted.
ID: 111433 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 13 · Next

Message boards : Projects : Anything and Everything to do with (WCG) World Community Grid

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.