Message boards : Projects : World Community Grid has announced an extended outage from Feb 14 to April 22, 2022
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 10 · Next
Author | Message |
---|---|
Send message Joined: 30 Mar 20 Posts: 419 |
Yes, the website came back, and at the same time I got 50 new OPNG tasks. The first 23 went down without a single http error, but now the http errors are back again, just as bad as ever. I also have a update script (batch file), but set at 6-minutes. Otherwise, the chance of getting any OPNG's would be almost 0%. Edit, added: After a lot of retries, I managed to download the rest of my 50 OPNG's + 20 more. I can't continue to babysit my computer, so I have set NNT, and will shut it down after the cached OPNG tasks are crunched, uploaded, and reported. I will try again in a couple of days, and if things haven't improved then, I'll wait a week or more before I try again. |
Send message Joined: 5 Oct 06 Posts: 5129 |
New error mode: 04/10/2022 20:11:33 | World Community Grid | [http] HTTP error: Error in the HTTP2 framing layerdrops the attempted connection immediately. I'm taking a break. |
Send message Joined: 10 May 07 Posts: 1444 |
New error mode: Getting same message... Found a closed github issue from 2019 in the curl library code that mentions the primary error. With all the burps and farts Krembil has given us in the past too many months, could they possibly be running outdated server software? Doubt it but anything is possible. Most likely, its the various real & virtual WCG server(?? s) that are completely overwhelmed with everyone trying to view website, send/receive work, etc. |
Send message Joined: 24 Dec 19 Posts: 229 |
WCG runs their own custom server software. they are not a normal BOINC project. wondering if it is "outdated" or not isn't even applicable being how different they are. |
Send message Joined: 25 May 09 Posts: 1301 |
Knowing IBM it was highly customised to run on their hardware, thus is being a grade one pain in the chair polisher to get running properly on "normal" hardware. |
Send message Joined: 5 Oct 06 Posts: 5129 |
Well, they backed out of that one pretty quickly (temporarily or permanently, time will tell) - I've been back to 'normal' delays for a while. I've given up trying for tonight. |
Send message Joined: 30 Mar 20 Posts: 419 |
boinccmd-exe --network_available in a batch file on Windows running --network_available every 20 seconds seems to keep my computer fed now, without manual intervention. It will auto-retry any stalled download. Simple example: cd C:\Program Files\BOINC :loop boinccmd.exe --network_available TIMEOUT /T 20 /nobreak cls goto loop |
Send message Joined: 30 Mar 20 Posts: 419 |
Sigh....Again WCG website went down. "503 Service Unavailable No server is available to handle this request", or "System error World Community Grid is currently experiencing an unexpected error. Please check Facebook or Twitter for more information." |
Send message Joined: 17 Nov 16 Posts: 890 |
I see that. Same issue as before. Same error messages. Guess their bandaid fix did not correct the real issue, ie they really don't know how to manage a project the scope and size that IBM could manage with one hand tied behind their back. |
Send message Joined: 5 Oct 06 Posts: 5129 |
As folks have said. I've got many OPNG transfers waiting from last night, and they're clearing very, very slowly under my OCD. But I caught an interesting new error message when checking it was the same problem as before: 05/10/2022 08:53:34 | World Community Grid | [http] [ID#45505] Received header from server: HTTP/2 503 05/10/2022 08:53:34 | World Community Grid | [http] [ID#45505] Received header from server: <html><body><h1>503 Service Unavailable</h1> 05/10/2022 08:53:34 | World Community Grid | [http] [ID#45505] Received header from server: No server is available to handle this request. 05/10/2022 08:53:34 | World Community Grid | [http] [ID#45505] Info: Connection cache is full, closing the oldest one.I'll look into that one. |
Send message Joined: 30 Mar 20 Posts: 419 |
As folks have said. I've got many OPNG transfers waiting from last night, and they're clearing very, very slowly under my OCD. That seems to be a Curl thing. CURLMOPT_MAXCONNECTS explained When the cache is full, curl closes the oldest one in the cache to prevent the number of open connections from increasing. https://curl.se/libcurl/c/CURLMOPT_MAXCONNECTS.html |
Send message Joined: 28 Jun 10 Posts: 2703 |
As folks have said. I've got many OPNG transfers waiting from last night, and they're clearing very, very slowly under my OCD.Same here but ARP, that being the only WCG project I run at the moment. At least I can see all the downloads again now without having to scroll! |
Send message Joined: 5 Oct 06 Posts: 5129 |
That seems to be a Curl thing.I was assuming that, but thanks for confirming and the link. But what's the limit in our clients, and is it of the order of that 45 thousand current ID count? If so, is it required/efficient in a client setting? I'll probably post that question in Git. |
Send message Joined: 30 Mar 20 Posts: 419 |
Well, regarding the situation at the moment of WCG at Krembil/Jurisica Lab, all I have to say is: "Luke 23:34" :-) |
Send message Joined: 5 Oct 06 Posts: 5129 |
|
Send message Joined: 10 May 07 Posts: 1444 |
Well, regarding the situation at the moment of WCG at Krembil/Jurisica Lab, all I have to say is: "Luke 23:34" :-) Yes, we forgive them. They supposedly "started working" on the physical hosting Feb 14 -- almost 6 months ago but took virtual control a bit over a year ago. Has anyone ever found out how many real full-time positions are actually working for WCG staffing at Krembil? I'm more inclined to say they are deep (over their heads?) into the s**t. |
Send message Joined: 8 Nov 10 Posts: 310 |
Well, regarding the situation at the moment of WCG at Krembil/Jurisica Lab, all I have to say is: "Luke 23:34" :-) I thought maybe you were going to say "Physician, heal thyself". |
Send message Joined: 25 May 09 Posts: 1301 |
Has anyone ever found out how many real full-time positions are actually working for WCG staffing at Krembil? The number of FTE employed on a task is all too often a totally irrelevant metric as one may have a thousand employed on a task an none of them doing any productive work, or one may have one employed and working very effectively. Guess which set will get the task done first, and at a lower cost. |
Send message Joined: 25 May 09 Posts: 1301 |
One thing I would like to see is a comparison between the code handed over by IBM and a comparable age "stable, production" version of server-side BOINC. I have a gut feeling that there would be a substantial difference between the two. Someone said (some time ago) that it may well have been an easier task to take the data and load it onto "clean" servers running a standard version of BOINC. While this may have sounded a simple task it could well be an uphill struggle unless the data structure deployed by IBM was identical to (or readily converted to) that used by "standard" BOINC. |
Send message Joined: 30 Mar 20 Posts: 419 |
Opened https://github.com/BOINC/boinc/issues/4952Thank you Richard ! Edit, added: WCG admins just has to be totally out of this world, or just don't understand what they are (not) doing. On their Facebook page, some 25 minutes ago, they posted: "We are experiencing a brief system error with our website and database. We apologize and will notify you when it has been resolved." "A brief system error"? No, it's not "brief" when it's going on for many hours, with the webpage unusable, and since the same issue happened just a couple of days ago, they obviously did not fix it then. Sigh, I really don't know what to say........ |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.