Stalled downloads

Message boards : Questions and problems : Stalled downloads
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Mad_Max

Send message
Joined: 29 Apr 19
Posts: 19
Russia
Message 96216 - Posted: 29 Feb 2020, 19:31:56 UTC
Last modified: 29 Feb 2020, 20:25:56 UTC

Yep. it continue to repeat from time to time. Just became more rare compared to beginning of Feb.
There is a fresh example with stuck R@H download.
It still does not get any new work while even single file download is stalling.
And still no any logs entries even with http_debug when i abort such stale downoads manually.
And still get new work after stale download already aborted.
29/02/2020 21:32:46 | Rosetta@home | [http] HTTP_OP::init_get(): http://boinc.bakerlab.org/rosetta/download/fc/rb_02_24_16848_16671_ab_t000__h002_robetta.zip
29/02/2020 21:32:46 | Rosetta@home | [http] HTTP_OP::libcurl_exec(): ca-bundle 'D:\Boinc\ca-bundle.crt'
29/02/2020 21:32:46 | Rosetta@home | [http] HTTP_OP::libcurl_exec(): ca-bundle set
29/02/2020 21:32:46 | Rosetta@home | Started download of rb_02_24_16848_16671_ab_t000__h002_robetta.zip
29/02/2020 21:32:46 | Rosetta@home | [http] [ID#10370] Info:  Connection 4149 seems to be dead!
29/02/2020 21:32:46 | Rosetta@home | [http] [ID#10370] Info:  Closing connection 4149
29/02/2020 21:32:46 | Rosetta@home | [http] [ID#10370] Info:  Connection 4150 seems to be dead!
29/02/2020 21:32:46 | Rosetta@home | [http] [ID#10370] Info:  Closing connection 4150
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Info:    Trying 128.95.160.157...
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Info:  Connected to boinc.bakerlab.org (128.95.160.157) port 80 (#4151)
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Sent header to server: GET /rosetta/download/fc/rb_02_24_16848_16671_ab_t000__h002_robetta.zip HTTP/1.1
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Sent header to server: Host: boinc.bakerlab.org
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.6.22)
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Sent header to server: Accept: */*
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Sent header to server: Accept-Encoding: deflate, gzip
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Sent header to server: Content-Type: application/x-www-form-urlencoded
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Sent header to server: Accept-Language: en_GB
29/02/2020 21:32:47 | Rosetta@home | [http] [ID#10370] Sent header to server:
29/02/2020 21:32:48 | Rosetta@home | [http] [ID#10370] Received header from server: HTTP/1.1 200 OK
29/02/2020 21:37:48 | Rosetta@home | [http] [ID#10370] Info:  Operation too slow. Less than 10 bytes/sec transferred the last 300 seconds
29/02/2020 21:37:48 | Rosetta@home | [http] [ID#10370] Info:  Closing connection 4151
29/02/2020 21:37:48 | Rosetta@home | [http] HTTP error: Timeout was reached
29/02/2020 21:37:48 | Rosetta@home | Temporarily failed download of rb_02_24_16848_16671_ab_t000__h002_robetta.zip: transient HTTP error
29/02/2020 21:37:48 | Rosetta@home | Backing off 03:06:40 on download of rb_02_24_16848_16671_ab_t000__h002_robetta.zip
29/02/2020 21:37:49 |  | Project communication failed: attempting access to reference site
.......................
I aborted stuck download of rb_02_24_16848_16671_ab_t000__h002_robetta.zip and press "update project"
29/02/2020 22:17:16 | Rosetta@home | update requested by user
29/02/2020 22:17:16 |  | [http] HTTP_OP::init_get(): http://boinc.bakerlab.org/rosetta/notices.php?userid=xxxx
29/02/2020 22:17:16 |  | [http] HTTP_OP::libcurl_exec(): ca-bundle set
29/02/2020 22:17:16 |  | [http] [ID#0] Info:  Connection 4172 seems to be dead!
29/02/2020 22:17:16 |  | [http] [ID#0] Info:  Closing connection 4172
29/02/2020 22:17:16 |  | [http] [ID#0] Info:  Connection 4173 seems to be dead!
29/02/2020 22:17:16 |  | [http] [ID#0] Info:  Closing connection 4173
29/02/2020 22:17:16 |  | [http] [ID#0] Info:  Connection 4174 seems to be dead!
29/02/2020 22:17:16 |  | [http] [ID#0] Info:  Closing connection 4174
29/02/2020 22:17:17 |  | [http] [ID#0] Info:    Trying 128.95.160.156...
29/02/2020 22:17:17 |  | [http] [ID#0] Info:  Connected to boinc.bakerlab.org (128.95.160.156) port 80 (#4175)
29/02/2020 22:17:17 |  | [http] [ID#0] Sent header to server: GET /rosetta/notices.php?userid=xxxxx
29/02/2020 22:17:17 |  | [http] [ID#0] Sent header to server: Host: boinc.bakerlab.org
29/02/2020 22:17:17 |  | [http] [ID#0] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.6.22)
29/02/2020 22:17:17 |  | [http] [ID#0] Sent header to server: Accept: */*
29/02/2020 22:17:17 |  | [http] [ID#0] Sent header to server: Accept-Encoding: deflate, gzip
29/02/2020 22:17:17 |  | [http] [ID#0] Sent header to server: Content-Type: application/x-www-form-urlencoded
29/02/2020 22:17:17 |  | [http] [ID#0] Sent header to server: Accept-Language: en_GB
29/02/2020 22:17:17 |  | [http] [ID#0] Sent header to server:
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server: HTTP/1.1 200 OK
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server: Date: Sat, 29 Feb 2020 19:17:08 GMT
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server: Server: Apache/2.4.18
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server: Expires: Sat, 29 Feb 2020 19:17:08 GMT
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server: Last-Modified: Sat, 29 Feb 2020 19:17:08 GMT
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server: Vary: Accept-Encoding
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server: Content-Length: 1968
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server: Content-Type: application/xml
29/02/2020 22:17:17 |  | [http] [ID#0] Received header from server:
29/02/2020 22:17:17 |  | [http] [ID#0] Info:  Connection #4175 to host boinc.bakerlab.org left intact
29/02/2020 22:17:20 | Rosetta@home | Sending scheduler request: Requested by user.
29/02/2020 22:17:20 | Rosetta@home | Not requesting tasks: some download is stalled
29/02/2020 22:17:20 | Rosetta@home | [http] HTTP_OP::init_post(): http://srv4.bakerlab.org/rosetta_cgi/cgi
29/02/2020 22:17:20 | Rosetta@home | [http] HTTP_OP::libcurl_exec(): ca-bundle set
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Info:    Trying 128.95.160.156...
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Info:  Connected to srv4.bakerlab.org (128.95.160.156) port 80 (#4176)
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: POST /rosetta_cgi/cgi HTTP/1.1
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: Host: srv4.bakerlab.org
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.6.22)
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: Accept: */*
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: Accept-Encoding: deflate, gzip
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: Content-Type: application/x-www-form-urlencoded
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: Accept-Language: en_GB
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: Content-Length: 15400
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server: Expect: 100-continue
29/02/2020 22:17:21 | Rosetta@home | [http] [ID#1] Sent header to server:
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Received header from server: HTTP/1.1 100 Continue
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Info:  We are completely uploaded and fine
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Received header from server: HTTP/1.1 200 OK
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Received header from server: Date: Sat, 29 Feb 2020 19:17:12 GMT
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Received header from server: Server: Apache/2.4.18
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Received header from server: Vary: Accept-Encoding
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Received header from server: Content-Length: 2389
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Received header from server: Content-Type: text/xml
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Received header from server:
29/02/2020 22:17:22 | Rosetta@home | [http] [ID#1] Info:  Connection #4176 to host srv4.bakerlab.org left intact
29/02/2020 22:17:22 | Rosetta@home | Scheduler request completed


Work fetch resume only after manual stalled download abort AND BOINC restart.
ID: 96216 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 96217 - Posted: 29 Feb 2020, 19:59:27 UTC - in response to Message 96216.  

I tried pasting that download url into a browser, and got the same thing:



It's still stalled in the browser, after all the time it took me to capture and upload that screenshot, and even after a pause/resume. So it seems to be a real problem of the Rosetta server - feel free to copy and show them that image.

Next question - how to get BOINC to handle it with a simple 'Abort transfer'?
ID: 96217 · Report as offensive
Mad_Max

Send message
Joined: 29 Apr 19
Posts: 19
Russia
Message 96220 - Posted: 29 Feb 2020, 20:41:38 UTC - in response to Message 96217.  
Last modified: 29 Feb 2020, 20:43:33 UTC

Yes, it is a two separate issues:
First: stalled downloads of some files from R@H. Its not related to BOINC client (i also cant download such files in browser or any download manager). Although this may be due to errors in the BOINC server code running on R@H server. Or some problems with R@H servers itself. I and few other volunteers have already reported it on R@H. Its theirs responsibility to check and fix (or report to BOINC devs if they think servers OK)

Second: BOINC client handle such "stuck download" situation very badly - it stucks in indefinite loop trying (and failing each time) to download same file every ~3 hours and completely stop requesting new work from a project where single stalled download is present until the user manual intervention. Even if work queue gone empty it still refuse ask new work from a project if it see any stalled download.
If there was a "backup" project set by user - BOINC switch to backup project completely. If there is no "backup" project (or it have some issues too) client just idle.
ID: 96220 · Report as offensive
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 96242 - Posted: 1 Mar 2020, 17:55:19 UTC - in response to Message 95857.  

1) if you see it happening, set <http_debug> in Event Log options, and retry the transfer - find out what's happening behind that 'transient HTTP error'.
2) make a careful and exact note of the file name in question. Cancel the download, and make sure it disappears from the transfers tab. Restart the client, and if the 'stalled download' message reappears, have a very careful 'read only' (no edits) peek inside client_state.xml - same folder. Find the reference (if any) to the file you cancelled, and post the whole of the
<file>
...
</file>
section it's enclosed in.


Well I tried it too without much success. Here is the BOINC log when it is stalled:
189 Rosetta@home 3/1/2020 12:03:45 PM Started download of PKY1232uM_9mer_gly_0918_1056_0001_5_SSC_matched_9_FD_C_GP_B_0001_notail.zip
190 3/1/2020 12:08:53 PM Project communication failed: attempting access to reference site
191 Rosetta@home 3/1/2020 12:08:53 PM Temporarily failed download of PKY1232uM_9mer_gly_0918_1056_0001_5_SSC_matched_9_FD_C_GP_B_0001_notail.zip: transient HTTP error
192 Rosetta@home 3/1/2020 12:08:53 PM Backing off 04:25:30 on download of PKY1232uM_9mer_gly_0918_1056_0001_5_SSC_matched_9_FD_C_GP_B_0001_notail.zip
193 3/1/2020 12:08:55 PM Internet access OK - project servers may be temporarily down.

I then cancelled the download (in the Transfers tab of BOINCTasks) and rebooted the Ubuntu 18.04.4 machine (BOINC 7.16.3).
But the file by that name did not show up in the "client_state.xml".

I think we are stuck, so to speak.
ID: 96242 · Report as offensive
Previous · 1 · 2 · 3

Message boards : Questions and problems : Stalled downloads

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.