Thread 'Reducing download errors in retried transfers'

Message boards : BOINC client : Reducing download errors in retried transfers
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 9313 - Posted: 2 Apr 2007, 18:19:01 UTC
Last modified: 2 Apr 2007, 18:23:38 UTC

Currently, a download retry often fails to pass the MD5 checksum test, because BOINC skips already downloaded bytes and appends only the rest of the download.

The file has a HTML error code in the first few hundred bytes, followed by a correct download starting from the byte count of the HTML error page.

The size is OK but MD5 and unzip attempts fail.


A possible fix :

Insert the DWORD value of the first 4 bytes of the workunit into the scheduler reply that assigns the workunit to a host.

When a download retry starts and the target file is not empty, check the first 4 bytes against this DWORD value for a decision about rewinding the file or appending at the end.


Maybe even 2 bytes would do, because that's often sufficient for *magic* ;-)

0x1f 0x8b for GZip
0x50 0x4b for PKZip
0x42 0x5a for BZip2
0x37 0x7a for 7-Zip
...
(the scheduler can know those without knowing the individual files)
ID: 9313 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 9315 - Posted: 2 Apr 2007, 19:14:39 UTC

If the HTTP status code isn't 200, it should always rewind to the beginning. Or, if it downloads 1MB just fine, then stops for whatever reason, and attempting to resume gives a status code not 200, it should rewind back to those known good 1MB. In other words, it should only increase "amount of data already downloaded" counter if the download actually succeeds.

If this isn't what it's currently doing, it should. An HTML error page should never get into the file.
ID: 9315 · Report as offensive
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 9318 - Posted: 2 Apr 2007, 19:48:57 UTC
Last modified: 2 Apr 2007, 19:49:53 UTC

Some problems on server side, which are not direct HTTP transport errors between client and server, cause that problem because they end up in the file. Maybe cURL does not recognize those as HTTP errors. Example :

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Bad Gateway</title>
</head><body>
<h1>Bad Gateway</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
...

(a server side proxy, I'm not using any)
ID: 9318 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 9319 - Posted: 2 Apr 2007, 20:32:45 UTC

But in that case, the HTTP header has '502' status code. I think it shouldn't be hard to start writing to file only after seeing a 200 OK.
ID: 9319 · Report as offensive

Message boards : BOINC client : Reducing download errors in retried transfers

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.