Proxy servers

From BOINC

SETI@home Classic benefited from "proxy servers" such as SETIQueue, that store work units and results, and transfer them between participant computers and the main SETI@home server. Proxies provide a smooth supply of work even when the main server is down, and they make it possible to run SETI@home Classic on computers not connected directly to the Internet.

These programs won't work with BOINC (see below), but some of their benefits can be achieved in other ways:

  • The buffering of multiple work units is provided by the BOINC client itself - you can specify how much work your computer should get each time it contacts the server.
  • Hosts that are not directly connected to the Internet, but share a LAN with one that is, can participate in BOINC using an HTTP 1.0 proxy such as Squid for Unix or FreeProxy for Windows.

Why won't SETIQueue work with BOINC?

Unlike SETI@home Classic, with its "one size fits all" work units, BOINC allows work units that have extreme requirements (memory, disk, CPU) and makes sure they're sent only to hosts that can handle them. In BOINC, a client communicates directly with the server, telling the server about its hardware (memory size, CPU speed etc.) and the server chooses work for it accordingly. Furthermore, BOINC has separate scheduling and data servers (in SETI@home Classic, a single server played both roles).

What benefits could task proxies achieve?

Buffering workunits centrally shall allow to keep an amount of prefetched workunits that matches the average of your environment as a whole instead of the average of a single computer. Subsequently, no prefetched WUs should exist on individual computers which reduces the risk of late delivery. In a perfect world with respect to this idea, a computing node would not store the state of a WU at all. Instead it would be updated at the central cache, allowing a partially computed WU to be handed over to other nodes, once a given node is shut down. This would allow nodes of any kind of use pattern to participate in BOINC computation.

Proxies could reduce network traffic by distributing copies of common computation code or WU runtime data to several nodes, instead of each node downloading it via the Internet.

A proxy could provide aggregated local statistics.

How a BOINC proxy system might work

Here's a sketch of a proxy system based on a modified core client. We assume that there's a "proxy" host that does only communication and storage, and a number of "worker" hosts that do computation. The core client must be modified to accept -proxy and -worker options:

  • With the -proxy option, the client does network communication (scheduler RPC, file upload and download) and no computation, CPU benchmarking, or measurement of other hardware info like memory and disk size. (It does, however, measure and store network speed.) It exits when network communication is finished.
  • With the -worker option, the client does the complement: computation and CPU benchmarking but no network communication, etc. It exits when computation is finished (or perhaps when a CPU becomes idle, or when a project is starved).

The proxy host would maintain a set of separate BOINC directories, one for each worker host. The high-level logic is (for each worker host):

  • Run the core client with -worker on the worker host.
  • When it exits, synchronize its directory with the corresponding directory on the proxy host.
  • Run the core client with -proxy on the proxy host.
  • When it exits, synchronize its directory with the corresponding directory on the worker host.
  • Repeat.

Note: none of the above is implemented. If you are a programmer and would like to help, please let us know. Also note: as described above, the system is not asynchronous (computation and communication don't overlap) and the proxy doesn't act as a buffer. It could be modified to have these properties.