wiki:PortalFeatures

Version 1 (modified by davea, 14 years ago) (diff)

--

BOINC support for science portals

A science portal is a web site serving a group of scientists in a particular area. In addition to other features, science portals typically offer a facility for computing, in which scientists can submit, via a web interface, sets of jobs to be run against standard applications. Currently these jobs are run on clusters or Grids.

This document proposes additions to BOINC to facilitate using a BOINC project as a computing resource for science portals. There are two related parts:

  • A mechanism for allocating computing resources among scientists
  • A mechanism for handling batches of jobs

Computing share

Demand for computing power may exceed supply, in which case we need a mechanism for dividing the supply among scientists. Assume that every job is associated with a particular BOINC user, representing either a scientist or some other organizational entity, and that each such user has a numeric computing share proportional to the fraction of the computing resource they should get.

The way in which computing shares are determined is outside the scope, but some possibilities are:

  • Each scientist attaches their own PCs to the BOINC project, and their computing share is the recent average credit of these hosts.
  • Volunteers attached to the project can assign shares of their resource to scientists, and a scientist's overall share is the sum of these, weighted by the volunteer average credit.
  • A scientist's share is increased if they contribute to the portal, e.g. by participating in the message boards.

What are the precise semantics of computing share? Ideally this should accommodate two classes of scientists:

  • Throughput-oriented: those who have an infinite stream of jobs. They want maximum throughput, and don't care about the turnaround times of individual jobs.

  • Latency-oriented: those who occasionally have large batches of jobs, and who want the entire batch to be finished fast. Such a user, for example, might not submit any jobs for a month, then submit a batch that takes a day to complete using the entire resource.

To accommodate both extremes, we propose a system in which each user has an associated debt, i.e. how much computation the system owes to that user.

A user's debt continually increases at a rate equal to their computing share. Debt is capped at a value corresponding to, say, use of the entire computing resource for 1 week.

When a job completes (successfully or not) its user's debt is decremented by the amount of computing done (more precisely, but the amount of computing that would have been done by the job's resources running at peak speed for the job's elapsed time).

In this scheme, latency-oriented user can build up a debt that allows their batches (if they are sufficiently infrequent) to get the full computing resource and finish as fast as possible.