wiki:PortalFeatures

Version 3 (modified by davea, 14 years ago) (diff)

--

BOINC support for science portals

A science portal is a web site serving a group of scientists. Science portals typically offer a computing service in which scientists can submit, via a web interface, sets of jobs to be run against standard applications. Current science portals run these jobs on clusters or Grids.

This document proposes two related additions to BOINC to facilitate using a BOINC project as a computing resource for science portals:

  • A mechanism for allocating computing resources among scientists
  • A mechanism for handling batches of jobs

Computing share

Demand for a portal's computing power may exceed supply, in which case we need a mechanism for dividing the supply among scientists. Assume that every job is associated with a particular BOINC user, representing either a scientist or some other organizational entity, and that each such user has a numeric computing share representing the fraction of the computing resource they should get.

The way in which computing shares are determined is outside our scope, but some possibilities are:

  • Each scientist attaches their own PCs to the BOINC project, and their computing share is the recent average credit of these hosts.
  • Volunteers attached to the project can assign shares of their resource to scientists, and a scientist's overall share is the sum of these, weighted by the volunteer average credit.
  • A scientist's share is increased if they contribute to the portal, e.g. by participating in the message boards.

What are the precise semantics of computing share? Ideally the mechanism should accommodate two classes of scientists:

  • Throughput-oriented: those who have an infinite stream of jobs, want maximum throughput, and don't care about the turnaround times of individual jobs.

  • Latency-oriented: those who occasionally have large batches of jobs, and who want the entire batch to be finished fast. Such a user, for example, might not submit any jobs for a month, then submit a batch that takes a day to complete using the entire resource.

To accommodate both extremes, we propose a system in which each user has an associated debt, i.e. how much computation the system owes to that user.

A user's debt continually increases at a rate equal to their computing share. Debt is capped at a value corresponding to, say, use of the entire computing resource for 1 week.

When a job completes (successfully or not) its user's debt is decremented by the amount of computing done (more precisely, but the amount of computing that would have been done by the job's resources running at peak speed for the job's elapsed time).

In this scheme, latency-oriented users can build up a debt that allows their batches (if they are sufficiently infrequent) to get the full computing resource and finish as fast as possible.

Batch scheduling

The second requirement of science portals is support for batches: groups of jobs that have value only when all of the jobs have been completed. The interval between batch submission and completion is called makespan.

Minimizing makespan

BOINC's scheduler should be modified to reduce makespan. The scheduler has control over two things:

  • what hosts to send which job instances to (and when to create new instances)
  • what deadline to assign to each job

A variety of techniques are possible:

  • Use slower hosts for the first jobs of a batch, with the expectation that the host will finish one job in the duration of the batch (and by extension, very slow hosts should possibly not get any jobs for some batches).
  • Use faster, more reliable hosts for the last jobs of a batch.
  • Use increased replication for the last jobs of a batch to reduce the expected minimum turnaround.

These techniques can be combined into policies. To study the relative performance of these policies, we can develop a simulator that takes as input a description of a real volunteer host population (e.g., SETI@home) and the parameters of a random batch arrival process, and computes the statistics of makespan, and perhaps the difference between predicted and actual makespan (see below).

Estimating makespan

Given the description of a possible new batch, the system should be able to provide a reasonably accurate estimate of its makespan (perhaps with error bars). The computation of this estimate must take into account

  • The computing resource (i.e. the distributions of speed and availability of the volunteer hosts, and their number).
  • The existing workload (batches currently in progress, and their state)
  • The parameters of the batch
  • The computing share and debt of the requesting user
  • The batch scheduling policy

In addition, the system should provide users with a web-based interface for viewing batches in progress, showing their fraction done, an updated estimate of their completion time, information about errors, and a button for aborting the batch.

Job DAGs

Batches are a special case of Directed Acyclic Graphs (DAGs) of jobs. The nodes in such a graph represent jobs; there is an edge from job A to job B if A must be completed before B can be started.

A batch can be viewed as a graph consisting of a "start node" fanning out to N job nodes, then fanning back in to a "finish node".

Science portals may offer the ability to submit collections of work that can be described as DAGs but not as batches: for example, a collection of units, each of which consists of running program A and using its output as input to program B.

(Note: BOINC's wrapper mechanism, which allows multiple applications to be run in sequence as part of a BOINC job, supports some simple cases, but it not general because it requires dependent jobs to run on a single host).

As an extension of the work described above, we could support DAGs rather than just batches.