wiki:BossaReference

Version 5 (modified by davea, 16 years ago) (diff)

--

Bossa reference manual

Abstractions

  • A Bossa project has one or more applications.
  • A application has a dynamic set of jobs.
  • An application has a dynamic set of batches, which are used to group jobs.
  • Each job has a set of job instances, in progress or completed.
  • A user is a person who volunteers to perform jobs.

Each of these is represented by a table in a MySQL database.

Jobs may be marked as calibration jobs. These are jobs for which the answer is known in advance; their purpose is to estimate the accuracy of each user. Each application has a calibration job fraction; this is the probability with which calibration jobs are assigned.

Each job has a state, one of:

BOSSA_JOB_EMBARGOED: the job is not yet eligible to be issued.

BOSSA_JOB_IN_PROGRESS: the job is eligible to be issued.

BOSSA_JOB_DONE: the job has been finished successfully.

BOSSA_JOB_INCONCLUSIVE: the job has finished unsuccessfully (typically because a consensus was not reached).

Each job has a floating-point priority. Jobs are assigned to users in order of decreasing priority.

Each job, job instance, and user has an associated app data - a PHP structure whose contents are determined by the project, not by Bossa. These structures are stored in the database.

Volunteer characteristics

For each application and each user, Bossa maintains skill estimate, an estimate of the user's skill at that task. This is maintained in the user's database record. Normally it's a single number in [0..1], and it's initially zero.

The skill estimate can be computed in any of several ways:

  • The results of the user's interaction with a Bolt course associated with the application.
  • The user's performance on "calibration jobs" mixed into the stream.
  • The fraction of the user's results classified as invalid by redundancy.

Skill estimates are used for two purposes:

  • To decide whether to give jobs to a user;
  • To decide how many redundant instances of a given job are needed.

Possible extensions

Integration with BOINC

Some offline jobs may involve computation done through BOINC; i.e. if the job is assigned to a team, the computation is queued in the project's BOINC server and dispatched to members of the team. (Or if the job is assigned to a user with many computers, those computers are used).

  • Tasks may be short (performed online via a single web page) or may take several weeks and involve running separate programs.
  • Tasks may be performed by a single user or by a group of cooperating users.
  • Tasks may be unvalidated, automatically validated, or validated by comparing redundant instances.

Teams as volunteers

Each job instance is assigned either to a user or to a team.

Offline jobs

  • Offline: jobs are not online, e.g. because they're potentially handled by a group of users, or requires other asynchronous activity.

A project can configure:

  • A maximum number of outstanding offline jobs per user or group
  • A maximum number of jobs per day issued per user or group