wiki:JobIntro

Version 10 (modified by Nicolas, 9 years ago) (diff)

typo (double word)

Introduction to job processing

In conventional distributed processing systems a job consists of an executable and some input files. The system runs the job on a remote computer and gives you the output files when it's done.

You can do this type of single job submission in BOINC. However, BOINC is designed for situations where

  • You need to process thousands or millions of jobs, using only a few applications.
  • You want jobs to be platform-independent, i.e. you don't have to specify what kind of computer to use.
  • You don't trust remote computers, so you want to "validate" their results.

To handle these requirements, BOINC defines an architecture in which jobs are processed in a pipeline consisting of the following steps:

  • You submit jobs.
  • The jobs are executed on remote hosts.

In some cases, jobs may be "replicated", i.e. executed on multiple hosts.

  • A ''validator'' examines replicas and selects one of them as 'canonical', or correct.
  • An ''assimilator'' handles validated results, storing them in an archive or database.

Typically validators and assimilators are application-specific, and you'll need to develop them (usually in C++) using frameworks supplied by BOINC. In some cases you may be able to use sample versions that are part of BOINC. Each program should be listed as a daemon in the config.xml file.

Submitting jobs

To set up for submitting jobs, you must

  1. Create an app and app versions.
  2. Write XML template files that describe the job's input and output files (typically the same template files can be used for many jobs).
  3. Set up a validator and assimilator for the application.

Then, for each job, you must

  1. Stage the job's input file(s) on the BOINC server.
  2. Submit the job, using any of several methods (see below).

Once this is done, BOINC takes over: it creates one or more instances of the job, distributes them to client hosts, collects the output files. It validates and assimilates the results, and deletes the input and output files.

There are three general methods for job submission:

Local submission
In this approach, jobs are submitted by programs or scripts run on the BOINC server. The job submitter must have login access to the server.
Web-based submission
In this approach, jobs are submitted via web pages running on the BOINC server. The job submitter need not have login access to the server.
Remote submission
In this approach, jobs are submitted by programs or web pages running outside the BOINC server.

Batches

For convenience, jobs may be collected into batches. Batches may be monitored and controlled as a unit.

Ownership of jobs and batches

In projects with multiple job submitters, job submitters are identified by their user account on the project (the same kind of accounts as volunteers; job submitters can also act as resource providers). Jobs and batches are associated with the user ID of their submitters.

Job submission control panel

For projects that use batches and job ownership, the web page submit.php allows users to view submitted batches and jobs and retrieve their output files. This link should be shown only to authorized users. You can do this, e.g., by putting the following on your home page (index.php):

$user = get_logged_in_user(false);
if ($user && BoincUserSubmit::lookup_userid($user->id);
    echo '
        <li><a href=submit.php>Job submission</a>
        <li><a href=sandbox.php>File sandbox</a>
    ';
}

(include the "File sandbox" link if you use this feature).

If you use web-based job submission, you can optionally put links to the job-submission pages on the control panel. To do so, add something like the following to your html/project/project.inc:

$submit_urls = array(
    "uppercase" => "uppercase_submit.php",
    "remote_test" => "submit_example.php",
);

Each array element maps an application name to the job submission form for that application.