wiki:JobIntro

Version 3 (modified by davea, 11 years ago) (diff)

--

Introduction to job processing

Most distributed processing systems let you submit jobs - you supply an executable and some input files, and the system runs the job on a remote computer and gives you the output files when it's done. You can do this type of single job submission in BOINC.

However, BOINC is designed for different types of situation, in particular those where:

  • You need to process thousands or millions of jobs, using only a few applications.
  • You want jobs to be platform-independent, i.e. you don't have to specify what kind of computer to use.
  • You want job results to be validated by replication.

To handle these requirements, BOINC defines an architecture in which jobs are processed in a pipeline consisting of the following programs:

  • A work generator creates jobs.
  • A validator compares replicated results and selects one of them as 'canonical', or correct.
  • An assimilator handles validated results, storing them in an archive or database.

Typically these programs are application-specific, and you will need to develop them (usually in C++) using frameworks supplied by BOINC. In some cases you may be able to use sample versions that are part of BOINC. Each program should be listed as a daemon in the config.xml file.

Submitting jobs

To submit a job you must

  1. Write XML template files that describe the job's input and output files (typically the same template files can be used for many jobs).
  2. Stage the job's input file(s)
  3. Invoke a BOINC function or script that submits the job.

Once this is done, BOINC takes over: it creates one or more instances of the job, distributes them to client hosts, collects the output files. It validates and processes the results, and deletes the input and output files.

Typically, steps 2) and 3) are done by a work generator program that creates lots of jobs.