wiki:AssimilateIntro
Last modified 17 hours ago Last modified on 10/24/14 14:18:23

Handling completed jobs

Completed jobs are handled by programs called assimilators. These are generally application-specific: they might copy output files from the BOINC upload directory to a permanent location, or they might parse the output files and insert results into a database.

Creating an assimilator

To create an assimilator, link the program sched/assimilator.cpp with a function of the form

int assimilate_handler(
    WORKUNIT& wu, vector<RESULT>& results, RESULT& canonical_result
);

This function is called when either

  • The workunit has a nonzero error mask (indicating, for example, too many error results). In this case the handler might write a message to a log or send an email to the project administrator.
  • The workunit has a canonical result. In this case wu.canonical_resultid will be nonzero, and canonical_result will contain the canonical result.

In both cases the 'results' vector will be populated with all the workunit's results (including unsuccessful and unsent ones). All files (both input and output) will generally be on disk.

It's possible that both conditions might hold.

The return values of assimilate_handler() are:

  • 0: success: the workunit will be marked as assimilated.
  • DEFER_ASSIMILATION: the workunit will be processed again when another instance finishes. This is useful for appliations where you want to see all the completed results.
  • Other nonzero values: the assimilator will log an error message and exit. Typically assimilate_handler() should return nonzero for any error condition. This way the system administrator can fix the problem before any completed or erroneous workunits are mis-handled by BOINC.

You can use BOINC's back-end utility functions to get file pathnames and open files.

Running assimilators

Run assimilators as BOINC daemons: that is, add an entry

<daemon>
   <cmd> my_assimilator --app APPNAME </cmd>
</daemon>

to your project's configuration file.

Assimilators have the following command-line options:

--app name
the application name
[ --mod N R ]
process only jobs with mod(ID, N) == R. This lets you run multiple assimilators in parallel to increase throughput.
[ -d N ]
set verbosity level (1 = least, 3 = most)
[ --dont_update_db ]
don't mark jobs as assimilated (for testing)

Using scripting languages

The assimilator script_assimilator lets you write assimilator logic in your language of choice (Python, Perl, PHP, bash, etc.). script_assimilator takes a command-line argument

--script "filename arg1 ... argn"
script to handle a completed job

arg1 ... argn represent cmdline args to be passed to the script for successful workunits. The options are:

files
list of output files of the job's canonical result
wu_id
workunit ID
result_id
ID of the canonical result
runtime
runtime of the canonical result, in seconds

If no args are specified, the script is invoked as

scriptname wu_id files

If the workunit has no canonical result (i.e. it failed) the script is invoked as

scriptname --error N wu_id

where N is an integer encoding the reasons for the job's failure (see WU_ERROR_* in html/inc/common_defs.inc)

The script must be put in your project's bin/ directory.

Python assimilator framework

A framework for all-Python assimilators can be found in sched/assimilator.py. See the comments for instructions.

The sample assimilator

BOINC includes a sample assimilator, sample_assimilator. It does the following:

  • For successful workunits, it writes the canonical instance's output files to the directory PROJECT/sample_results/. If there is only one output file it is named WU_NAME. If there are more than one they are named WU_NAME_0, WU_NAME_1, etc. If there are no output files, an empty file WU_NAME_no_output_files is created.
  • If the workunit failed (e.g., too many errors) it appends a line to sample_results/errors containing the workunit name and the error code.

The sample assimilator can be used as a placeholder while you are developing your application. In some cases you may be able to use it in production.