[[PageOutline]] = Condor-B: BOINC/Condor integration = This document describes the design of Condor-B, extensions to BOINC and Condor so that a BOINC-based volunteer computing project can provide resources to a Condor pool. Goals: * From the job submitter's viewpoint, things should look as much like Condor as possible: i.e. they prepare Condor submit files and use condor_submit. * Exception to the above: applications to be used in this way must be set up ahead of time on the BOINC server. == Issues == Condor-B must address some basic differences between Condor and BOINC: * Data model: in the BOINC model, files have both logical and physical names. Physical names are unique within a project, and the file associated with a given physical name is immutable. Files may be used by many jobs. In Condor, a file is associated with a job, and has a single name. * Application concept: In Condor, a job is associated with a single executable, and can run only on hosts of the appropriate platform (and possibly other attributes, as specified by the job's !ClassAd). In BOINC, there may be many app versions for a single application: e.g. versions for different platforms, GPU types, etc. A job is associated with an application, not an app version. == Proposed architecture == We'll use Condor's existing mechanism for sending jobs to non-Condor back ends. This will involve 2 components: * A "BOINC GAHP" program. * A new class in Condor's job_router for managing communication with the BOINC GAHP. === GAHP protocol === The API exported by the BOINC GAHP has the following functions: {{{ submit_jobs() inputs: batch_name (unique within project) app_name jobs job name cmdline list of input files output: error code }}} Each input file is described by its path on the submit node. The file name is the name by which the app will refer to the file. What the BOINC GAHP does: * From the list of input files, filter out application files. Need to figure out how to do this: could be attribute specified in Condor submit file, or the list could be fetched from BOINC server. * Eliminate duplicates in file list * Compute MD5s of files * BOINC physical name of each file is condorv_(md5) * Do query_files() RPC to see which files are already on BOINC server * Do upload_files() RPC to copy needed files to BOINC server * Do submit_jobs() RPC to BOINC server; create batch, jobs {{{ query_batch in: batch name out: list of jobs job name status (done/error/in prog/not in prog) }}} {{{ query_job in: job name out: status list of URLs of output files }}} {{{ abort_jobs in: list of job names }}} {{{ set_lease in: batch name new lease end time }}} === BOINC Web RPCs === {{{ query_files() in: list of physical file names out: list of those not present on server }}} {{{ upload_files() in: batch name filename file contents out: error code uploads files and creates DB records (see below) }}} {{{ submit_jobs() in: same as for GAHP, except include both logical and physical name out: error code }}} === Atomicity === (We need to decide about this). === Authentication === All the above APIs will take a "credentials" argument, which may be either a BOINC authenticator or x.509 certificate; we'll need to decide this. Two general approaches: * Each job submitter has a separate account on the BOINC project (created ahead of time in a way TBD). This is preferred because it allows BOINC to enforce quotas. * All jobs belong to a single BOINC account. == File management mechanism == To keep track of input files, we'll add the following to BOINC: * DB tables for files, and for batch/file associations * daemon for deleting files and DB records of files with no associations, or past all lease ends == Changes to BOINC == * The job creation primitives (create_work()) will let you directly specify the logical names of input files, rather than specifying them in a template. * Add lease_end field to batch == Implementation notes == The BOINC GAHP could be implemented in PHP, Python, or C++. My inclination is to use Python.