wiki:PythonAppDev

Version 10 (modified by kadam, 16 years ago) (diff)

--

PyBOINC: simplified BOINC application development in Python

T(DesignDocument)?

This is a proposed design for making developing BOINC applications as simple as possible. PyBOINC provides a master/slave model: the master runs on the server, and the slave is distributed.

Here's an example, which sums the squares of integers from 1 to 100. The application consists of three files. The first, app_types.py, defines the input and output types:

class Input:
    def __init__(self, arg):
        self.value = arg

class Output:
    def __init__(self, arg):
        self.value = arg;

The second file, app_master.py, is the master program:

import app_types

def make_calls():
    for i in range(100):
       input = Input(i);
       pyboinc_call('app_slave.py', input)

def handle_result(output):
    sum += output.value

sum = 0

pyboinc_master(make_calls, handle_result)

print "The answer is %d", sum

The third file, pyboinc_slave.py, is the slave function:

import app_types

input = pyboinc_get_input()
output = Output(input.value*input.value);
pyboinc_return_output(output);

The procedure for running this program is:

  • Create a BOINC project
  • Run a script ops/py_boinc.php that configures the project to use PyBOINC
  • Set an environment var PYBOINC_DIR to the root directory of the project
  • Create a directory (anywhere) containing the above files
  • In that directory, type
    python app_master.py
    
  • This command may take a long time. If it's aborted via ^C, it may be repeated later. In that case no new jobs are created, and the master waits for the completion of the remaining slaves.

Implementation

PyBOINC uses a new table, 'batch', which represents a group of jobs. Its fields are:

  • ID
  • ID of user who submitted this batch
  • path of 'batch directory'

PyBOINC uses the following files and subdirectories in the job directory:

  • pyboinc_checkpoint: If present, this contains a job ID
  • new/: result files not yet handled
  • old/: result files already handled

PyBOINC uses Python's Pickler class for serialization.

The PyBOINC setup script creates an application 'pyboinc'. Its work units have two input files: a Python program, and a data file. Its application runs a Python interpreter on the program file. The executable of the application is a shell script for linux/mac and a batch file for windows, which executes the python interpreter with the client code:

python app_client.py
  • Question: what if python interpreter is not present on a windows box? Is the license of python allows distribution of the interpreter?

PyBOINC uses the following daemons:

  • validator: uses the sample bitwise validator (need to check what python produces with floating-point operations on the different platforms)
  • assimilator: uses a variant of sample_assimilator. Given a completed result, it looks up the batch record, then copies the output file to BATCH_DIR/new/

Pseudocode for the various PyBOINC functions:

static jobID

pyboinc_call(slave_filename, input)
    create a uniquely-named file x in the download hierarchy, file name should contain batch ID
    Pickler(x).dump(input)
    create_work()

pyboinc_master(make_calls, handle_result)
    read jobID from pyboinc_checkpoint
    if none
        create a batch record; jobID = its ID
        make_calls()
        write jobID to checkpoint file
    move all files from old/ to new/
    while (not all jobs done)
        if there is a file x in new/
            output = Pickler.load(x)
            handle_result(output)
            move x to old/
        else
            sleep(1)

pyboinc_get_input()
    boinc_resolve_filename("input", infile)
    return Pickler.load(infile)

pyboinc_return_output(output)
    boinc_resolve_filename("output", outfile)
    Pickler(outfile).dump(output)

PyMW - Master Worker Computing in Python

PyMW (Master Worker Computing in Python) PyMW is a Python module for parallel master-worker computing in a variety of environments, including BOINC. With the PyMW module, users can write a single program that scales from multi-core machines to global computing platforms.

PyMW can be downloaded from here.

A BOINC project has to be prepared to accept applications written in Python. BOINC comes with an install script for PyMW. The setup script - pymw_setup.py - can be found in the bin directory of the BOINC project. An absolute path to the PyMW working directory has to be provided with the -p or --pymw switch to the setup script. The assimilator program for PyMW will copy the results to this directory. Example to run the setup script:

cd /home/kadam/projects/sandbox/bin
pymw_setup -p /home/kadam/pymw/examples/tasks

The PyMW setup script executes the followings, respectively:

  • Insert pymw_assimilator in the daemons section of config.xml
  • Insert pymw application in project.xml
  • Create pymw directory in the app directory of the project
  • Create client application executables for Linux platform
  • Create client application executables for Windows platform
  • Call xadd
  • Call update_versions

Once successfully executed, the project is ready to handle PyMW jobs. The setup script can be called multiple times. However, if the pymw client application is already registered with the project, the setup script will only alter the working directory of the pymw_assimilator in the config.xml file.

To use PyMW with the BOINC interface PyMW has to be configured with the -i or --interface switch. Also the home directory of the BOINC project needs to be passed to PyMW with the -p or --project_home switch. Example to run PyMW with the BOINC interface:

cd /home/kadam/pymw/examples
python monte_pi.py -i boinc -p /home/kadam/projects/sandbox/

For further information on using PyMW, please refer to the documentation of PyMW.