= PyBOINC: simplified BOINC application development in Python = [[T(DesignDocument)]] This is a proposed design for making developing BOINC applications as simple as possible. PyBOINC provides a master/slave model: the master runs on the server, and the slave is distributed. Here's an example, which sums the squares of integers from 1 to 100. The application consists of three files. The first, '''app_types.py''', defines the input and output types: {{{ class Input: def __init__(self, arg): self.value = arg class Output: def __init__(self, arg): self.value = arg; }}} The second file, '''app_master.py''', is the master program: {{{ import app_types def make_calls(): for i in range(100): input = Input(i); pyboinc_call('app_slave.py', input) def handle_result(output): sum += output.value sum = 0 pyboinc_master(make_calls, handle_result) print "The answer is %d", sum }}} The third file, '''pyboinc_slave.py''', is the slave function: {{{ import app_types input = pyboinc_get_input() output = Output(input.value*input.value); pyboinc_return_output(output); }}} The procedure for running this program is: * Create a BOINC project * Run a script ops/py_boinc.php that configures the project to use PyBOINC * Set an environment var PYBOINC_DIR to the root directory of the project * Create a directory (anywhere) containing the above files * In that directory, type {{{ python app_master.py }}} * This command may take a long time. If it's aborted via !^C, it may be repeated later. In that case no new jobs are created, and the master waits for the completion of the remaining slaves. == Implementation == PyBOINC uses a new table, 'batch', which represents a group of jobs. Its fields are: * ID * ID of user who submitted this batch * path of 'batch directory' PyBOINC uses the following files and subdirectories in the job directory: * pyboinc_checkpoint: If present, this contains a job ID * new/: result files not yet handled * old/: result files already handled PyBOINC uses Python's [http://docs.python.org/lib/node317.html Pickler] class for serialization. The PyBOINC setup script creates an application 'pyboinc'. Its work units have two input files: a Python program, and a data file. Its application runs a Python interpreter on the program file. The executable of the application is a shell script for linux/mac and a batch file for windows, which executes the python interpreter with the client code: {{{ python app_client.py }}} * Question: what if python interpreter is not present on a windows box? Is the license of python allows distribution of the interpreter? PyBOINC uses the following daemons: * validator: uses the sample bitwise validator (need to check what python produces with floating-point operations on the different platforms) * assimilator: uses a variant of sample_assimilator. Given a completed result, it looks up the batch record, then copies the output file to BATCH_DIR/new/ Pseudocode for the various PyBOINC functions: {{{ static jobID pyboinc_call(slave_filename, input) create a uniquely-named file x in the download hierarchy, file name should contain batch ID Pickler(x).dump(input) create_work() pyboinc_master(make_calls, handle_result) read jobID from pyboinc_checkpoint if none create a batch record; jobID = its ID make_calls() write jobID to checkpoint file move all files from old/ to new/ while (not all jobs done) if there is a file x in new/ output = Pickler.load(x) handle_result(output) move x to old/ else sleep(1) pyboinc_get_input() boinc_resolve_filename("input", infile) return Pickler.load(infile) pyboinc_return_output(output) boinc_resolve_filename("output", outfile) Pickler(outfile).dump(output) }}} = PyMW - Master Worker Computing in Python = PyMW (Master Worker Computing in Python) PyMW is a Python module for parallel master-worker computing in a variety of environments, including BOINC. With the PyMW module, users can write a single program that scales from multi-core machines to global computing platforms. == Download == PyMW can be downloaded from [http://pymw.sourceforge.net/ here]. == Installation == A BOINC project has to be prepared to accept applications written in Python. BOINC comes with an install script for PyMW. The setup script - pymw_setup.py - can be found in the bin directory of the BOINC project. An absolute path to the PyMW working directory has to be provided with the -p or --pymw switch to the setup script. The assimilator program for PyMW will copy the results to this directory. Example to run the setup script: {{{ cd /home/kadam/projects/sandbox/bin pymw_setup -p /home/kadam/pymw/examples/tasks }}} The PyMW setup script executes the followings, respectively: * Insert pymw_assimilator in the daemons section of config.xml * Insert pymw application in project.xml * Create pymw directory in the app directory of the project * Create client application executables for Linux platform * Create client application executables for Windows platform * Call xadd * Call update_versions == Platforms == By default PyMW assumes that the python interpreter is installed on the client computers. The Python interpreter is more likely to be installed on linux clients. On Windows PyMW assumes that Python is installed and the path environment variable contains the path to the interpreter. If python is installed on a windows box, but it's not in the path, than PyMW will look in the default installation directory, which is C:\Python25 with the current version of Python. OS X client is coming soon. If python is not installed on the client computer, PyMW jobs will fail. The [http://www.python.org/psf/license/ license of Python] makes it possible to deploy the interpreter as part of the client application, however the current version of the interpreter is about 35 MB, which is not very network friendly. A later version of PyMW will come up with a solution that transfers the interpreter compressed over the network to reduce network load, but make PyMW widely available. == Usage == Once successfully executed, the project is ready to handle PyMW jobs. The setup script can be called multiple times. However, if the pymw client application is already registered with the project, the setup script will only alter the working directory of the pymw_assimilator in the config.xml file. To use PyMW with the BOINC interface PyMW has to be configured with the -i or --interface switch. Also the home directory of the BOINC project needs to be passed to PyMW with the -p or --project_home switch. Example to run PyMW with the BOINC interface: {{{ cd /home/kadam/pymw/examples python monte_pi.py -i boinc -p /home/kadam/projects/sandbox/ }}} For further information on using PyMW, please refer to the documentation of PyMW.