wiki:WrapperApp

Version 13 (modified by davea, 10 years ago) (diff)

--

Legacy applications

A legacy application is one which doesn't use the BOINC API (for example, because the source code is not available). Such applications can be run under BOINC using a 'wrapper application' supplied by BOINC. The wrapper handles all communication with the core client, and runs the legacy application as a subprocess:

http://boinc.berkeley.edu/wrapper.png

The wrapper program (called wrapper) is in boinc_samples. It reads a file with logical name 'job.xml'. This file has the format:

<job_desc>
    <task>
        <application>worker_5.10_windows_intelx86.exe</application>
        [ <stdin_filename>stdin_file</stdin_filename> ]
        [ <stdout_filename>stdout_file</stdout_filename> ]
        [ <stderr_filename>stderr_file</stderr_filename> ]
        [ <command_line>--foo bar</command_line> ]
    </task>
    [ ... ]
</job_desc>

The job file specifies a sequence of tasks. The descriptor for each task includes the name of the application. If the application uses standard I/O (stdin, stdout or stderr) the descriptor specifies the logical names of the files to which these are to be connected. The descriptor may also specify command-line argments to be passed to the application. wrapper itself may be passed command-line arguments (specified in the workunit template); these are passed to each of the applications after those specified in the job file.

The job file can specify multiple tasks. This is useful for two purposes:

  • To handle jobs that involve multiple steps (e.g., preprocessing and postprocessing).
  • To break a long job up into smaller pieces. This provides a form of checkpointing: wrapper does checkpointing at the task level, so that lost CPU time is limited even if the legacy applications themselves are not restartable.

Notes:

  • The job file can be part of the workunit (e.g. if its command line elements differ between workunits) or the application version (if it's the same between workunits).
  • Files opened directly by a worker program must have the <copy_file/> tag. This requires version 5.5 or higher of the BOINC core client (you can specify this limit at either the application or project level.
  • If you run wrapper in standalone mode (while debugging), you must provide input files with the proper logical, not physical, names.

Example

Here's an example of this mechanism:

  • Compile the program 'worker' from the boinc_samples tree, producing (say) 'worker_5.10_windows_intelx86.exe'. This is the legacy app. If reads from stdin and writes to stdout; it also opens and reads a file 'in', and opens and writes a file 'out'. It takes one command-line argument: the number of CPU seconds to use.
  • Compile the program 'wrapper' from the boinc_samples tree, producing (say) 'wrapper_5.10_windows_intelx86.exe'. This program executes your legacy application, and acts as a proxy for it (to report CPU time etc.).
  • Create an application named 'worker', and a corresponding directory 'project/apps/worker'. In this directory, create a directory 'wrapper_5.10_windows_intelx86.exe'. Put the files 'wrapper_5.10_windows_intelx86.exe', and 'worker_5.10_windows_intelx86.exe' there.
  • In the same directory, create a file 'job.xml=job_1.12.xml' (1.12 is a version number) containing
    <job_desc>
        <task>
            <application>worker_5.10_windows_intelx86.exe</application>
            <stdin_filename>stdin</stdin_filename>
            <stdout_filename>stdout</stdout_filename>
            <command_line>10</command_line>
        </task>
    </job_desc>
    
    This file (which has logical name 'job.xml' and physical name 'job_1.12.xml') is read by 'wrapper'; it tells it the name of the legacy program, what files to connect to its stdin/stdout, and a command line.
  • Create a workunit template file
    <file_info>
        <number>0</number>
    </file_info>
    <file_info>
        <number>1</number>
    </file_info>
    <workunit>
        <file_ref>
            <file_number>0</file_number>
            <open_name>in</open_name>
            <copy_file/>
        </file_ref>
        <file_ref>
            <file_number>1</file_number>
            <open_name>stdin</open_name>
        </file_ref>
        <rsc_fpops_bound>1000000000000</rsc_fpops_bound>
        <rsc_fpops_est>1000000000000</rsc_fpops_est>
    </workunit>
    
    and a result template file
    <file_info>
        <name><OUTFILE_0/></name>
        <generated_locally/>
        <upload_when_present/>
        <max_nbytes>5000000</max_nbytes>
        <url><UPLOAD_URL/></url>
    </file_info>
    <file_info>
        <name><OUTFILE_1/></name>
        <generated_locally/>
        <upload_when_present/>
        <max_nbytes>5000000</max_nbytes>
        <url><UPLOAD_URL/></url>
    </file_info>
    <result>
        <file_ref>
            <file_name><OUTFILE_0/></file_name>
            <open_name>out</open_name>
            <copy_file/>
        </file_ref>
        <file_ref>
            <file_name><OUTFILE_1/></file_name>
            <open_name>stdout</open_name>
        </file_ref>
    </result>
    
  • Run update_versions to create an app version.
  • Run a script like
    cp download/input `bin/dir_hier_path input`
    cp download/input2 `bin/dir_hier_path input2`
    
    bin/create_work -appname worker -wu_name worker_nodelete \
    -wu_template templates/worker_wu \
    -result_template templates/worker_result \
    input input2
    
    to generate a workunit.

To understand how all this works: at the beginning of execution, the file layout is:

Project directoryslot directory
inputin (copy of project/input)
job_1.12.xmljob.xml (link to project/job_1.12.xml)
input2stdin (link to project/input2)
worker_nodelete_0stdout (link to project/worker_nodelete_0)
worker_5.10_windows_intelx86.exeworker_5.10_windows_intelx86.exe (link to project/worker_5.10_windows_intelx86.exe)
wrapper_5.10_windows_intelx86.exewrapper_5.10_windows_intelx86.exe(link to project/wrapper_5.10_windows_intelx86.exe)

The wrapper program executes the worker, connecting its stdin to project/input2 and its stdout to project/worker_nodelete_0. The worker program opens 'in' for reading and 'out' for writing.

When the worker program finishes, the wrapper sees this and exits. Then the BOINC core client copies slot/out to project/worker_nodelete_1.