wiki:MultiSize

Multi-size apps

The difference in throughput between a slow processor (e.g. an Android device that runs infrequently) and a fast processor (e.g. a GPU that's always on) can be a factor of 1,000 or more. Having a single job size can therefore present problems:

  • If the size is small, hosts with GPUs get huge numbers of jobs. This causes performance problems on the client and a high DB load on the server.
  • If the size is large, slow hosts can't get jobs, or they get jobs that take weeks to finish.

To address this, BOINC provides a mechanism that tries to send large jobs to fast devices and small jobs to slow devices.

How it works

A multi-size application has a set of N size classes, 0 ... N-1. Each job belongs to a size class. Jobs of size class i are smaller than those of size class i+1. You decide how many size classes to have, and how large the jobs of a given size class are.

A size_census script periodically computes statistics about the "effective speed" of devices for each multi-size app, where effective speed is the device speed times host availability. In particular, it computes and maintains the boundaries of the N quantiles.

When a host requests work for a particular device, the scheduler computes its quantile for each multi-size application. It preferentially sends it jobs of the corresponding size class. If it must send jobs of a different size class, it prefers smaller classes.

Set up the application

To make an app multi-size, set the n_size_classes field of its database entry. Currently this must be done manually, e.g.

update app set n_size_classes=3 where id=14;

Job creation

Set the size class of jobs as you create them. From C++:

...
wu.size_class = 2;
ret = create_work(wu, ...);

From scripts or command line:

create_work ... --size_class 2

Don't forget to set wu.rsc_fpops_est and wu.rsc_fpops_bound appropriately as well.

You may want your work generator to maintain a supply of jobs of each size class. To find the number of unsent jobs of a given size class, use

int count_unsent_results(int&, int appid, int size_class);

Daemon configuration

The script size_census.php computes effective speed statistics for multi-size apps, and writes them to flat files (named size_census_APPNAME) in the project directory. Arrange to run it periodically by putting the following in your config.xml:

    <task>
      <cmd>run_in_ops size_census.php</cmd>
      <output>size_census.out</output>
      <period>24 hour</period>
    </task>

If you run the script with the --all_apps option, it will compute the statistics of all apps, not just multi-size ones. This is useful when you're getting things set up.

For each multi-size app, you must run a daemon size_regulator that regulates the flow of jobs into the shared-memory job cache, making sure that cache doesn't get clogged with jobs of a single size

    <daemon>
      <cmd>size_regulator --app_name uppercase --lo 10 --hi 30 --sleep_time 10</cmd>
      <output>size_regulator_uppercase.out</output>
      <pid_file>size_regulator_uppercase.pid</pid_file>
      <disabled>1</disabled>
    </daemon>

The command-line options of size_regulator are

--app_name
name of the application
--lo
keep at least this many jobs of each size class in cache
--hi
keep at most this many jobs of each size class in cache
--sleep_time
sleep this long if nothing to do

The follow options correspond to those for feeder; use the same one.

--random_order
--priority_asc
--priority_order
--priority_order_create_time

Configuration

To use this feature you must use include the following in your config.xml:

<job_size_matching>1</job_size_matching>
Last modified 9 years ago Last modified on May 27, 2015, 12:33:25 PM