wiki:AppMultiThread

Version 19 (modified by davea, 16 years ago) (diff)

--

API for multi-thread apps

T(DesignDocument)?

Why write a multi-threaded app?

The average number of cores per PC will increase over the next few years, possibly at a faster rate than the average amount of available RAM.

Depending on your application and project, it may be desirable to develop a multi-threaded application. Possible reasons to do this:

  • Your application's memory footprint is large enough that, on some PCs, there's not enough RAM to run a separate copy of the app on each CPU.
  • You want to reduce the turnaround time of your jobs (either because of human factors, or to reduce server occupancy).

Writing and debugging a multi-threaded app is hard. You may be able to use languages like Titanium or Cilk, or libraries of multi-threaded numerical "kernels".

Deploying a multi-threaded app version

BOINC uses the application planning mechanism to coordinate the scheduling of multi-threaded applications.

Suppose you've developed a multi-threaded program, and that it achieves a linear speedup on up to 16 processors, and no additional speedup beyond that. To deploy it:

  • Choose a "planning class" name for the program, say "par16" (see below).
  • Create an app version. Include a file app_plan containing "par16".
  • Link the following function into your scheduler:
    bool app_plan(SCHEDULER_REQUEST& sreq, const char* plan_class, HOST_USAGE& hu) {
        if (!strcmp(plan_class, "par16")) {
            if (host.ncpus < 16) {
                hu.ncpus = sreq.host.ncpus;
                hu.flops = sreq.host.p_fpops*sreq.host.p_ncpus;
            } else {
                hu.ncpus = 16;
                hu.flops = sreq.host.p_fpops*16;
            }
            return true;
        }
        return false;
    }
    

Client scheduling policy

Suppose an app A uses NT(A) threads. Ideally, on a host with N CPUs, we want NT(A), summed over running apps, to be at least N; otherwise CPU time is wasted. However, if it's much more than N, we increase latency without increasing throughput, we may increase synchronization overhead, and we may use more RAM than needed.

Given a list of runnable applications (ordered by priority or deadline), the CPU scheduler runs applications until the number of threads reaches a limit. If the user has specified a limit N on the number of CPUs to use, the number of threads cannot exceed N. Otherwise, the scheduler stop when the number of threads is >= NCPUS.

Notes

  • The average number of CPUs used by an app may be less than its number of thread (because of I/O or synchronization). Ideally the client should the number of CPUs, not the number of threads.