= API for multi-thread apps =

[[T(DesignDocument)]]

== Why write a multi-threaded app? ==

The average number of cores per PC will increase over the next few years,
possibly at a faster rate than the average amount of available RAM.

Depending on your application and project, it may be desirable
to develop a multi-threaded application.
Possible reasons to do this:

 * If your application's memory footprint is large enough that, on some PCs, there's not enough RAM to run a separate copy of the app on each CPU.

 * If you want to reduce the turnaround time of your jobs (either because of human factors, or to reduce server occupancy).

Writing and debugging a multi-threaded app is hard.
You may be able to use languages like Titanium or Cilk,
or libraries of
numerical "kernels" that are multi-threaded.

== Assumptions ==

Suppose an app A uses NT(A) threads.

Ideally, on a host with N CPUs, we want
NT(A), summed over running apps, to be about N.
If it's less, we're not using CPU time.
If it's more, then
 * we increase latency without increasing throughput
 * we use more RAM than needed
 * synchronization overhead is high

We assume that applications may be able to change NT(A) dynamically
in response to suggestions from BOINC.

Example: suppose
 * we have an 80-core CPU
 * app A can use 1,2,4,8,16,32 threads
 * app B can use 1,2,4,8,16,32,64 threads

Then we want to have either (16, 64) or (32, 32, 16) threads most of the time.

== Proposal ==

API functions:
{{{
int boinc_nthreads_hint();
}}}
An application calls {{{boinc_nthreads_hint()}}} periodically,
at points where it is able to change its number of threads.
It returns a suggested number N of threads.
The application should change its number of threads to
a value as large as possible but no greater than N.
{{{
void boinc_nthreads(int actual, int possible);
}}}
An application calls this to report its actual number of threads,
and its maximum possible number of threads.
It should call this whenever either quantity changes.

A WU DB record can specify "max average threads",
an estimate of the average  value of NT(A) on a host with arbitrarily many CPUs.
This is used by the client and scheduler to estimate completion time.

== Implementation ==

Shared-memory messages:
 * core->app (process control channel): {{{<target_nthreads>}}}
 * app->core (process control channel): {{{<actual_nthreads>}}}

Client maintains estimates of CPU efficiency per job,
uses this to scale {{{target_nthreads}}}.

Implementation ({{{enforce_schedule()}}}):
as we schedule jobs, decrement CPU count by scaled {{{actual_nthreads}}}.
{{{rr_simulation()}}} needs to be modified too.

== Notes ==
The average number of processors used, Ncpus(A), may be less
(because of I/O or synchronization).