= Multicore apps =

The average number of cores per PC will increase over the next few years,
possibly at a faster rate than the average amount of available RAM.

Depending on your application and project,
it may be desirable to develop a multi-thread application.
Possible reasons to do this:

 * Your application's memory footprint is large enough that, on some PCs, there's not enough RAM to run a separate copy of the app on each CPU.
 * You want to reduce the turnaround time of your jobs (either because of human factors, or to reduce server occupancy).

You may be able to use OpenCL, MPI, OpenMP,
[http://blogs.nvidia.com/2011/06/cuda-now-available-for-multiple-x86-processors/ CUDA],
languages like Titanium or Cilk,
or libraries of multi-threaded numerical "kernels", to develop a multi-threaded app.

== Initialization ==

Depending on whether your application uses multiple threads,
multiple processes, or both,
you will need to call
Use [BasicApi#init the appropriate initialization function].

== Thread priorities ==

You should set the priority of new threads to that of the main thread.
{{{
#ifdef MSVC
#define getThreadPriority() GetThreadPriority(GetCurrentThread())
#define setThreadPriority(num) SetThreadPriority(GetCurrentThread(), num)
#else
#include <sys/resource.h>
#define getThreadPriority() getpriority(PRIO_PROCESS, 0)
#define setThreadPriority(num) nice(num)
#endif
}}}
getThreadPriority() is called before forking (by OpenMP in AQUA's case),
and setThreadPriority() is then called by each worker thread.

== Deploying a multicore app version ==

BOINC uses the [AppPlan application planning] mechanism to
coordinate the scheduling of multicore applications.

Suppose you've developed a multicore program,
and that it achieves a linear speedup on up to 64 processors, and no additional speedup beyond that.
To deploy it:

 * Choose a "plan class" name for the program, say "mt" (see below).
 * Create an [AppVersionNew app version], specifying its plan class as "mt".
 * Edit the following in sched/sched_customize.cpp if needed:
{{{
// the following is for an app that can use anywhere from 1 to 64 threads
//
static inline bool app_plan_mt(
    SCHEDULER_REQUEST& sreq, HOST_USAGE& hu
) {
    double ncpus = g_wreq->effective_ncpus;
        // number of usable CPUs, taking user prefs into account
    int nthreads = (int)ncpus;
    if (nthreads > 64) nthreads = 64;
    hu.avg_ncpus = nthreads;
    hu.max_ncpus = nthreads;
    sprintf(hu.cmdline, "--nthreads %d", nthreads);
    hu.projected_flops = sreq.host.p_fpops*hu.avg_ncpus*.99;
        // the .99 ensures that on uniprocessors a sequential app
        // will be used in preferences to this
    hu.peak_flops = sreq.host.p_fpops*hu.avg_ncpus;
    return true;
}

}}}