= Multicore apps =

The average number of cores per PC will increase over the next few years,
possibly at a faster rate than the average amount of available RAM.

Depending on your application and project,
it may be desirable to develop a multi-thread application.
Possible reasons to do this:

 * Your application's memory footprint is large enough that, on some PCs, there's not enough RAM to run a separate copy of the app on each CPU.
 * You want to reduce the turnaround time of your jobs (either because of human factors, or to reduce server occupancy).

You may be able to use OpenCL, MPI, OpenMP,
[http://blogs.nvidia.com/2011/06/cuda-now-available-for-multiple-x86-processors/ CUDA],
languages like Titanium or Cilk,
or libraries of multi-threaded numerical "kernels", to develop a multi-threaded app.

== Initialization ==

Depending on whether your application uses multiple threads,
multiple processes, or both,
you will need to call
Use [BasicApi#init the appropriate initialization function].

== Thread priorities ==

You should set the priority of new threads to that of the main thread.
{{{
#ifdef MSVC
#define getThreadPriority() GetThreadPriority(GetCurrentThread())
#define setThreadPriority(num) SetThreadPriority(GetCurrentThread(), num)
#else
#include <sys/resource.h>
#define getThreadPriority() getpriority(PRIO_PROCESS, 0)
#define setThreadPriority(num) nice(num)
#endif
}}}
getThreadPriority() is called before forking (by OpenMP in AQUA's case),
and setThreadPriority() is then called by each worker thread.

== Deploying a multicore app version ==

BOINC uses the [AppPlan application planning] mechanism to
coordinate the scheduling of multicore applications.

Suppose you've developed a multicore program,
and that it achieves a linear speedup on up to 64 processors, and no additional speedup beyond that.
To deploy it:

 * Choose a "planning class" name for the program, say "par64" (see below).
 * Create an [UpdateVersions app version], specifying its plan class as "par64".
 * Link the following function into your scheduler (customized as needed):
{{{
bool app_plan(SCHEDULER_REQUEST& sreq, const char* plan_class, HOST_USAGE& hu) {
    if (!strcmp(plan_class, "par64")) {
        // the following is for an app that can use anywhere
        // from 1 to 64 threads, can control this exactly,
        // and whose speedup is .95N
        // (on a uniprocessor, we'll use a sequential app if one is available)
        //
        int ncpus, nthreads;
        bool bounded;

        get_ncpus(sreq, ncpus, bounded);
        nthreads = ncpus;
        if (nthreads > 64) nthreads = 64;
        hu.avg_ncpus = nthreads;
        hu.max_ncpus = nthreads;
        sprintf(hu.cmdline, "--nthreads %d", nthreads);
        hu.flops = 0.95*sreq.host.p_fpops*nthreads;
        if (config.debug_version_select) {
            log_messages.printf(MSG_NORMAL,
                "[version] Multi-thread app estimate %.2f GFLOPS\n",
                hu.flops/1e9
            );
        }
        return true;
    }
...
}
}}}

The BOINC client will schedule applications based
on the average CPU usage returned by this function.