wiki:AppMultiThread

Version 32 (modified by davea, 12 years ago) (diff)

--

Multicore apps

The average number of cores per PC will increase over the next few years, possibly at a faster rate than the average amount of available RAM.

Depending on your application and project, it may be desirable to develop a multi-thread application. Possible reasons to do this:

  • Your application's memory footprint is large enough that, on some PCs, there's not enough RAM to run a separate copy of the app on each CPU.
  • You want to reduce the turnaround time of your jobs (either because of human factors, or to reduce server occupancy).

You may be able to use OpenCL, MPI, OpenMP, CUDA, languages like Titanium or Cilk, or libraries of multi-threaded numerical "kernels", to develop a multi-threaded app.

Initialization

Depending on whether your application uses multiple threads, multiple processes, or both, you will need to call Use the appropriate initialization function.

Thread priorities

You should set the priority of new threads to that of the main thread.

#ifdef MSVC
#define getThreadPriority() GetThreadPriority(GetCurrentThread())
#define setThreadPriority(num) SetThreadPriority(GetCurrentThread(), num)
#else
#include <sys/resource.h>
#define getThreadPriority() getpriority(PRIO_PROCESS, 0)
#define setThreadPriority(num) nice(num)
#endif

getThreadPriority() is called before forking (by OpenMP in AQUA's case), and setThreadPriority() is then called by each worker thread.

Deploying a multicore app version

BOINC uses the application planning mechanism to coordinate the scheduling of multicore applications.

Suppose you've developed a multicore program, and that it achieves a linear speedup on up to 64 processors, and no additional speedup beyond that. To deploy it:

  • Choose a "plan class" name for the program, say "mt" (see below).
  • Create an app version, specifying its plan class as "mt".
  • Edit the following in sched/sched_customize.cpp if needed:
    // the following is for an app that can use anywhere from 1 to 64 threads
    //
    static inline bool app_plan_mt(
        SCHEDULER_REQUEST& sreq, HOST_USAGE& hu
    ) {
        double ncpus = g_wreq->effective_ncpus;
            // number of usable CPUs, taking user prefs into account
        int nthreads = (int)ncpus;
        if (nthreads > 64) nthreads = 64;
        hu.avg_ncpus = nthreads;
        hu.max_ncpus = nthreads;
        sprintf(hu.cmdline, "--nthreads %d", nthreads);
        hu.projected_flops = sreq.host.p_fpops*hu.avg_ncpus*.99;
            // the .99 ensures that on uniprocessors a sequential app
            // will be used in preferences to this
        hu.peak_flops = sreq.host.p_fpops*hu.avg_ncpus;
        return true;
    }