wiki:AppCoprocessor

Version 18 (modified by davea, 15 years ago) (diff)

--

Applications that use coprocessors

This document describes BOINC's support for applications that use coprocessors such as

  • GPUs
  • Cell SPEs

We'll assume that these resources are allocated rather than scheduled: i.e., an application using a coprocessor has it locked while the app is in memory, even if the app is suspended by BOINC or descheduled by the OS.

The BOINC client probes for coprocessors and reports them in scheduler requests. The client keeps track of coprocessor allocation, i.e. how many instances of each are free. It only runs an app if enough instances are available.

Deploying a coprocessor app

BOINC uses the application planning mechanism to coordinate the scheduling of multi-threaded applications.

Suppose you've developed a coprocessor program, that it uses a CUDA GPU and 1 GFLOPS of the CPU, and produces a total of 100 GFLOPS. To deploy it:

  • Choose a "planning class" name for the program, say "cuda" (see below).
  • Create an app version, specifying its plan class as "cuda".
  • Link the following function into your scheduler (customize as needed):
    int app_plan(SCHEDULER_REQUEST& sreq, char* plan_class, HOST_USAGE& hu) {
        if (!strcmp(plan_class, "cuda")) {
            // the following is for an app that uses a CUDA GPU
            //
            COPROC_CUDA* cp = (COPROC_CUDA*)sreq.coprocs.lookup("CUDA");
            if (!cp) {
                if (config.debug_version_select) {
                    log_messages.printf(MSG_NORMAL,
                        "[version] Host lacks CUDA coprocessor for plan class cuda\n"
                    );
                }
                return PLAN_REJECT_CUDA_NO_DEVICE;
            }
            int v = (cp->prop.major)*100 + cp->prop.minor;
            if (v < 100) {
                if (config.debug_version_select) {
                    log_messages.printf(MSG_NORMAL,
                        "[version] CUDA version %d < 1.0\n", v
                    );
                }
                return PLAN_REJECT_CUDA_VERSION;
            } 
    
            if (cp->drvVersion && cp->drvVersion < PLAN_CUDA_MIN_DRIVER_VERSION) {
                if (config.debug_version_select) {
                    log_messages.printf(MSG_NORMAL,
                        "[version] NVIDIA driver version %d < PLAN_CUDA_MIN_DRIVER_VERSION\n",
                        cp->drvVersion
                    );
                }
                return PLAN_REJECT_NVIDIA_DRIVER_VERSION;
            }
    
            if (cp->prop.dtotalGlobalMem < PLAN_CUDA_MIN_RAM) {
                if (config.debug_version_select) {
                    log_messages.printf(MSG_NORMAL,
                        "[version] CUDA mem %d < %d\n",
                        cp->prop.dtotalGlobalMem, PLAN_CUDA_MIN_RAM
                    );
                }
                return PLAN_REJECT_CUDA_MEM;
            }
            hu.flops = cp->flops_estimate();
    
            // assume we'll need 0.5% as many CPU FLOPS as GPU FLOPS
            // to keep the GPU fed.
            //
            double x = (hu.flops*0.005)/sreq.host.p_fpops;
            hu.avg_ncpus = x;
            hu.max_ncpus = x;
    
            hu.ncudas = 1;
    
            if (config.debug_version_select) {
                log_messages.printf(MSG_NORMAL,
                    "[version] CUDA app estimated %.2f GFLOPS (clock %d count %d)\n",
                    hu.flops/1e9, cp->prop.clockRate,
                    cp->prop.multiProcessorCount
                );
            }
            return 0;
        }
        log_messages.printf(MSG_CRITICAL,
            "Unknown plan class: %s\n", plan_class
        );
        return PLAN_REJECT_UNKNOWN;
    }
    

Questions

  • How does BOINC know if non-BOINC applications are using resources?