wiki:AppCoprocessor

Context Navigation

Version 35 (modified by davea, 13 years ago) (diff)
--

Applications that use coprocessors

BOINC supports applications that use coprocessors. The supported coprocessor types (as of [18892])are NVIDIA and ATI GPUs.

The BOINC client probes for coprocessors and reports them in scheduler requests. The client keeps track of coprocessor allocation, i.e. how many instances of each are free. It only runs an app if enough instances are available.

You can develop your application using any programming system, e.g. CUDA (for NVIDIA), Brook+ (for ATI) or OpenCL.

Dealing with GPU memory allocation failures

GPUs don't have virtual memory. GPU memory allocations may fail because other applications are using the GPU. This is typically a temporary condition. Rather than exiting with an error in this case, call

boinc_temporary_exit(60);

This will exit the application, and will tell the BOINC client to restart it again in at least 60 seconds.

Device selection

Some hosts have multiple GPUs. When your application is run by BOINC, it will be passed a command-line argument

--gpu_type X --device N

where X is the GPU type (e.g., 'nvidia' or 'ati') and N is the device number of the GPU that is to be used. If your application uses multiple GPUs, it will be passed multiple --device arguments, e.g.

--gpu_type X --device 0 --device 3

Cleanup on premature exit

The BOINC client may kill your application in the middle. This may leave the GPU in a bad state. To prevent this, call

boinc_begin_critical_section();

before using the GPU, and between GPU kernels do

if (boinc_status.quit_request || boinc_status.abort_request) {
    // cudaThreadSynchronize(); or whatever is needed
    boinc_end_critical_section();
    while (1) boinc_sleep(1);
}

Plan classes

Each coprocessor application has an associated plan class which determines the hardware and software resources that are needed to run the application.

The following plan classes for NVIDIA are pre-defined:

cuda: NVIDIA GPU, compute capability 1.0+, driver version 177.00+, 254+ MB RAM.
cuda23: Requires driver version 190.38+, 384+ MB RAM.
cuda_fermi: Requires compute capability 2.0+ and CUDA version 3.0+
opencl_nvidia_101: Requires OpenCL 1.1+ support

For ATI the situation is more complex because AMD changed the DLL names from amd* to ati* midstream; applications are linked against a particular name and will fail if it's not present.

ati: CAL version 1.0.0+, amd* DLLs
ati13amd: CAL version 1.3+, amd* DLLs
ati13ati: CAL version 1.3+, ati* DLLs
ati14: CAL version 1.4+, ati* DLLs
opencl_ati_101: OpenCL 1.1+

You can verify which DLLs your application is linked against by using Dependency Walker against your application. If your executable contains DLL names prefixed with 'amd' then your plan class will be ati or ati13amd depending on which version of the CAL SDK you are using. If the DLL names are prefixed with 'ati' then use the ati13ati or ati14 plan classes.

In all cases (NVIDIA and ATI), the application is assumed to use 1 GPU, and the CPU usage is assumed to be 0.5% the FLOPS of the GPU. If there is a choice, the scheduler will give preference to later classes, i.e. it will pick cuda23 over cuda.

Once you have chosen a plan class for your executable, create an app version, specifying its plan class.

Defining a custom plan class

If your application has properties that differ from any of the pre-defined classes, you can define your own. To do this, you must modify the application planning function that you link into your scheduler.

To see how to do this, let's look at the default function. First, we check if the host has an NVIDIA GPU.

int app_plan(SCHEDULER_REQUEST& sreq, char* plan_class, HOST_USAGE& hu) {
    ...
    if (!strcmp(plan_class, "cuda")) {
        COPROC_CUDA* cp = (COPROC_CUDA*)sreq.coprocs.lookup("CUDA");
        if (!cp) {
            if (config.debug_version_select) {
                log_messages.printf(MSG_NORMAL,
                    "[version] Host lacks CUDA coprocessor for plan class cuda\n"
                );
            }
            add_no_work_message("Your computer has no NVIDIA GPU");
            return false;
        }

Check the compute capability (1.0 or better):

        int v = (cp->prop.major)*100 + cp->prop.minor;
        if (v < 100) {
            if (config.debug_version_select) {
                log_messages.printf(MSG_NORMAL,
                    "[version] CUDA version %d < 1.0\n", v
                );
            }
            add_no_work_message(
                "Your NVIDIA GPU lacks the needed compute capability"
            );
         }

Check the CUDA runtime version. As of client version 6.10, all clients report the CUDA runtime version (cp->cuda_version); use that if it's present. In 6.8 and earlier, the CUDA runtime version isn't reported. Windows clients report the driver version, from which the CUDA version can be inferred; Linux clients don't return the driver version, so we don't know what the CUDA version is.

        // for CUDA 2.3, we need to check the CUDA RT version.
        // Old BOINC clients report display driver version;
        // newer ones report CUDA RT version
        //
        if (!strcmp(plan_class, "cuda23")) {
            if (cp->cuda_version) {
                if (cp->cuda_version < 2030) {
                    add_no_work_message("CUDA version 2.3 needed");
                    return false;
                 }
            } else if (cp->display_driver_version) {
                if (cp->display_driver_version < PLAN_CUDA23_MIN_DRIVER_VERSION) {
                    sprintf(buf, "NVIDIA display driver %d or later needed",
                        PLAN_CUDA23_MIN_DRIVER_VERSION
                    );
                 }
            } else {
                add_no_work_message("CUDA version 2.3 needed");
                return false;
            }

Check for the amount of video RAM:

        if (cp->prop.dtotalGlobalMem < PLAN_CUDA_MIN_RAM) {
            if (config.debug_version_select) {
                log_messages.printf(MSG_NORMAL,
                    "[version] CUDA mem %d < %d\n",
                    cp->prop.dtotalGlobalMem, PLAN_CUDA_MIN_RAM
                );
            }
            sprintf(buf,
                "Your NVIDIA GPU has insufficient memory (need %.0fMB)",
                PLAN_CUDA_MIN_RAM
            );
            add_no_work_message(buf);
            return false;
        }

Estimate the FLOPS:

        hu.flops = cp->flops_estimate();

Estimate its CPU usage:

        // assume we'll need 0.5% as many CPU FLOPS as GPU FLOPS
        // to keep the GPU fed.
        //
        double x = (hu.flops*0.005)/sreq.host.p_fpops;
        hu.avg_ncpus = x;
        hu.max_ncpus = x;

Indicate the number of GPUs used. Typically this will be 1. If your application uses only a fraction X<1 of the CPU processors, and a fraction Y<1 of video RAM, reports the number of GPUs as min(X, Y). In this case BOINC will attempt to run multiple jobs per GPU is possible.

        hu.ncudas = 1;

Return true to indicate that the application can be run on the host:

        return true;

Download in other formats:

Plain Text