Changes between Version 35 and Version 36 of AppCoprocessor


Ignore:
Timestamp:
Oct 15, 2011, 10:41:45 PM (13 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AppCoprocessor

    v35 v36  
    11= Applications that use coprocessors =
    22BOINC supports applications that use coprocessors.
    3 The supported coprocessor types (as of [18892])are NVIDIA and ATI GPUs.
     3The supported coprocessor types (as of [24404])are NVIDIA and ATI GPUs.
    44
    55The BOINC client probes for coprocessors and reports them in scheduler requests.
     
    77It only runs an app if enough instances are available.
    88
    9 You can develop your application using any programming system, e.g. CUDA (for NVIDIA), Brook+ (for ATI) or OpenCL.
     9You can develop your application using any programming system,
     10e.g. CUDA (for NVIDIA), CAL (for ATI) or OpenCL.
    1011
    1112== Dealing with GPU memory allocation failures ==
     13
    1214GPUs don't have virtual memory.
    1315GPU memory allocations may fail because other applications are using the GPU.
     
    1820boinc_temporary_exit(60);
    1921}}}
    20 This will exit the application, and will tell the BOINC client to restart it again in at least 60 seconds.
     22This will exit the application, and will tell the BOINC client to restart
     23it again in at least 60 seconds,
     24at which point memory may be available.
    2125
    2226== Device selection ==
     27
    2328Some hosts have multiple GPUs.
    24 When your application is run by BOINC, it will be passed a command-line argument
     29When your application is run by BOINC, it receives information
     30about which GPU instance to use.
     31This is passed as a command-line argument
    2532
    2633{{{
    27 --gpu_type X --device N
     34--device N
    2835}}}
    29 where X is the GPU type (e.g., 'nvidia' or 'ati') and N is the device number of the GPU that is to be used.
     36where N is the device number of the GPU that is to be used.
    3037If your application uses multiple GPUs, it will be passed multiple --device arguments, e.g.
    3138
    3239{{{
    33 --gpu_type X --device 0 --device 3
     40--device 0 --device 3
    3441}}}
     42
     43Some OpenCL apps can use either NVIDIA or ATI GPUs,
     44so they must also be told which type of GPU to use.
     45This is passed in the APP_INIT_DATA structure returned by '''boinc_get_init_data()'''.
     46{{{
     47char gpu_type[64];     // "nvidia" or "ati"
     48int gpu_device_num;
     49}}}
     50
    3551== Cleanup on premature exit ==
    36 The BOINC client may kill your application in the middle. This may leave the GPU in a bad state. To prevent this, call
     52The BOINC client may kill your application during execution.
     53This may leave the GPU in a bad state. To prevent this, call
    3754
    3855{{{
     
    4562    // cudaThreadSynchronize(); or whatever is needed
    4663    boinc_end_critical_section();
    47     while (1) boinc_sleep(1);
     64    exit(0);
    4865}
    4966}}}
     67
    5068== Plan classes ==
    51 Each coprocessor application has an associated [wiki:AppPlan plan class] which determines the hardware and software resources that are needed to run the application.
     69Each coprocessor application has an associated [wiki:AppPlan plan class]
     70which determines the hardware and software resources that are needed to run the application.
    5271
    5372The following plan classes for NVIDIA are pre-defined:
     
    82101
    83102== Defining a custom plan class ==
     103
    84104If your application has properties that differ from any of the pre-defined classes,
    85 you can define your own.
    86 To do this, you must modify the [wiki:AppPlan application planning function] that you link into your scheduler.
     105you can modify them, or better yet define your own.
    87106
    88 To see how to do this, let's look at the default function.
    89 First, we check if the host has an NVIDIA GPU.
     107To define a new NVIDIA/CUDA plan class, add a new clause
     108to '''app_plan_cuda()''' in sched/sched_customize.cpp.
     109For example, the plan class '''cuda23''' is defined by:
    90110
    91111{{{
    92 int app_plan(SCHEDULER_REQUEST& sreq, char* plan_class, HOST_USAGE& hu) {
    93112    ...
    94     if (!strcmp(plan_class, "cuda")) {
    95         COPROC_CUDA* cp = (COPROC_CUDA*)sreq.coprocs.lookup("CUDA");
    96         if (!cp) {
    97             if (config.debug_version_select) {
    98                 log_messages.printf(MSG_NORMAL,
    99                     "[version] Host lacks CUDA coprocessor for plan class cuda\n"
    100                 );
    101             }
    102             add_no_work_message("Your computer has no NVIDIA GPU");
     113    if (!strcmp(plan_class, "cuda23")) {
     114        if (!cuda_check(c, hu,
     115            100,        // minimum compute capability (1.0)
     116            200,        // max compute capability (2.0)
     117            2030,       // min CUDA version (2.3)
     118            19500,      // min display driver version (195.00)
     119            384*MEGA,   // min video RAM
     120            1.,         // # of GPUs used (may be fractional, or an integer > 1)
     121            .01,        // fraction of FLOPS done by the CPU
     122            .21            // estimated GPU efficiency (actual/peak FLOPS)
     123        )) {
    103124            return false;
    104125        }
     126    }
    105127}}}
    106 Check the compute capability (1.0 or better):
     128
     129To define a new ATI/CAL plan class, add a new clause
     130to '''app_plan_ati()'''.
     131For example:
     132{{{
     133    if (!strcmp(plan_class, "ati14")) {
     134        if (!ati_check(c, hu,
     135            1004000,    // min display driver version (10.4)
     136            false,      // require libraries named "ati", not "amd"
     137            384*MEGA,   // min video RAM
     138            1.,         // # of GPUs used (may be fractional, or an integer > 1)
     139            .01,        // fraction of FLOPS done by the CPU
     140            .21         // estimated GPU efficiency (actual/peak FLOPS)
     141        )) {
     142            return false;
     143        }
     144    }
     145}}}
     146
     147To define a new OpenCL plan class, add a new clause to
     148'''app_plan_opencl()'''.
     149For example:
    107150
    108151{{{
    109         int v = (cp->prop.major)*100 + cp->prop.minor;
    110         if (v < 100) {
    111             if (config.debug_version_select) {
    112                 log_messages.printf(MSG_NORMAL,
    113                     "[version] CUDA version %d < 1.0\n", v
    114                 );
    115             }
    116             add_no_work_message(
    117                 "Your NVIDIA GPU lacks the needed compute capability"
    118             );
    119          }
     152    if (!strcmp(plan_class, "opencl_nvidia_101")) {
     153        return opencl_check(
     154            c, hu,
     155            101,        // OpenCL version (1.1)
     156            256*MEGA,   // min video RAM
     157            1,          // # of GPUs used
     158            .1,         // fraction of FLOPS done by the CPU
     159            .21         // estimated GPU efficiency (actual/peak FLOPS)
     160        );
     161    }
    120162}}}
    121 Check the CUDA runtime version.
    122 As of client version 6.10, all clients report the CUDA runtime version (cp->cuda_version); use that if it's present.
    123 In 6.8 and earlier, the CUDA runtime version isn't reported.
    124 Windows clients report the driver version, from which the CUDA version can be inferred;
    125 Linux clients don't return the driver version, so we don't know what the CUDA version is.
    126 
    127 {{{
    128         // for CUDA 2.3, we need to check the CUDA RT version.
    129         // Old BOINC clients report display driver version;
    130         // newer ones report CUDA RT version
    131         //
    132         if (!strcmp(plan_class, "cuda23")) {
    133             if (cp->cuda_version) {
    134                 if (cp->cuda_version < 2030) {
    135                     add_no_work_message("CUDA version 2.3 needed");
    136                     return false;
    137                  }
    138             } else if (cp->display_driver_version) {
    139                 if (cp->display_driver_version < PLAN_CUDA23_MIN_DRIVER_VERSION) {
    140                     sprintf(buf, "NVIDIA display driver %d or later needed",
    141                         PLAN_CUDA23_MIN_DRIVER_VERSION
    142                     );
    143                  }
    144             } else {
    145                 add_no_work_message("CUDA version 2.3 needed");
    146                 return false;
    147             }
    148 }}}
    149 Check for the amount of video RAM:
    150 
    151 {{{
    152         if (cp->prop.dtotalGlobalMem < PLAN_CUDA_MIN_RAM) {
    153             if (config.debug_version_select) {
    154                 log_messages.printf(MSG_NORMAL,
    155                     "[version] CUDA mem %d < %d\n",
    156                     cp->prop.dtotalGlobalMem, PLAN_CUDA_MIN_RAM
    157                 );
    158             }
    159             sprintf(buf,
    160                 "Your NVIDIA GPU has insufficient memory (need %.0fMB)",
    161                 PLAN_CUDA_MIN_RAM
    162             );
    163             add_no_work_message(buf);
    164             return false;
    165         }
    166 }}}
    167 Estimate the FLOPS:
    168 
    169 {{{
    170         hu.flops = cp->flops_estimate();
    171 }}}
    172 Estimate its CPU usage:
    173 
    174 {{{
    175         // assume we'll need 0.5% as many CPU FLOPS as GPU FLOPS
    176         // to keep the GPU fed.
    177         //
    178         double x = (hu.flops*0.005)/sreq.host.p_fpops;
    179         hu.avg_ncpus = x;
    180         hu.max_ncpus = x;
    181 }}}
    182 Indicate the number of GPUs used. Typically this will be 1.
    183 If your application uses only a fraction X<1 of the CPU processors,
    184 and a fraction Y<1 of video RAM, reports the number of GPUs as min(X, Y).
    185 In this case BOINC will attempt to run multiple jobs per GPU is possible.
    186 
    187 {{{
    188         hu.ncudas = 1;
    189 }}}
    190 Return true to indicate that the application can be run on the host:
    191 
    192 {{{
    193         return true;
    194 }}}