wiki:HomogeneousRedundancy

Version 3 (modified by davea, 17 years ago) (diff)

--

Dealing with numerical discrepancies

Most numerical applications produce different outcomes for a given workunit depending on the machine architecture, operating system, compiler, and compiler flags. For some applications these discrepancies produce only small differences in the final output, and results can be validated using a 'fuzzy comparison' function that allows for deviations of a few percent.

Other applications are 'divergent' in the sense that small numerical differences lead to unpredictably large differences in the final output. For such applications it may be difficult to distinguish between results that are correct but differ because of numerical discrepancies, and results that are erroneous. The 'fuzzy comparison' approach does not work for such applications.

Eliminating discrepancies

One approach is to eliminate numerical discrepancies. Some notes on how to do this for Fortran programs are given in a paper, Massive Tracking on Heterogeneous Platforms and in an earlier text document, both courtesy of Eric McIntosh from CERN.

Homogeneous redundancy

BOINC provides a feature called homogeneous redundancy to handle divergent applications. You can enable it for a project by including the line

<homogeneous_redundancy>N</homogeneous_redundancy>

in the config.xml file.

Alternatively, you can enable it selectively for a single application by setting the homogeneous_redundancy field in its database record.

Homogeneous redundancy (HR) divides hosts into 'numerical equivalence classes': two hosts are in the same class if they return identical results for your applications. The BOINC scheduler will send results for a given workunit only to hosts in the same class; this lets you use strict equality to compare redundant results.

N specifies the granularity of host classification:

0
Don't user homogeneous redundancy (all hosts are numerically equivalent)
1
Use a fine-grained classification with 80 classes (4 OS and 20 CPU types).
2
Use a coarse-grained classification in which there are 4 classes: Windows, Linux, Mac-PPC and Mac-Intel.

The proper classification depends on your application, and how it's compiled (compiler, compiler options, math libraries) on the various platforms. WCG reports that the following gcc options (on Linux) cause their apps to produce identical results on all processor types:

-mieee-fp -O3 -fno-rtti -ffor-scope -DNDEBUG