wiki:ValidationSimple

Context Navigation

Version 26 (modified by davea, 10 years ago) (diff)
--

Developing a custom validator

Developing a custom validator

To create a validator, you must supply three functions:

extern int init_result(RESULT& result, void*& data);

This takes a result, reads its output file(s), parses them into a memory structure, and returns (via the 'data' argument) a pointer to this structure. The return value is:

Zero on success,
ERR_OPENDIR if there was a transient error, e.g. the output file is on a network volume that is not available. The validator will try this result again later.
Any other return value indicates a permanent error. Example: an output file is missing, or has a syntax error. The result will be marked as invalid.

To locate and open the result's output files, use utility functions such as get_output_file_path() and try_fopen() (see example below).

extern int compare_results(
    RESULT& r1, void* data1, RESULT& r2, void* data2, bool& match
);

This takes two results and their associated memory structures. It returns (via the 'match' argument) true if the two results are equivalent (within the tolerances of the application).

extern int cleanup_result(RESULT& r, void* data);

This frees the structure pointed to by data, if it's non-NULL.

You must link these functions with the files validator.cpp, validate_util.cpp, and validate_util2.cpp. The result is your custom validator.

If for some reason you need to access the WORKUNIT in your init_result() etc. functions:

DB_WORKUNIT wu;
wu.lookup_id(result->workunitid);

Runtime outliers

BOINC's mechanisms for estimating job runtimes are based on the assumption that a job's computation is roughly proportional to its FLOPS estimate (workunit.rsc_fpops_est). If there are exceptions to this (e.g. jobs that exit immediately because of unusual input data) these mechanisms will work better if you label them as such. To do this, set

result.runtime_outlier = true;

in init_result() for these results.

Example

Here's an example for an application whose output file contains an integer and a double. Two results are considered equivalent if the integers are equal and the doubles differ by no more than 0.01.

#include <string>
#include <vector>
#include <math.h>
#include "error_numbers.h"
#include "boinc_db.h"
#include "sched_util.h"
#include "validate_util.h"
using std::string;
using std::vector;

struct DATA {
    int i;
    double x;
};

int init_result(RESULT const & result, void*& data) {
    FILE* f;
    OUTPUT_FILE_INFO fi;
    int i, n, retval;
    double x;

    retval = get_output_file_path(result, fi.path);
    if (retval) return retval;
    retval = try_fopen(fi.path.c_str(), f, "r");
    if (retval) return retval;
    n = fscanf(f, "%d %f", &i, &x);
    fclose(f);
    if (n != 2) return ERR_XML_PARSE;
    DATA* dp = new DATA;
    dp->i = i;
    dp->x = x;
    data = (void*) dp;
    return 0;
}

int compare_results(
    RESULT& r1, void* _data1, RESULT const& r2, void* _data2, bool& match
) {
    DATA* data1 = (DATA*)_data1;
    DATA* data2 = (DATA*)_data2;
    match = true;
    if (data1->i != data2->i) match = false;
    if (fabs(data1->x - data2->x) > 0.01) match = false;
    return 0;
}

int cleanup_result(RESULT const& r, void* data) {
    if (data) delete (DATA*) data;
    return 0;
}

Using scripting languages

The validator script_validator allows you to write your validation logic in your language of choice (Python, Perl, Java, bash). script_validator takes two additional command-line arguments:

--init_script filename: name of "init script" to check a result.
--compare_script filename: name of script to compare two results.

The init script is called as

filename f1 ... fn

where f1 ... fn are the output files of a job (there may be just one). It exits with zero if the files are valid.

The compare script is called as

filename f1 ... fn g1 ... gn

where f1 ... fn are the output files of one job, and g1 ... gn are the output files are another job. It exits zero if the files are equivalent.

The scripts must be put in your project's bin/ directory.

For applications that don't use replication, the compare script need not be given.

As an example, the following PHP script, used as a compare script, would require that results match exactly:

#!/usr/bin/env php
<?php

$f1 = $argv[1];
$f2 = $argv[2];
if (md5_file($f1) != md5_file($f2)) {
    fwrite(STDERR, "$f1 and $f2 don't match\n");
    exit(1);
}

?>

Testing your validator

While you're developing a validator, it's convenient to run it in "standalone mode", i.e. run it manually against particular output files. BOINC provides a test harness that lets you do this:

In boinc/sched/, copy makefile_validator_test to your own file, say makefile_vt.
Edit this makefile, changing VALIDATOR_SRC to refer to the .cpp file containing your init_result(), compare_result(), and cleanup_result() functions.
Do make -f makefile_vt.

This creates a program validator_test. Do

validator_test file1 file2

to test your code against the given output files. It will show the result of each function call.

Notes:

Currently this assumes that results have a single output file. If you need this generalized, let us know.

Download in other formats:

Plain Text