[[PageOutline]]
= Developing a custom validator =

To create a validator, you must supply three functions:

{{{
extern int init_result(RESULT& result, void*& data);
}}}
This takes a result, reads its output file(s), parses them into a memory structure,
and returns (via the 'data' argument) a pointer to this structure.
The return value is:

 * Zero on success,
 * ERR_OPENDIR if there was a transient error, e.g. the output file is on a network volume that is not available.
  The validator will try this result again later.
 * Any other return value indicates a permanent error.
  Example: an output file is missing, or has a syntax error.
  The result will be marked as invalid.

To locate and open the result's output files, use
[BackendUtilities utility functions] such as '''get_output_file_path()''' and '''try_fopen()'''
(see example below).

{{{
extern int compare_results(
    RESULT& r1, void* data1, RESULT& r2, void* data2, bool& match
);
}}}
This takes two results and their associated memory structures.
It returns (via the 'match' argument) true if the two results are equivalent
(within the tolerances of the application).

{{{
extern int cleanup_result(RESULT& r, void* data);
}}}
This frees the structure pointed to by data, if it's non-NULL.

You must link these functions with the files validator.cpp, validate_util.cpp, and validate_util2.cpp.
The result is your custom validator.

If for some reason you need to access the WORKUNIT in your init_result() etc. functions:

{{{
DB_WORKUNIT wu;
wu.lookup_id(result.workunitid);
}}}

'''Note:''' You need commit [[https://github.com/BOINC/boinc/commit/b065526b7eae842997765523f3223138f59f3ab3|b065526]] in order to access result.workunit inside e.g. `init_result()`

== Runtime outliers ==

BOINC's mechanisms for estimating job runtimes are based on the assumption
that a job's computation is roughly proportional to its FLOPS estimate (workunit.rsc_fpops_est).
If there are exceptions to this (e.g. jobs that exit immediately because of unusual input data)
these mechanisms will work better if you label them as such.
To do this, set
{{{
result.runtime_outlier = true;
}}}
in init_result() for these results.

== Example ==
Here's an example for an application
whose output file contains an integer and a double.
Two results are considered equivalent if the integers are equal and the doubles differ by no more than 0.01.

{{{
#include <string>
#include <vector>
#include <math.h>
#include "error_numbers.h"
#include "boinc_db.h"
#include "sched_util.h"
#include "validate_util.h"
using std::string;
using std::vector;

struct DATA {
    int i;
    double x;
};

int init_result(RESULT const & result, void*& data) {
    FILE* f;
    OUTPUT_FILE_INFO fi;
    int i, n, retval;
    double x;

    retval = get_output_file_path(result, fi.path);
    if (retval) return retval;
    retval = try_fopen(fi.path.c_str(), f, "r");
    if (retval) return retval;
    n = fscanf(f, "%d %f", &i, &x);
    fclose(f);
    if (n != 2) return ERR_XML_PARSE;
    DATA* dp = new DATA;
    dp->i = i;
    dp->x = x;
    data = (void*) dp;
    return 0;
}

int compare_results(
    RESULT& r1, void* _data1, RESULT const& r2, void* _data2, bool& match
) {
    DATA* data1 = (DATA*)_data1;
    DATA* data2 = (DATA*)_data2;
    match = true;
    if (data1->i != data2->i) match = false;
    if (fabs(data1->x - data2->x) > 0.01) match = false;
    return 0;
}

int cleanup_result(RESULT const& r, void* data) {
    if (data) delete (DATA*) data;
    return 0;
}
}}}

== Using scripting languages ==

The validator '''script_validator''' allows you to write your validation logic
in your language of choice (Python, PHP, Perl, Java, bash).
'''script_validator''' takes two additional command-line arguments:

 '''--init_script "filename arg1 ... argn"''' :: script to check the validity of a result.  Exit zero if valid.
 '''--compare_script "filename arg1 ... argn"''' :: script to compare two results.  Exit zero if outputs are equivalent.

'''arg1 ... argn''' represent cmdline args to be passed to the scripts.
The options for init_script are:
 '''files''' :: list of paths of output files of the result
 '''result_id''' :: result ID
 '''runtime''' ::  task runtime in seconds

Additional options for compare_script, for the second result:
 '''files2''' :: list of paths of output files
 '''result_id2''' ::  result ID
 '''runtime2''' ::  task runtime

'''arg1 ... argn''' can be omitted,
in which case only the output file paths are passed to the scripts.

The scripts must be put in your project's bin/ directory.

For applications that don't use replication, the compare script need not be given.
For applications that don't need output file syntax checking, the init script need not be given.

As an example, the following PHP script, used as a compare script,
would require that results match exactly:
{{{
#!/usr/bin/env php
<?php

$f1 = $argv[1];
$f2 = $argv[2];
if (md5_file($f1) != md5_file($f2)) {
    fwrite(STDERR, "$f1 and $f2 don't match\n");
    exit(1);
}

?>
}}}

The corresponding entry in config.xml would look like
{{{
<daemon>
    <cmd>script_validator --app uppercase -d 3 --compare_script compare.php</cmd>
</daemon>
}}}

== Testing your validator ==

While you're developing a validator,
it's convenient to run it in "standalone mode",
i.e. run it manually against particular output files.
BOINC provides a test harness that lets you do this:

 * In boinc/sched/, copy '''makefile_validator_test''' to your own file, say '''makefile_vt'''.
 * Edit this makefile, changing VALIDATOR_SRC to refer to the .cpp file
   containing your init_result(), compare_result(), and cleanup_result() functions.
 * Do '''make -f makefile_vt'''.

This creates a program '''validator_test'''.
Do
{{{
validator_test file1 file2
}}}
to test your code against the given output files.
It will show the result of each function call.

Notes:
 * Currently this assumes that results have a single output file.
   If you need this generalized, let us know.