wiki:ValidationLowLevel
Last modified 6 years ago Last modified on 04/19/09 08:19:13

Low-level validator framework

BOINC's simple validator framework is sufficient in almost all cases. If for some reason you need more control, you can use the low-level framework (on which the simple framework is based).

To make a validator program using the low-level framework, link validator.cpp with two application-specific functions:

int check_set(
    vector<RESULT> results, DB_WORKUNIT& wu, int& canonicalid,
    double& credit, bool& retry
);
  • check_set() takes a set of results (all with outcome=SUCCESS). It reads and compares their output files. If there is a quorum of matching results, it selects one of them as the canonical result, returning its ID. In this case it also returns the credit to be granted for correct results for this workunit.
  • If, when an output file for a result has a nonrecoverable error (e.g. the directory is there but the file isn't, or the file is present but has invalid contents), then it must set the result's outcome (in memory, not database) to outcome=RESULT_OUTCOME_VALIDATE_ERROR and validate_state=VALIDATE_STATE_INVALID.

Use BOINC's back-end utility functions (in sched/validate_util.cpp) to get file pathnames and to distinguish recoverable and nonrecoverable file-open errors.

  • If a canonical result is found, check_set() must set the validate_state field of each non-ERROR result (in memory, not database) to either validate_state=VALIDATE_STATE_VALID or validate_state=VALIDATE_STATE_INVALID.
  • If a recoverable error occurs while reading output files (e.g. a directory wasn't visible due to NFS mount failure) then check_set() should return retry=true. This tells the validator to arrange for this WU to be processed again in a few hours.
  • check_set() should return nonzero if a major error occurs. This tells the validator to write an error message and exit.
int check_pair(RESULT& new_result, RESULT& canonical_result, bool& retry);
  • check_pair() compares a new result to the canonical result. In the absence of errors, it sets the new result's validate_state to either VALIDATE_STATE_INVALID or VALIDATE_STATE_VALID.
  • If it has a nonrecoverable error reading an output file of either result, or if the new result's output file is invalid, it must set the new result's outcome (in memory, not database) to VALIDATE_ERROR.
  • If it has a recoverable error while reading an output file of either result, it returns retry=true, which causes the validator to arrange for the WU to be examined again in a few hours.
  • check_pair() should return nonzero if a major error occurs. This tells the validator to write an error message and exit.

Neither function should delete files or access the BOINC database.

Examples of these two functions may be found in validate_util2.cpp, which implements the simple validator framework.

Pseudocode

int check_set(
  vector<RESULT> results,
  DB_WORKUNIT&   wu,
  int&           canonicalid,
  double&        credit,
  bool&          retry
);

Define N := length of result vector, and let M := N.

check_set() will ALWAYS be called with N>=wu.min_quorum.

check_set() will ALWAYS be called with ALL results satisfying
    result.outcome == RESULT_OUTCOME_SUCCESS
    result.validate_state == VALIDATE_STATE_INIT

check_set() should NEVER modify wu [although it is not declared
    const]

[1] Syntax pass (optional)

 for (all N results) {

   if (one or more of the result's output files can be read
       and one or more of those files contains erroneous or
       invalid or incorrect output, i.e. bad file syntax)
   {
     set result.outcome=RESULT_OUTCOME_VALIDATE_ERROR; 
     set result.validate_state=VALIDATE_STATE_INVALID;
     decrement counter: M = M-1;
   } // erroneous or incorrect or invalid output files

   else if (result has a potentially recoverable error,
	    i.e. NFS directory not mounted, server
	    is unreachable, upload server unreachable)
   {
     dont not modify result.validate_state;
     dont not modify result.outcome;
     decrement counter M = M-1;
     set retry=true;
   } // recoverable error
   
   else if (every output file of the result is unreadable or
	    fails to exist)
   {
     set result.outcome=RESULT_OUTCOME_VALIDATE_ERROR;
     set result.validate_state=VALIDATE_STATE_INIT;
     decrement counter: M = M-1;
   } // all result output files unreadable or nonexistent
   
 } // end of syntax pass loop over all N results

 Define REMAINING RESULTS to be those that do NOT fall into one of
 the three categories above. There are M of these. If the syntax pass
 has been skipped, then M == N.

 if (M < wu.min_quorum)
 {
   don't modify canonicalid;
   don't modify credit;
   leave retry as set above;
   leave result.outcome unchanged for M remaining results;
   leave result.validate_state unchanged for M remaining results;
   return 0;
 } // fewer than min_quorum results remain

 At any point in this process, if a major error occurs, check_set()
 should return nonzero.  This will cause the validator to exit.  If
 this happens, it does not matter how you have set or modified
 result.outcome, result.validate_state, retry, credit, or canonicalid.

// END OF OPTIONAL SYNTAX PASS


[2] Comparison pass (required).  We have
    M>=wu.min_quorum REMAINING RESULTS results with
      result.outcome == RESULT_OUTCOME_SUCCESS
      result.validate_state == VALIDATE_STATE_INIT

   All the output files of all of these results are
   readable.  All of the output files for a given result
   are, when taken "in isolation" apparently valid.  [If
   these conditions are not met then you must do the
   "syntax pass" above.]

   if (one of these results is determined to be THE correct
       [canonical] result)
   {

     for (correct result) {
       set result.validate_state=VALIDATE_STATE_VALID;
       set canonicalid=result.id;
     } // canonical result

     for (the REMAINING M - 1 results)
     {
       // NOTE: what is below can be done by calling
       // check_pair(result, canonical_result)
       if (result is correct, matches canonical)
       {
	 result.validate_state=VALIDATE_STATE_VALID;
       }
       else
       {
         result.validate_state=VALIDATE_STATE_INVALID;
       }
     } // loop over remaining M-1 results

     set credit;

     leave retry as set from the syntax pass above;

     return 0;
   } // found canonical result
   else
   {
     // You are UNABLE to determine if one of the M REMAINING RESULTS
     // is correct, so:
    
     do not modify result.outcome for ANY of M remaining results;
     do not modify result.validate_state for ANY of M remaining results;
     do not set credit;
     do not set canonicalid;
     leave retry as set from the syntax pass;
     return 0;
    
   } // did not find canonical result
    
 At any point in this process, if a major error occurs, check_set()
 should return nonzero.  This will cause the validator to exit.  If
 this happens, it does not matter how you have set result.outcome,
 result.validate_state, retry, credit, or canonicalid for ANY of the
 results.
    
// end of Comparison pass