Changes between Initial Version and Version 1 of BackendState


Ignore:
Timestamp:
Apr 25, 2007, 2:55:54 PM (17 years ago)
Author:
Nicolas
Comment:

Converted by an automatic script

Legend:

Unmodified
Added
Removed
Modified
  • BackendState

    v1 v1  
     1= Workunit and result state transitions =
     2
     3The processing of workunits and results can be described in terms of transitions of their state variables.
     4
     5
     6=== Workunit state variables ===
     7 Workunits parameters are described [JobIn here].
     8
     9Workunit state variables are as follows:
     10
     11
     12
     13||'''canonical_resultid'''||The ID of the canonical result for this workunit, or zero.
     14 * Initially zero
     15 * Set by the validator (by check_set())
     16
     17||||'''transition_time'''||The next time to check for state transitions for this WU.
     18 * Initially now.
     19 * Set to now by scheduler when get a result for this WU.
     20 * Set to min(current value, now + delay_bound) by scheduler     when send a result for this WU
     21 * Set to min(x.sent_time + wu.delay_bound) over IN_PROGRESS results x     by transitioner when done handling this WU
     22 * Set to now by validator if it finds canonical result,     or if there is already a canonical result     and some other results have validate_state = INIT,     or if there is no consensus and the number of successful results     is > wu.max_success_results
     23
     24||||'''file_delete_state'''||Indicates whether input files should be deleted.
     25 * Initially INIT
     26 * Set to READY by transitioner when all results have server_state=OVER         and wu.assimilate_state=DONE         Note: db_purge purges a WU and all its results when         file_delete_state=DONE;         therefore it is critical that it only be set to DONE         if all results have server_state=OVER.
     27 * Set to DONE by file_deleter when it has attempted to delete files.
     28
     29||||'''assimilate_state'''||Indicates whether the workunit should be assimilated.
     30 * Initially INIT
     31 * Set to READY by transitioner if wu.assimilate_state=INIT         and WU has error condition
     32 * Set to READY by validator when find canonical result         and wu.assimilate_state=INIT
     33 * Set to DONE by assimilator when done
     34
     35||||'''need_validate'''||Indicates that the workunit has a result that needs validation.
     36 * Initially FALSE
     37 * Set to TRUE by transitioner if the number of success results         is at least wu.min_quorum and there is a success result         not validated yet
     38 * Set to FALSE by validator
     39
     40||||'''error_mask'''||A bit mask for error conditions.
     41 * Initially zero
     42 * Transitioner sets COULDNT_SEND_RESULT if some result couldn't be sent.
     43 * Transitioner sets TOO_MANY_RESULTS if too many error results
     44 * Transitioner sets TOO_MANY_TOTAL_RESULTS if too many total results
     45 * Validator sets TOO_MANY_SUCCESS_RESULTS if no consensus         and too many success results
     46
     47||Workunit invariants:
     48
     49
     50 * eventually either canonical_resultid or error_mask is set
     51 * eventually transition_time = infinity
     52 * Each WU is assimilated exactly once
     53
     54Notes on deletion of input files:
     55
     56
     57 * Input files are eventually deleted, but only when all results have state=OVER (so that clients don't get download failures) and the WU has been assimilated (in case the project wants to examine input files in error cases).
     58
     59
     60=== Result state variable ===
     61 Result state variables are listed in the following table:
     62
     63||'''report_deadline'''||Give up on result (and possibly delete input files)     if don't get reply by this time.
     64 * Set by scheduler to now + wu.delay_bound when send result
     65
     66||||'''server_state'''||Values: UNSENT, IN_PROGRESS, OVER
     67 * Initially UNSENT
     68 * Set by scheduler to IN_PROGRESS when send result
     69 * Set by scheduler to OVER when result is reported         in request message from client.
     70 * Set by scheduler to OVER when it thinks         host has detached project.
     71 * Set by transitioner to OVER if now > result.report_deadline
     72 * Set by transitioner to OVER if WU has error condition         and result.server_state=UNSENT
     73 * Set by validator to OVER if WU has canonical result         and result.server_state=UNSENT
     74
     75||||'''outcome'''||Values: SUCCESS, COULDNT_SEND, CLIENT_ERROR, NO_REPLY, DIDNT_NEED,     VALIDATE_ERROR, CLIENT_DETACHED.     Defined iff result.server_state=OVER
     76 * Set by scheduler to SUCCESS if get reply and no client error
     77 * Set by scheduler to CLIENT_ERROR if get reply and client error
     78 * Set by scheduler to NO_REPLY if it thinks host has detached project.
     79 * Set by transitioner to NO_REPLY if server_state=IN_PROGRESS         and now < report_deadline
     80 * Set by transitioner to DIDNT_NEED if WU has error condition         and result.server_state=UNSENT
     81 * Set by validator to DIDNT_NEED if WU has canonical result         and result.server_state=UNSENT
     82 * Set by validator to VALIDATE_ERROR if outcome was initially         SUCCESS, but the validator had a permanent error reading a result file,         or a file had a syntax error.         Prevents the validator from trying again.
     83 * Set by scheduler to CLIENT_DETACHED if it gets a request         indicating that the client detached, then reattached
     84
     85||||'''client_state'''||Records the client state (DOWNLOADING, DOWNLOADED,     COMPUTE_ERROR, UPLOADING, UPLOADED, ABORTED)     where an error occurred.     Defined if outcome is CLIENT_ERROR.     ||||'''file_delete_state'''||
     86 * Initially INIT
     87 * Set by transitioner to READY if this is the canonical result,         and file_delete_state=INIT,         and wu.assimilate_state=DONE,         and all the results have server_state=OVER,         and all all the results with outcome=SUCCESS have validate_state<>INIT
     88 * Set by transitioner to READY if wu.assimilate_state=DONE         and result.outcome=CLIENT_ERROR         or result.validate_state!=INIT
     89
     90||||'''validate_state'''||     Defined iff result.outcome=SUCCESS
     91 * Initially INIT
     92 * Set by validator to VALID if outcome=SUCCESS and matches canonical result
     93 * Set by validator to INVALID if outcome=SUCCESS and doesn't match canonical result
     94 * Set by transitioner to NO_CHECK if the WU had an error;     this avoids showing claimed credit as 'pending'.
     95 * Set by validator to ERROR if outcome=SUCCESS and         had a permanent error trying to read an output file,         or an output file had a syntax error.
     96 * Set by validator to INCONCLUSIVE if check_set()         didn't find a consensus in a set of results containing this one.
     97 * Set by scheduler to TOO_LATE if the result was reported         after the canonical result's files were deleted.
     98
     99||
     100
     101Result invariants:
     102
     103
     104 * Eventually server_state = OVER.
     105 * Output files are eventually deleted.
     106
     107 Notes on deletion of output files:
     108 * Non-canonical results can be deleted as soon as the WU is assimilated.
     109 * Canonical results can be deleted only when all results have server_state=OVER and all success results are validated.
     110 * If a result reply arrives after its timeout, the output files can be immediately deleted.
     111
     112 How do we delete output files that arrive REALLY late? (e.g. uploaded after all results have timed out, and never reported)? Possible answer: let X = create time of oldest unassimilated WU. Any output files created before X can be deleted.