Changes between Version 8 and Version 9 of CreditNew

Show
Ignore:
Author:
davea (IP: 128.32.18.181)
Timestamp:
11/04/09 12:24:38 (3 weeks ago)
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • CreditNew

    v8 v9  
    134134so that the average is the same for each version. 
    135135The adjustment is always downwards: 
    136 we maintain the average PFC*(V) of PFC() for each app version V, 
     136we maintain the average PFC^mean^(V) of PFC() for each app version V, 
    137137find the minimum X. 
    138138An app version V's jobs are then scaled by the factor 
    139 {{{ 
    140 S(V) = (X/PFC*(V)) 
    141 }}} 
     139 
     140 S(V) = (X/PFC^mean^(V)) 
     141 
    142142 
    143143The result for a given job J 
    144144is called "Version-Normalized Peak FLOP Count", or VNPFC(J): 
    145 {{{ 
    146 VNPFC(J) = PFC(J) * (X/PFC*(V)) 
    147 }}} 
     145 
     146 VNPFC(J) = PFC(J) * (X/PFC^mean^(V)) 
    148147 
    149148Notes: 
    162161   (e.g., workunit.rsc_fpops_est) 
    163162   we can normalize by this to reduce the variance, 
    164    and make PFC*(V) converge more quickly. 
     163   and make PFC^mean^(V) converge more quickly. 
    165164 * ''a posteriori'' estimates of job size may exist also 
    166165   (e.g., an iteration count reported by the app) 
    204203then, for that app, 
    205204hosts should get the same average granted credit per job. 
    206 To ensure this, for each application A we maintain the average VNPFC*(A), 
    207 and for each host H we maintain VNPFC*(H, A). 
     205To ensure this, for each application A we maintain the average VNPFC^mean^(A), 
     206and for each host H we maintain VNPFC^mean^(H, A). 
    208207The '''claimed credit''' for a given job J is then 
    209 {{{ 
    210 VNPFC(J) * (VNPFC*(A)/VNPFC*(H, A)) 
    211 }}} 
     208 
     209 VNPFC(J) * (VNPFC^mean^(A)/VNPFC^mean^(H, A)) 
     210 
    212211 
    213212There are some cases where hosts are not sent jobs uniformly: 
    219218 
    220219This can be done by dividing 
    221 each sample in the computation of VNPFC* by WU.rsc_fpops_est 
     220each sample in the computation of VNPFC^mean^ by WU.rsc_fpops_est 
    222221(in fact, there's no reason not to always do this). 
    223222 
    227226   and increases the claimed credit of hosts that are more efficient 
    228227   than average. 
    229  * VNPFC* is averaged over jobs, not hosts. 
     228 * VNPFC^mean^ is averaged over jobs, not hosts. 
    230229 
    231230== Computing averages == 
    312311 
    313312 * One-time cheats (like claiming 1e304) can be prevented by 
    314    capping VNPFC(J) at some multiple (say, 10) of VNPFC*(A). 
     313   capping VNPFC(J) at some multiple (say, 10) of VNPFC^mean^(A). 
    315314 * Cherry-picking: suppose an application has two types of jobs, 
    316315  which run for 1 second and 1 hour respectively. 
    319318  Suppose a client systematically refuses the 1 hour jobs 
    320319  (e.g., by reporting a crash or never reporting them). 
    321   Its VNPFC*(H, A) will quickly decrease, 
     320  Its VNPFC^mean^(H, A) will quickly decrease, 
    322321  and soon it will be getting several thousand times more credit 
    323322  per actual work than other hosts! 
    325324  whenever a job errors out, times out, or fails to validate, 
    326325  set the host's error rate back to the initial default, 
    327   and set its VNPFC*(H, A) to VNPFC*(A) for all apps A. 
     326  and set its VNPFC^mean^(H, A) to VNPFC^mean^(A) for all apps A. 
    328327  This puts the host to a state where several dozen of its 
    329328  subsequent jobs will be replicated. 
    335334 
    336335Unrelated to the credit proposal, but in a similar spirit. 
    337 The server will maintain ET*(H, V), the statistics of 
     336The server will maintain ET^mean^(H, V), the statistics of 
    338337job runtimes (normalized by wu.rsc_fpops_est) per 
    339338host and application version. 
    340339 
    341340The server's estimate of a job's runtime is then 
    342 {{{ 
    343 R(J, H) = wu.rsc_fpops_est * ET*(H, V) 
    344 }}} 
     341 
     342 R(J, H) = wu.rsc_fpops_est * ET^mean^(H, V) 
     343 
    345344 
    346345== Implementation == 

If this page is incomplete or incorrect, please edit it or add it to the wiki to-do list. To do this, you must be logged in; click Login or Register above.