2) data flow options jobs, job instances work generator, assimilator simple case: no files single input, output file, no sticky sticky input files sticky output files locality scheduling querying/deleting files long-running jobs trickle messages intermediate file upload