wiki:FileDeleter

Version 16 (modified by Christian Beer, 10 years ago) (diff)

updated to reflect current code

Server-side file deletion

Files are deleted from the data server's upload and download directories by two programs:

  • The file_deleter daemon, which deletes input and output files as jobs are completed.
  • The antique file deleter, which deletes files that have "fallen through the cracks".

The File Deleter

Typically you don't need to customize this. The default file deletion policy is:

  • A workunit's input files are deleted when all results are 'over' (reported or timed out) and the workunit is assimilated.
  • A result's output files are deleted after the workunit is assimilated. The canonical result is handled differently, since its output files may be needed to validate results that are reported after assimilation; hence its files are deleted only when all results are over, and all successful results have been validated.

Command-line options:

-d N
set debug output level (1/2/3/4)
--mod M R
handle only WUs with ID mod M == R
--one_pass
exit after one pass through DB
--dry_run
don't update DB (for debugging only)
--download_dir D
override download_dir from project config with D
--sleep_interval N
sleep for N seconds between scans (default 5)
--appid N
only process workunits with appid=N
--app S
only process workunits of app with name S
--dont_retry_errors
don't retry file deletions that failed previously
--preserve_wu_files
update the DB, but don't delete input files
--preserve_result_files
update the DB, but don't delete output files
--dont_delete_batches
don't delete anything with positive batch number
--input_files_only
don't delete output files If you store input and output files on different servers, you can improve performance by running separate file deleters, each one on the machine where the corresponding files are stored.
--output_files_only
don't delete input files
--xml_doc_like L
only process workunits where xml_doc LIKE 'L'

In some cases you may not want files to be deleted. There are three ways to accomplish this:

  • Use the --preserve_wu_files and/or the --preserve_result_files command-line options.
  • Include <no_delete/> in the <file_info> element for a file in a workunit or result template. This lets you suppress deletion on a file-by-file basis.
  • Include nodelete in the workunit name.

The Antique File Deleter

Runs as a periodic task. Removes 'antiques': output files that are older than the oldest WU in the database. These files are created when BOINC clients return after the corresponding WU has been deleted from the database.

The antique files are deleted by using a Unix 'find' command to locate files that are older than the oldest workunit. The find command will work on NFS mounted file systems, and will ignore .nfs stale file markers. The output of find is limited by a 'head' to 50000 files by default.

If the web-server account on your system is not 'apache', add a <httpd_user> element to your config.xml file. Otherwise antique deletion won't work.

Command-line options:

-d N
set debug output level (1/2/3/4)
--dry_run
don't delete any files, just log what would be deleted
--usleep N
sleep this number of usecs after each examined file (Throttles I/O if there are many files.)