Changes between Initial Version and Version 1 of DbPurge


Ignore:
Timestamp:
Apr 24, 2007, 2:37:52 PM (17 years ago)
Author:
Nicolas
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DbPurge

    v1 v1  
     1= Database purging utility =
     2
     3As a BOINC project operates, the size of its workunit and result tables increases. Eventually they become so large that adding a field or building an index may take hours or days.
     4
     5To address this problem, BOINC provides a utility '''db_purge''' that writes result and WU records to XML-format archive files, then deletes them from the database.
     6
     7Workunits are purged only when their input files have been deleted. Because of BOINC's file-deletion policy, this implies that all results are completed. So when a workunit is purged, all its results are purged too.
     8
     9Run db_purge from the project's bin/ directory. It will create an archive/ directory and store archive files there.
     10
     11db_purge is normally run as a daemon, specified in the [ProjectConfigFile config.xml] file. It has the following command-line options:
     12
     13 '''-min_age_days N '''::
     14        Purge only WUs with mod_time at least N days in the past. Recommended value: 7 or so. This lets users examine their recent results.
     15 '''-max N'''::
     16        Purge at most N WUs, then exit
     17 '''-max_wu_per_file N'''::
     18        Write at most N WUs to each archive file. Recommended value: 10,000 or so.
     19 '''-zip'''::
     20        Compress archive files using zip
     21 '''-gzip'''::
     22        Compress archive files using gzip
     23 '''-d N'''::
     24        Set logging verbosity to N (1,2,3)
     25
     26== Archive file format ==
     27
     28The archive files have names of the form `wu_archive_TIME` and `result_archive_TIME` where `TIME` is the Unix time the file was created. In addition, db_purge generates index files 'wu_index' and 'result_index' associating each WU and result ID with the timestamp of its archive file.
     29
     30The format of both type of index files is a number of rows each containing:
     31{{{
     32ID     TIME
     33}}}
     34The ID field of the WU or result, 5 spaces, and the timestamp part of the archive filename where the record with that ID can be found.
     35
     36The format of a record in the result archive file is:
     37
     38{{{
     39<result_archive>
     40  <id>%d</id>
     41  <create_time>%d</create_time>
     42  <workunitid>%d</workunitid>
     43  <server_state>%d</server_state>
     44  <outcome>%d</outcome>
     45  <client_state>%d</client_state>
     46  <hostid>%d</hostid>
     47  <userid>%d</userid>
     48  <report_deadline>%d</report_deadline>
     49  <sent_time>%d</sent_time>
     50  <received_time>%d</received_time>
     51  <name>%s</name>
     52  <cpu_time>%.15e</cpu_time>
     53  <xml_doc_in>%s</xml_doc_in>
     54  <xml_doc_out>%s</xml_doc_out>
     55  <stderr_out>%s</stderr_out>
     56  <batch>%d</batch>
     57  <file_delete_state>%d</file_delete_state>
     58  <validate_state>%d</validate_state>
     59  <claimed_credit>%.15e</claimed_credit>
     60  <granted_credit>%.15e</granted_credit>
     61  <opaque>%f</opaque>
     62  <random>%d</random>
     63  <app_version_num>%d</app_version_num>
     64  <appid>%d</appid>
     65  <exit_status>%d</exit_status>
     66  <teamid>%d</teamid>
     67  <priority>%d</priority>
     68  <mod_time>%s</mod_time>
     69</result_archive>
     70}}}
     71
     72The format of a record in the WU archive file is:
     73{{{
     74<workunit_archive>
     75  <id>%d</id>
     76  <create_time>%d</create_time>
     77  <appid>%d</appid>
     78  <name>%s</name>
     79  <xml_doc>%s</xml_doc>
     80  <batch>%d</batch>
     81  <rsc_fpops_est>%.15e</rsc_fpops_est>
     82  <rsc_fpops_bound>%.15e</rsc_fpops_bound>
     83  <rsc_memory_bound>%.15e</rsc_memory_bound>
     84  <rsc_disk_bound>%.15e</rsc_disk_bound>
     85  <need_validate>%d</need_validate>
     86  <canonical_resultid>%d</canonical_resultid>
     87  <canonical_credit>%.15e</canonical_credit>
     88  <transition_time>%d</transition_time>
     89  <delay_bound>%d</delay_bound>
     90  <error_mask>%d</error_mask>
     91  <file_delete_state>%d</file_delete_state>
     92  <assimilate_state>%d</assimilate_state>
     93  <hr_class>%d</hr_class>
     94  <opaque>%f</opaque>
     95  <min_quorum>%d</min_quorum>
     96  <target_nresults>%d</target_nresults>
     97  <max_error_results>%d</max_error_results>
     98  <max_total_results>%d</max_total_results>
     99  <max_success_results>%d</max_success_results>
     100  <result_template_file>%s</result_template_file>
     101  <priority>%d</priority>
     102  <mod_time>%s</mod_time>
     103</workunit_archive>
     104}}}