| 1 | = Database purging utility = |
| 2 | |
| 3 | As a BOINC project operates, the size of its workunit and result tables increases. Eventually they become so large that adding a field or building an index may take hours or days. |
| 4 | |
| 5 | To address this problem, BOINC provides a utility '''db_purge''' that writes result and WU records to XML-format archive files, then deletes them from the database. |
| 6 | |
| 7 | Workunits are purged only when their input files have been deleted. Because of BOINC's file-deletion policy, this implies that all results are completed. So when a workunit is purged, all its results are purged too. |
| 8 | |
| 9 | Run db_purge from the project's bin/ directory. It will create an archive/ directory and store archive files there. |
| 10 | |
| 11 | db_purge is normally run as a daemon, specified in the [ProjectConfigFile config.xml] file. It has the following command-line options: |
| 12 | |
| 13 | '''-min_age_days N ''':: |
| 14 | Purge only WUs with mod_time at least N days in the past. Recommended value: 7 or so. This lets users examine their recent results. |
| 15 | '''-max N''':: |
| 16 | Purge at most N WUs, then exit |
| 17 | '''-max_wu_per_file N''':: |
| 18 | Write at most N WUs to each archive file. Recommended value: 10,000 or so. |
| 19 | '''-zip''':: |
| 20 | Compress archive files using zip |
| 21 | '''-gzip''':: |
| 22 | Compress archive files using gzip |
| 23 | '''-d N''':: |
| 24 | Set logging verbosity to N (1,2,3) |
| 25 | |
| 26 | == Archive file format == |
| 27 | |
| 28 | The archive files have names of the form `wu_archive_TIME` and `result_archive_TIME` where `TIME` is the Unix time the file was created. In addition, db_purge generates index files 'wu_index' and 'result_index' associating each WU and result ID with the timestamp of its archive file. |
| 29 | |
| 30 | The format of both type of index files is a number of rows each containing: |
| 31 | {{{ |
| 32 | ID TIME |
| 33 | }}} |
| 34 | The ID field of the WU or result, 5 spaces, and the timestamp part of the archive filename where the record with that ID can be found. |
| 35 | |
| 36 | The format of a record in the result archive file is: |
| 37 | |
| 38 | {{{ |
| 39 | <result_archive> |
| 40 | <id>%d</id> |
| 41 | <create_time>%d</create_time> |
| 42 | <workunitid>%d</workunitid> |
| 43 | <server_state>%d</server_state> |
| 44 | <outcome>%d</outcome> |
| 45 | <client_state>%d</client_state> |
| 46 | <hostid>%d</hostid> |
| 47 | <userid>%d</userid> |
| 48 | <report_deadline>%d</report_deadline> |
| 49 | <sent_time>%d</sent_time> |
| 50 | <received_time>%d</received_time> |
| 51 | <name>%s</name> |
| 52 | <cpu_time>%.15e</cpu_time> |
| 53 | <xml_doc_in>%s</xml_doc_in> |
| 54 | <xml_doc_out>%s</xml_doc_out> |
| 55 | <stderr_out>%s</stderr_out> |
| 56 | <batch>%d</batch> |
| 57 | <file_delete_state>%d</file_delete_state> |
| 58 | <validate_state>%d</validate_state> |
| 59 | <claimed_credit>%.15e</claimed_credit> |
| 60 | <granted_credit>%.15e</granted_credit> |
| 61 | <opaque>%f</opaque> |
| 62 | <random>%d</random> |
| 63 | <app_version_num>%d</app_version_num> |
| 64 | <appid>%d</appid> |
| 65 | <exit_status>%d</exit_status> |
| 66 | <teamid>%d</teamid> |
| 67 | <priority>%d</priority> |
| 68 | <mod_time>%s</mod_time> |
| 69 | </result_archive> |
| 70 | }}} |
| 71 | |
| 72 | The format of a record in the WU archive file is: |
| 73 | {{{ |
| 74 | <workunit_archive> |
| 75 | <id>%d</id> |
| 76 | <create_time>%d</create_time> |
| 77 | <appid>%d</appid> |
| 78 | <name>%s</name> |
| 79 | <xml_doc>%s</xml_doc> |
| 80 | <batch>%d</batch> |
| 81 | <rsc_fpops_est>%.15e</rsc_fpops_est> |
| 82 | <rsc_fpops_bound>%.15e</rsc_fpops_bound> |
| 83 | <rsc_memory_bound>%.15e</rsc_memory_bound> |
| 84 | <rsc_disk_bound>%.15e</rsc_disk_bound> |
| 85 | <need_validate>%d</need_validate> |
| 86 | <canonical_resultid>%d</canonical_resultid> |
| 87 | <canonical_credit>%.15e</canonical_credit> |
| 88 | <transition_time>%d</transition_time> |
| 89 | <delay_bound>%d</delay_bound> |
| 90 | <error_mask>%d</error_mask> |
| 91 | <file_delete_state>%d</file_delete_state> |
| 92 | <assimilate_state>%d</assimilate_state> |
| 93 | <hr_class>%d</hr_class> |
| 94 | <opaque>%f</opaque> |
| 95 | <min_quorum>%d</min_quorum> |
| 96 | <target_nresults>%d</target_nresults> |
| 97 | <max_error_results>%d</max_error_results> |
| 98 | <max_total_results>%d</max_total_results> |
| 99 | <max_success_results>%d</max_success_results> |
| 100 | <result_template_file>%s</result_template_file> |
| 101 | <priority>%d</priority> |
| 102 | <mod_time>%s</mod_time> |
| 103 | </workunit_archive> |
| 104 | }}} |