An Incentive System for Volunteer ComputingDavid AndersonJanus Kristensen Abstract
1) IntroductionVolunteer computing is a computing paradigm that uses large numbers of computers, volunteered by members of the general public, to provide computing and storage resources. Early volunteer computing projects include the Great Internet Mersenne Prime Search (GIMPS) [GIMPS], SETI@home [SAH], and Folding@home [FAH]. Volunteer computing is being used in high-energy physics, molecular biology, medicine, astrophysics, climate study, and other areas. Volunteer computing participants have a wide range of motivations. Some are interested primarily by the science being done, others by the social aspects of the projects. One thing that nearly every participant wants is a numeric representation of their computational contribution. We call this credit, in the sense not of money but of acknowledgement. A participant's credit can be presented in several different contexts. For example, Figure X shows the screensaver graphics of Einstein@home, which displays the participant's credit as a number. Figure X shows the BOINC Manager, which displays a graph of credit versus time. The purpose, in both cases, is to reassure the participant that he or she is continuing to contribute (or, if credit is not increasing, to alert the participant to problems). Other contexts involve comparing the credit of volunteers or groups of volunteers. For example, Figure X shows a web page with a list of Einstein@home volunteers, ranked by credit. Figure Y shows a web page with a list of Einstein@home 'teams' (self-organized groups of volunteers) ranked by credit. Such pages (often called 'leaderboards') encourage competition between volunteers; to increase their standing on the leaderboards, volunteers may harness otherwise unused computers, upgrade existing computers, or buy new computers expressly for the purpose of getting more credit.
Competition between teams has an especially important role:
team members recruit new volunteers (family, friends, co-workers)
to their team for the purpose of increasing its ranking.
This creates a 'viral marketing' effect that leads
to exponential growth in the number of volunteers
and the amount of computing power.
Another interesting context is 'signature images', typically used as signatures for message-board posts. These images show a compact list of the projects in which the person is participating, and the credit on each project (see Figure X). SETI@home, which is representative of early volunteer computing projects, had a primitive credit system. It gave one unit of credit for each task, or 'workunit', completed. This system had many shortcomings. It didn't reflect the fact that some tasks used much more processing time than others. It gave credit even for incorrect results, which led to malicious users getting undeserved credit. BOINC (Berkeley Open Infrastructure for Network Computing) is a middleware system for volunteer computing. It is being used by a number of volunteer computing projects, including SETI@home, Climateprediction.net [CPDN], LHC@home, Predictor@home, and Einstein@Home [EAH]. Volunteers participate by running a BOINC client program on their computers. They can 'attach' each computer to any set of projects. 1.1) Design GoalsIn designing BOINC's credit system, we had two high-level goals:
These led to several subsidiary goals:
1.2) Defining creditHow are we to define a unit of credit? We might say, for example, that the use of one gigabyte of disk storage for one day earns one unit of credit. But is this adequate? The storage probably has greater utility if the host is highly available and/or has a high-bandwidth network connection. Should we give more credit in that case? There are countless dimensions of this sort. Given our goal of a single credit scale for incommensurable quantities (e.g., computation and storage) it is tempting to use the common denominator of money: the credit given for a resource is related to its cost. But cost to whom? There are two alternatives:
A second general issue whether to grant credit when a resource is volunteered but not used. For example, a participant's PC might contact the project during a period when the project has no work, and then sit idle for an hour. Should credit be granted? From the participant's point of view, perhaps yes. However, it is not clear how this could be made cheat-resistant; while it's possible to show with high probability that a computer did work, it's not easy to show that it was willing to do work. Finally, the magnitude of the credit scale is significant in terms of volunteer perception. A volunteer might find it discouraging to contribute a day of computer time and get 0.01 units of credit. Conversely, a inflated scale can lead to huge numbers that are hard to visually parse. BOINC's scale gives a typical computer about 100 units of credit per day. 2) Computation creditMost applications that use BOINC do primarily floating-point-intensive computations. Therefore we decided to base computational credit on floating-point operations. Currently, the credit system doesn't reflect memory usage. {Nor does it take disk, network or GPU usage into account} 2.1 Generic benchmarkingDepending on how much the application developer wants to get involved, BOINC offers three ways of determining computational credit. The first of these is based on CPU benchmarks:
This is a bit more complex on multiprocessors, since resources that are shared between processors may cause a CPU to get fewer FLOPS when other CPUs are running. The most extreme case is Intel hyperthreaded processors, in which each pair of CPUs share a single FPU. To reflect these factors, BOINC runs the benchmarks on all CPUs simultaneously. There are a number of scenarios in which this mechanism doesn't work well:
SOMETHING ABOUT BENCHMARK WEIGHTING; PICTURE 2.2) Application benchmarkThe second approach requires the application to implement its own benchmark. It may use a standard benchmark such as Whetstone, or preferably an application-specific benchmark that more closely mimics the application's memory access pattern and FP operation mix. The results of this benchmark are reported to the BOINC core client via an API call boinc_fpops_per_cpu_sec(double x);that must be made before the application terminates. 2.3) Application measurementThe third approach is for the application to do its own accounting of floating-point operations. This may be feasible for applications whose inner loop is deterministic and allows counting of FP ops (for example, FFT). The total number of FP operations is reported via an API call boinc_fpops_cumulative(double x);that must be made before the application terminates. {Have we got any measurements of how well these (all 3 benchmark methods) actually work across projects - does a computer get about the same credit or not for one day's work in seti compared to cpdn? This would be a valid success criteria.} 2.4) Redundant computation and creditThe claimed credit of a result, as with the output, cannot be trusted. A participant can modify the client software so that it returns an exaggerated claimed credit. A hardware or software problem can result in a claimed credit of zero. BOINC provides a mechanism called persistent redundant computing, in which each task is performed on multiple computers. The results of these redundant computations are compared and are accepted as correct only if they agree. The number of instances required for a quorum, and the criteria for agreement, are project-specific. This mechanism can be extended to minimize the impact of erroneously claimed credit. The idea is as follows: if n > 1 correct results are identified, with claimed credits C1 < ... < Cn, then
2.5) Trickle messages and creditClimateprediction.net has long jobs (months) so can't wait until the end to allocate credit. Allocate credit in trickle messages. Each message has summary of state (e.g. global temperature). Server message handler checks summary (quick and dirty cheating detection) then allocates credit based on fraction done. 3) Credit for storage and network transferStorage proof if server has a copy generate random string X have client MD5 with X, ensure result is correct if multiple clients have copy Network download direct proof: use 'has file' proof indirect proof: computed correct result (executable, input file) upload count bytes4) Credit reporting and display4.1) Total and recent average creditFor each entity (user, team and host) BOINC maintains and uses two credit-related quantities:
In our experience it is important to use both of these; for example, to have one leaderboard ranked by total credit, and another ranked by recent average credit. This rewards both new volunteers who contribute many computers (and thus get high recent average credit) and volunteers who join the project early and participate for a long time (and thus get high total credit). In BOINC, recent average credit (RAC) is computed as an exponentially weighted average with a half-life of one week. In other words, if an entity is not granted any credit for one week, its RAC will decline by 50%. An entity's RAC is updated whenever credit is granted. To avoid expensive database queries, the update is not able to scan the history of previous credit grants. The only information available is the previous RAC, and the time it was computed. BOINC uses the following approximation (all times are in days). half_life = 7; delta_t = now - last_update; Weight = exp(-delta_t*ln(2)/half_life); avg *= weight; avg += (1-weight)*(work/delta_t);In other words, we suppose that the credit was done granted at a constant rate R between last_update and now, a compute the new RAC as a weighted average of the old RAC and R, weighted to give the desired half-life if work is zero. The above expression blows up (0*1/0) when now = last_update, i.e. when two credits are granted at the same moment. (This frequently occurs in practice). To address this, note that the limit as x -> 0 of exp(x) is 1 + x + O(x^2)So as delta_t -> 0, weight approaches 1 - delta_t*ln(2)/half_lifeso we have (1-weight)*(work/delta_t) ~= (delta_t*ln(2)/half_life) * (work/delta_t) = work*ln(2)/half_life)So we can replace the last line with if (1-weight > 1e-6) { avg += (1-weight)*(work/delta_t); } else { avg += work*ln(2)/half_life; } 4.2) Accounting group creditTeam membership is dynamic; users can quit a team and join a different one. There are two approaches to accounting credit for a team or other collection of users:
Approach 2) can be efficiently implemented because teams have their own database records, which can be used to store total and recent-average credit. For other groupings of volunteers this is not the case. For example, each volunteer has an associated country (specified at signup, and mutable). The volunteers from a particular country form an implicit group, and there can be competition between countries. However, there is no separate database table for countries, so approach 1) must be used. Similarly for implicit groups such as 'users who own a single computer' or 'users who signed up in August 2003'. 4.3) Exporting credit dataBOINC-based projects can export statistics data as a set of compressed XML files available via HTTP (these files are generated by a BOINC utility program, which is typically run once a day). Examples of the XML elements: <team> <id>5</id> <name>Broadband Reports Team Starfire</name> <total_credit>153402.872429</total_credit> <expavg_credit>503030.483254</expavg_credit> <expavg_time>1087542007.701900</expavg_time> <nusers>14</nusers> </team> <user> <id>12</id> <name>John Keck</name> <country>United States</country> <total_credit>42698.813543</total_credit> <expavg_credit>117348.653646</expavg_credit> <expavg_time>1087542007.701900</expavg_time> <cpid>283472938743489759837498347</cpid> <teamid>5</teamid> </user> <host> <id>102</id> <userid>12</userid> <total_credit>42698.813543</total_credit> <expavg_credit>117348.653646</expavg_credit> <expavg_time>1087542007.701900</expavg_time> <p_vendor>GenuineIntel</p_vendor> <p_model>Pentium</p_model> <os_name>Windows XP</os_name> <os_version>5.1</os_version> <create_time>1040170006</create_time> <timezone>28800</timezone> <ncpus>2</ncpus> <p_fpops>45724737.082762</p_fpops> <p_iops>43233895.373973</p_iops> <p_membw>4032258.064516</p_membw> <m_nbytes>670478336.000000</m_nbytes> <m_swap>1638260736.000000</m_swap> <d_total>9088008192.000000</d_total> <d_free>3788505088.000000</d_free> <n_bwup>24109.794088</n_bwup> <n_bwdown>57037.049858</n_bwdown> <avg_turnaround>465609.562145</avg_turnaround> <host_cpid>e129b5fa44ed8ba58e41c472822f2807</host_cpid> </host> 4.4) Cross-project user identificationAccounts on different projects are considered equivalent if they have the same email address (we have considered other concepts, but they all lead to extreme complexity). Projects can't just export email addresses in statistics files; email addresses are private. It's also not desirable to export hashed email addresses, because spammers could enumerate feasible email addresses and compare them with the hashed addresses. Instead, BOINC uses the following system:
This system provides cross-project identification based on email address, without publicizing any information from which email addresses could be derived. 4.5) Host identificationEach host generates a random internal cross-project ID This is reported to the projects that to which the host is attached. The projects convert it to an external cross-project ID by hashing it with the owner's email address; this prevents spoofing. The external ID is exported in statistics files. 4.6) Credit statistics web sitesA number of web sites have been created to show BOINC credit statistics. They typically maintain relational databases with tables representing entities (accounts, hosts, teams) both on particular projects, and aggregated over all projects. They may maintain entities (e.g. countries, single-computer participants, etc.) that are not explicitly maintained in the projects' BOINC databases. These tables are indexed so that ranges of entities (according to total or recent-average credit order) can be efficiently enumerated. Periodically (typically once per day) they download XML statistics data from all BOINC projects and update the database tables. The sites offer web pages that show leaderboards, i.e. lists of top users, hosts, or teams, or of derived groups such as countries, operating systems, or processor types. These boards may be filtered according a range of criteria: for example, a page might show the top Macintosh owners in Paraguay. With some additional database complexity, the sites can show historical data: e.g. the change in leaderboard rankings over the past month. This type of data is very good at fueling team competition. In addition, the sites may offer other web-based services, such as
4.7) Cross-project issuesWhat if a project gives more credit per FLOP than others? What if a project gives credit 'prizes'? 5) Related workEconomic systems Not relevant to volunteer computing. In economic systems, the owner of a resource tries to make as much money as possible. Here, the ownerUnited Devices 6) ConclusionThere is extensive discussion on BOINC message boards about
Future work:
|