= BOINC client emulator = The BOINC client emulator (BCE) simulates a single BOINC client interacting with one or more projects. BCE uses the same source code as the client for the CPU scheduling and work-fetch policies, so it models the BOINC client accurately. The intended uses of BCE include: * Identifying scenarios (combinations of host and project characteristics) where the current scheduling policies don't behave well. * Studying experimental policies. However, BCE is not necessarily perfect - in some cases its results may differ significantly from what the actual client would do. Or its inputs may be inadequate to describe a real-life scenario. If you find such cases, please send email to [ProjectPeople David Anderson]. You can use BCE in either of two ways: * Through a [//sim_form.php web interface]. This lets you do one simulation at a time, and shows you results graphically. * Compile it yourself and run from a command line. This provides a more flexible interface. == Input files == The input consists of the following files: === client_state.xml === #input_client_state This describes a set of attached projects. The format is an extension of the state file generated by the client; you can use the state file of a running client as an input to the simulator. The fields used by the simulator are as follows (fields marked with * are not generated by the client). {{{ host_info p_ncpus p_fpops m_nbytes coprocs }}} These describe the hosts's processing hardware. The simulator doesn't model disk usage. {{{ time_stats on_frac connected_frac active_frac gpu_active_frac *on_lambda *connected_lambda *active_lambda *gpu_active_lambda }}} These describe the host's availability: * on_frac: the fraction of total time this host runs the client * connected_frac: of the time this host runs the client, the fraction it is connected to the Internet. * active_frac: of the time this host runs the client, the fraction it is enabled to use CPU * gpu_active_frac: of the time this host runs the client, the fraction it is enabled to use GPU (always <= active_frac). For periods of activity and inactivity are exponentially distributed. The mean of the activity periods can be specified with '''on_lambda''' etc.; the default is 1 hour. {{{ project project_name resource_share *available frac lambda app name *latency_bound *fpops_est *fpops_actual mean stddev *weight *max_concurrent app_version app_name avg_ncpus flops plan_class coproc type count gpu_ram *working_set workunit app_name rsc_fpops_est rsc_fpops_bound result name report_deadline received_time active_task result_name working_set_size }}} Notes: Each application has a fixed latency bound. It can be specified in app.latency_bound. If not, and there is a result for that app, it is computed as report_deadline - received time for one such result. If there is no result, it is 1 week. An application has a fixed FLOP count estimate. It can be specified as app.fpops_est. If not, and there is a WU for that app, it is wu.rsc_fpops_est. Otherwise it is 3600*1e9 (i.e., 1 GFLOPS/hr). An application has a normal distribution of actual FLOP count. It can be specified as app.fpops_actual. Otherwise it is mean app.fpops_est, stddev 0. An application has an associated weight that determines the fraction of its jobs dispatched by that project. This defaults to 1. An application version has a fixed working set size. This can be specified as app_version.working_set. If not, and there is an active task for that app version, active_task.working_set_size is used. Otherwise it defaults to 0. The availability of the projects (i.e. the periods when scheduler RPCs succeed) is modeled with two parameters: the duration of available periods are exponentially distributed with the given mean, and the unavailable periods are exponentially distributed achieving the given available fraction. The availability of a project can be specified as project.available; otherwise it is always available. The algorithm for simulating a scheduler RPC to project P is: {{{ while need more work X = list of P's apps with versions for requested resources if X is empty break choose an app A from X, randomly based on weights V = version that uses requested resources and has highest FLOPS J = generate job if J is feasible update request else infeasible_count++ if infeasible_count == 10 break }}} The available periods (i.e., when BOINC is running) and the idle periods (i.e. when there is no user input) are modeled as above. === global_prefs.xml === format described [PrefsImpl here]. === cc_config.xml === #input_cc_config format described [ClientMessages here]. == Building and running the simulator == #build_and_run The simulator can be built with 'makefile_sim' on Unix or the 'sim' project on Windows. The usage is: {{{ sim [--duration X] [--delta X] [--server_uses_workload] [--dirs d1 ...] }}} --duration:: simulate this much time. --delta:: time step of simulation. --server_uses_workload:: servers take existing workload into account when deciding whether to send jobs. --dcf_dont_use:: Duration correction factor (DCF) is one. --dcf_stats:: Use formula for DCF based on completion time mean/stdev. --dirs d1 ...:: chdir into each of the given directories, and runs a simulation based on the input files there. Prints summaries of each one separately, and a total summary. == Output files == #output The simulator creates several output files: '''index.html''': an index of other files '''log.txt''': This is the message log (same as would be generated by the client). Its contents are controlled by [#input_cc_config cc_config.xml]. '''time_line.html''': When viewed in a web browser, a 'time line' showing what's running when. '''summary.xml''': Contains four performance metrics: wasted_frac:: Of the total CPU time, the fraction spent computing results that missed their deadline. idle_frac:: Of the total CPU time, the fraction spent not computing. share_violation:: A measure (0 to 1) of how badly resource shares were violated. monotony:: A measure (0 to 1) of how long a single project used all CPUs (so that user would see only that project on their screensaver, and get bored). In addition, information is printed about the per-project CPU time and waste.