wiki:WorkShop07/BoincGrid

Version 6 (modified by Jack, 17 years ago) (diff)

--

Why should BOINC use grids

  • Resources in general more secure and owners trusted
    • Can be used for result verification
  • Resources are more stable, available, and often underutilized
    • Easier to support low-latency jobs for example
  • Resources often connected by high bandwidth links
    • Could support data-intensive or data-parallel jobs
  • Environment (in terms of software/libraries) tends to be more homogeneous and configurable
  • Easy to run the BOINC client on a cluster and supercomputers by statically compiling a stand-alone client
    • Examples: condor pools that run BOINC jobs when their machines are not in use
  • Leverage existing grid software
    • Job submission often simpler with web portals
  • Single Scheduling Environment for both High Performance Computing (HPC) resources [Grids] and Volunteer Resources

Why should grids use BOINC

  • Order of magnitude more computing power and storage at fraction of the cost
  • Many grid jobs are already task parallel,
  • Challenge
    • Lack of mechanisms and standards to allow a job submitted on a grid to run easily in BOINC
      • Cannot rely on the existence of a software stack
      • Concept of an individual user or job (and access rights) in BOINC
      • Credit accounting for these jobs
    • There are some pit falls that have been identified by uses BOINC on some Grid computing systems. Most of the problems identified in this session dealt with HPC/Grid Submission Systems and efficient use of Grid resources (allotted grid computing hours).

-- Problem 1: Scheduled Grid client runs when server has no work

The client would be idle burning time from the users allotted computer hours.

-- Problem 2: Grid Scheduler kills jobs after time expires

Even with check pointing data processed on the last WU would be lost. The workspace would be scratched.

Action Items

Incorporate some mechanism into the BOINC client to make it aware to a limited runtime environment. Potential Fixes:

If no work is available from the servers the BOINC client would end instead of deferring communication.

  • MaxWUCount Parameter

After the max number of WU are reached; the new client would not request new work and then end gracefully.