Work fetch and GPUs

Current policy

  • Weighted round-robin simulation
    • get per-project and overall CPU shortfalls
    • see what misses deadline
  • If overall shortfall, get work from project with highest LTD
  • Scheduler request includes just "work_req_seconds".


There may be no CPU shortfall, but GPU is idle

If GPU is idle, we should get work from a project that potentially has jobs for it.

If the project has both CPU and GPU jobs, we may need to tell to send only GPU jobs.

LTD isn't meaningful with GPUs

New policy

A CPU job is one that uses only CPU time
A CUDA job is one that uses CUDA (and may use CPU as well)
	base class for the work-fetch policy of a resource
	derived classes include all RR sim - related data

		called before RR sim

		called before exists_fetchable_project()
		sees if there's project to req from for this resource,
		and caches that

	bool exists_fetchable_project()
		there's a project we can ask for work for this resource

	select_project(priority, char buf)
		if the importance of getting work for this resource is P,
		chooses and returns a PROJECT to request work from,
		and a string to put in the request message
		Choose the project for which LTD + expected payoff is largest

	values for priority:
			no shortfalls
		NEED: a shortfall, but no idle devices right now
		NEED_NOW: idle devices right not



	bool count_towards_share(PROJECT p)
		whether to count p's resource share in the total for this rsc
		== whether we've got a job of this type in last 30 days

	add_shortfall(PROJECT, dt)
		add x to this project's shortfall,
		where x = dt*(share - instances used)

	double total_share()
		total resource share of projects we're counting

		for each project p
			x = insts of this device used by P's running jobs
			y = P's share of this device
			update P's LTD

The following defined in base class:
	accumulate_shortfall(dt, i, n)
		i = instances in use, n = total instances
		nidle = n - i
		max_nidle max= nidle
		shortfall += dt*(nidle)
		for each project p for which count_towards_share(p)
			add_proj_shortfall(p, dt)

	data members:
		double shortfall
		double max_nidle

	data per project: (* means save in state file)
		double shortfall
		int last_job*
			last time we had a job from this proj using this rsc
			if the time is within last N days (30?)
			we assume that the project may possibly have jobs of that type
		bool runnable
		max deficit
		backoff timer*
			how long to wait until ask project for work only for this rsc
			double this any time we ask only for work for this rsc and get none
			(maximum 24 hours)
			clear it when we have a job that uses the rsc
		double share
			# of instances this project should get based on RS
		double long_term_debt*

derived classes:
		we could eventually subclass this from COPROC_WORK_FETCH
debt accounting
	for each resource type
RR sim

do simulation as current
on completion of an interval dt

scheduler request msg
double work_req_seconds
double cuda_req_seconds
bool send_only_cpu
bool send_only_cuda
double ninstances_cpu
double ninstances_cuda

work fetch

We need to deal w/ situation where there's GPU shortfall
	but no projects are supplying GPU work.
	We don't want an overall backoff from those projects.
	Solution: maintain separate backoff timer per resource

	switch cpu_work_fetch.priority
		case DONT_NEED
			set no_cpu in req message
		case NEED, NEED_NOW:
			work_req_sec = p.cpu_shortfall
			ncpus_idle = p.max_idle_cpus
	switch cuda_work_fetch.priority
		case DONT_NEED
			set no_cuda in the req message
		case NEED, NEED_NOW:

for prior = NEED_NOW, NEED
	for each coproc C (in decreasing order of importance)
	p = C.work_fetch.select_proj(prior, msg);
		if p
			put msg in req message
	p = cpu_work_fetch(prior)
		if p

When get scheduler reply
	if request.