I just don't understand how BOINC schedules tasks

Message boards : Questions and problems : I just don't understand how BOINC schedules tasks
Message board moderation

To post messages, you must log in.

AuthorMessage
Paul Schauble

Send message
Joined: 29 Aug 05
Posts: 68
Message 100250 - Posted: 9 Aug 2020, 7:16:19 UTC

I am running BOINC 7.16.7 (x64) on Windows 10 v1909 x64, with project srbase, LHC, and Asteroids. The CPU has 4 hyperthreaded core with BOINC set to allow 3 cores (yes, different definition of "cores").

I just watched BOINC suspend 2 very near deadline, and should have been high priority, Asteroids tasks with less than 1 hour to run in favor of an LHC ATLAS task with an 8 hour estimate due in 4 days.

The two Asteroids tasks where the last two in queue, so everything would have run successfully had they been allowed to finish.

The BOINC event log shows this when the Asteroids tasks were suspended:
2020-08-08 10:00:33 | Asteroids@home | Computation for task ps_200721_input_411087_1_0 finished
2020-08-08 10:00:33 | LHC@home | Starting task hEKKDmLfLNxnsSi4apGgGQJmABFKDmABFKDmEoBPDmABFKDmq6rqen_0
2020-08-08 10:00:33 | LHC@home | [cpu_sched] Starting task hEKKDmLfLNxnsSi4apGgGQJmABFKDmABFKDmEoBPDmABFKDmq6rqen_0 using ATLAS version 200 (vbox64_mt_mcore_atlas) in slot 0
2020-08-08 10:00:35 | Asteroids@home | Started upload of ps_200721_input_411087_1_0_0
2020-08-08 10:00:38 | Asteroids@home | Finished upload of ps_200721_input_411087_1_0_0
2020-08-08 10:00:43 | Asteroids@home | Sending scheduler request: To report completed tasks.
2020-08-08 10:00:43 | Asteroids@home | Reporting 1 completed tasks
2020-08-08 10:00:43 | Asteroids@home | Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: )
2020-08-08 10:00:44 | Asteroids@home | Scheduler request completed
2020-08-08 10:00:44 | Asteroids@home | Project requested delay of 7 seconds
2020-08-08 10:01:33 | Asteroids@home | [cpu_sched] Preempting ps_200721_input_411087_2_0 (removed from memory)
2020-08-08 10:01:33 | Asteroids@home | [cpu_sched] Preempting ps_200721_input_411152_2_1 (removed from memory)

I think the sequence here was

  1. The last three Asteroids tasks are running. One completes.
  2. BOINC starts the LHC ATLAS task, which requires 3 CPUs, without taking notice of the 3 CPU requirement.
  3. After running for 1 minute, BOINC notices the requirement for 3 CPUs and preempts the two Asteroids tasks to provide the other two CPUs. This preemption is without regard to priorities or deadlines.


It seems obviously wrong to sacrifice two tasks that could have finished within deadline to run a task that will finish within deadline in any case.

Can someone please explain why it works this way?

Thanks,
++PLS


ID: 100250 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 100251 - Posted: 9 Aug 2020, 8:00:47 UTC - in response to Message 100250.  

Probably because the multithreaded / virtual machine rules ('preempt other tasks so that the MT task can have everything it needs') were added piecemeal after the 'run earliest deadline first if a task is in danger of missing deadline' rule. The interaction between these different rules is very subtle, and if they're not all added together as a coherent, thought-through, set, there can be conflicts - as you have found.
ID: 100251 · Report as offensive
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1283
United Kingdom
Message 100253 - Posted: 9 Aug 2020, 9:32:17 UTC

Also, depending on the hardware, it is possible that the VM used by LHC can over-commit the CPU as the VM will uses a single core to look after itself (well actually a small fraction of a core), and so BOINC sees it as a single core, not the three virtualised cores in use. VMs are weird that way.
ID: 100253 · Report as offensive
Paul Schauble

Send message
Joined: 29 Aug 05
Posts: 68
Message 100343 - Posted: 19 Aug 2020, 10:45:10 UTC - in response to Message 100251.  

I though it was something like that. In other words, the whole scheduler needs to be rethought if the light f multi-cpu tasks.

It would likely not be that had to make the scheduler not preempt a task that was in danger of missing its deadline. The scheduler supposedly already handles tasks like that. But doing so would likely just cause a different set of weirdness.

Just have to put up with it.

++PLS
ID: 100343 · Report as offensive

Message boards : Questions and problems : I just don't understand how BOINC schedules tasks

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.