Resource sharing is not properly functional

Author	Message
Matthew Burch Send message Joined: 1 Oct 11 Posts: 17	Message 59184 - Posted: 31 Dec 2014, 14:07:10 UTC Currently, I am running BOINC 7.4.27 (x64) I have two projects that I am trying to run. SETI@home is set for 100 resource share. Asteroids@home is set for 5 resource share. Unfortunately, Asteroids @ home is completely ignoring it's resource share, and utilizing 100% of my machine's computing resources, and it has been doing so for long enough that it's clearly not a fluke. At this time: "Avg. work done" for Asteroids is 28,000+ "Avg. work done" for SETI is 2,600+ Clearly, if resource share is supposed to be based on "Avg. work done" then resource share is being ignored. BUT what if BOINC is calculating over the long haul? "Work done" for Asteroids is over 1,000,000 "Work done" for SETI is less than 12,000,000 Clearly, if resource share is supposed to be based on "Work done" then resource share is being ignored. So, what is happening here? Why is BOINC defying my wishes as to how to distribute my computing resources that I am donating? After bouncing the question around in the SETI forum to see if I was just being dense, the answer seems pretty obvious. I may be partly wrong. I'm no code monkey, but I'm certain that I am mostly right. When I tell BOINC that I want to devote 100 resource share to SETI and 5 resource share to Asteroids, and tell the projects that I want 10 days of data for each of them, BOINC is giving bad data to the projects, telling BOTH projects what the TOTAL computing power of my machine is. BOTH projects then send me ten days worth of work, as if they were each 100% workload on my machine. Now, I have 20 days of work on my machine, and 10 days to do the work. This is clearly broken, unless BOINC would like to loan me a Tardis. To make things worse, some projects assign all their work units in a very close timeframe, less than ten days out, which forces BOINC to assign their work as high priority. This seems to be why Asteroids is running away with my machine. How can this be fixed? Logically, it's very simple, and would require three obvious implementations. 1) When BOINC tells a project that I am requesting X days work, that should be sent as X days work at the calculated % of my machine's CPU/GPU capacity that I am willing to devote to it. The project should NOT be given my machine's total computational capacity. 2) To prevent abuse of work unit calculations, the BOINC client should monitor ACTUAL CPU/GPU usage and make certain it matches with the resource share distribution the user requests. This calculation would completely ignore work unit numbers as calculated by individual projects, and focus entirely on processor utilization numbers. If a project goes significantly beyond it's allowed allocation of work based on resource share, the project is throttled until the other project catches up. If this happens regularly, then it is reported to BOINC. 3) If one project runs out of work, it's effective resource share is reduced to zero, numbers are reset, and the client checks every now and then for work. When new work is found, resource share is reverted, numbers are reset, and calculations of comparative resource share resume. I would like for the BOINC team to remember that I am donating my time here. Your software is the 'vehicle' of my donation of computing resources. When I say I want to donate 100 resource share to one project and 5 resource to the other, that's what I mean. I expect the BOINC manager to make what I want to happen reality. If I were to go to an accountant and tell them I wanted to give $100 to Charity A and $5 to Charity B, but then noticed that a check had been cut from my account for $105 to charity B, I would immediately demand an explanation. If the accountant could only tell me "Oh, they gamed the system so I gave them all the money you set aside for donations." I would instantly fire them. That being said, BOINC is not a paid service, and because of that, I will not hold BOINC to the same standards that I would hold a paid service. However, I feel that it IS appropriate for me to hold your feet over the fire a little bit here, because BOINC is not properly managing my resource allocations, as I wish them to be allocated. The methodology above would allow this. Logs 12/31/2014 7:29:47 AM \| Asteroids@home \| Computation for task ps_141205_60445_8_2 finished 12/31/2014 7:29:47 AM \| Asteroids@home \| Starting task ps_141205_60445_5_2 12/31/2014 7:29:49 AM \| Asteroids@home \| Started upload of ps_141205_60445_8_2_0 12/31/2014 7:29:57 AM \| Asteroids@home \| Finished upload of ps_141205_60445_8_2_0 12/31/2014 7:29:59 AM \| Asteroids@home \| Sending scheduler request: To report completed tasks. 12/31/2014 7:29:59 AM \| Asteroids@home \| Reporting 1 completed tasks 12/31/2014 7:29:59 AM \| Asteroids@home \| Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: not highest priority project) 12/31/2014 7:30:01 AM \| Asteroids@home \| Scheduler request completed 12/31/2014 7:49:37 AM \| Asteroids@home \| Computation for task ps_141205_60444_26_2 finished 12/31/2014 7:49:37 AM \| Asteroids@home \| Starting task ps_141205_60444_15_2 12/31/2014 7:49:39 AM \| Asteroids@home \| Started upload of ps_141205_60444_26_2_0 12/31/2014 7:49:46 AM \| Asteroids@home \| Finished upload of ps_141205_60444_26_2_0 12/31/2014 7:49:47 AM \| Asteroids@home \| Sending scheduler request: To report completed tasks. 12/31/2014 7:49:47 AM \| Asteroids@home \| Reporting 1 completed tasks 12/31/2014 7:49:47 AM \| Asteroids@home \| Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: not highest priority project) 12/31/2014 7:49:49 AM \| Asteroids@home \| Scheduler request completed 12/31/2014 7:51:06 AM \| Asteroids@home \| Computation for task ps_141205_60425_16_2 finished 12/31/2014 7:51:06 AM \| Asteroids@home \| Starting task ps_141205_60416_3_2 12/31/2014 7:51:08 AM \| Asteroids@home \| Started upload of ps_141205_60425_16_2_0 12/31/2014 7:51:16 AM \| Asteroids@home \| Finished upload of ps_141205_60425_16_2_0 12/31/2014 7:51:20 AM \| Asteroids@home \| Sending scheduler request: To report completed tasks. 12/31/2014 7:51:20 AM \| Asteroids@home \| Reporting 1 completed tasks 12/31/2014 7:51:20 AM \| Asteroids@home \| Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: not highest priority project) 12/31/2014 7:51:22 AM \| Asteroids@home \| Scheduler request completed 12/31/2014 8:20:59 AM \| Asteroids@home \| Computation for task ps_141205_60442_20_2 finished 12/31/2014 8:20:59 AM \| Asteroids@home \| Starting task ps_141205_60441_15_2 12/31/2014 8:21:01 AM \| Asteroids@home \| Started upload of ps_141205_60442_20_2_0 12/31/2014 8:21:09 AM \| Asteroids@home \| Finished upload of ps_141205_60442_20_2_0 12/31/2014 8:21:14 AM \| Asteroids@home \| Sending scheduler request: To report completed tasks. 12/31/2014 8:21:14 AM \| Asteroids@home \| Reporting 1 completed tasks 12/31/2014 8:21:14 AM \| Asteroids@home \| Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: not highest priority project) 12/31/2014 8:21:17 AM \| Asteroids@home \| Scheduler request completed 12/31/2014 8:21:27 AM \| Asteroids@home \| Sending scheduler request: To report completed tasks. 12/31/2014 8:21:27 AM \| Asteroids@home \| Reporting 1 completed tasks 12/31/2014 8:21:27 AM \| Asteroids@home \| Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: not highest priority project) 12/31/2014 8:21:30 AM \| Asteroids@home \| Scheduler request completed 12/31/2014 8:27:08 AM \| Asteroids@home \| Computation for task ps_141205_60414_23_2 finished 12/31/2014 8:27:08 AM \| Asteroids@home \| Starting task ps_141205_60436_9_2 12/31/2014 8:27:10 AM \| Asteroids@home \| Started upload of ps_141205_60414_23_2_0 12/31/2014 8:27:18 AM \| Asteroids@home \| Finished upload of ps_141205_60414_23_2_0 12/31/2014 8:27:20 AM \| Asteroids@home \| Sending scheduler request: To report completed tasks. 12/31/2014 8:27:20 AM \| Asteroids@home \| Reporting 1 completed tasks 12/31/2014 8:27:20 AM \| Asteroids@home \| Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: not highest priority project) 12/31/2014 8:27:22 AM \| Asteroids@home \| Scheduler request completed ID: 59184 ·

Claggy Send message Joined: 23 Apr 07 Posts: 1112	Message 59186 - Posted: 31 Dec 2014, 14:30:38 UTC - in response to Message 59184. Last modified: 31 Dec 2014, 14:30:46 UTC and tell the projects that I want 10 days of data for each of them The Cache settings are Global, setting a Cache setting of 10 days means 10 days overall, Not 10 days for Seti and 10 days for Asteroids. Claggy ID: 59186 ·

Jazzop Send message Joined: 19 Dec 06 Posts: 90	Message 59194 - Posted: 31 Dec 2014, 17:00:01 UTC - in response to Message 59184. Stop trying to understand the work-sharing algorithm. I gave up a long time ago, when it was significantly changed after version 5.x or 6.x (I don't remember). There are plenty of detailed explanations floating around on this forum regarding the rationale behind the algorithm. I think it fails to satisfy the observer mainly because so many projects do a bad job of estimating their tasks' Time to Completion, especially when that results in a very short deadline and BOINC gets all panicky and urgent. To get the outcome you seek, you either have to micromanage BOINC constantly or leave it alone for a long period of time so it can "learn" how long the assortment of projects/tasks you have chosen actually perform on your machine. ID: 59194 ·

Matthew Burch Send message Joined: 1 Oct 11 Posts: 17	Message 59196 - Posted: 31 Dec 2014, 17:13:57 UTC - in response to Message 59186. Last modified: 31 Dec 2014, 17:14:56 UTC and tell the projects that I want 10 days of data for each of them The Cache settings are Global, setting a Cache setting of 10 days means 10 days overall, Not 10 days for Seti and 10 days for Asteroids. Claggy I will edit my post appropriately to take this into account. Regardless of how one looks at it, it's nor working properly when I'm not spending my donated processor time the way I wish to. EDIT Actually, no I will not edit the post, because I forgot that posts can only be edited for a short time after posting. ID: 59196 ·

Matthew Burch Send message Joined: 1 Oct 11 Posts: 17	Message 59198 - Posted: 31 Dec 2014, 17:23:16 UTC - in response to Message 59194. Last modified: 31 Dec 2014, 17:25:21 UTC Stop trying to understand the work-sharing algorithm. I gave up a long time ago, when it was significantly changed after version 5.x or 6.x (I don't remember). There are plenty of detailed explanations floating around on this forum regarding the rationale behind the algorithm. I think it fails to satisfy the observer mainly because so many projects do a bad job of estimating their tasks' Time to Completion, especially when that results in a very short deadline and BOINC gets all panicky and urgent. To get the outcome you seek, you either have to micromanage BOINC constantly or leave it alone for a long period of time so it can "learn" how long the assortment of projects/tasks you have chosen actually perform on your machine. You're correct. I do not need to understand the algorithm. All I have to understand is that BOINC is not donating my processor time the way I want it to. I am donating my CPU and GPU time and have indicated how I want my resources shared. I need BOINC to prevent projects from ignoring my wishes. I will be generous and hope that Asteroids isn't doing it intentionally. The end result for continued failure to split work the way I want it split will be the same though, whether the system gaming is accidental or intentional. Eventually I will drop alternate projects and go back 100% to one project, SETI, if BOINC does not allow me to simply help out other projects while concentrating most of my processing on SETI. ID: 59198 ·

Aurora Borealis Send message Joined: 8 Jan 06 Posts: 448	Message 59200 - Posted: 31 Dec 2014, 18:46:45 UTC - in response to Message 59198. As I pointed out on the SETI forum. Stick to a small cache, and leave BOINC to stabilize. Then you can make incremental changes to resource share until things reflect you're wishes. Don't forget that if you're using Credits as your benchmark then you have to take into account that some projects do pay out a lot more per computing time then others. GPU's crunching especially have a very large discrepancy between projects. Boinc V 7.4.36 Win7 i5 3.33G 4GB NVidia 470 ID: 59200 ·

Matthew Burch Send message Joined: 1 Oct 11 Posts: 17	Message 59202 - Posted: 31 Dec 2014, 20:09:23 UTC - in response to Message 59200. Last modified: 31 Dec 2014, 20:13:15 UTC As I pointed out on the SETI forum. Stick to a small cache, and leave BOINC to stabilize. Then you can make incremental changes to resource share until things reflect you're wishes. Don't forget that if you're using Credits as your benchmark then you have to take into account that some projects do pay out a lot more per computing time then others. GPU's crunching especially have a very large discrepancy between projects. BOINC has had over 1 million Asteroids "Work Done" units to stabilize, and has not yet done so. I'm not only using Credits for my benchmark, I'm looking at the same SETI work units that have been sitting untouched since I got more Asteroids work the other day after that project had been down for a while. I first started doing Asteroids work when the Database at SETI needed to be rebuilt. Asteroids immediately took over every available resource on my machine. I didn't care that much then because I was fairly short on SETI work anyway, and knew that there wasn't any more on the way for a while. The SETI work had deadlines several weeks out. So Asteroids had a couple weeks to establish itself on my machine. Then the Asteroids servers stopped giving out data, and I worked through entire downloaded selection of Asteroids work units before the SETI work units re-started. SETI provided a few more work units, and I was steadily working SETI until the other day. Then the Asteroids project returned to service. My machine reported a large number of Asteroids results, and downloaded a lot of new Asteroids work units. Immediately, all SETI work stopped, again. Since Asteroids returned to service, my machine has not even touched any of the nine SETI work units that it was working on. I cannot see where cache size can possibly have any impact on this. It's abundantly clear that BOINC is not properly arranging for my CPU and GPU time is split between projects the way I want it to be. Why exactly this is happening, I can't say, but I can say that something is clearly broken. ID: 59202 ·

Gary Charpentier Send message Joined: 23 Feb 08 Posts: 2465	Message 59207 - Posted: 31 Dec 2014, 23:32:58 UTC Last modified: 31 Dec 2014, 23:34:17 UTC There are another couple recent threads on this board about this issue. The figures that BOINC uses in the work fetch are REC numbers, not RAC numbers, assuming you are using a reasonably recent version of BOINC. If Asteroids is using credit inflation, they are, then the RAC numbers you see will be way out of line. This in addition to the new fetching can cause users to think BOINC is beserk. The new fetch will grab as much work as possible, until the buffer is full, from a SINGLE project. So if it attempts to get work from SETI and there isn't any -- AT THAT PRECISE MICROSECOND -- then it will go to Asteroids and gorge itself, no matter the relative work fractions. ID: 59207 ·

Matthew Burch Send message Joined: 1 Oct 11 Posts: 17	Message 59215 - Posted: 1 Jan 2015, 0:39:11 UTC - in response to Message 59207. Last modified: 1 Jan 2015, 0:42:31 UTC There are another couple recent threads on this board about this issue. The figures that BOINC uses in the work fetch are REC numbers, not RAC numbers, assuming you are using a reasonably recent version of BOINC. If Asteroids is using credit inflation, they are, then the RAC numbers you see will be way out of line. This in addition to the new fetching can cause users to think BOINC is beserk. The new fetch will grab as much work as possible, until the buffer is full, from a SINGLE project. So if it attempts to get work from SETI and there isn't any -- AT THAT PRECISE MICROSECOND -- then it will go to Asteroids and gorge itself, no matter the relative work fractions. I am arguing that there's a whole lot of meaningless calculations going on. If I have SETI@home set to 100 resource share, and Asteroids@home set to 5 resource share, then BOINC should monitor how much actual, REAL CPU/GPU time is devoted to each project on the local machine. For example, using 8 cores of CPU, set SETI 100 and Asteroids 5: Once SETI has completed 100 hours of CPU work, a series of Asteroids tasks are started. Once those jobs are completed, Asteroids is determined to have done 20 hours of work. That's fine. BOINC then has SETI do 300 hours of work before starting any more Asteroids work. When SETI is at 400 and Asteroids is at 20, BOINC starts two more Asteroids jobs. If that ends up being less than 5 hours, BOINC pulls another job for Asteroids. From that point on, the client would start one Asteroids job every time the actual CPU time ratio of SETI to Asteroids exceeds 20:1. It's not rocket science. If a local method of tracking resources is utilized, then it doesn't matter what individual projects do to try to skew the numbers, either intentionally, or by accident. In the example above, 100 hours of CPU is spent on one project for every 5 hours spent on another. Period. Done. No shenanigans allowed. What this will do is encourage projects to develop more efficient algorithms, so they get more bang per hour. Other than the time it would take to code the changes, I can think of no legitimate reason for NOT doing resource sharing calculations at the client. That being said, BOINC is being actively developed, so the time and effort required to make BOINC capable of doing its job and letting donors reliably and accurately manage their resource allocations would seem to be a strong priority. ID: 59215 ·

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.