New ResourceShare values are not "read" by some hosts

Message boards : BOINC Manager : New ResourceShare values are not "read" by some hosts
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Tuna Ertemalp
Avatar

Send message
Joined: 23 Dec 13
Posts: 45
United States
Message 77058 - Posted: 31 Mar 2017, 1:05:45 UTC
Last modified: 31 Mar 2017, 1:06:18 UTC

I have 12 hosts. I use BAM as AcctMgr. All my projects were at ResourceShare=100 for the last 2 years. Yesterday, I assigned them values 1/5/10/25/50/100/200/500 in BAM, and BAM successfully set those in actual project sites. Then I forced an UPDATE on all projects on each host. To my surprise, some hosts got the whole set of new values for the projects they run, and some hosts got those only for a few projects. I cannot understand what is going on...

As an example, I will use GPUGRID. I set it to 200 in BAM, BAM relayed that info to the project site, and I see it at 200 there. I don't use any "computer location" stuff; all hosts are at the default location. Yet, some hosts received 200 as the new value, and some hosts are stuck at 100. For kicks, on such a host, I suspended & NNT'd everything except for GPUGRID, changed the value to 201 on GPUGRID site (to take any issues with BAM or stale files or file timestamps off the table), ran on update on GPUGRID on that host, it received 8 tasks (4xTitanX, baby!), yet the resshare stayed at 100! I changed it back to 200 on the site, rerun update, still 100. I looked at all *gpugid*.xml files under ProgramData/BOINC, resource_share entries are all 200; not a single instance of 100.

And, a good number of projects are at this state on some number of my hosts.

What to do? How can it be XXX in all *projname*.xml files, yet be 100 for the projname in BOINCMgr?? If it were for one project across all hosts, or all projects across one host, I could understand, and blame a project or host being stuck at something, but this? :(

Yes, everything is the latest version.

Thanks
Tuna
ID: 77058 · Report as offensive
Profile Tuna Ertemalp
Avatar

Send message
Joined: 23 Dec 13
Posts: 45
United States
Message 77258 - Posted: 10 Apr 2017, 4:58:54 UTC

So, I PAINSTAKINGLY went through all of my 12 hosts with ~50 projects they are attached to, and made sure every project on each host has the correct resshare that I set under MyProjects in BAM, and I also made sure that each project's own site also showed the same resshare in their ProjectPreferences under YourAccount. To do this, I had to identify all the projects for each of my hosts that for some strange reason wouldn't get the new resshare from the project's XML, drain it of any remaining tasks with NoNewTasks+AbortNotStartedWork+DelayedDetach, wait for detach, and then reattach it using BAM. Somehow this initialized the project on that host with the correct resshare value coming from the project. So, now the default values under BAM's MyProjects page, the project sites themselves and my hosts attached to those projects are all in sync.

But there clearly is a bug somewhere in BOINCMgr that prevents it from accepting the ResShare value from the XML file sent by the project, sometimes. Working around it was very very very time consuming.

Thanks
Tuna
ID: 77258 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15477
Netherlands
Message 77270 - Posted: 10 Apr 2017, 8:34:02 UTC

I'll wait until the developers at BAM have said anything about this: https://boincstats.com/en/forum/18/11507,1
ID: 77270 · Report as offensive
Profile Tuna Ertemalp
Avatar

Send message
Joined: 23 Dec 13
Posts: 45
United States
Message 77277 - Posted: 10 Apr 2017, 16:33:29 UTC - in response to Message 77270.  

I'll wait until the developers at BAM have said anything about this: https://boincstats.com/en/forum/18/11507,1


Certainly. But note that my report here is about seemingly random hosts not respecting the resshare of seemingly random projects even though the XML sent to the host by that project during an update contains the correct value (since the project site itself has the correct value under MyAccount-->ProjectSettings), unless I detach n' reattach.

On the other hand, my report on BAM is that BAM doesn't seem to read back (or display) correctly the current resshare value of some projects on some hosts after an AccountMgrUpdate, regardless if that value on the host is correct based on what the project site has, even though the host sends an XML to BAM with the correct values.

I acknowledge that they sound similar, but I'll be surprised if they are the same issue. The two ends and direction of flow of information seems to be differeach in each case...

Tuna
ID: 77277 · Report as offensive
ChristianB
Volunteer developer
Volunteer tester

Send message
Joined: 4 Jul 12
Posts: 321
Germany
Message 77293 - Posted: 11 Apr 2017, 10:28:20 UTC

The problem with issues like that is that they are hard to reproduce. If it is only happening on some hosts but not on others I can't reproduce this on my end. What I could do if I find the time is to check where the value the Manager show to the user comes from. Normally this should come directly from the project specific preferences XML. Do you still have hosts that show this discrepancy or di you clean them all?
ID: 77293 · Report as offensive
Profile Tuna Ertemalp
Avatar

Send message
Joined: 23 Dec 13
Posts: 45
United States
Message 77308 - Posted: 11 Apr 2017, 13:40:14 UTC - in response to Message 77293.  

Unfortunately, I cleaned them all up which took 2-3 days, after leaving them untouched for about a week to see if the problem would resolve itself on its own.
ID: 77308 · Report as offensive

Message boards : BOINC Manager : New ResourceShare values are not "read" by some hosts

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.