Stuck in gear

Message boards : Questions and problems : Stuck in gear
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Bobby

Send message
Joined: 27 Oct 19
Posts: 8
United States
Message 93335 - Posted: 27 Oct 2019, 20:20:17 UTC

I have been using the software since reinstalling on October 7 this year. Things have been going good except last night my Internet was knocked off line. When I got up this morning, I reset my router and the Internet was back on line. I restarted my computer. I went back to BOINC manager and my project as well as 4 task I had were gone. I attempted to get back on by loading a project, but the BOINC manager cannot stay on line. It connects and then after a few seconds disconnects. I am getting a message that I am using the wrong password. I can get onto Einstein@home and my account is still active. The BOINC manager seems to be stuck in a loop and the stdoutdae.txt, now >30Mb, shows the loop that it is stuck in. I can't seem to find where to enter the correct password for the BOINC manager. Below is a sample of the loop that the program is stuck in.

27-Oct-2019 12:08:17 [---] [gui_rpc] GUI RPC reply: '<boinc_gui_rpc_reply>
<handle_get_screensaver_tasks>
<suspend_reason>0</suspend_reason>
<result>
<name>h1_0412.75_O2C02C'
27-Oct-2019 12:08:18 [---] [suspend] net_susp: no; file_xfer_susp: no; reason: unknown reason
27-Oct-2019 12:08:18 [Einstein@Home] [heartbeat] Heartbeat sent to task h1_0412.75_O2C02Cl1In0__O2MD1G2_G34731_412.90Hz_24_1
27-Oct-2019 12:08:18 [---] [poll] CLIENT_STATE::do_something(): End poll: 0 tasks active
27-Oct-2019 12:08:18 [---] [gui_rpc] GUI RPC Command = '<boinc_gui_rpc_request>
<get_screensaver_tasks/>
</boinc_gui_rpc_request>
'
27-Oct-2019 12:08:18 [---] [gui_rpc] GUI RPC reply: '<boinc_gui_rpc_reply>
<handle_get_screensaver_tasks>
<suspend_reason>0</suspend_reason>
<result>

Please Help
ID: 93335 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 93337 - Posted: 27 Oct 2019, 21:31:48 UTC - in response to Message 93335.  

You don't say which OS you use, so you may have to look around where the file is, but completely exit BOINC (client and manager) then go to the BOINC data directory and remove the gui_rpc_auth.cfg file, then restart BOINC Manager (which if all's right should start the client and connect to it now).

If you every edited the gui_rpc_auth.cfg file and put in your own password, you will have to do that again after this.
ID: 93337 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 93338 - Posted: 27 Oct 2019, 21:40:20 UTC - in response to Message 93335.  
Last modified: 27 Oct 2019, 21:43:00 UTC

Are you the same 'Bobby' as the 'BOBBY CONGER' that I replied to at Einstein, earlier this evening? If so, let's try and tie the two threads together.

I see that the Einstein thread now has a brief snippet from an Event Log, with several debug flags set. It would probably help if you could post a longer snippet, from the top, showing the complete sequence of events from computer restart to the recurrence of the error messages. (Edit - since others are getting involved who don't usually read Einstein, the message in question is https://einsteinathome.org/content/stuck-last-gear-0#comment-174084)

As I said at Einstein, I think the 'bad password' error message is bad programming: it's the fallthrough message when all known reasons for failure have been excluded. So we're dealing with something unknown to the programmers (although it's been around for years).

The password in question is described in Controlling BOINC remotely. It's a string placed in the file gui_rpc_auth.cfg on the computer you're trying to control. Under Windows, it may be a random 32-character string created when BOINC is first installed, or you can place something simpler there yourself. If you are connecting to a BOINC client on the same machine as the Manager you're using, you should never need to supply a password: the Manager can look up the password from the same place as the client, and self-authenticate.

The password field needs to be populated if you are genuinely needing to control a different, remote client on another machine. Then you would go to the 'Select computer...' menu entry on the File menu in BOINC Manager (Advanced View, if you're not using that already). You should fill in BOTH boxes (for a remote computer), or NEITHER box (for a local client).

But as I've said, I think this whole message is a red-herring - we need to find a different cause. Hold on while I find the source code for the bad message, so I can explain why I think it's wrong.
ID: 93338 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 93339 - Posted: 27 Oct 2019, 21:55:36 UTC
Last modified: 27 Oct 2019, 21:56:43 UTC

I think the problem starts at https://github.com/BOINC/boinc/blob/master/clientgui/BOINCBaseFrame.cpp#L479.

There are a couple of genuine causes listed, then - at line 518 - there's an

    } else {
If we ever reach this point, there's only one answer we can give - "The password you have provided is incorrect, please try again.": whatever the true failure.

And that's as far as I'm going tonight - late evening, European time. I'll think again in the morning.
ID: 93339 · Report as offensive
Profile Bobby

Send message
Joined: 27 Oct 19
Posts: 8
United States
Message 93340 - Posted: 28 Oct 2019, 2:17:28 UTC - in response to Message 93337.  

i'm using Window 10 on a 64 bit computer. Hope this helps.
ID: 93340 · Report as offensive
Profile Bobby

Send message
Joined: 27 Oct 19
Posts: 8
United States
Message 93341 - Posted: 28 Oct 2019, 2:18:27 UTC - in response to Message 93338.  

Yes I am the same. I guess my desperation is showing. I'd like to get the computer back to work, but nothing is helping right now.
Thanks
ID: 93341 · Report as offensive
Profile Bobby

Send message
Joined: 27 Oct 19
Posts: 8
United States
Message 93342 - Posted: 28 Oct 2019, 2:25:40 UTC - in response to Message 93337.  

I did as you instructed and renamed the gui_rpc_auth.cfg file to gui_rpc_auth.old and restarted BOINC manager. It did the same thing as before where it will connect for 10 seconds and then disconnect. I assume it is filling the stdoutdae.txt file up because it has doubled in size now from ~30Mb to 72Mb.
ID: 93342 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 93349 - Posted: 28 Oct 2019, 9:58:48 UTC - in response to Message 93342.  

Do you have BOINC installed as a service, or as a (normal) user install?
In the event log and stdoutdae.txt this will show as a line stating that BOINC runs as a daemon.

I did as you instructed and renamed the gui_rpc_auth.cfg file to gui_rpc_auth.old and restarted BOINC manager.
BOINC consists of two things, a client and a Manager (the GUI). Exiting the Manager not necessarily exits the client, just as starting the Manager doesn't necessarily start the client. That's all user-definable. The Manager is there to easily allow control of the client, the client does all the heavy work of scheduling, storing work, contacting the projects, etc.

Which is why I instructed to quit BOINC, both client and manager, before deleting the gui_rpc_auth.cfg file. Because when the client still runs, when you tamper with this file, upon the restart of the Manager (the GUI) it'll just reuse the previous values.

So go to Windows Task Manager (CTRL + Shift + Esc), choose BOINC, right click on it, Go to details, right click on boinc.exe, End process tree (Jeez Microsoft, can we do this any more circuitous?)
Make sure it doesn't come back (when installed as a service it may restart, then you'll have to stop the service via the services app).

Now, to make sure that the Manager runs the client when it starts, we need to be in the registry.
In the search box on the taskbar, type regedit. Then, select the top result for Registry Editor (Desktop app).
Go to HKEY_CURRENT_USER\Software\Space Sciences Laboratory, U.C. Berkeley\BOINC Manager
Make sure that DisplayShutdownClientDialog has value set to 1.
Make sure that RunDaemon has value set to 1.
And if not, edit these so their values are 1.

No need to restart the computer, just exit the registry.
Now start BOINC Manager in your normal way.

I assume it is filling the stdoutdae.txt file up because it has doubled in size now from ~30Mb to 72Mb.
Normally this file is 2MB and when filled to that, will switch the full version to stdoutdae.old, and spawn a new stdoutdae.txt
If you edited the client configuration file cc_config.xml and added a line
<max_stdout_file_size>N</max_stdout_file_size>

Where N is a value in bytes, then this file can become that much bigger.

It can only double in size so quick when it has extraneous and unnecessary debug flags on.
Also in cc_config.xml, are all the log_flags. file_xfer, sched_ops and task are always on (1), the rest is by default off (0).
So check here what's all on (1) and if you don't find it necessary, turn them off (0).
You best edit this file with the client off. Editing can be done in Notepad, no fancy XML editor necessary. Make sure to save changes in ANSI encoding format.
ID: 93349 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 93350 - Posted: 28 Oct 2019, 10:02:00 UTC - in response to Message 93342.  

It sounds like the client is crashing every 10 seconds, then restarting. You won't be able to post the whole 72 MB here (64 KB limit), and we wouldn't want to read it! Can you search through that file for the phrase "Starting BOINC client", and then copy out everything until the next appearance of the same phrase. Choose one of the most recent occurrences towards the end of the file. If the result is still over 64 KB, you'll have to upload it to a file sharing site and post us a link.

It might be worth looking at the .txt files starting 'stderr...' in the same folder. Look at the datestamps first - not worth bothering with one that hasn't changed in the last couple of days.
ID: 93350 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 93353 - Posted: 28 Oct 2019, 10:24:32 UTC - in response to Message 93349.  

That's an alternate theory. Modern BOINCs only rotate the logs to enforce the 2 MB (or whatever) limit when the client restarts, so it may not be crashing. Maybe you have so many optional log flags that the client simply can't keep up, and doesn't have enough time to process comms with the manager at the same time.

Ideally, in that case you would need to find a way of shutting the client down cleanly, so you can edit cc_config.xml and read the new version at startup. Best way is to navigate to BOINC's program directory (C:\Program Files\BOINC), open a command window, and issue

boinccmd --quit
ID: 93353 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 93355 - Posted: 28 Oct 2019, 10:30:55 UTC - in response to Message 93353.  

LOL, I forgot about BOINCCMD. Ta. :)
ID: 93355 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 93358 - Posted: 28 Oct 2019, 11:32:54 UTC

Bobby's machine contacted the Einstein server about 20 minutes ago - Tasks for computer 12791665. It reported a completed task (still pending validation), so it looks like the client is running properly.

Completed tasks are showing a big discrepancy between elapsed time and CPU time, which correlates with a very busy CPU. Jord's probably on to something with the 'too many log flags' rendering the client incommunicado.
ID: 93358 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 93360 - Posted: 28 Oct 2019, 13:00:30 UTC
Last modified: 28 Oct 2019, 13:01:03 UTC

Let me give a default cc_config.xml file, just remove or rename the original one, put this one in place.
You can copy it into Notepad, then save as a All Files, ANSI format, cc_config.xml file into C:\Programdata\BOINC\

<cc_config>
    <log_flags>
        <file_xfer>1</file_xfer>
        <sched_ops>1</sched_ops>
        <task>1</task>
        <app_msg_receive>0</app_msg_receive>
        <app_msg_send>0</app_msg_send>
        <async_file_debug>0</async_file_debug>
        <benchmark_debug>0</benchmark_debug>
        <checkpoint_debug>0</checkpoint_debug>
        <coproc_debug>0</coproc_debug>
        <cpu_sched>0</cpu_sched>
        <cpu_sched_debug>0</cpu_sched_debug>
        <cpu_sched_status>0</cpu_sched_status>
        <dcf_debug>0</dcf_debug>
        <disk_usage_debug>0</disk_usage_debug>
        <file_xfer_debug>0</file_xfer_debug>
        <gui_rpc_debug>0</gui_rpc_debug>
        <heartbeat_debug>0</heartbeat_debug>
        <http_debug>0</http_debug>
        <http_xfer_debug>0</http_xfer_debug>
        <idle_detection_debug>0</idle_detection_debug>
        <mem_usage_debug>0</mem_usage_debug>
        <network_status_debug>0</network_status_debug>
        <notice_debug>0</notice_debug>
        <poll_debug>0</poll_debug>
        <priority_debug>0</priority_debug>
        <proxy_debug>0</proxy_debug>
        <rr_simulation>0</rr_simulation>
        <rrsim_detail>0</rrsim_detail>
        <sched_op_debug>0</sched_op_debug>
        <scrsave_debug>0</scrsave_debug>
        <slot_debug>0</slot_debug>
        <state_debug>0</state_debug>
        <statefile_debug>0</statefile_debug>
        <suspend_debug>0</suspend_debug>
        <task_debug>0</task_debug>
        <time_debug>0</time_debug>
        <trickle_debug>0</trickle_debug>
        <unparsed_xml>0</unparsed_xml>
        <work_fetch_debug>0</work_fetch_debug>
    </log_flags>
    <options>
         <max_stdout_file_size>20119200</max_stdout_file_size>
    </options>
</cc_config>

The max_stdout_file_size sets it to 18MB.

After putting this one in place, restart the client.
ID: 93360 · Report as offensive
Profile Bobby

Send message
Joined: 27 Oct 19
Posts: 8
United States
Message 93362 - Posted: 28 Oct 2019, 13:55:13 UTC - in response to Message 93349.  

I did as instructed. The computer is now set such that the stdoutdae.txt has been reset and the values in the registry are set to the default which is zero. I noticed that the gui_rpc_auth.cfg had been changed and the password did not match the authenticator line in account_einstein.phys.uwm.edu.xml. I don't know if that would have an effect or not, but I did change the gui_rpc_auth.cfg to the original configuration file when I set up the computer to run. I am assuming that this is the password and I have no idea how it got changed. I don't read hex so I'm not sure if this is my password or not. After all this I attempted to connect with BOINC manager again and got the same result, the password is not recognized. I am at a loss right now.

Bobby
ID: 93362 · Report as offensive
Profile Bobby

Send message
Joined: 27 Oct 19
Posts: 8
United States
Message 93363 - Posted: 28 Oct 2019, 14:03:03 UTC - in response to Message 93360.  

Thank Jord. I don't know how the file got corrupted but this has got me back on line. I appreciate all your help.
Bobby
ID: 93363 · Report as offensive
Profile Bobby

Send message
Joined: 27 Oct 19
Posts: 8
United States
Message 93364 - Posted: 28 Oct 2019, 14:03:59 UTC - in response to Message 93362.  

I am back on line now after copying the cc_config.xml file that was sent to me by Jord. Thanks for all your help.

Bobby
ID: 93364 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 93365 - Posted: 28 Oct 2019, 14:19:28 UTC - in response to Message 93362.  

I did as instructed. The computer is now set such that the stdoutdae.txt has been reset and the values in the registry are set to the default which is zero. I noticed that the gui_rpc_auth.cfg had been changed and the password did not match the authenticator line in account_einstein.phys.uwm.edu.xml. I don't know if that would have an effect or not, but I did change the gui_rpc_auth.cfg to the original configuration file when I set up the computer to run. I am assuming that this is the password and I have no idea how it got changed. I don't read hex so I'm not sure if this is my password or not. After all this I attempted to connect with BOINC manager again and got the same result, the password is not recognized. I am at a loss right now.

Bobby
The gui_rpc_auth.cfg password is local only - it's not meant to have any relationship with the authenticator. I think it's random.

I think the client only reads the file at startup, and is then listening for that password for the rest of the session (next restart, days or weeks later). If you change it, the Manager will read the new one from the file, and offer that instead - and it won't match the old one. Moral: restart the client after changing the password.
ID: 93365 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15483
Netherlands
Message 93366 - Posted: 28 Oct 2019, 15:33:25 UTC - in response to Message 93362.  
Last modified: 28 Oct 2019, 15:36:48 UTC

The authenticator for projects is an MD5 hash of your email address - which is your unique identifier. You shouldn't use it as the password for communications between the BOINC client and the Manager. You have a unique authenticator per project you add.

The 32 character hexadecimal password in gui_rpc_auth.cfg is made at random by the client when it starts and it finds no such file present. You can also have an empty gui_rpc_auth.cfg file, with nothing in it. Just delete the 32 char key and save the file.
As Richard surmises, the file is made and read only at a client start. The "Read config files" option in BOINC manager does not reread this file.

But glad to see that the clean cc_config.xml file worked. I don't think the other one was corrupted, but merely someone tinkered a bit too much with the Event Log options menu. If you want to use some debug flags, that's fine, but read up on what they do at https://boinc.berkeley.edu/wiki/Client_configuration first, and if still wondering, just ask in a project forum or here.
ID: 93366 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5082
United Kingdom
Message 93426 - Posted: 30 Oct 2019, 18:25:00 UTC

Just to be on the safe side, I thought it might be wise to test that we'd diagnosed this one properly. So I set about half the debug log flags on one of my machines - all the ones in the right-hand column in v7.16.3, to be precise. That includes some biggies like <app_message_receive> and <rrsim_detail>. And clicked 'apply', with both Manager and Event Log windows visible.

Sure enough, the first actual error message I got was "The password you have provided is incorrect, please try again.". Then, all hell broke out in both windows. Both views cleared, started to redraw, cleared again, flickered and flashed, showed partial information, etc. etc. Clearly couldn't keep up with the flow of GUI RPCs.

I crashed everything down, killed the processes for the science apps which hadn't got the 'quit' messages, and rebooted the machine for luck.

BOINC auto-restarts, of course (I hadn't stopped that), and the windows started flashing again. Another crash stop, another set of process kills, and this time I hand-edited the log flag section of cc_config.xml.

This time, BOINC started cleanly. I've been baby-sitting it for about an hour, and no sign of trouble. The first two projects have reported a batch of tasks each (including some of the tasks which I'd hand-killed) and both sets of tasks validated - no errors.

So, the core BOINC functionality is robust, and came up smelling of roses - full marks. But that "password incorrect" message is a liar: I do have a password set, but the contents of that file haven't changed since September 2013, when I set this machine up.
ID: 93426 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 93427 - Posted: 30 Oct 2019, 20:05:29 UTC

That's scary.

Perhaps an additional message at the end of the if-else: "Sorry, we haven't a clue what's wrong."
Or at least something that indicates the problem isn't one that's been anticipated.
ID: 93427 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : Stuck in gear

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.