64 bit system suddenly thinks it is 32 bit.

Message boards : Questions and problems : 64 bit system suddenly thinks it is 32 bit.
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Charles Dennett

Send message
Joined: 8 Nov 05
Posts: 24
United States
Message 64302 - Posted: 16 Sep 2015, 14:18:55 UTC

I have 3 64 bit Linux systems all running Fedora 22 with the latest patches. Several weeks ago they started thinking they are 32 bit. They started using the 32 bit executables for the projects I crunch. I found lines like this in the sched_request files for each project:

    <platform_name>x86_64-pc-linux-gnu</platform_name>
    <alt_platform>
        <name>i686-pc-linux-gnu</name>
    </alt_platform>


This past weekend one of those system had a failed disk drive and I had to replace it and rebuild the system from scratch. Still Fedora 22 with all the latest patches. All of a sudden it no longer believes it is 32 bit. The alt_platform stanza no longer appears in any of its sched_request files.

I guessing that some package or patch that got installed when this started is confusing whatever it is in the boinc client that decides if it is 64 bit capable or not. Perhaps some 64 bit library was not updated. When I reinstalled the system this past weekend, I grabbed the KDE LiveCD and used that to reinstall. Other than the boinc-client, I installed nothing extra.

I've googled around and found nothing appropriate. Next step is to see if I can grab the source and take a look although I'm not a developer, only a UNIX/Linux admin so I may or may not have any luck. Oh, and boinc clinet and manager is installed from the packages supplied in the Fedora repository. It is version 7.2.42.

Any hints or suggestions would be appreciated. If more info is needed, let me know.
ID: 64302 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 64303 - Posted: 16 Sep 2015, 14:36:57 UTC - in response to Message 64302.  

Is this related to your question 64 bit systems now using 32 bit application for Linux at FiND? Do you have the same observations for other projects? I'll attempt an explanation there, but both you and Ben may find it a little surprising.
ID: 64303 · Report as offensive
Charles Dennett

Send message
Joined: 8 Nov 05
Posts: 24
United States
Message 64304 - Posted: 16 Sep 2015, 14:41:09 UTC - in response to Message 64303.  
Last modified: 16 Sep 2015, 14:42:52 UTC

Yes, it is. The other projects I have also have the same lines in their sched_request files on the 2 systems that have been running for who knows how long. They no longer appear in the one I rebuilt this weekend. However, before that, they did.

Since it is apparently happening on other projects, I thought it might be better to ask here.

I realize it's not a critical problem, but it just bothers me and I want to figure out why.
ID: 64304 · Report as offensive
Charles Dennett

Send message
Joined: 8 Nov 05
Posts: 24
United States
Message 64305 - Posted: 16 Sep 2015, 14:47:28 UTC - in response to Message 64304.  

One other thing I just noticed. For the other project I crunch, it has the alt_platform lines in its sched_request file saying it is 32 bit but uses the 64 bit executable. That project supplies both 32 and 64 bit Linux executables.
ID: 64305 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 64306 - Posted: 16 Sep 2015, 15:52:30 UTC

My suggested explanation is available for anyone to read at FiNDAH message 2584. It applies to most projects, but the evidence is particularly stark at FiND.
ID: 64306 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 64308 - Posted: 16 Sep 2015, 17:27:00 UTC - in response to Message 64306.  

I think your explanation covers pretty well why BOINC server may send either 32-bit or 64-bit apps when both are runnable.

I'll cover this part:

This past weekend one of those system had a failed disk drive and I had to replace it and rebuild the system from scratch. Still Fedora 22 with all the latest patches. All of a sudden it no longer believes it is 32 bit. The alt_platform stanza no longer appears in any of its sched_request files.


64-bit Linux version of BOINC client decides whether it can run 32-bit apps by checking files in system library directory and if it finds any 32-bit libraries it concludes that this system must support 32-bit apps. (Or so, can't remember the details.)

You probably installed some 32-bit program in August which then pulled in 32-bit libraries.
ID: 64308 · Report as offensive
Charles Dennett

Send message
Joined: 8 Nov 05
Posts: 24
United States
Message 64311 - Posted: 16 Sep 2015, 17:47:18 UTC - in response to Message 64308.  

There have always been 32 bit and 64 bit libraries on these systems. It's nothing new. I went back through the logs and see where some 32 bit libraries were upgraded (meaning newer versions replaced older versions) but I didn't see where anything was installed. I'll be trying one more thing - a complete removal and installation of boinc. If that doesn't fix it, and I'm not confident it will, I'll live with it unless I get a new idea.

Thanks for the input.

Charlie
ID: 64311 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 64312 - Posted: 16 Sep 2015, 18:05:58 UTC - in response to Message 64311.  

The decision of which application to use (if both are available choices for the system) is made by the project server, on the basis of the application_version history for the HostID of the computer.

If you remove and replace BOINC, but leave the rest of the computer unchanged, a BOINC server will try to recycle any available previous HostID - using your account, computer name, IP address, and probably other things I've forgotten to make the association between hardware and HostID. It does this to avoid unnecessary bloating of the Host table in the database.

So, rather than removing and replacing the BOINC client, what you probably want to do is to force a new HostID, and thus start clean with a new host_app_version table. (make sure you don't immediately fill it with zero-runtime-estimate tasks...)

The best way of doing that is by keeping the old BOINC installation, and forcing an apparent 'cheat' by tweaking the <rpc_seqno> for the project in client_state.xml downwards, so it appears that two separate computers are trying to contact the project scheduler using the same HostID. Someone may need to remind me whether it's necessary to set <allow_multiple_clients> at the same time for this to work with recent server code - or you could experiment.

Flush out all running tasks with NNT before you try this, and update the project 'dry', and inspect for a successful HostID change, before allowing new work.
ID: 64312 · Report as offensive
Charles Dennett

Send message
Joined: 8 Nov 05
Posts: 24
United States
Message 64313 - Posted: 16 Sep 2015, 18:13:24 UTC - in response to Message 64312.  

When I rebuilt the system this past weekend and reinstalled boinc, it took over the old hostid. That's what made be think a complete removal and reinstall might work. But, before i do that I'll give what you suggest a shot and see what happens.
ID: 64313 · Report as offensive
Charles Dennett

Send message
Joined: 8 Nov 05
Posts: 24
United States
Message 64314 - Posted: 16 Sep 2015, 19:27:24 UTC - in response to Message 64313.  

No luck. I first changed the rpc_seq number to 0. Started boinc. It reregistered the host with a new id number and used the old host entry but still used the 32 bit binary. Then I stopped everything after the few tasks had completed. I removed everything boinc and reinstalled. I attached to the project. It registered the host as new but still used the old host id and still used the 32 bit binary.

Someone suggested perhaps a 32 bit library somewhere is fooling boinc into thinking it can only run 32 bit. I've had both 32 and 64 bit libraries on this system for ever so I'm not sure what it causing the confusion.

I guess I'll live with it for now.

Thanks everyone for the suggestions. I really appreciate it.

Charlie
ID: 64314 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 64316 - Posted: 16 Sep 2015, 19:47:10 UTC - in response to Message 64311.  

Ok, checked what the client does.

It scans the files in /lib, /lib32, /lib/32, /usr/lib, /usr/lib32 and /usr/lib/32. For all files in those directories and following symlinks it runs either /usr/bin/file or /bin/file, whichever you have.

If the output from the file command contains both "ELF" and "32-bit" then 32-bit apps are supported.

There is <no_alt_platform> but I don't know why it would be set in your re-installed host.
ID: 64316 · Report as offensive
Charles Dennett

Send message
Joined: 8 Nov 05
Posts: 24
United States
Message 64317 - Posted: 16 Sep 2015, 20:04:43 UTC - in response to Message 64316.  

On a Linux system the 64 bit libraries are in /lib64 and /usr/lib64. 32 bit libraries are in /usr/lib and /lib. There is no directory named lib32 anywhere that I could find.

I'll give that no_alt_platform a shot and see what happens.

Charlie
ID: 64317 · Report as offensive
Charles Dennett

Send message
Joined: 8 Nov 05
Posts: 24
United States
Message 64318 - Posted: 16 Sep 2015, 20:18:37 UTC - in response to Message 64317.  
Last modified: 16 Sep 2015, 20:18:50 UTC

Added no_alt_paltform to my cc_config.xml file. Didn't seem to make a difference.
ID: 64318 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 64319 - Posted: 16 Sep 2015, 20:38:22 UTC - in response to Message 64318.  
Last modified: 16 Sep 2015, 20:38:58 UTC

That's weird. Do you have the file in right place? Mistyped the tag name?
ID: 64319 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 64320 - Posted: 16 Sep 2015, 20:41:12 UTC - in response to Message 64319.  
Last modified: 16 Sep 2015, 20:56:26 UTC

Didn't work for me either, in Windows64.

And I'm using BOINC v7.6.9, with cc_config.xml pre-populated with default tags from the GUI - just set the value in the place provided.

Didn't work with a simple 'Read config files', didn't work after a full client restart.

Edit - I did get a huge number of error messages on startup, including

16/09/2015 21:32:44 | FiND@Home | [error] App version has unsupported platform windows_intelx86; changing to windows_x86_64

and

16/09/2015 21:32:44 | LHC@home 1.0 | [error] App version has unsupported platform windows_intelx86; changing to windows_x86_64
16/09/2015 21:32:44 | LHC@home 1.0 | [error] State file error: duplicate app version: sixtrack windows_x86_64 44401 sse3

Edit 2: So now I have

<app_version>
    <app_name>vina</app_name>
    <version_num>102</version_num>
    <platform>windows_x86_64</platform>
    <avg_ncpus>1.000000</avg_ncpus>
    <max_ncpus>1.000000</max_ncpus>
    <flops>618682904.035871</flops>
    <api_version>7.5.0</api_version>
    <file_ref>
        <file_name>vina_1.2_windows_intelx86.exe</file_name>
        <main_program/>
    </file_ref>
</app_version>

- totally muddled, but it's going on fetching new work. Unfortunately, FiND doesn't distinguish the platform work is issued for by plan class, or otherwise display platform or alt_platform in the web lists of tasks issued.

Edit 3: but that <flops> value looks like the APR=0.62 from the 64-bit app_ver in Application details for host 105172. Next step is probably a 'flush all tasks' (which will take far less time than is being estimated with that grotty APR), reset project, and see what comes down the pipe next time.
ID: 64320 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 64322 - Posted: 16 Sep 2015, 21:06:27 UTC - in response to Message 64320.  

But... it did work, you wouldn't have had those errors otherwise. There doesn't seem to be any log messages though.
ID: 64322 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 64323 - Posted: 16 Sep 2015, 21:11:35 UTC - in response to Message 64322.  

But... it did work, you wouldn't have had those errors otherwise. There doesn't seem to be any log messages though.

My initial quick test of working was "Did it download the 64-bit version of the app at the next work request?", and the answer was "no".

It was only as I worked through the logs and other indicators that I saw that it had converted the existing tasks and application_version to run the current workload (including the 32-bit binary) as if it was the 64-bit platform.
ID: 64323 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 64324 - Posted: 16 Sep 2015, 21:27:38 UTC - in response to Message 64323.  

ok
ID: 64324 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 64325 - Posted: 16 Sep 2015, 21:29:29 UTC

OK, it begins to make more sense. After that batch had completed and reported, I reset the project, and fetched a new batch.

This time I got the 64-bit app, as

<app_version>
    <app_name>vina</app_name>
    <version_num>102</version_num>
    <platform>windows_x86_64</platform>
    <avg_ncpus>1.000000</avg_ncpus>
    <max_ncpus>1.000000</max_ncpus>
    <flops>661977170.668584</flops>
    <api_version>7.5.0</api_version>
    <file_ref>
        <file_name>vina_1.2_windows_x86_64.exe</file_name>
        <main_program/>
    </file_ref>
</app_version>

Note that the APR has already risen slightly. The sched_request also showed <platform_name>windows_x86_64</platform_name>, and no alternates.

But I'm also getting messages like

16/09/2015 22:20:29 | GPUGRID | Message from server: This project doesn't support computers of type windows_x86_64

so I'd better start putting this computer back together without the <no_alt_platform>, and resetting the projects which were 'adjusted'.
ID: 64325 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 64326 - Posted: 16 Sep 2015, 21:41:22 UTC

'Read config files' isn't enough to re-activate <alt_platform>, it needs a full client restart.

But after that, it asked with <alt_platform>, and got the 32-bit application back as before.

So the tag works, but use with care - it operates globally, across all projects, and not all of them necessarily supply work flagged as 64-bit.
ID: 64326 · Report as offensive
1 · 2 · Next

Message boards : Questions and problems : 64 bit system suddenly thinks it is 32 bit.

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.