Problem building client with VS2017

Message boards : BOINC client : Problem building client with VS2017
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 90002 - Posted: 11 Feb 2019, 6:32:36 UTC

I have been using VS2017 for some time and have several repositories at GitHub.

I wanted to debug a problem I am seeing with the RX-570 amd boards so I forked boinc and attempted to build the client.

The build requires VS2013 which I don't want to install since I using 2017. I changed the properties on boinc and libboinc to point to the latest SDK and retargeted both to VS2017. There were a boat load of errors. First one was curl.h was missing. I downloaded that and put it at one of the include paths. That fixed the compile error but I suspect the program wont find the library if I ever get it to run. I then started on the next error (see below): "CLIENT_ID" type redefinition. That showed up under a comment about MinGW_W64 defining this. Well, this system does have MinGW_W64 as it was needed for a project that did not use Visual Studio. I am guessing that VS2017 somehow picked up the MinGW env (include) variables??? That is suspicious.

Questions:
1 Has this been build with VS2017?

2. I have another system I can put vs2013 and that 120V_xp sdk on. (that's the SDK for vs2013). Will that solve all the problems including the missing curl stuff?

3. How is the actual client for windows built? If MinGW_W64 is how Berkeley is building the windows version, should I can switch to MinGW? Unfortunately, its debugger is nothing like MSVC and I suspect I will have to throw in print statements to see shat is happening.
ID: 90002 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 90004 - Posted: 11 Feb 2019, 8:35:04 UTC - in response to Message 90002.  

I've successfully built with VS2013, but I haven't tried VS2017.

I think you would have to update the whole vs2013 solution file - Microsoft probably has an import wizard which will break the back of that process, but you may still need to tweak some lines manually.

You would also need to assemble a Windows build dependencies tree for VS2017 (that gives you curl, openssl, wxwidgets, and zlib - there may be more). I don't know how much the format changes between versions.
ID: 90004 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 90013 - Posted: 11 Feb 2019, 15:59:54 UTC - in response to Message 90002.  

1 Has this been build with VS2017?


Not quite yet there. I have a sneak preview on GitHub but it's not complete yet.

The code does need some updating for VS2017 but I haven't seen the errors you have. But the sneak preview is configured for v141_xp so maybe the newer SDK you use has some stuff the old one doesn't.

2. I have another system I can put vs2013 and that 120V_xp sdk on. (that's the SDK for vs2013). Will that solve all the problems including the missing curl stuff?


Right now VS2013 gets you coding faster. Besides the page about Git and dependencies Richard linked to see also Compiling BOINC client software.

3. How is the actual client for windows built?


The official release is built with VS2010 and that's what David probably uses day to day. I'm not sure what Charlie uses and everyone else uses VS2013.
ID: 90013 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 90017 - Posted: 11 Feb 2019, 16:28:17 UTC - in response to Message 90013.  

... and everyone else uses VS2013.
Including the CI ('continuous integration') automated testing which takes place within minutes after every code change. If you build on VS2013, and you see that the AppVeyor check has passed on the branch you're interested in, you know you can concentrate on the logic and not worry about syntax errors.
ID: 90017 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 90026 - Posted: 11 Feb 2019, 19:30:24 UTC - in response to Message 90017.  
Last modified: 11 Feb 2019, 19:40:19 UTC

Made some progress in locating the problem.

No, did not get vs2017 working with boinc. Did not get VS2013 working although I did find an ISO and downloaded it and will look at its installation later.

Ageless: I got your message and will (eventually) go over to gethub to report the problem which is I suspect AMD driver related.

My problem: Usually any system that has an RX560 or RX570 or an RX Vega has twice as many GPU show up in boinc as there really are. This locks up the VNC service which then makes it difficult to get in and see what is happening.

Sometimes using revo-uninstaller gets rid of all AMD drivers and things are back to normal until the next microsoft upgrade when a new amd goes in and all hell breaks loose. Since VNC server hangs I had to install openssh server to get in to allow a termination of boinc.

read the following at boinc\client\gpu_detect.cpp

    // client-specific GPU code. Mostly GPU detection
    //
    // theory of operation:
    // there are two ways of detecting GPUs:
    // - vendor-specific libraries like CUDA and CAL,
    // which detect only that vendor's GPUs
    // - OpenCL, which can detect multiple types of GPUs,
    // including nvidia/amd/intel as well was new types
    // such as ARM integrated GPUs
    //
    // These libraries sometimes crash,
    // and we've been unable to trap these via signal and exception handlers.
    // So we do GPU detection in a separate process (boinc --detect_gpus)
    // This process writes an XML file "coproc_info.xml" containing
    // - lists of GPU detected via CUDA and CAL
    // - lists of nvidia/amd/intel GPUs detected via OpenCL
    // - a list of other GPUs detected via OpenCL
    //
    // When the process finishes, the client parses the info file



So that file is created and then re-read to see what was found!

Sure enough, I looked at coproc_info.xml and there were 4 gpus. The first two had the latest opencl driver, the bottom two had the previous which was supposedly uninstalled.

I used notepad to edit that xml and removed the two at the bottom. I then made the coproc_info.xml read only so it could not be updated. When I started boinc I was back to my two gpu's but BUT B.U.T.

Unfortunately, the 2nd gpu was fake. At least there are only 2 to worry about, not 4.

This is what I mean by fake: note the values on the 2nd board. The flops should be identical.



I am going to look into editing coproc_info.xml, maybe that can fix the problem. I have another system with an RX570 and RX560 that does not have a problem (it used to). Going to look into its copro_info.xml for some clue. I am not letting it update its drivers. Unaccountably I cannot restore the system with 2 RX570 to make it work again.

This is far from being fixed as VNC locks up accessing the system nor does that fake board even run tasks correctly: they never finish.

ID: 90026 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 90027 - Posted: 11 Feb 2019, 20:57:45 UTC - in response to Message 90026.  

If you haven't done it already, run clinfo, either the one from AMD driver package (if any) or the one here.

If clinfo reports more GPUs than you really have then no point in looking at BOINC's code. It's really the drivers / OpenCL runtime that's confused.
ID: 90027 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 90028 - Posted: 11 Feb 2019, 22:56:31 UTC
Last modified: 11 Feb 2019, 23:12:10 UTC

The problem has always been the drivers. That being said, Windows and GPU-Z have never reported more GPUs than actually exist unlike BOINC.

I ran that clinfo test but leave the analysis to you. The results are in this zip file.
clinfo_nocrossfire.txt was run with crossfire disabled (more on that later)
clinfo_dif.txt is the difference between crossfire and no crossfire (FWIW).
coproc_info.xml list 4 gpus: First 2 on the more recent opencl driver, the second two on an older driver that I CANNOT YET FIND A WAY TO UNINSTALL.

One of the driver installs, probable one that windows did on its own, enabled crossfire. I cannot use crossfire and it causes problem with one of my apps "DVDFAB". I noticed it was enabled when GPU-z showed the frequency was "0" on the second RX570 and I disabled crossfire and GPU-z is back to normal as shown. By back to normal I mean that both boards are running near 100% and the second RX570 eventually will finish a work unit, and quickly, unlike when crossfire was enabled.

However, I still have 2 fake GPUs (on driver 2671.3) and must resort to the trick of editing that xml file to ensure that BOINC sees only the first 2 GPUs.
Currently I have been using revo uninstaller to remove excess stuff. revo executes the AMD uninstall package and then it scans the registry and allows me to clean up all references left over after the uninstall. It seems I need something better. I would like to avoid "cleaner" which I think is spyware. I will look for a better AMD cleaner. What concerns me is that windows and gpuz seem to have no problem with the extraneous driver The pulldown box for gpuz only had 2 boards listed. Windows device manager shows only 2 devices but does allow me to rollback to the previous driver so it know there is another.

Thanks for looking.



[EDIT] This may be windows thing. This is on my boinc farm and I could have used Linux but I noticed the motherboard had a license in the bios so I put in windows for free. I can probably put in a flash drive with ubuntu and solve this problem in an hour.
ID: 90028 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 90033 - Posted: 12 Feb 2019, 6:04:19 UTC - in response to Message 90028.  

Finally got it working w/o having to modify that coproc_info.xml file: using driver 17.12.2 but probably works with later version.

This is what I think caused the problem but it is just a guess: An unwanted update from microsoft came in causing driver problem. I installed a driver for the RX570 from my set of drivers but probably failed to uninstall the one that came in;. I am guessing that left an older (or maybe newer?) opencl on the system. I tried several drivers, uninstalling and reinstalling and, I assume, must have uninstalled one that happened to match the leftover opencl. Then, when I put in 17.12.2, there were no other opencl libraries.
My guess is that AMD's uninstall only remove the opencl that was installed with the package. ie: no "cleanup" of opencl. Possibly the custom install would give an option for a clean install. I consistently used the express install.
ID: 90033 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 90034 - Posted: 12 Feb 2019, 8:15:17 UTC - in response to Message 90033.  

That sounds very plausible. OpenCL is supposed to be a universal, platform independent, programming language. I've seen Windows 10 install an NVidia driver from Microsoft, which had CUDA support but not OpenCL. Later, I let the same machine download an Intel GPU driver from Microsoft, which came with OpenCL. Lo and behold, when I put the NVidia card back in the machine, that had OpenCL support too, using the Intel tools.

There are BOINC users who advocate using 'clean install' and a driver removal utility at every update: I think that's overkill, but worth having available as a standby in case problems like this crop up.
ID: 90034 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 90044 - Posted: 12 Feb 2019, 14:38:24 UTC

On the subject of VS2017 ….

Googling around I read where VS2013 apps can be built on VS2017 if the 120v_xp SDK is available as a retarget. The recommendation was to install VS2013 community and then upgrade to VS2017 community. This was because the SDK was not available or at least a few members on stackoverflow and other forums were unable to find an installable SDK (120v_xp).

I tried just building the boinc client (module libboinc and boinc) but retargeted to latest sdk and for x64 only as that was what I was interested in. There were 96 source programs that compiled correctly (an obj was built). Module boinc could not be built because of diagnostics_win.cpp problem (one file). libboinc cold not be built because of problems in 4 files (same diagnostics_win and 3 includes)

Certainly looks like VS2017 could build the client. That is easier said than done. Does anyone have an installable SDK that "120v_xp" target? If so, PM me where to find it.

Thanks for looking!
ID: 90044 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 90060 - Posted: 12 Feb 2019, 17:56:55 UTC - in response to Message 90028.  

I'm counting 3 GPUs on the clinfo output but it looks like it was cut short.

The next time you get too many GPUs check out HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenCL\Vendors key. Or you can use Process Monitor to see what clinfo is really doing. Or you can build clinfo and OpenCL.dll yourself and then step through them.

It's a bit odd that you are the only one having this problem, or at least I can't recall anyone else reporting the same.

btw. Oblomov's clinfo pulls out more information that the clinfo in our download directory and he's got AppVeyour build the executables.
ID: 90060 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 90061 - Posted: 12 Feb 2019, 18:11:55 UTC - in response to Message 90028.  

You are correct in reporting this as a driver problem, but one would think that the boinc client should give a warning or two.

1. multiple library problem

From that zip I posted earlier I copied the following sections that indicated a problem*

<opencl_driver_version>2766.5</opencl_driver_version>
<device_num>0</device_num>
<peak_flops>5095424000000.000000</peak_flops>
<opencl_available_ram>4294967296.000000</opencl_available_ram>
<opencl_device_index>0</opencl_device_index>

<opencl_driver_version>2766.5</opencl_driver_version>
<device_num>1</device_num>
<peak_flops>5095424000000.000000</peak_flops>
<opencl_available_ram>4294967296.000000</opencl_available_ram>
<opencl_device_index>1</opencl_device_index>

<opencl_driver_version>2671.3</opencl_driver_version>
<device_num>2</device_num>
<peak_flops>5095424000000.000000</peak_flops>
<opencl_available_ram>4294967296.000000</opencl_available_ram>
<opencl_device_index>0</opencl_device_index>

<opencl_driver_version>2671.3</opencl_driver_version>
<device_num>3</device_num>
<peak_flops>5095424000000.000000</peak_flops>
<opencl_available_ram>4294967296.000000</opencl_available_ram>
<opencl_device_index>1</opencl_device_index>


* Since the boinc client know there are only 2 physical GPUs in the system, it could draw the conclusion, after parsing the above sections, that device 2 and 3 do not exist and at a minimum issue a warning to the user.


Well, see, that's the problem. How does BOINC know how many GPUs there really is?

BOINC only uses CAL, CUDA and OpenCL to check what GPUs are available. Those are what science apps use and it's good thing that BOINC and science apps have the same view of the machine.

2. Crossfire enabled
Obviously, crossfire and SLI are used by gamers. Unfortunately, it seems that the boinc client assigns tasks to the GPU that is the slave. From my experience, tasks that take minutes to complete on the master, take hours or days on the slave. Possibly opencl could be coded to handle this properly and both tasks run at %100 efficiency. Some projects (or at least setiathome) push different versions of tasks onto the user and compare the results to see which ones finish earlier to decide which versions to send. This could be misleading if one of the GPUs was the slave on a crossfire or sli system.


My understanding is that CrossFire and SLI shouldn't matter for OpenCL. If it does then IMHO that's driver bug and you need to report it to AMD.

I don't know if it's possible to detect CrossFire with OpenCL. From the clinfo output the only remotely usable difference was the other GPUs max clock frequency of 300MHz and that's not much.
ID: 90061 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 90062 - Posted: 12 Feb 2019, 18:16:16 UTC - in response to Message 90044.  

On the subject of VS2017 ….


Did you happen to use Autotools build system before Visual Studio? If you did then wipe out everything Autotools created. It may be that the build is picking up config.h generated for MinGW.

Otherwise, sledgehammer approach and rename your MinGW directory and see if the problems go away. If the problems go away then it's something on your system. I have MSYS2/MinGW installed but I don't have the same problems.
ID: 90062 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 90347 - Posted: 28 Feb 2019, 2:38:23 UTC
Last modified: 28 Feb 2019, 2:40:46 UTC

Driver update from AMD fixed a few things.

I combined two systems into one by using pair of 1x risers: One RX560 on an x16 and two RX560 on 1x risers.

Device manager showed all 3, Radeon 19.1.1 showed only 2 and boinc only saw 2. The crossfire option was missing. GPU-z showed 2 boards at %100 and one idling.

I then put all 3 boards on 3@x16 (electrical 16,8,8) which cannot normally be used as there is a heat problem (no fans on this open system). Anyway, this was worse. Device manager shows 3 but Radeon only showed 1 board and boinc showed only 1. gpu-z showed two idling. I assume this was a cross fire problem. I looked again for crossfire selection but not there.

I then upgraded to 19.2.3 and used the pair of 1x risers again. Crossfire option showed up and I was able to unselect crossfire. Boinc saw all 3 RX560 and gpu-z showed that all 3 were under %100 load. The risers are nice because the separate the graphics boards which then run cool w/o any fans. They all run SETI. The increase in speed over CPU is so large it was more efficient to put all 3 boards on a single mombo.

I have now put VS2013 on a separate system along with that required SDK and will look at a boinc build just on that system.
ID: 90347 · Report as offensive

Message boards : BOINC client : Problem building client with VS2017

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.