I assume this is a windows-7 "Bug" ...

Message boards : Questions and problems : I assume this is a windows-7 "Bug" ...
Message board moderation

To post messages, you must log in.

AuthorMessage
edwardpf

Send message
Joined: 18 Feb 11
Posts: 20
United States
Message 45744 - Posted: 22 Sep 2012, 4:23:41 UTC
Last modified: 22 Sep 2012, 4:24:28 UTC

I am running Boinc v6.12.34 (been running benchmarks for about 6 months and don't want to be changing any underlying software ... yet) on a core-7 860 16Gb and 2 nvidia cards. Windows 7 patched to current patch levals.

I am running SETI only. ( 3Wu's on even CPS's 3Wu's on odd CPU's 2Wu's on the fast GPU and 1Wu on the slow GPU) GPU work is done on CPU's 0 and 7.

I am running 4 copies of Boinc. started by the following ".bat" command:

rem
rem cd C:\BOINC_test_data_1
start /affinity 54 C:\BOINC_test_programs\boinc.exe --gui_rpc_port 31416 --dir C:\BOINC_test_data_1 --detach
rem
rem cd C:\BOINC_test_data_2
start /affinity 2A C:\BOINC_test_programs\boinc.exe --gui_rpc_port 31417 --dir C:\BOINC_test_data_2 --allow_multiple_clients --detach
rem
cd C:\BOINC_test_GPU_1
start /affinity 81 C:\BOINC_test_programs\boinc.exe --gui_rpc_port 31418 --dir C:\BOINC_test_GPU_1 --allow_multiple_clients --detach
REM
cd C:\BOINC_test_GPU_2
start /affinity 81 C:\BOINC_test_programs\boinc.exe --gui_rpc_port 31419 --dir C:\BOINC_test_GPU_2 --allow_multiple_clients --detach
REM
cd C:\BOINC_test_data_1
start C:\BOINC_test_programs\boincmgr.exe /s
rem
cd ..\users\ ...

This has, as far as I can tell, always run correctly.

But if I comment out some stuff so I can run a subset of WU's like this:

rem
rem cd C:\BOINC_test_data_1
start /affinity 54 C:\BOINC_test_programs\boinc.exe --gui_rpc_port 31416 --dir C:\BOINC_test_data_1 --detach
rem
rem cd C:\BOINC_test_data_2
rem start /affinity 2A C:\BOINC_test_programs\boinc.exe --gui_rpc_port 31417 --dir C:\BOINC_test_data_2 --allow_multiple_clients --detach
rem
rem cd C:\BOINC_test_GPU_1
rem start /affinity 81 C:\BOINC_test_programs\boinc.exe --gui_rpc_port 31418 --dir C:\BOINC_test_GPU_1 --allow_multiple_clients --detach
REM
rem cd C:\BOINC_test_GPU_2
rem start /affinity 81 C:\BOINC_test_programs\boinc.exe --gui_rpc_port 31419 --dir C:\BOINC_test_GPU_2 --allow_multiple_clients --detach
REM
cd C:\BOINC_test_data_1
start C:\BOINC_test_programs\boincmgr.exe /s
rem
cd ..\users\...

I will, on occasion (about one in 10 times), see that Windows task manager will report the system is 13% busy and than ONLY CPU 2 is being used at 100%. if I have WTM display the process affinity they are correctly set to 2,4,6 but win-7 is only scheduling CPU 2.

If I manually reset the 3 Wu's to 2,4,6 they will then schedule on 2,4,6.

I ASSUME this is a Win-7 bug and NOT an EPF (me) bug.

Any comments or ideas??

Ed F
ID: 45744 · Report as offensive
edwardpf

Send message
Joined: 18 Feb 11
Posts: 20
United States
Message 47297 - Posted: 15 Jan 2013, 23:42:22 UTC - in response to Message 45744.  

Well ... I see no one wanted to reply to this question ... but I'll try adding this added info and maybe it will ring a bell with someone ...

I just joined Milkeyway at home and have been running fine there (on cpu's 1,2,5,and 7) BUT I got a WU with a Remaing time est of 87600:00:00.

The WU will actually finish in about 48 Hrs.

In order to give the WU "more power" I let it run on all 8 CPU's. as it happens the MT WU will only run on 4 CPU's so I let the queue run dry and returned the MT WU to cpu's 1,3,5,and 7. The WU only consumed 25% (cpu's 1,7) of the computer instead of 50% like it should (MT 4 CPU's - with affinity set to 1,3,5,7). If I set affinity to all 8 CPU's the WU would consume 50% of the computer across the 8 available CPU's.

Setting affinity to 0,2,4,6 it would run only on 0,2,4.

resetting affinity to 1,3,5,7 ... again only cpu's 1 & 7 were busy

This sounds to me like the same prob as above BUT with scheduling MT threads on common cpu (1 & 7) and NOT across the allowed 1,3,5,7


ANY Ideas if Win-7 has an issue with the scheduler and setting affinity??

Thanks

Ed F



ID: 47297 · Report as offensive
Joe Bloggs

Send message
Joined: 6 Jan 13
Posts: 40
Hong Kong
Message 47304 - Posted: 16 Jan 2013, 9:57:53 UTC - in response to Message 45744.  

Just curious, how are you stipulating that two WUs run on the fast card and one on the slow card? All your config is way over my head but this is something I may want to do if I ever get a new card...
ID: 47304 · Report as offensive
SekeRob2

Send message
Joined: 6 Jul 10
Posts: 585
Italy
Message 47315 - Posted: 16 Jan 2013, 17:38:26 UTC - in response to Message 47304.  
Last modified: 16 Jan 2013, 17:40:10 UTC

An interesting challenge, which with the new app_config.xml of client 7.0.42 and up would be

<max_concurrent>3</maxconcurrent>

To limit to 3 at most.

But where they go on your GPUs [of different capability]... random?

<app_config>
   <app>
      <name>scienceshortname</name>
      <max_concurrent>3</max_concurrent>
      <gpu_versions>
          <gpu_usage>.5</gpu_usage>
          <cpu_usage>.5</cpu_usage>
      </gpu_versions>
    </app>
</app_config>


The above set limits the maximum per GPU to 2 and to use half a CPU each, which you need to determine if that works or needs to be higher/lower. Maybe BOINC is smart enough to put 2 on the most capable card... trial and error to learn [and report back to forums so we get a little wiser too, TYVMIA]
Coelum Non Animum Mutant, Qui Trans Mare Currunt
ID: 47315 · Report as offensive
edwardpf

Send message
Joined: 18 Feb 11
Posts: 20
United States
Message 47317 - Posted: 16 Jan 2013, 18:23:13 UTC - in response to Message 47304.  
Last modified: 16 Jan 2013, 18:26:31 UTC

I have 4 boinc processes running. Each boinc is set up in its own directory (see above i.e. ... dir C:\BOINC_test_data_1 or --dir C:\BOINC_test_GPU_2 etc.).

In each directory you have control over that boinc just as if it were the only one running via the normal configuration tech's esp."cc_config.xml".

The fast GPU's looks like this:

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
<ignore_cuda_dev>1</ignore_cuda_dev>
<ncpus>0</ncpus>
<max_file_xfers_per_project>8</max_file_xfers_per_project>
<save_stats_days>360</save_stats_days>
<http_transfer_timeout>120</http_transfer_timeout>
</options>
</cc_config>

app_info.xml contains

<coproc>
<type>CUDA</type>
<count>0.5</count>
</coproc>


the slow GPU's looks like this:

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
<ignore_cuda_dev>0</ignore_cuda_dev>
<ncpus>0</ncpus>
<max_file_xfers_per_project>8</max_file_xfers_per_project>
<save_stats_days>360</save_stats_days>
<http_transfer_timeout>120</http_transfer_timeout>
</options>
</cc_config>

app_info.xml contains

<coproc>
<type>CUDA</type>
<count>1.0</count>
</coproc>

Does this help??

Ed F
ID: 47317 · Report as offensive
Joe Bloggs

Send message
Joined: 6 Jan 13
Posts: 40
Hong Kong
Message 47331 - Posted: 17 Jan 2013, 3:53:53 UTC

Has anyone posted a wish for the "count" parameter of gpus to be user-editable--so that a fast gpu can count as two to receive more jobs while a slow one can count as 0.5 to receive no jobs other than those that the user hand-edits to run on 0.5 gpus?
ID: 47331 · Report as offensive

Message boards : Questions and problems : I assume this is a windows-7 "Bug" ...

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.