Multi core tasks alongside single core tasks.

Message boards : BOINC client : Multi core tasks alongside single core tasks.
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 111922 - Posted: 29 May 2023, 9:05:32 UTC - in response to Message 111921.  

Currently, I've only run Amicable in MT mode. The Einstein tasks are defined as GPU-only, but I've allocated them a full CPU core via app_config.xml because of the OpenCL overhead. I run NumberFields as a simple, lightweight, CPU-only project.

Have you seen #5257? Several bits of fine-tuning in there.
ID: 111922 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111923 - Posted: 29 May 2023, 10:25:35 UTC - in response to Message 111922.  

Currently, I've only run Amicable in MT mode. The Einstein tasks are defined as GPU-only, but I've allocated them a full CPU core via app_config.xml because of the OpenCL overhead. I run NumberFields as a simple, lightweight, CPU-only project.

Have you seen #5257? Several bits of fine-tuning in there.


"Just looked the Einstein ones that crashed were all gravity wave 4CPU + GPU The Gamma Ray which says in Manager, 1CPU+1NvidiaGPU is now at 18% so clearly not subject to whatever made the others crash about 1 minute in.
ID: 111923 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 111924 - Posted: 29 May 2023, 11:06:30 UTC - in response to Message 111923.  

Just looked the Einstein ones that crashed were all gravity wave 4CPU + GPU
That's odd - I didn't think they had such a beast. I've only ever run tasks designed as 1 GPU plus fractional CPU, although I've usually controlled them with app_config, often to run two per GPU.

The latest Gravity Wave GPU tasks have a high GPU memory demand, so I can't run two per card - although I can run 1 GW plus 1 Gamma-ray. They seem to be running down the supply of GW tasks at the moment, so I'm only getting resends: I expect that's in preparation to start clean with a new batch of data.
ID: 111924 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111925 - Posted: 29 May 2023, 16:22:17 UTC
Last modified: 29 May 2023, 16:43:34 UTC

Just started testing again. 30% of cores available. One 4 core Amicable running 1 1CPU Einstein and 1 1CPU+!GPU Einstein. This is actlually 37.5% Even if as is I believe the case with some GPU tasks that the load on the CPU is minimal the other 5 cores in use would constitute 31.25% of available cores. So clearly an over commit.

Now to increase core count and see what happens.

I increased %CPUs to 55. (Not something I normally do as on the type of tasks I typically run there is no gain from using more than 50%) None of the three Einstein tasks waiting to run have started. Still just the four core Amicable Numbers task and two single core Einstein one of which also used the GPU

Now I know it does not only affect CPDN work waiting to run.
ID: 111925 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111926 - Posted: 29 May 2023, 19:39:00 UTC
Last modified: 29 May 2023, 20:05:08 UTC

Not sure are the only non VB Atlas tasks the ones that need cvfms? I am currently looking at the cmake output log to try and determine where cmake fell over.

Edit: I needn't have bothered for all the sense it made. Not like Make where it tells you what was missing!
ID: 111926 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 111927 - Posted: 29 May 2023, 20:43:32 UTC - in response to Message 111926.  

Can't help, I'm afraid - I just installed whatever was needed from the CERN cerncvm repository - can't find the installation guide right now.

computezrmle (a few posts ago) is an expert in this field.
ID: 111927 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111928 - Posted: 29 May 2023, 20:54:37 UTC - in response to Message 111927.  

Can't help, I'm afraid - I just installed whatever was needed from the CERN cerncvm repository - can't find the installation guide right now.

computezrmle (a few posts ago) is an expert in this field.


I tried following the guide for Debian doing it from scratch.
ID: 111928 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111929 - Posted: 29 May 2023, 20:57:06 UTC
Last modified: 29 May 2023, 21:00:16 UTC

No worries. If I get any more tasks from CPDN before the East Asia ones arrive, I can always run other stuff in VB where it can just do its own thing. I would like to chase the issue down and work out what the problem is but more out of academic interest than necessity.

Edit: Just found the .deb files on the CERN site.
ID: 111929 · Report as offensive
computezrmle

Send message
Joined: 2 Feb 22
Posts: 81
Germany
Message 111931 - Posted: 30 May 2023, 4:58:26 UTC - in response to Message 111925.  

Just started testing again. 30% of cores available. ...
... increased %CPUs to 55. ...

Didn't recently test BOINC's behaviour regarding the core percentage setting.

At least far in the past it worked in intervals.
On Dave's 16 core computer the interval is 100/16=6.25
Hence, each setting 50 <= x < 56.25 results in the same number of cores that will be used.

=> It shouldn't make a difference whether it is set to 50% or 55% but you may see a difference if you set 57%.
ID: 111931 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 111932 - Posted: 30 May 2023, 5:52:16 UTC - in response to Message 111931.  

Yes, it's always 'next integer below'.
ID: 111932 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111933 - Posted: 30 May 2023, 7:02:38 UTC
Last modified: 30 May 2023, 7:03:34 UTC

Edit: Just found the .deb files on the CERN site.


cvmfs_2.10.1~1+ubuntu22.04_amd64.deb
cvmfs-fuse3_2.10.1~1+ubuntu22.04_amd64.deb
The first one fails with

The following packages have unmet dependencies.
cvmfs: Depends:cvmfs-config-default but it is not installable or cvmfs-config but it is not installable

I probably have time to sort this before Atlas tasks become available again.
ID: 111933 · Report as offensive
computezrmle

Send message
Joined: 2 Feb 22
Posts: 81
Germany
Message 111935 - Posted: 30 May 2023, 8:38:52 UTC - in response to Message 111933.  

The following packages have unmet dependencies.
cvmfs: Depends:cvmfs-config-default but it is not installable or cvmfs-config but it is not installable

You also need this package:
http://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default_latest_all.deb

First you need to do after installation is to run (once!):
[sudo] cvmfs_config setup

Never directly modify the options in any *.conf file below /etc/cvmfs
Instead create a corresponding *.local file and write the modification to that file.

I probably have time to sort this before Atlas tasks become available again.

That's why I suggested CMS from LHC@home dev since this has mt tasks available and does only need VirtualBox.
ID: 111935 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111936 - Posted: 30 May 2023, 9:20:31 UTC - in response to Message 111935.  

That's why I suggested CMS from LHC@home dev since this has mt tasks available and does only need VirtualBox.


Thanks, I was planning on testing with non VB MT tasks first. Will have a think about this.

And thanks for the link to the needed .deb. I now have my machine set up for the non-VB tasks.
ID: 111936 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111940 - Posted: 30 May 2023, 21:44:15 UTC
Last modified: 30 May 2023, 21:45:18 UTC

Now downloading a 7 core task from LHC. Might pause it in order to be awake to see what happens.

Edit: It is a VB one.
ID: 111940 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111944 - Posted: 31 May 2023, 6:14:48 UTC - in response to Message 111940.  

Same happens as with Amicable numbers. 7 core task started, 2 Einstein tasks stopped. Increased number of available cores to 8, Neither of the Einstein tasks started, LHC started to download a task for 8 cores.
ID: 111944 · Report as offensive
computezrmle

Send message
Joined: 2 Feb 22
Posts: 81
Germany
Message 111945 - Posted: 31 May 2023, 7:00:11 UTC - in response to Message 111940.  

Now downloading a 7 core task from LHC. Might pause it in order to be awake to see what happens.

Edit: It is a VB one.

Pausing tasks from LHC@home can be tricky.
CMS: If they pause for more than 2h they loose connection to WMAgent and cancel their subtask; depending on the runtime the either finish or try to get another subtask
ATLAS native: always restarts from scratch
Theory native: requires cgroups v1 and a special preparation; does not work on cgroups v2


LHC started to download a task for 8 cores

ATLAS has a server side max core limit of 8 for Vbox and 12 for native tasks.
ID: 111945 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111946 - Posted: 31 May 2023, 9:53:11 UTC

Pausing tasks from LHC@home can be tricky.
Just pausing before it started which won't cause problems. Even increasing core count to 100% Einstein task didn't restart. with 8 core task from LHC running. So I have my answer now that it isn't just MT tasks from AN causing the issue.
ID: 111946 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 111948 - Posted: 31 May 2023, 10:43:00 UTC - in response to Message 111946.  

I wonder why I didn't see it - what's the difference?

I wouldn't mind looking through a complete cycle of cpu_sched_debug, if you could put one where I can see it? Compare it with the one I posted on GitHub for DA, on the different issue I saw.
ID: 111948 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2538
United Kingdom
Message 111949 - Posted: 31 May 2023, 10:54:59 UTC - in response to Message 111948.  
Last modified: 31 May 2023, 10:55:45 UTC

I wonder why I didn't see it - what's the difference?

I wouldn't mind looking through a complete cycle of cpu_sched_debug, if you could put one where I can see it? Compare it with the one I posted on GitHub for DA, on the different issue I saw.
Will wait for current tasks to finish, then set it up to record from starting the single core tasks. - About a couple of hours.
ID: 111949 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5081
United Kingdom
Message 111950 - Posted: 31 May 2023, 11:14:59 UTC - in response to Message 111949.  

That would be great - I'll keep an eye on things when I get back from lunch.
ID: 111950 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : BOINC client : Multi core tasks alongside single core tasks.

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.