Ticket #879 (closed Defect: fixed)

Opened 7 months ago

Last modified 3 months ago

cuda device summary truncated/misleading

Reported by: MarkJ Assigned to: davea
Priority: Undetermined Milestone: Undetermined
Component: Undetermined Version: 6.6.20
Keywords: cuda device Cc:

Description

If the user has multiple gpu cards in a system the BOINC messages are getting truncated. Example given below.

04/12/09 01:59:43 CUDA devices: GeForce? GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS), GeForce? GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS), GeForce? GTX 295 (driver version 18568, CUDA version 1.3, 896MB, est. 106GFLOPS)

The user in question had 2 x GTX295's installed so it should report all four, however the last one is chopped off.

I suggest that for each cuda device that is found display the cuda device number, plus the details on a seperate line. That way people with a mixture of devices can see which cuda device it is and also its details without having to scan a long string. This also resolves the issue of truncating the string.

Also the cuda version displayed is misleading. The number shown is in fact the compute capability of the card. It does not change even when the device drivers are updated. The cuda version is determined by the device drivers/dll files and this should not be confused with the devices compute capability.

Suggested approach: 04/12/09 01:59:43 CUDA device 0: GeForce? GTX 295 (driver version 18568, Compute Capability 1.3, 896MB, est. 106GFLOPS) 04/12/09 01:59:43 CUDA device 1: GeForce? GTX 295 (driver version 18568, Compute Capability 1.3, 896MB, est. 106GFLOPS) 04/12/09 01:59:43 CUDA device 2: GeForce? GTX 295 (driver version 18568, Compute Capability 1.3, 896MB, est. 106GFLOPS) 04/12/09 01:59:43 CUDA device 3: GeForce? GTX 295 (driver version 18568, Compute Capability 1.3, 896MB, est. 106GFLOPS)

Change History

04/19/09 17:00:12 changed by davea

  • status changed from new to closed.
  • resolution set to fixed.

(In [17847]) - client: improve CPU sched debug messages

(say what kind of job and why we're scheduling it)

- client: log messages describing GPUs: one line per GPU; fixes #879

04/23/09 11:01:26 changed by romw

(In [17865]) - client: improve CPU sched debug messages

(say what kind of job and why we're scheduling it)

  • client: log messages describing GPUs: one line per GPU; fixes #879

client/

cpu_sched.cpp

lib/

coproc.cpp,h

05/02/09 05:02:57 changed by MarkJ

  • status changed from closed to reopened.
  • resolution deleted.

Changes were made as indicated above, however the cuda device number is not given anywhere in 6.6.25. Debugging messages do not indicate which device is being acted upon. Cuda detection doesn't indicate device number.

05/02/09 05:28:12 changed by romw

  • owner set to davea.
  • status changed from reopened to new.

06/03/09 13:29:51 changed by davea

  • status changed from new to closed.
  • resolution set to fixed.

(In [18277]) - client: include device number in message describing NVIDIA GPU,

and call it "NVIDIA GPU" rather than "CUDA device" fixes #879

07/06/09 09:01:09 changed by romw

(In [18546]) - client: include device number in message describing NVIDIA GPU,

and call it "NVIDIA GPU" rather than "CUDA device" fixes #879

lib/

coproc.cpp


If this page is incomplete or incorrect, please edit it or add it to the wiki to-do list. To do this, you must be logged in; click Login or Register above.