GPUs on SCARF

Available GPU Hardware

12 nodes with dual K40 cards, restricted to the fbioctopus and fbioctopus-exclusive partitions and only usable by the CLF LSF Octopus group 6 nodes with four A100 cards (4 A100 devices available on node), available in the gpu partition

GPUs in SLURM

SCARF’s GPU nodes have the same base software payload as the standard SCARF nodes with a few minor differences to support the GPUs. GPU software is integrated into the module system.

Requesting GPUs

All users have access to GPUs, there is no need to request special access to the GPU partitions.

Note that currently there is an interaction with the automatic setting of the constraint to intel if not specified. As the GPU nodes are AMD systems, you must add the following to your job submision parameters:

-C amd

GPUs are allocated via Slurm’s GRES mechanism and are requested on a per node basis. This resource allocation is enforced by linux cgroups meaning that you will not be able to see GPUs you have not requested. For example to request an interactive job with 3 CPU cores and 2 GPUs, do:

salloc -p gpu -n 3 --gres=gpu:3

Running nvidia-smi commands show 3 GPUs

nvidia-smi -L

GPU 0: Tesla K80 (UUID: GPU-9fbdcd8e-91f1-5f28-c01b-11645602d26e)
GPU 1: Tesla K80 (UUID: GPU-919735d8-df0e-efef-91e6-9dae7f3c74d5)
GPU 2: Tesla K80 (UUID: GPU-340a466c-6089-429f-b6b7-e2aee435ba51)

But the system actually has 4 GPU devices, cgroups has limited access to the 3 requested

ls  /dev/nvidia?

/dev/nvidia0  /dev/nvidia1  /dev/nvidia2  /dev/nvidia3

GPU Hardware

SCARF has 6 generally accessible GPU systems, each has 4 A100 GPUs, while the examples in this document use the –gres=gpu:X format, the model of the gpus is also requestable with –gres=gpu:a100:X, this is somewhat moot currently as while SCARF does have does different GPU models (k40 and a100) they are not available in the same partitions as the K40 systems are restricted to specific communities.

MPI and GPUs

Running MPI jobs works similarly to simple interactive jobs, however some important things to note are, the GRES mechanism allocates per node therefore the parameter:

--gres=gpu:3

requires every host allocated to the job to have 3 available GPUs, also MPI ranks started by the same invocation of srun or mpirun -srun will use the same gpus as other ranks on the same host.