Skip to main content

GPU container access

Leveraging GPU capabilities within a Podman container provides a powerful and efficient method for running GPU-accelerated workloads. Below are instructions on how to get started setting up your OS to utilize the GPU.

Prerequisites

  • NVIDIA Graphics Card (Pascal or later)
  • WSL2 (Hyper-V is not supported)

Procedure

  1. The most up-to-date NVIDIA GPU Driver will support WSL 2. You are not required to download anything else on your host machine for your NVIDIA card.

  2. Verify that WSL2 was installed when installing Podman Desktop.

  3. Create your Podman Machine.

  4. Install NVIDIA Container Toolkit onto the Podman Machine:

Podman Machine requires the NVIDIA Container Toolkit to be installed.

This can be installed by following the official NVIDIA guide or running the steps below:

SSH into the Podman Machine:

$ podman machine ssh

Run the following commands on the Podman Machine, not the host system:

$ curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
tee /etc/yum.repos.d/nvidia-container-toolkit.repo && \
yum install -y nvidia-container-toolkit && \
nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml && \
nvidia-ctk cdi list
info

A configuration change might occur when you create or remove Multi-Instance GPU (MIG) devices, or upgrade the Compute Unified Device Architecture (CUDA) driver. In such cases, you must generate a new Container Device Interface (CDI) specification.

Verification

To verify that containers created can access the GPU, you can use nvidia-smi from within a container with NVIDIA drivers installed.

Run the following official NVIDIA container on your host machine:

$ podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

Example output:

PS C:\Users\admin>  podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Fri Aug 16 18:58:14 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.36 Driver Version: 546.33 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3060 On | 00000000:07:00.0 On | N/A |
| 0% 34C P8 20W / 170W | 886MiB / 12288MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 33 G /Xwayland N/A |
+---------------------------------------------------------------------------------------+

Troubleshooting

Version mismatch

You might encounter the following error inside the containers:

# nvidia-smi
Failed to initialize NVML: N/A

This problem is related to a mismatch between the Container Device Interface (CDI) and the installed version.

To fix this problem, generate a new CDI specification by running the following inside the Podman machine:

nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
info

You might need to restart your Podman machine.

Additional resources