Using GPUs with Fuzzball
Fuzzball intelligently passes GPU resources from the host through to the containers that support your jobs. Once an administrator has properly added and configured resources that include GPUs, you can access them simply by specifying them in your Workflow.
Observe the following Fuzzfile:
version: v1
jobs:
gpu-check:
image:
uri: docker://rockylinux:9
command:
- /bin/sh
- '-c'
- nvidia-smi
resource:
cpu:
cores: 1
memory:
size: 1GB
devices:
nvidia.com/gpu: 1
Note that the gpu-check job in this workflow is based on the official Rocky Linux (v9) container
on Docker
Hub.
This container does not have a GPU driver or any NVIDIA related software installed. However, the
Fuzzfile specifies that this job should run the program nvidia-smi to check the status of any GPUs
that are present.
The resources section has a field devices that allows you to specify that the job should use
GPUs. Since our administrator has configured GPUs for use with nvidia.com/gpu, we can specify that
we need to use one. Fuzzball will cause all of the driver-related software to be available in the
container.
If you are using the workflow editor in the web UI to create your Fuzzfile, you can open the
Resources tab in the flyout menu on the right and “Add” a device. Once again, in the example
configuration, the nvidia.com/gpu string is appropriate to add a GPU to our job.

After submitting the workflow, note that the nvidia-smi command has no trouble completing even
though the Rocky Linux container has no NVIDIA-related software.
Thu Aug 15 20:27:38 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.06 Driver Version: 555.42.06 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla V100-SXM2-16GB Off | 00000000:00:1E.0 Off | 0 |
| N/A 42C P0 40W / 300W | 1MiB / 16384MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
It’s important to understand that not only is installing NVIDIA drivers within your containers unnecessary for GPU workflows, it’s actually considered a poor practice that can lead to compatibility issues. Here’s why:
When executing GPU workloads, Fuzzball automatically identifies the appropriate libraries and binaries associated with the NVIDIA driver installed on the host system and dynamically injects them into your container at runtime. This sophisticated mechanism ensures perfect compatibility between the kernel modules running on the host and the libraries available inside the container.
If you manually install GPU drivers inside your container, you risk creating conflicts—the software in your container might discover and attempt to use these drivers instead of the properly injected ones, potentially disrupting the correspondence between libraries and kernel modules and leading to mysterious failures or performance issues.
Important distinction: There is a difference between the NVIDIA driver and CUDA. This is confusing since CUDA is often packaged with the NVIDIA driver. While you should not install GPU drivers in your container, you should install the appropriate version of CUDA to support your GPU-accelerated applications. CUDA provides the programming interface and runtime libraries your applications need, while Fuzzball handles the driver injection automatically.