Enabling GPU Passthrough in OpenStack with Kolla Ansible for High-Performance Workloads/

In the world of high-performance computing (HPC) and artificial intelligence (AI), GPUs are the unsung heroes, delivering the computational power needed for demanding workloads. If you're using OpenStack with Kolla Ansible and want to unleash the full potential of GPUs for your virtual machines, this guide is for you. We'll dive into the why and how of GPU passthrough, making your OpenStack environment ready for cutting-edge applications.
Why GPU Passthrough?
By default, virtual machines (VMs) in OpenStack don't have direct access to physical GPUs. This limitation can bottleneck performance for workloads like:
- AI Model Training: Deep learning frameworks like TensorFlow and PyTorch thrive on GPU acceleration.
- Scientific Simulations: High-precision simulations demand parallel processing power.
- Rendering and Visualization: Tasks like 3D modeling and video editing are GPU-intensive.
GPU passthrough allows a VM to access a physical GPU directly, ensuring native-level performance. This is particularly useful for organizations running private clouds with OpenStack, enabling them to maximize the value of their GPU investments.
Preparing Your Environment
Before diving into OpenStack configurations, let's ensure the hardware and host system are ready.
Step 1: GPU(s) need to be installed in the Compute Nodes
- You can have multiple compute nodes with multiple GPUs
- This even works with consumer grade cards like the GTX / RTX series
- You won't have to pay any additional subscription fees. All you need is the official NVIDIA driver
- In this tutorial we will use two compute nodes
- compute01 - NVIDIA Tesla T4 (SR-IOV)
- compute02 - NVIDIA RTX 4090 (PCI)
- All steps below need to be carried out on the compute nodes
- We can check for the existence of an NVIDIA GPU(s) like this:
sudo lspci -nn | grep NVIDIA
Example
Compute01: NVIDIA Tesla T4
[root@compute01 ~]# lspci -nn | grep NVIDIA
41:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
42:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
[root@compute01 ~]# lspci -s 41:00.0 -k
41:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
Subsystem: NVIDIA Corporation Device 12a2
Kernel driver in use: nouveau
Kernel modules: nouveau
[root@compute01 ~]# lspci -s 42:00.0 -k
42:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
Subsystem: NVIDIA Corporation Device 12a2
Kernel driver in use: nouveau
Kernel modules: nouveau
- Check if the GPU is using SR-IOV. Some older cards like the TESLA T4 do, but not all of them. We will need this for later when configuring the passthrough.
sudo lspci -s 41:00.0 -vvvv | grep -i SR
sudo lspci -s 42:00.0 -vvvv | grep -i SR
Example
[root@compute01 ~]# sudo lspci -s 41:00.0 -vvvv | grep -i SR
Capabilities: [bcc v1] Single Root I/O Virtualization (SR-IOV)
[root@compute01 ~]# sudo lspci -s 42:00.0 -vvvv | grep -i SR
Capabilities: [bcc v1] Single Root I/O Virtualization (SR-IOV)
Both cards are using SR-IOV.
Step 2: Verify Hardware Support
- Check BIOS Settings:
- Ensure your CPU and motherboard support IOMMU (VT-d for Intel, AMD-Vi for AMD).
- Enable IOMMU in the BIOS settings.
- Enable IOMMU on the Host OS:
- Modify the kernel parameters in
/etc/default/grub
:
sudo vi /etc/default/grub
...
# AMD
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on"
# Intel
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"
...
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
reboot
- Verify IOMMU Activation: After rebooting, confirm IOMMU is enabled:
dmesg | grep -e DMAR -e IOMMU
Step 3: Prepare the GPU
1 . Blacklist Default Drivers: Prevent default GPU drivers from loading by adding the following to /etc/modprobe.d/blacklist-nouveau.conf
:
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off
Regenerate the initramfs:
sudo dracut --force
2 . Bind GPU to VFIO Driver:
- Identify the GPU vendor and product IDs:
lspci -nn | grep -i nvidia
Example 1: NVIDIA Tesla T4
[root@compute01 ~]# lspci -nn | grep NVIDIA
41:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
42:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
- We note down the Vender ID and Product ID 10de:1eb8
Example 2: NVIDIA RTX 4090
[root@compute02 ~]# lspci -nn | grep NVIDIA
21:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD102 [GeForce RTX 4090] [10de:2684] (rev a1)
21:00.1 Audio device [0403]: NVIDIA Corporation AD102 High Definition Audio Controller [10de:22ba] (rev a1)
a1:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD102 [GeForce RTX 4090] [10de:2684] (rev a1)
a1:00.1 Audio device [0403]: NVIDIA Corporation AD102 High Definition Audio Controller [10de:22ba] (rev a1)
c1:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD102 [GeForce RTX 4090] [10de:2684] (rev a1)
c1:00.1 Audio device [0403]: NVIDIA Corporation AD102 High Definition Audio Controller [10de:22ba] (rev a1)
- We note down the IDs 10de:2684 (VGA) and 10de:22ba (Audio)
- Bind the GPU to the
vfio-pci
driver by adding this to/etc/modprobe.d/vfio.conf
:
NVIDIA Tesla T4 (node: compute01)
cat > /etc/modprobe.d/vfio.conf << EOF
# create new : for [ids=***], specify [vendor-ID:device-ID]
## for vga and sound seperated by comma: ids=10de:1eb8,10de:1ek3
options vfio-pci ids=10de:1eb8
EOF
echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf
NVIDIA RTX 4090 (node: compute02) - Here we need both IDs VGA and Audio
cat > /etc/modprobe.d/vfio.conf << EOF
# create new : for [ids=***], specify [vendor-ID:device-ID]
## for vga and sound seperated by comma: ids=10de:1eb8,10de:1ek3
options vfio-pci ids=10de:2684,10de:22ba
EOF
echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf
Reboot the compute node(s)
reboot
After the reboot, we can do a quick check to validate that everything is working as expected. A nice and simple way is to install virt-host-validate.
yum -y install libvirt-daemon-common-10.0.0-6.3.el9_4.x86_64
virt-host-validate
Example
[root@compute01 ~]# virt-host-validate
QEMU: Checking for hardware virtualization : PASS
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : PASS
QEMU: Checking if device /dev/vhost-net exists : PASS
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup 'cpu' controller support : PASS
QEMU: Checking for cgroup 'cpuacct' controller support : PASS
QEMU: Checking for cgroup 'cpuset' controller support : PASS
QEMU: Checking for cgroup 'memory' controller support : PASS
QEMU: Checking for cgroup 'devices' controller support : PASS
QEMU: Checking for cgroup 'blkio' controller support : PASS
QEMU: Checking for device assignment IOMMU support : PASS
QEMU: Checking if IOMMU is enabled by kernel : PASS
QEMU: Checking for secure guest support : PASS
Configuring Kolla Ansible for GPU Passthrough
Step 1: Update Nova Configuration in Kolla Ansible for the compute node(s)
On your Kolla deployment node, where your /etc/kolla directory lives, you need to configure the nova config for the compute node(s).
mkdir -p /etc/kolla/config/nova/compute01
mkdir -p /etc/kolla/config/nova/compute02
compute01
Notice that we use device_type: "type-PF" since the NVIDIA Tesla T4 cards are SR-IOV.
cat > /etc/kolla/config/nova/compute01/nova.conf << EOF
[PCI]
device_spec = { "vendor_id": "10de", "product_id": "1eb8" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "1eb8" }
alias = { "vendor_id":"10de", "product_id":"1eb8", "device_type":"type-PF", "name":"tesla-t4" }
EOF
compute02
Notice that we use device_type: "type-PCI" since the NVIDIA Tesla T4 cards are not SR-IOV and again we list both the VGA and Audio IDs.
cat > /etc/kolla/config/nova/compute01/nova.conf << EOF
[PCI]
device_spec = { "vendor_id": "10de", "product_id": "1eb8" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "2684" }
passthrough_whitelist = { "vendor_id": "10de", "product_id": "22ba" }
alias: { "vendor_id":"10de", "product_id":"2684", "device_type":"type-PCI", "name":"rtx-4090" }
EOF
Run the kolla-ansible reconfigure for the node(s)
kolla-ansible -i multinode reconfigure -t nova --limit compute01,compute02
Step 2: Update Nova Configuration in Kolla Ansible for the control node(s)
Nova API
Notice that we need to configure all passthrough GPUs here.
cat >> /etc/kolla/config/nova/nova-api.conf < EOF
[pci]
alias: { "vendor_id":"10de", "product_id":"1eb8", "device_type":"type-PF", "name":"tesla-t4" }
alias: { "vendor_id":"10de", "product_id":"2684", "device_type":"type-PCI", "name":"rtx-4090" }
EOF
Nova Scheduler
[filter_scheduler]
enabled_filters = PciPassthroughFilter
available_filters = nova.scheduler.filters.all_filters
Run kolla-ansible reconfigure for all control nodes
kolla-ansible -i multinode reconfigure -t nova --limit control
Creating private GPU-Enabled Flavor(s)
Set GPU-Specific Properties:
Notice that we can specify the number of GPUs for each flavor.
- tesla-t4:1 = 1 Tesla T4 GPU
- rtx-4090:2 = 2 RTX 4090 GPUs
By creating private flavors, we ensure that not all users can create VMs with these flavors. We can assign them to individual projects.
# 1 NVIDIA Tesla T4
openstack flavor create \
--vcpus 2 \
--ram 4096 \
--disk 25 \
--property "pci_passthrough:alias"="tesla-t4:1" \
--private \
gpuflavor_teslat4-1
# 2 NVIDIA RTX 4090
openstack flavor create \
--vcpus 16 \
--ram 32768 \
--disk 100 \
--property "pci_passthrough:alias"="rtx-4090:2" \
--private \
gpuflavor_rtx4090-2
# 1 NVIDIA RTX 4090
openstack flavor create \
--vcpus 16 \
--ram 32768 \
--disk 100 \
--property "pci_passthrough:alias"="rtx-4090:1" \
--private \
gpuflavor_rtx4090-1
Assing private Flavor to project
openstack project list
## nova flavor-access-add <FlavorName> <ProjectID>
nova flavor-access-add gpuflavor_rtx4090-2 ba1123c1104b4c9da623a836789849d0
Launching a GPU-Accelerated Ubuntu VM
Launch the Instance:
openstack server create --flavor gpuflavor_rtx4090-2 --image "jammy-server-cloudimg-amd64" --network demo-net --security-group default --key-name mykey jammy-gpu-test-01
Install GPU Drivers in the VM: Once the VM is running, assign a FIP, SSH into the VM and install the necessary GPU drivers.
ssh ubuntu@jammy-gpu-test01
sudo apt update
sudo apt install -y nvidia-driver-525 nvtop
sudo reboot
Verify GPU Passthrough: We can use nvidia-smi / nvtop to confirm the GPU is accessible:
nvidia-smi
nvtop
Test and Benchmark GPU
curl -O https://cdn.geekbench.com/Geekbench-5.4.1-Linux.tar.gz
tar -xzvf Geekbench-5.4.1-Linux.tar.gz
cd Geekbench-5.4.1-Linux
./geekbench5 --compute CUDA
The Future is GPU-Powered
With GPU passthrough configured using Kolla Ansible on Rocky Linux, your OpenStack deployment is now capable of running high-performance, GPU-accelerated workloads. Whether you’re training machine learning models, simulating complex systems, or rendering 3D visuals, OpenStack can handle it seamlessly.
Embrace the power of GPU passthrough and propel your workloads to new heights. If you’ve implemented this or have tips to share, drop a comment below—I’d love to hear about your experience!
Sources
http://www.panticz.de/openstack/gpu-passthrough
https://medium.com/@thomasal14/gpu-passthrough-in-openstack-da2a98a16f7b
https://documentation.suse.com/soc/9/html/suse-openstack-cloud-crowbar-all/gpu-passthrough.html
https://docs.openstack.org/nova/2023.2/admin/pci-passthrough.html
https://www.server-world.info/en/note?os=Rocky_Linux_8&p=kvm&f=12
https://satishdotpatel.github.io/gpu-passthrough-for-openstack/