Unlocking GPU powered Kubernetes in Openstack Magnum with Kolla-Ansible
Introduction
Kubernetes has become the go-to platform for container orchestration, and OpenStack Magnum simplifies its deployment by offering native integration with OpenStack’s infrastructure. With the rise of GPU-accelerated workloads in fields like AI, machine learning, and scientific computing, the ability to leverage NVIDIA GPUs within Kubernetes clusters has become crucial. Let’s explore how OpenStack Magnum bridges this gap, enabling seamless integration of NVIDIA GPUs in your Kubernetes environment.
What is OpenStack Magnum?
OpenStack Magnum is a service designed to manage container orchestration engines such as Kubernetes, Docker Swarm, and Mesos. It allows users to deploy and manage clusters directly within an OpenStack environment, leveraging its compute, storage, and networking capabilities.
Why NVIDIA GPUs in Kubernetes?
NVIDIA GPUs are essential for accelerating workloads that demand high parallel computation, such as:
- AI/ML Training and Inference: Speed up model training and real-time inference.
- High-Performance Computing (HPC): Solve complex scientific problems faster.
- Data Analytics: Process and analyze massive datasets efficiently.
Integrating these GPUs into a Kubernetes environment enables containerized applications to scale dynamically while benefiting from the raw computational power GPUs provide.
Integrating NVIDIA GPUs with OpenStack Magnum
OpenStack’s Nova compute service already supports GPU passthrough and virtual GPU (vGPU) capabilities. By combining Nova’s GPU features with Magnum’s Kubernetes orchestration, you can achieve GPU acceleration for your Kubernetes clusters. This guide will focus on GPU Passthrough. It is possible to create a new K8s cluster as well as update an existing cluster with GPUs.
Here’s how it works:
- Configure GPU Passthrough
- Install OpenStack Magnum
- Enable Cluster API driver (optional but highly recommended)
- The big advantage of the Cluster API driver is that we can use Ubuntu as host OS for our K8s nodes. This simplifies the GPU driver installation since Ubuntu is officially supported by the NVIDIA GPU Operator HELM chart.
- It also works without the Cluster API driver and Fedora CoreOS (FCOS) but here we need to manually install the driver in the woker node(s) that run on the compute node where the GPU is installed. I used FCOS 37 and K8s 1.21.11. I have not managed to integrate the driver in the FCOS image. I tried for days and didn't manage to get it working. 😦
- Create a new K8s Cluster template with GPU flavor for worker nodes
- Deploy a new K8s Cluster with GPU(s) or add GPU(s) to an existing K8s Cluster
- Install NVIDIA GPU driver in Fedora CoreOS
- Deploy NVIDIA GPU Operator Helm chart
For a new cluster:
That is simple. All you need to do is create a new K8s cluster template just as usual but make sure to use a GPU flavor for the worker nodes.
Update an existing cluster:
The steps are the same for Ubuntu (CAPI) and FCOS. We will do it for an FCOS 37 based K8s 1.21.11 cluster. For Ubuntu you can follow the same steps, except that you must not install the NVIDIA driver manually.
Unfortunately, it is not possible to change the flavor for worker nodes in an existing cluster template from which one or more clusters have already been deployed. Therefore, it is also useless to add another cluster worker node, as this would use a flavor without GPU. However, you can add an additional node group that can use a different flavor.
First we check for the existing nodegroups. For this we need the CLUSTER ID:
CLUSTER_ID=$(openstack coe cluster show k8s-fcos-calico-small-35-v1.21.11-octavia -f value -c uuid)
openstack coe nodegroup list $CLUSTER_ID -c uuid -c name -c status -c role
Example
(2023.2) [vagrant@seed 1.21.11]$ openstack coe nodegroup list $CLUSTER_ID -c uuid -c name -c status -c role
+--------------------------------------+----------------+-----------------+--------+
| uuid | name | status | role |
+--------------------------------------+----------------+-----------------+--------+
| c590bfc5-14f7-43da-ac21-833cd5811c46 | default-master | CREATE_COMPLETE | master |
| 179a0af4-f1ff-4cc6-b014-670e081d745b | default-worker | CREATE_COMPLETE | worker |
+--————————————+-—————+--—————+--——+
There is always a default-master and default-worker nodegroup.
Next we create a new nodegroup called "worker-gpu" for the GPU worker nodes using the e.g. "gpuflavor_teslaT4-1" flavor.
openstack coe nodegroup create
--node-count 1
--role worker
--flavor gpuflavor_teslaT4-1
--image Fedora-CoreOS-37
$CLUSTER_ID worker-gpu
This will trigger the creation of a new worker node, so it can take a while. Keep checking the status of the cluster and the nodegroups.
Example
(2023.2) [vagrant@seed 1.21.11]$ openstack coe nodegroup list $CLUSTER_ID -c uuid -c name -c status -c role
+--------------------------------------+----------------+-----------------+--------+
| uuid | name | status | role |
+--------------------------------------+----------------+-----------------+--------+
| c590bfc5-14f7-43da-ac21-833cd5811c46 | default-master | CREATE_COMPLETE | master |
| 179a0af4-f1ff-4cc6-b014-670e081d745b | default-worker | CREATE_COMPLETE | worker |
| 175a8f61-40c3-4c2d-acca-3e06679c2355 | worker-gpu | CREATE_COMPLETE | worker |
+--————————————+-—————+--—————+--——+
After a while the new worker node has been created and joined to the cluster.
openstack coe nodegroup show --max-width 80 $CLUSTER_ID worker-gpu
Example
(2023.2) [vagrant@gedasvl101 1.21.11]$ openstack coe nodegroup show --max-width 80 $CLUSTER_ID worker-gpu
+--------------------+---------------------------------------------------------+
| Field | Value |
+--------------------+---------------------------------------------------------+
| uuid | fb9263f0-3e2c-488f-a1b8-5d6b0956950f |
| name | worker-gpu |
| cluster_id | fe947555-a875-4c1a-b644-a33ae2acc046 |
| project_id | ba1178c1104b4c9da623a836789849d0 |
| docker_volume_size | None |
| labels | {'kube_tag': 'v1.21.11-rancher1', 'hyperkube_prefix': |
| | 'docker.io/rancher/', 'container_runtime': |
| | 'containerd', 'master_lb_floating_ip_enabled': 'true', |
| | 'docker_volume_type': 'DEFAULT'} |
| labels_overridden | {} |
| labels_skipped | {} |
| labels_added | {} |
| flavor_id | gpuflavor_teslaT4-1 |
| image_id | Fedora-CoreOS-37 |
| node_addresses | ['10.0.0.220'] |
| node_count | 1 |
| role | worker-gpu |
| max_node_count | None |
| min_node_count | 0 |
| is_default | False |
| stack_id | d9a95049-bb32-4610-a4e2-3b4e69a51e3c |
| status | CREATE_COMPLETE |
| status_reason | Stack CREATE completed successfully |
+--——————+———————————————————+
You can verify that the new worker node has a GPU, by temporary assigning a FIP and connect via SSH.
# check for free FIPs
openstack floating ip list
# or create new FIP
openstack floating ip create public-network
# assign the FIP to the instance
openstack server add floating ip <FIP> <instance>
# ssh to the worker node using the FIP
ssh core@<FIP>
# check for nvidia GPUs
lspci | grep -i nvidia
OPTIONAL:
Of course, you can also remove the node group in case you no longer need the GPU by running:
openstack coe nodegroup delete $CLUSTER_ID worker-gpu
Install NVIDIA GPU driver in Fedora CoreOS
As mentioned earlier we need to manually install the NVIDIA driver when using FCOS. All below needs to be executed on the GPU worker node instance. The node needs a FIP for this, so that you can access it via SSH.
rpm-ostree install nvidia-container-toolkit-1.16.0-1 libnvidia-container-tools-1.16.0-1 libnvidia-container1-1.16.0-1 nvidia-container-runtime
systemctl reboot
Once the GPU worker node has rebooted, ssh to it again and run the following:
systemctl disable --now nvidia-powerd
systemctl disable --now systemd-hwdb-update
Next we can check if the GPU is correctly detected
nvidia-smi
nvidia-smi --query-gpu=driver_version --format=csv
Once we verified that the GPU is detected, we can continue with the NVIDIA GPU Operator Helm Chart installation.
Install NVIDIA GPU Operator Helm chart
The NVIDIA GPU Operator Helm chart is really a great piece of software. It automatically detects all worker nodes with GPUs, installs the NVIDIA drivers (not for FCOS) and configures containerd.
Prerequisites
kubectl create ns gpu-operator
kubectl label --overwrite ns gpu-operator pod-security.kubernetes.io/enforce=privileged
Add NVIDIA Helm repo
helm repo add nvidia https://nvidia.github.io/gpu-operator
&& helm repo update
Fedora CoreOS
The latest NVIDIA GPU Operator is not supported on K8s 1.21.x. But older versions e.g. 22.9.0 do still work. So we need to install version v22.9.0 without the NVIDIA driver and with a different container image for the toolkit.
helm install --wait --generate-name
-n gpu-operator --create-namespace
nvidia/gpu-operator
--set driver.enabled=false
--version=v22.9.0
--set toolkit.version=v1.11.0-ubi8
Ubuntu (CAPI):
Here we can simply install the latest version and benefit from the automatic NVIDIA driver installation.
helm install --wait --generate-name
-n gpu-operator --create-namespace
nvidia/gpu-operator
We can actually check if the NVIDIA GPU has been correctly detected using kubectl describing the GPU worker node:
kubectl describe node k8s-fcos-calico-small-35--worker-gp-dw775stwxobq-node-0
Example
....
Capacity:
cpu: 12
ephemeral-storage: 267899884Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32867712Ki
nvidia.com/gpu: 1 <------- This is what we are looking for
pods: 110
....
Run GPU-Accelerated Workloads
Finally we can run some very simple GPU workloads.
cat > cuda-vectoradd.yaml << EOF
apiVersion: v1
kind: Pod
metadata:
name: cuda-vectoradd
spec:
restartPolicy: OnFailure
containers:
name: cuda-vectoradd
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04"
resources:
limits:
nvidia.com/gpu: 1
EOF
# apply the deployment
kubectl apply -f cuda-vectoradd.yaml
# check logs to see results
kubectl logs cuda-vectoradd
Example
(2023.2) [vagrant@seed 1.21.11]$ kubectl logs cuda-vectoradd
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
Conclusion
OpenStack Magnum combined with NVIDIA GPUs offers a powerful platform for modern workloads that require massive computational power. By leveraging Magnum’s integration with OpenStack and Kubernetes, organizations can deploy GPU-accelerated applications with ease, scalability, and efficiency. Whether you’re training complex AI models or running HPC simulations, this setup provides the foundation for innovation at scale.