Update 27.12.2023

With the release of Kolla-Ansible OpenStack 2023.2 (Bobcat), the CAPI driver is now already included in Magnum. While the Magnum docs are not yet updated, the release notes state this:

Adds support for copying in {{ node_custom_config }}/magnum/kubeconfig to Magnum containers for magnum-cluster-api driver.

A quick google search lead me to the following two commits:

kolla-ansible
Ansible deployment of the Kolla containers

https://review.opendev.org/c/openstack/kolla/+/902101

I deployed a fresh Openstack 2023.2 cluster and tested it, but the Magnum containers are still missing HELM and without any Kubeconfig file, Magnum would not work at all. Whenever I tried interacting with Magnum, creating template, deploying cluster it would fail with an error of:

2023-12-26 22:05:19.967 16 ERROR wsme.api [None req-e40d2b6a-c5e2-4328-96e9-09b6b3238830 e759aa4e51954f68a579308a8fe6b6b1 004e61a0444a4b99aa316568b3f88191 - - default -] Server-side error: "Configuration file ~/.kube/config not found". Detail:
Traceback (most recent call last):

  File "/var/lib/kolla/venv/lib64/python3.9/site-packages/pykube/config.py", line 111, in from_env
    config = cls.from_service_account()

  File "/var/lib/kolla/venv/lib64/python3.9/site-packages/pykube/config.py", line 42, in from_service_account
    with service_account_dir.joinpath("token").open() as fp:

  File "/usr/lib64/python3.9/pathlib.py", line 1252, in open
    return io.open(self, mode, buffering, encoding, errors, newline,

  File "/usr/lib64/python3.9/pathlib.py", line 1120, in _opener
    return self._accessor.open(self, flags, mode)

FileNotFoundError: [Errno 2] No such file or directory: '/var/run/secrets/kubernetes.io/serviceaccount/token'

To me this looks like a bug that has been introduced with the commit above. Since I wanted to test CAPI, I had to create a Management cluster anyway and so I used the Kubeconfig from that.

This means that my post below is still useful, even if you are deploying the latest stable Openstack 2023.2 Bobcat release. I have updated the missing bits and pieces. Furthermore I have added steps how to use CAPI in OpenStack without Magnum at the bottom of this post.

To be honest, OpenStack Magnum has unfortunately not worked reliably since the last releases and quickly lead to great frustration. This is partly due to the very fast release cycle of Kubernetes and the way how Magnum works. Fortunately, the developers at Vexxhost and StackHPC have solved the problem by creating a new Magnum driver using a completely different approach, namely Cluster API, or CAPI for short. CAPI has been around for quite some time now and looks very promising. The idea is quite simple. You need a K8s cluster and this cluster can deploy additional K8s clusters. Some of you may know TripleO (OpenStack on OpenStack) which is kind of the same approach. If you want to learn more about CAPI, there is a very good video, which was recorded at this year's OpenInfra Summit. The best thing is that this new driver works fully transparent for the enduser. You can still use the typical way to create a K8s cluster using Magnum.

A few weeks ago, the topic was very actively discussed on the OpenStack mailing list and the first approaches on how to use CAPI under Kolla-Ansible were made. Unfortunately, however, the information is very widely scattered across many websites. So there was a lack of clear instructions. Today I spent some time to collect all the infos and have already successfully deployed a K8s cluster with CAPI in my OpenStack 2023.1 Kolla-Ansible environment.

Create a K8s Management cluster

CAPI needs a K8s cluster, Octavia (LoadBalancer as a Service) and it seems to only work with Calico CNI. This cluster can be anywhere, we just need to be able to access it via kubectl and clusterctl. For example we can either deploy a K8s cluster using Magnum or Kind (Kubernetes in Docker). Kind should not be used in production, but it can be used to bootstrap a management cluster on the selected infrastructure provider.

Magnum

This is just an example for my environment. In your case you need to at least adjust the networking part, image, keypair etc.

Make sure to adjust the following to match your environment:

  • --image
  • --keypair (optional, can also be left out)
  • --dns-nameserver
  • --external-network
  • --fixed-network
  • --fixed-subnet
  • --flavor
  • --master-flavor
# create cluster template
openstack coe cluster template create \
  k8s-flan-small-37-v1.24.16-containerd \
  --image Fedora-CoreOS-37 --keypair mykey \
  --external-network public1 --fixed-network demo-net \
  --fixed-subnet demo-subnet --dns-nameserver 192.168.2.1 \
  --flavor m1.kubernetes.med --master-flavor m1.kubernetes.med \
  --volume-driver cinder --docker-volume-size 10 \
  --network-driver flannel --docker-storage-driver overlay2 \
  --coe kubernetes \
  --labels kube_tag=v1.24.16-rancher1,hyperkube_prefix=docker.io/rancher/,container_runtime=containerd,docker_volume_type=__DEFAULT__

# deploy cluster
openstack coe cluster create \
    --cluster-template k8s-flan-small-37-v1.24.16-containerd \
    --master-count 1 \
    --node-count 2 \
    --keypair mykey \
    k8s-flan-small-37-v1.24.16-containerd

Kind

By default Kind only listens on localhost. We need to change this so that it listens on a local interface that is reachable from the Magnum containers. For this we need to create a config yaml file. Make sure to adjust the "apiServerAddress".

cat > kind-config.yml << EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  # WARNING: It is _strongly_ recommended that you keep this the default
  # (127.0.0.1) for security reasons. However it is possible to change this.
  apiServerAddress: "192.168.2.140"
  # By default the API server listens on a random open port.
  # You may choose a specific port but probably don't need to in most cases.
  # Using a random port makes it easier to spin up multiple clusters.
  apiServerPort: 6443
EOF

Next we add our local user to the docker group

sudo usermod -G docker vagrant -a
su - $USER

Then we install some tools like kubectl and KIND using arkade

curl -sLS https://get.arkade.dev | sudo sh
arkade get kubectl@v1.25.12 helm
export PATH=$PATH:~/.arkade/bin
export KUBECONFIG=~/config
arkade get kind
kind version
kind create cluster --config=kind-config.yml
kubectl cluster-info

Install clusterctl

We need clusterctl to transform our K8s cluster to a Management cluster.

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.7.3/clusterctl-linux-amd64 -o clusterctl
sudo install -o root -g root -m 0755 clusterctl /usr/local/bin/clusterctl
clusterctl version

Download the K8s cluster config

We will need the K8s cluster config from our Management cluster in order for kubectl and clusterctl to interact with it.

Magnum

openstack coe cluster config k8s-flan-small-37-v1.24.16-containerd

It will tell you where the config is located. We just need to export it. In my case it is:

export KUBECONFIG=/home/vagrant/config

Kind

Kind will store the kubeconfig in the same directory where you initialized the cluster. However you can dump the config like so:

kind get kubeconfig > config

Initialize the CAPI Management cluster using clusterctl

Don't forget to enable the Cluster_Topology and Cluster_Resource_Set feature, which are required to enable support for managed topologies and ClusterClass, can be enabled via:

export EXP_CLUSTER_RESOURCE_SET=true
export CLUSTER_TOPOLOGY=true

I had overlooked this and when creating a cluster I got the following error messages in /var/log/kolla/magnum/magnum-conductor.log:

ERROR oslo_messaging.rpc.server pykube.exceptions.HTTPError: admission webhook "validation.kubeadmcontrolplanetemplate.controlplane.cluster.x-k8s.io" denied the request: spec: Forbidden: can be set only if the ClusterTopology feature flag is enabled
pykube.exceptions.HTTPError: admission webhook "validation.clusterresourceset.addons.cluster.x-k8s.io" denied the request: spec: Forbidden: can be set only if the ClusterResourceSet feature flag is enabled

Just in case you also forgot to set it before running clusterctl init, you can delete the provider components from the management cluster by running

clusterctl delete --all

And then just install them again by running

clusterctl init --infrastructure openstack

This will now create new namespaces and deploy certain pods like cert-manager to our K8s management cluster.

Monitor the CAPI Management cluster deployment

We can monitor the deployment in a new session using kubectl.

I install kubectl using arkade.

curl -sLS https://get.arkade.dev | sudo sh
arkade get kubectl@v1.24.16 helm
export PATH=$PATH:~/.arkade/bin
export KUBECONFIG=/home/vagrant/config
kubectl cluster-info
[vagrant@seed ~]$ kubectl get pods -A
NAMESPACE                       NAME                                                         READY   STATUS              RESTARTS   AGE
capi-kubeadm-bootstrap-system   capi-kubeadm-bootstrap-controller-manager-55fd94b768-lfgwd   0/1     ContainerCreating   0          14s
capi-system                     capi-controller-manager-7664b6648f-476tl                     0/1     Running             0          25s
cert-manager                    cert-manager-6586869bd6-pt489                                1/1     Running             0          60s
cert-manager                    cert-manager-cainjector-685cf784bc-qdrq5                     1/1     Running             0          60s
cert-manager                    cert-manager-webhook-7fccf6b85f-tsm4h                        1/1     Running             0          59s
kube-system                     coredns-7f4bcd98d7-9ncs8                                     1/1     Running             0          19m
kube-system                     coredns-7f4bcd98d7-fbvst                                     1/1     Running             0          19m
kube-system                     csi-cinder-controllerplugin-dc7889b4f-kvbr9                  6/6     Running             0          18m
kube-system                     csi-cinder-nodeplugin-95sjm                                  3/3     Running             0          8m42s
kube-system                     csi-cinder-nodeplugin-jmw52                                  3/3     Running             0          18m
kube-system                     csi-cinder-nodeplugin-wcrhg                                  3/3     Running             0          8m52s
kube-system                     dashboard-metrics-scraper-7866c78b8-bdkbf                    1/1     Running             0          19m
kube-system                     k8s-keystone-auth-jrdng                                      1/1     Running             0          19m
kube-system                     kube-dns-autoscaler-8f9cf4c99-rzcc5                          1/1     Running             0          19m
kube-system                     kube-flannel-ds-dcqp4                                        1/1     Running             0          8m52s
kube-system                     kube-flannel-ds-pjqrq                                        1/1     Running             0          19m
kube-system                     kube-flannel-ds-rqzdz                                        1/1     Running             0          8m41s
kube-system                     kubernetes-dashboard-d78dc6f78-zgk72                         1/1     Running             0          19m
kube-system                     magnum-metrics-server-564c9cdd6d-8m2lt                       1/1     Running             0          18m
kube-system                     npd-86frm                                                    1/1     Running             0          7m40s
kube-system                     npd-l6dxq                                                    1/1     Running             0          8m11s
kube-system                     openstack-cloud-controller-manager-zcfc2                     1/1     Running             0          19m
[vagrant@seed ~]$ kubectl get ns
NAME                                STATUS   AGE
capi-kubeadm-bootstrap-system       Active   95s
capi-kubeadm-control-plane-system   Active   82s
capi-system                         Active   109s
capo-system                         Active   66s
cert-manager                        Active   2m23s
default                             Active   21m
kube-node-lease                     Active   21m
kube-public                         Active   21m
kube-system                         Active   21m
[vagrant@seed ~]$ kubectl get deploy -A
NAMESPACE                           NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager       1/1     1            1           145m
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager   1/1     1            1           145m
capi-system                         capi-controller-manager                         1/1     1            1           145m
capo-system                         capo-controller-manager                         1/1     1            1           145m
cert-manager                        cert-manager                                    1/1     1            1           145m
cert-manager                        cert-manager-cainjector                         1/1     1            1           145m
cert-manager                        cert-manager-webhook                            1/1     1            1           145m
kube-system                         coredns                                         2/2     2            2           146m
local-path-storage                  local-path-provisioner                          1/1     1            1           146m

After a few moments you should see something like this

(2023.1) [vagrant@seed ~]$ clusterctl init --infrastructure openstack
Fetching providers
Installing cert-manager Version="v1.13.2"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v1.6.0" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v1.6.0" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v1.6.0" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-openstack" Version="v0.8.0" TargetNamespace="capo-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

  clusterctl generate cluster [name] --kubernetes-version [version] | kubectl apply -f -

We have now successfully created our K8s Management cluster.

Now it's your choice. You can go ahead and patch Magnum or you can just use CAPI natively and deploy workload clusters. For using CAPI without Magnum continue reading at the bottom of this post.

Patch K8s Cluster API in Kolla-Ansible

Patching K8s Cluster API in Kolla-Ansible requires us to copy the K8s Management cluster config to all control nodes and install magnum-cluster-api and HELM. As mentioned in the update, Openstack 2023.2 (Bobcat) already supports CAPI in Magnum to some extend. We only need to copy the Kubeconfig file to a config directory under the kolla/config folder and install HELM in the containers. For some reason CAPI requires Calico CNI and this is not allowed in the default configuration in Magnum. I always got the following error message when deploying a K8s cluster using CAPI:

2024-01-05 22:33:07.147 17 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib64/python3.9/site-packages/magnum_cluster_api/resources.py", line 163, in get_object
2024-01-05 22:33:07.147 17 ERROR oslo_messaging.rpc.server     assert CONF.cluster_template.kubernetes_allowed_network_drivers == ["calico"]

This is not documented in the official Magnum docs and no matter how much I googled, I couldn't find any clue on how to change this. Luckily some other user on the OpenStack discuss Mailinglist managed to figure this out. You can find his great blog here.

So for all releases we need to allow Calico CNI in magnum.conf

cat > /etc/kolla/config/magnum.conf << EOF
[trust]
cluster_user_trust = True

[cluster_template]
kubernetes_allowed_network_drivers = calico
kubernetes_default_network_driver = calico
EOF

What's really strange is that even in the Magnum source code the default option is "all".

Openstack 2023.2 (Bobcat)

Kubeconfig file

mkdir -p /etc/kolla/config/magnum/
cp ~/config /etc/kolla/config/magnum/kubeconfig

If you are using Docker

cat > patch_magnum_capi_drv_bobcat.sh << EOF
curl -O https://get.helm.sh/helm-v3.13.2-linux-amd64.tar.gz
tar xvzf helm-v3.13.2-linux-amd64.tar.gz
docker cp linux-amd64/helm magnum_conductor:/usr/local/bin/
docker cp linux-amd64/helm magnum_api:/usr/local/bin/
EOF

If you are using Podman

cat > patch_magnum_capi_drv_bobcat.sh << EOF
curl -O https://get.helm.sh/helm-v3.13.2-linux-amd64.tar.gz
tar xvzf helm-v3.13.2-linux-amd64.tar.gz
podman cp linux-amd64/helm magnum_conductor:/usr/local/bin/
podman cp linux-amd64/helm magnum_api:/usr/local/bin/
EOF

with the above script we can easily install HELM in the Magnum containers on all control nodes by running:

ansible -i multinode -m script -a "patch_magnum_capi_drv_bobcat.sh" control

Finally we just need to run kolla-ansible deploy

kolla-ansible -i multinode deploy --tags common,horizon,magnum 

In older Openstack releases we need to patch CAPI in the containers manually.

Openstack 2023.1 (Antelope)

Here we need to manually copy the Kubeconfig file to all control nodes. We can use ansible for this.

ansible -i multinode -m copy -a 'src=config dest=.' control

Next, we need to copy this Kubeconfig and install magnum-cluster-api and HELM in magnum_api and magnum_conductor containers on all control nodes. We create a simple shell script for this purpose:

cat > patch_magnum_capi_drv_antelope.sh << EOF
sudo docker exec -it magnum_conductor pip install magnum-cluster-api
sudo docker exec -it magnum_api pip install magnum-cluster-api
curl -O https://get.helm.sh/helm-v3.13.2-linux-amd64.tar.gz
tar xvzf helm-v3.13.2-linux-amd64.tar.gz
docker cp linux-amd64/helm magnum_conductor:/usr/local/bin/
docker cp linux-amd64/helm magnum_api:/usr/local/bin/
sudo docker exec -it magnum_conductor mkdir /var/lib/magnum/.kube
sudo docker exec -it magnum_api mkdir /var/lib/magnum/.kube
sudo docker cp config magnum_api:/var/lib/magnum/.kube/config
sudo docker cp config magnum_conductor:/var/lib/magnum/.kube/config
EOF

and again execute it via ansible.

ansible -i multinode -m script -a "patch_magnum_capi_drv_antelope.sh" control

Restart the magnum containers. Careful since this will disrupt the magnum service!

ansible -i multinode -m shell -a 'docker restart magnum_conductor magnum_api' control

Create the K8s CAPI Templates

Now we can download and create a K8s 1.24.16 cluster template using Cluster API.

It is very important to use a flavor that has at least 2 vCPUs and 20GB disk. Kubeadm requires 2 vCPUs and the Ubuntu image needs a 20GB disk.

Make sure to adjust the following to match your environment:

  • --dns-nameserver
  • --external-network
export OS_DISTRO=ubuntu # you can change this to "flatcar" if you want to use Flatcar
for version in v1.27.4; do \
  [[ "${OS_DISTRO}" == "ubuntu" ]] && IMAGE_NAME="ubuntu-2204-kube-${version}" || IMAGE_NAME="flatcar-kube-${version}"; \
  #curl -LO https://object-storage.public.mtl1.vexxhost.net/swift/v1/a91f106f55e64246babde7402c21b87a/magnum-capi/${IMAGE_NAME}.qcow2; \
  openstack image create ${IMAGE_NAME} --disk-format=qcow2 --container-format=bare --property os_distro=${OS_DISTRO} --file=${IMAGE_NAME}.qcow2; \
  openstack coe cluster template create \
      --image $(openstack image show ${IMAGE_NAME} -c id -f value) \
      --external-network public1 \
      --dns-nameserver 192.168.2.1 \
      --master-lb-enabled \
      --master-flavor m1.kubernetes.med \
      --flavor m1.kubernetes.med \
      --network-driver calico \
      --docker-storage-driver overlay2 \
      --coe kubernetes \
      --label kube_tag=${version} \
      k8s-${version};
done;

For multiple K8s versions you can use

export OS_DISTRO=ubuntu # you can change this to "flatcar" if you want to use Flatcar
for version in v1.24.16 v1.25.12 v1.26.7 v1.27.4; do \
  [[ "${OS_DISTRO}" == "ubuntu" ]] && IMAGE_NAME="ubuntu-2204-kube-${version}" || IMAGE_NAME="flatcar-kube-${version}"; \
  curl -LO https://object-storage.public.mtl1.vexxhost.net/swift/v1/a91f106f55e64246babde7402c21b87a/magnum-capi/${IMAGE_NAME}.qcow2; \
  openstack image create ${IMAGE_NAME} --disk-format=qcow2 --container-format=bare --property os_distro=${OS_DISTRO} --file=${IMAGE_NAME}.qcow2; \
  openstack coe cluster template create \
      --image $(openstack image show ${IMAGE_NAME} -c id -f value) \
      --external-network public1 \
      --dns-nameserver 192.168.2.1 \
      --master-lb-enabled \
      --master-flavor m1.kubernetes.med \
      --flavor m1.kubernetes.med \
      --network-driver calico \
      --docker-storage-driver overlay2 \
      --coe kubernetes \
      --label kube_tag=${version} \
      k8s-${version};
done;

To deploy the K8s cluster with one master and two worker nodes, we simply use

openstack coe cluster create \
    --cluster-template k8s-v1.27.4 \
    --master-count 1 \
    --node-count 1 \
    --keypair mykey \
    k8s-v1.27.4

And voila we have our K8s cluster deployed using CAPI.

(2023.1) [vagrant@seed ~]$ openstack coe cluster list
+--------------------------------------+-------------+---------+------------+--------------+-----------------+---------------+
| uuid                                 | name        | keypair | node_count | master_count | status          | health_status |
+--------------------------------------+-------------+---------+------------+--------------+-----------------+---------------+
| a48871c6-64cf-4232-87af-c6dcbf698db9 | k8s-v1.24.16 | mykey   |          2 |            1 | CREATE_COMPLETE | HEALTHY       |
+--------------------------------------+-------------+---------+------------+--------------+-----------------+---------------+

Big credits to Vexxhost and StackHPC for creating this new driver and also the OpenStack community.

Use CAPI in OpenStack without Magnum

While this is possible requires much more manual intervention compared to Magnum. First of all we need the same tools:

  • kubectl
  • clusterctl
  • helm
  • K8s CAPI template

I used arkade to install kubectl and helm so I just need to update my PATH

export PATH=$PATH:~/.arkade/bin

And we need access to our Management cluster

export KUBECONFIG=~/config

We need some more tools to create the config for our workload cluster

wget https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-openstack/master/templates/env.rc -O /tmp/env.rc
sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq
sudo chmod +x /usr/bin/yq

Next we need to create a clouds.yaml file. This is needed for CAPI to talk to our Openstack infrastructure to create the loadbalancer, instances etc. I tried using the file /etc/kolla/clouds.yaml but it didn't work so I used the script from here to create it.

. /etc/kolla/admin-openrc.sh

cat > gen-clouds-yml.sh << EOF
#!/bin/bash

source /etc/kolla/admin-openrc.sh

PROJECT_ID=$(openstack project list | grep $OS_PROJECT_NAME | awk '{print $2}')

cat << EOS > clouds.yaml
clouds:
  openstack:
    auth:
      auth_url: $OS_AUTH_URL
      username: "$OS_USERNAME"
      password: "$OS_PASSWORD"
      project_name: "$OS_PROJECT_NAME"
      project_id: "$PROJECT_ID"
      user_domain_name: "$OS_USER_DOMAIN_NAME"
    region_name: "$OS_REGION_NAME"
    interface: "public"
    identity_api_version: $OS_IDENTITY_API_VERSION
EOS

EOF

sh gen-clouds-yml.sh

The workload cluster will be deployed from a yaml file. We need to create this. Again make sure to adjust to match your environment. At best open a fresh console shell for this to make sure that no duplicate credentials exist in your environment.

source /tmp/env.rc clouds.yaml openstack
cat > capi_env.sh << EOF
# The list of nameservers for OpenStack Subnet being created.
# Set this value when you need create a new network/subnet while the access through DNS is required.
export OPENSTACK_DNS_NAMESERVERS=192.168.2.1
# FailureDomain is the failure domain the machine will be created in.
export OPENSTACK_FAILURE_DOMAIN=nova
# The flavor reference for the flavor for your server instance.
export OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR=m1.kubernetes.med
# The flavor reference for the flavor for your server instance.
export OPENSTACK_NODE_MACHINE_FLAVOR=m1.kubernetes.med
# The name of the image to use for your server instance. If the RootVolume is specified, this will be ignored and use rootVolume directly.
export OPENSTACK_IMAGE_NAME=ubuntu-2204-kube-v1.24.16
# The SSH key pair name
export OPENSTACK_SSH_KEY_NAME=mykey
# The external network
export OPENSTACK_EXTERNAL_NETWORK_ID=07b1e600-8116-4814-a75d-8f1f0f73e9e2
EOF
source capi_env.sh
env | grep OPENST

Download the ubuntu-2204-kube-v1.24.16 image (see Create the K8s CAPI Templates) before proceeding.

Generate the workload cluster config and deploy

clusterctl generate cluster --infrastructure openstack capi-quickstart --kubernetes-version v1.24.16 --control-plane-machine-count=1 --worker-machine-count=2 > capi-quickstart.yaml
kubectl apply -f capi-quickstart.yaml
kubectl get cluster
clusterctl describe cluster capi-quickstart

After a while the cluster will be provisioned but the control plane will not be ready until we install a Container Network Interface (CNI).

kubectl get kubeadmcontrolplane

Output

[vagrant@seed ~]$ kubectl get kubeadmcontrolplane
NAME                            CLUSTER           INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE   VERSION
capi-quickstart-control-plane   capi-quickstart   true                                 1                  1         1             15m   v1.24.16

To install the CNI we need to retrieve the Kubeconfig from the workload cluster

clusterctl get kubeconfig capi-quickstart > capi-quickstart.kubeconfig

We will install Calico

kubectl --kubeconfig=./capi-quickstart.kubeconfig \
  apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml

But we are still not done yet. The workload cluster needs to be able to interact with our OpenStack infrastructure and so we need to deploy the Out-of-Tree OpenStack Cloud Provider.

Troubleshooting

First thing to do in case your cluster deployment is not making any progress would be to lookup the details of the cluster

openstack coe cluster show k8s-v1.24.16

Example

The "status_reason" can give a good indication on why there is no progress. In the the case below there are no free IP adresses on the network. In this case floating IPs.

(2023.2) [vagrant@seed ~]$ openstack coe cluster show k8s-v1.24.16
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                | Value                                                                                                                                                                                                                                                                                                                                         |
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| status               | CREATE_IN_PROGRESS                                                                                                                                                                                                                                                                                                                            |
| health_status        | None                                                                                                                                                                                                                                                                                                                                          |
| cluster_template_id  | ccec9598-e756-42bd-b730-350389389294                                                                                                                                                                                                                                                                                                          |
| node_addresses       | []                                                                                                                                                                                                                                                                                                                                            |
| uuid                 | 481b8b96-ebb1-443f-b5fe-968cd6d973b9                                                                                                                                                                                                                                                                                                          |
| stack_id             | kube-5omjd                                                                                                                                                                                                                                                                                                                                    |
| status_reason        | CAPI Cluster status: TopologyUpdate: Updated "KubeadmControlPlane/kube-5omjd-rp9m5". CAPI OpenstackCluster status reason: Failedcreaterouter: Failed to create router k8s-clusterapi-cluster-magnum-system-kube-5omjd: Expected HTTP response code [201 202] when accessing [POST http://172.28.7.149:9696/v2.0/routers], but got 409 instead |
|                      | {"NeutronError": {"type": "IpAddressGenerationFailure", "message": "No more IP addresses available on network 726d35c4-eef9-43d1-a8aa-54eb9cfc8978.", "detail": ""}}                                                                                                                                                                          |
| created_at           | 2024-02-19T13:34:03+00:00                                                                                                                                                                                                                                                                                                                     |
| updated_at           | 2024-02-19T13:34:21+00:00                                                                                                                                                                                                                                                                                                                     |
| coe_version          | None                                                                                                                                                                                                                                                                                                                                          |
| labels               | {'kube_tag': 'v1.24.16', 'availability_zone': 'nova', 'auto_scaling_enabled': 'False', 'auto_healing_enabled': 'False'}                                                                                                                                                                                                                       |
| labels_overridden    | {}                                                                                                                                                                                                                                                                                                                                            |
| labels_skipped       | {}                                                                                                                                                                                                                                                                                                                                            |
| labels_added         | {'availability_zone': 'nova', 'auto_scaling_enabled': 'False', 'auto_healing_enabled': 'False'}                                                                                                                                                                                                                                               |
| fixed_network        | None                                                                                                                                                                                                                                                                                                                                          |
| fixed_subnet         | None                                                                                                                                                                                                                                                                                                                                          |
| floating_ip_enabled  | True                                                                                                                                                                                                                                                                                                                                          |
| faults               |                                                                                                                                                                                                                                                                                                                                               |
| keypair              | mykey                                                                                                                                                                                                                                                                                                                                         |
| api_address          | None                                                                                                                                                                                                                                                                                                                                          |
| master_addresses     | []                                                                                                                                                                                                                                                                                                                                            |
| master_lb_enabled    | False                                                                                                                                                                                                                                                                                                                                         |
| create_timeout       | 60                                                                                                                                                                                                                                                                                                                                            |
| node_count           | 1                                                                                                                                                                                                                                                                                                                                             |
| discovery_url        |                                                                                                                                                                                                                                                                                                                                               |
| docker_volume_size   | None                                                                                                                                                                                                                                                                                                                                          |
| master_count         | 1                                                                                                                                                                                                                                                                                                                                             |
| container_version    | None                                                                                                                                                                                                                                                                                                                                          |
| name                 | k8s-v1.24.16                                                                                                                                                                                                                                                                                                                                  |
| master_flavor_id     | m1.kubernetes.med                                                                                                                                                                                                                                                                                                                             |
| flavor_id            | m1.kubernetes.med                                                                                                                                                                                                                                                                                                                             |
| health_status_reason | {}                                                                                                                                                                                                                                                                                                                                            |
| project_id           | 70608ad5a62c4b478d0692dbe99fd6f6                                                                                                                                                                                                                                                                                                              |
+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

As soon as we free up some floating IPs and wait for a few moments, we will start to see progress.

(2023.2) [vagrant@seed ~]$ openstack coe cluster show k8s-v1.24.16
+----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                | Value                                                                                                                                                                                                               |
+----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| status               | CREATE_IN_PROGRESS                                                                                                                                                                                                  |
| health_status        | None                                                                                                                                                                                                                |
| cluster_template_id  | ccec9598-e756-42bd-b730-350389389294                                                                                                                                                                                |
| node_addresses       | []                                                                                                                                                                                                                  |
| uuid                 | 481b8b96-ebb1-443f-b5fe-968cd6d973b9                                                                                                                                                                                |
| stack_id             | kube-5omjd                                                                                                                                                                                                          |
| status_reason        | CAPI Cluster status: Provisioned: Cluster kube-5omjd is Provisioned. CAPI OpenstackCluster status reason: Successfulcreatefloatingip: Created floating IP 172.28.4.133 with id 1a753843-8414-4009-8037-4f415a02b814 |
| created_at           | 2024-02-19T13:34:03+00:00                                                                                                                                                                                           |
| updated_at           | 2024-02-19T13:45:33+00:00                                                                                                                                                                                           |
| coe_version          | None                                                                                                                                                                                                                |
| labels               | {'kube_tag': 'v1.24.16', 'availability_zone': 'nova', 'auto_scaling_enabled': 'False', 'auto_healing_enabled': 'False'}                                                                                             |
| labels_overridden    | {}                                                                                                                                                                                                                  |
| labels_skipped       | {}                                                                                                                                                                                                                  |
| labels_added         | {'availability_zone': 'nova', 'auto_scaling_enabled': 'False', 'auto_healing_enabled': 'False'}                                                                                                                     |
| fixed_network        | None                                                                                                                                                                                                                |
| fixed_subnet         | None                                                                                                                                                                                                                |
| floating_ip_enabled  | True                                                                                                                                                                                                                |
| faults               |                                                                                                                                                                                                                     |
| keypair              | mykey                                                                                                                                                                                                               |
| api_address          | None                                                                                                                                                                                                                |
| master_addresses     | []                                                                                                                                                                                                                  |
| master_lb_enabled    | False                                                                                                                                                                                                               |
| create_timeout       | 60                                                                                                                                                                                                                  |
| node_count           | 1                                                                                                                                                                                                                   |
| discovery_url        |                                                                                                                                                                                                                     |
| docker_volume_size   | None                                                                                                                                                                                                                |
| master_count         | 1                                                                                                                                                                                                                   |
| container_version    | None                                                                                                                                                                                                                |
| name                 | k8s-v1.24.16                                                                                                                                                                                                        |
| master_flavor_id     | m1.kubernetes.med                                                                                                                                                                                                   |
| flavor_id            | m1.kubernetes.med                                                                                                                                                                                                   |
| health_status_reason | {}                                                                                                                                                                                                                  |
| project_id           | 70608ad5a62c4b478d0692dbe99fd6f6                                                                                                                                                                                    |
+----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Some useful troubleshooting commands

kubectl get cluster capi-quickstart -o yaml > capi-quickstart-out.yaml
kubectl get machinedeployments -A
kubectl get crd
kubectl logs -n capi-system deploy/capi-controller-manager -f
kubectl logs -n capo-system -l control-plane=capo-controller-manager -f

Sources

Getting Started - Cluster API driver for Magnum
Quick Start - The Cluster API Book
openstack-note/Magnum Cluster API Guide.md at main · ngyenhuukhoi/openstack-note
Contribute to ngyenhuukhoi/openstack-note development by creating an account on GitHub.
Getting Started - Kubernetes Cluster API Provider OpenStack
Using clouds.yaml - Andreas Karis Blog
Andreas Karis’ blog about anything Kubernetes, OpenShift, Linux and Networking