Kolla-Ansible

Kolla-Ansible upgrade from Yoga to Zed

r0k5t4r

21 May 2023 • 3 min read

Since there still seems to be no official guide on how to upgrade from OpenStack release Yoga to Zed with Kolla-Ansible, I have put together the steps that I used to successfully upgrade. I hit a few problems here and there but finally managed to get it working. My OpenStack cloud is running under Rocky Linux 8. To upgrade to Zed, we first need to upgrade all nodes from Rocky Linux 8 to Rocky Linux 9. While there is no "official" way to do this, there are some "unofficial" guides on the internet on how to accomplish this. For a homelab I would not bother but if you are running a production system I would recommend that you replace each node one by one with a fresh install of Rocky 9. To be more specific I used the just released Rocky Linux 9.2. So imagine you have at least 3 control nodes and multiple compute nodes in your environment. You would then start by following the steps outlined in the Kolla-Ansible article here to remove nodes from your environment. In my case I also have a VM, call it seed, from which I run kolla-ansible. I would start by upgrading this node.

So deploy a fresh Rocky 9 VM and follow the instructions from the Kolla-Ansible Yoga Quickstart Guide to install Kolla-Ansible. And then simply copy across the following files and dirs:

/etc/kolla/*
multinode
all-in-one
any ssh private keys that you may use to connect to the OpenStack nodes

Optional

If possible use the very same IP adress that you used on your old Rocky Linux 8 VM and if you had a local docker registry running, recreate it on the new VM. Firste we make sure to set the old registry as insecure (replace <ipofoldregistry> with the IP of your old local registry):

cat > /etc/docker/daemon.json << EOF
{
    "insecure-registries": [
        "<ipofoldregistry>:4000"
    ]
}
EOF

We can then use the simple script below, that I found here on Stackoverflow to pull all images from our old local registry:

cat > ~/pull_old_docker_reg.sh << EOF
#/bin/bash
yum -y install jq
set-x
registry="<ipofoldregistry>:4000"
repos=`curl -u bla:bla http://$registry/v2/_catalog?n=300 | jq '.repositories[]' | tr -d '"'`
for repo in $repos; do
   docker pull --all-tags $registry/$repo;
done
EOF

Now we can use below script to push them to our new local registry:

cat > ~/push_docker_img.sh << EOF
docker images | grep kolla | grep -v local | awk '{print \$1,\$2}' | while read -r image tag; do
        newimg=\`echo \${image} | cut -d / -f2-\`
        docker tag \${image}:\${tag} localhost:4000/\${newimg}:\${tag}
        docker push localhost:4000/\${newimg}:\${tag}
done
EOF

In case you have not used the same IP on your new seed node, don't forget to update it in the globals.yml file.

docker_registry: "<ip_of_new_seed_node>:4000"

To check if the new seed node is working is expected, you could simply run a:

kolla-ansible -i multinode precheck

If this works, I would then continue with a compute node, but you can also start with a control node. After you successfully removed the node you would then reinstall it with Rocky Linux 9. Once installed, you follow the steps for adding the node back to your environment from the same article. After a couple of hours, you finally have replaced all the Rocky Linux 8 nodes with Rocky Linux 9.

By some mistake I ran into the following error when trying to re-add the new Rocky 9 control nodes:

TASK [mariadb : Creating shard root mysql user] ******************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
skipping: [control02]
skipping: [control03]
fatal: [gedaopl07]: FAILED! => {"changed": false, "msg": "Can not parse the inner module output: b'[WARNING]: Failure using method (v2_playbook_on_play_start) in callback plugin

I had absolutely no clue what was causing this until I figured out that for whatever reason, probably due to a faulty script that I used a few weeks ago, containers from the latest OpenStack release were pushed to my local docker registry and the kolla-ansible deploy command installed them instead of the ones tagged with Yoga. I simply removed the ones with no tag from my local repo, pulled in the images again, pushed them to the local repo and after a new kolla-ansible deploy run the containers were successfully replaced. That is the good part, the bad part is that it took me several hours to figure this out. :) But it least it proves that kolla-ansible can be very resilient to failures / problems.

Now you can follow the usual steps from the official Kolla-Ansible Zed guide to upgrade from the Yoga to the Zed release. Overall a pretty much straight forward approach but can be very time consuming depending on the number of nodes that you have in your OpenStack cluster.