It is very convenient to use a local docker registry when deploying Kubernetes clusters using Magnum in OpenStack. Especially if you like me try to deploy multiple clusters to test several settings. It is just way faster to pull the images from the local network instead of the internet. In my homelab I have set up a local docker registry and if you followed my article Deploy OpenStack Cluster with Kolla-Ansible on a single ESXi host using Terraform then you know how I did it. Sometimes I create a new cluster template and once I try to deploy a cluster and it just doesn't seem to make any progress, I quickly ssh into the k8s master node and see this:
(yoga) [bofl@seed ~]$ ssh core@192.168.2.22
Fedora CoreOS 35.20220424.3.0
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/tag/coreos
[systemd]
Failed Units: 1
heat-container-agent.service
[core@k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 ~]$ systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● heat-container-agent.service loaded failed failed Run heat-container-agent
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
1 loaded units listed.
[core@k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 ~]$ systemctl status heat-container-agent.service
× heat-container-agent.service - Run heat-container-agent
Loaded: loaded (/etc/systemd/system/heat-container-agent.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2023-05-23 11:31:22 UTC; 8min ago
Process: 1404 ExecStartPre=mkdir -p /var/lib/heat-container-agent (code=exited, status=0/SUCCESS)
Process: 1405 ExecStartPre=mkdir -p /var/run/heat-config (code=exited, status=0/SUCCESS)
Process: 1407 ExecStartPre=mkdir -p /var/run/os-collect-config (code=exited, status=0/SUCCESS)
Process: 1410 ExecStartPre=mkdir -p /opt/stack/os-config-refresh (code=exited, status=0/SUCCESS)
Process: 1411 ExecStartPre=mv /var/lib/os-collect-config/local-data /var/lib/cloud/data/cfn-init-data (code=exited, status=1/FAILURE)
Process: 1412 ExecStartPre=mkdir -p /srv/magnum (code=exited, status=0/SUCCESS)
Process: 1413 ExecStartPre=/bin/podman kill heat-container-agent (code=exited, status=125)
Process: 1491 ExecStartPre=/bin/podman rm heat-container-agent (code=exited, status=1/FAILURE)
Process: 1549 ExecStartPre=/bin/podman pull 192.168.2.10:4000/heat-container-agent:wallaby-stable-1 (code=exited, status=125)
Process: 1604 ExecStart=/bin/podman run --name heat-container-agent --privileged --net=host --volume /srv/magnum:/srv/magnum --volume /opt/stack/os-config-refresh:/opt/stack/os-config-refresh --volume /run/systemd:/run/systemd --volume /etc/:/etc/ --volume /var/lib:/var/lib --volume /var/run:/var/run --volume /var/log:/var/log --volume /tmp:/tmp --volume /dev:/dev --env REQUESTS_CA_BUNDLE=/etc/>
Main PID: 1604 (code=exited, status=125)
CPU: 206ms
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 podman[1413]: 2023-05-23 11:31:22.155865026 +0000 UTC m=+0.389640534 system refresh
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 podman[1413]: Error: no container with name or ID "heat-container-agent" found: no such container
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 podman[1491]: Error: no container with name or ID "heat-container-agent" found: no such container
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 podman[1549]: Trying to pull 192.168.2.10:4000/heat-container-agent:wallaby-stable-1...
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 podman[1549]: Error: initializing source docker://192.168.2.10:4000/heat-container-agent:wallaby-stable-1: pinging container registry 192.168.2.10:4000: Get "https://192.168.2.10:4000/v2/": http: server gave HTTP response to HTTPS client
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 systemd[1]: Started Run heat-container-agent.
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 podman[1604]: Trying to pull 192.168.2.10:4000/heat-container-agent:wallaby-stable-1...
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 podman[1604]: Error: initializing source docker://192.168.2.10:4000/heat-container-agent:wallaby-stable-1: pinging container registry 192.168.2.10:4000: Get "https://192.168.2.10:4000/v2/": http: server gave HTTP response to HTTPS client
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 systemd[1]: heat-container-agent.service: Main process exited, code=exited, status=125/n/a
May 23 11:31:22 k8s-flannel-small-35-1-21-11-s-5tx7tjl3wirl-master-0 systemd[1]: heat-container-agent.service: Failed with result 'exit-code'.
The error message:
http: server gave HTTP response to HTTPS client
simply means that we are only serving http instead of https as the client expects. Of couse my local docker registry is just for testing purposes and doesn't offer a valid TLS certifiate so it is considered an insecure registry. So we need to update the cluster template. Before we can make any changes to the cluster template we need to delete the failed cluster:
openstack coe cluster delete mycluster
Wait a few moments and check that the cluster has been successfully deleted
openstack coe cluster list
Now we simply need to remember to always include the inscure_registry parameter via the "openstack coe cluster template create" command, right? WRONG. :) Unfortunately, there doesn't seem to be a way to do it via the "openstack coe create" command. But we can do it via the "openstack coe cluster template update" command after we created the template:
openstack coe cluster template update 84f4756d-f29f-4a79-bfc3-8eb5e3c36c82 replace insecure_registry=192.168.2.10:4000
or via the Horizon dashboard:
Container Infra -> Cluster templates -> Select the cluster template -> Update Cluster Template -> Node Spec -> Insecure Registry
and put in the IP of the local docker registry:
Then we can try to redeploy this cluster. This should have solved at least the issue with insecure registry.