Successfully upgrade OpenShift cluster on a disconnected environment with troubleshooting guide.
With the frequent new releases of OpenShift versions, it is important that we stay up-to-date with the latest version. As per the Red Hat article https://access.redhat.com/support/policy/updates/openshift#:~:text=Red%20Hat%20forecasts%20OpenShift%20for,Full%20Support%20and%20Maintenance%20Support, the below table shows that the frequency of new release versions is typically 3 months.
OpenShift Container Platform upgrade is supported on the following environments :
1. Disconnected environment: This environment does not have internet connectivity. These are also referred to as a restricted or airgap environment. A private docker registry is required in such environments where the images can be mirrored to and pulled from.
2. Online environment: The environment which has internet connectivity. In this case, the images will be pulled directly from the online registry.
The blog lists down the steps to upgrade an existing cluster in a disconnected environment. Upgrading a cluster helps to apply the latest enhancements and bug fixes available in the upgraded version.
Pre-requisites:
1. OCP cluster with OCP 4.4.6: Note that the version may differ as per your requirement.
2. Virtual Machine: A Linux virtual machine with internet connectivity, which will be used as the local registry to mirror the images from Red Hat. This VM should also be able to connect to the Docker private registry.
3. Docker private registry: This registry will contain the images required for the OpenShift Container Platform upgrade. Images from the local registry will be pushed to this registry.
4. Pull-secret: Copy pull-secret to the VM which hosts mirror registry:
1. Download pull secret from https://cloud.redhat.com/openshift/install/pull-secret.
2. Generate the base64-encoded username and password or token for your mirror registry:
$ echo -n '<username>:<password>' | base64 -w0
Note: For <username> and <password>, specify the user name and password configured for registry.
3. Make a copy of your pull secret in JSON format:
$ cat ./pull-secret.text | jq . > <path>/<pull-secret-file>
Note: Specify the path store the pull secret and a name for the JSON file.
4. Edit the new file and add a section with your external registry (mirror registry) to it:
"auths": {
"<mirror_registry fully qualified hostname:port>": {
"auth": "<credentials>",
"email": "you@example.com"
},
Note: For <credentials>, specify the base64-encoded user name and password for the mirror registry.
The pull-secret file should now look like the following sample:
{
"auths":
{
"<mirror registry fully qualified host name:port>": {
"auth": "<credentials>",
"email": "you@example.com" },
"cloud.openshift.com": {
"auth": "b3BlbnNo...",
"email": "you@example.com" },
"quay.io": {
"auth": "b3BlbnNo...",
"email": "you@example.com"
},
"registry.connect.redhat.com": {
"auth": "NTE3Njg5Nj...",
"email": "you@example.com"
},
"registry.redhat.io": {
"auth": "NTE3Njg5Nj...",
"email": "you@example.com"
}
}
}
5. Select the upgrade path for the specific version:
When we upgrade from a lower version of the cluster to a higher version in an offline environment, we need to follow the cluster upgrade path. Note that you need to follow a proper upgrade path, which is officially supported by Red Hat.
Base cluster version: 4.4.6
Target cluster version: 4.6.4
As a next step, we need to select the upgrade path based on the base and target cluster version.
Steps to select an upgrade path:
1. Go to link https://mirror.openshift.com/pub/openshift-v4/clients/ocp/<target version>/
In this example, since the upgrade is to be done to the 4.6.4 version (as the final version):
https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.6.4/
2. Open the release.txt file:
https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.6.4/release.txt
3. Go to the Release Metadata section in the file:
Release Metadata:
Version: 4.6.4
Upgrades: 4.5.16, 4.5.17, 4.5.18, 4.5.19, 4.6.1, 4.6.2, 4.6.3
The above-listed versions are the base versions from which the cluster can be upgraded to 4.6.4. From this list, you need to choose an appropriate version. For this example, let us select 4.5.16 as the cluster is currently at 4.4.6.
Now as mentioned in the above steps, open the release.txt file for 4.5.16 and check the upgrade version:
Version: 4.5.16
Upgrades: 4.4.13, 4.4.14, 4.4.15, 4.4.16, 4.4.17, 4.4.18, 4.4.19, 4.4.20, 4.4.21, 4.4.23, 4.4.26, 4.4.27, 4.4.28, 4.4.29, 4.4.30, 4.5.2, 4.5.3, 4.5.4, 4.5.5, 4.5.6, 4.5.7, 4.5.8, 4.5.9, 4.5.11, 4.5.13, 4.5.14, 4.5.15
From the above list, we have chosen 4.4.17(as our cluster base version is 4.4.6, and 4.4.17 can be upgraded from 4.4.6). To confirm open the release.txt file for 4.4.17. The below list confirms that 4.4.17 can be upgraded from 4.4.6:
Version: 4.4.17
Upgrades: 4.3.29, 4.3.31, 4.4.6, 4.4.8, 4.4.9, 4.4.10, 4.4.11, 4.4.12, 4.4.13, 4.4.14, 4.4.15, 4.4.16
Therefore, as per the analysis, the upgrade path for 4.4.6 to 4.6.4 is as follows:
4.4.6 -> 4.4.17 -> 4.5.16-> 4.6.4
*The recipe shows how to upgrade from 4.4.6 to 4.4.17. If you wish to upgrade till 4.6.4, then repeat the same steps for each version based on your upgrade path.
- The upgrade path may differ based on your base cluster version and chosen path.
Steps to upgrade the cluster platform version
1. Make sure all pre-requisites are set up and available before we begin with the upgrade.
In a disconnected environment, the following flow chart depicts the flow of OpenShift Container Platform upgrade:
As mentioned in the pre-requisite section, the below should ingredients should be available before we proceed with the upgrade:
1. OCP cluster
2. Upgrade path has been selected based on the base cluster and the target version.
3. Private docker registry is available.
4. Virtual VM with internet connectivity is available. Also, the virtual machine should be able to connect to the private docker registry.
5. The pull-secret.json file with added auth for the private docker registry.
2. Set the below parameters on the virtual machine
OCP_RELEASE=4.4.17
LOCAL_REGISTRY=registry.example.com:<port>
LOCAL_REPOSITORY=ocp4/openshift
PRODUCT_REPO=openshift-release-dev
LOCAL_SECRET_JSON=<path to pull secret>/pullsecret.json
RELEASE_NAME=ocp-release
ARCHITECTURE=x86_64
REMOVABLE_MEDIA_PATH=.
- OCP_RELEASE is the target version.
- LOCAL_REGISTRY=registry.example.com:<port> is the Fully qualified name of the private docker registry. For IBM Cloud Pak System the docker registry port is 443)
- LOCAL_REPOSITORY is the repository to store RedHat OpenShift Container Platform images. ocps/openshift is the default value.
- PRODUCT_REPO is a repo to store the images. openshift-release-dev is the default value.
- LOCAL_SECRET_JSON is the path to the pull secret on the virtual machine.
- RELEASE_NAME OCP release name. ocp-release is the default value.
- ARCHITECTURE Platform architecture. For Linux, it is x86_64.
- REMOVABLE_MEDIA_PATH is the location where the mirror folder will be created and images will be mirrored on your virtual machine.
3. Mirror the images to virtual machine — dry run
Let’s have a dry run first to confirm whether the mirroring would go successfully.
oc adm release mirror -a ${LOCAL_SECRET_JSON} --to-dir=${REMOVABLE_MEDIA_PATH}/mirror quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-${ARCHITECTURE} --dry-run
4. Mirror the images to the virtual machine
After the dry run is successful, mirror the images:
oc adm release mirror -a ${LOCAL_SECRET_JSON} --to-dir=${REMOVABLE_MEDIA_PATH}/mirror quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-${ARCHITECTURE}
Post successful mirroring of images, a signature file is created. Note the path of the signature file.
5. Mirror the images to private docker registry
After the images are mirrored to the virtual machine, push these to the private docker registry:
oc image mirror -a ${LOCAL_SECRET_JSON} --from-dir=${REMOVABLE_MEDIA_PATH}/mirror ‘file://openshift/release:<version>*’ ${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}
6. Create the config file
Copy /mirror/config/signature-<xxx>.yaml from virtual machine to Primary helper of the cluster. The primary helper is the node that has oc cli deployed and from which the cluster upgrade process would be done.
On the Primary helper node, run the below command to create the config file
oc apply -f signature-<xxx>.yaml
Note: The above is valid while upgrading from version 4.4.18. For versions below 4.4.18, refer to steps 7–8 to create the file manually.
7. Set parameters to run the upgrade
OCP_RELEASE_NUMBER=4.4.17
ARCHITECTURE=x86_64
DIGEST="$(oc adm release info quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE_NUMBER}-${ARCHITECTURE} | sed -n 's/Pull From: .*@//p')"
DIGEST_ALGO="${DIGEST%%:*}"
DIGEST_ENCODED="${DIGEST#*:}"
SIGNATURE_BASE64=$(curl -s "https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/${DIGEST_ALGO}=${DIGEST_ENCODED}/signature-1" | base64 -w0 && echo)
Note: Run this step on the VM with an internet connection to get the image signature file.
8. Create the config file for checksum
cat >checksum-${OCP_RELEASE_NUMBER}.yaml <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: release-image-${OCP_RELEASE_NUMBER}
namespace: openshift-config-managed
labels:
release.openshift.io/verification-signatures: ""
binaryData:
${DIGEST_ALGO}-${DIGEST_ENCODED}: ${SIGNATURE_BASE64}
EOF
After the file is created successfully, apply the checksum file to the cluster:
oc create -f checksum-${OCP_RELEASE_NUMBER}.yaml
9. Upgrade the cluster to the new version
Set the parameters as follows:
LOCAL_REGISTRY=registry.example.com
LOCAL_REPOSITORY=ocp4/openshift4
Note: We need not provide a port for LOCAL_REGISTRY.
Upgrade Command:
Run the below command to initiate the cluster upgrade :
oc adm upgrade --allow-explicit-upgrade --to-image ${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}@sha256:<shasum>
Note: Shasum value can be taken from the signature file copied in step 7.
10. Confirm the status of the upgrade
While the upgrade is in progress, it is good to check on the upgrade status intermittently.
Run the below command to check the current progress of the upgrade:
oc get clusterversion
After the upgrade is done successfully, oc get clusterversion should give the latest upgraded version.
Troubleshooting Guide:
Cluster upgrade goes through certain errors and issues while progressing. Below troubleshooting guide would help to resolve certain issues that are commonly faced while the upgrade is progressing:
1. Upgrade gets stuck due to any degraded cluster operator:
If the upgrade gets stuck on any of the co, delete the operator and its pods. After the co restarts,the upgrade would continue on its own.
Example
oc delete co <co name>
oc project <co name>
oc delete pods --all
Check whether the pods are restarted again in the same project. All the pods should be in Running state.
oc get pods
The co should be restarted and the Degraded state should be False
oc get co | grep <co name>
Note: If the issue does not get resolved, search for a relevant Bugzilla report or contact Red Hat support.
2. If the following error message is shown while cluster upgrade:
“The cluster operator openshift-samples has not yet successfully rolled out”
Follow the below steps to resolve the issue:
Step 1: Run the following oc command:
oc delete configs.samples cluster
Step 2: Wait for some time, and run the following command:
oc get co
The openshift-samples operator should be Degraded=False state :
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
openshift-samples 4.4.17 False True False 7
3. If the upgrade is halted due to api-server degraded :
Refer to the following Bugzilla report to restart api-server operators: https://bugzilla.redhat.com/show_bug.cgi?id=1817588
4. To confirm the upgrade status, run the below commands:
oc get clusterversionoc get clusterversion -o json|jq ".items[0].spec"oc get clusterversion -o json|jq ".items[0].status.history"
Additional information and references:
- Based on the upgrade path, the upgrade has to be done one by one for every version in the path till the target version is reached.
- The typical upgrade period is 30–50 minutes per version upgrade.
- RedHat reference guide: https://docs.openshift.com/container-platform/4.4/updating/updating-restricted-network-cluster.html
The above steps should help upgrade your cluster smoothly without any issues. Stay tuned for more of such interesting technical topics!