OpenShift Day 2 Operations – Part 1

“Day 2” operations refer to everything that happens after the cluster is installed, which could be a lot or a little, depending on how you plan to use the cluster.

Verifying the Health of Your OpenShift 4 Cluster

Managing an OpenShift 4 cluster effectively involves regular health checks to ensure smooth operation and reliability. An unhealthy cluster can lead to downtime, reduced performance, and compromised workloads.

Node Health

Healthy nodes are crucial for running workloads effectively.

  • Use this command to check the node status: oc get nodes

Verify that all nodes show Ready in the STATUS column.

You can use oc get nodes -o wide to get more details about the cluster

For more details on a specific node:

oc describe node <node-name>

Verify resources allocated to a node:

oc describe node <node name>  | grep -A 10 "Allocated resources"

Get Allocated resources for all nodes:

oc describe nodes | grep -A 10 "Allocated resources"

 Check Cluster Operators

Cluster Operators are responsible for managing the lifecycle of key components of an OpenShift cluster. To verify their status:

Run the following command:

oc get clusteroperators

A little addon to the previous command very useful when you are upgrading your cluster:

watch -n5 oc get clusteroperators

 Pod Health

Ensuring that pods are running as expected is a key part of cluster health.

Get pods not running nor completed

oc get pods -A -o wide | grep -v -E 'Completed|Running'