“Day 2” operations refer to everything that happens after the cluster is installed, which could be a lot or a little, depending on how you plan to use the cluster.
Effective troubleshooting and monitoring in OpenShift require understanding the right way to retrieve logs and manage issues. While it might be tempting to SSH directly into cluster nodes, OpenShift provides tools and workflows to handle logs more securely and efficiently.
Why You Should Avoid SSHing to Nodes
SSHing directly into cluster nodes might seem like a quick way to debug issues, but it introduces several risks and challenges:
1. Security Risks
- Inconsistent Access Control: Granting SSH access bypasses OpenShift’s centralized role-based access control (RBAC).
- Increased Attack Surface: Open SSH ports expose nodes to potential attacks.
2. Configuration Drift
- Manual changes made via SSH can lead to discrepancies between the actual state and the desired state managed by OpenShift.
- Untracked modifications can complicate troubleshooting and recovery processes.
3. Cluster Stability
- Direct changes to system files or services can inadvertently disrupt critical cluster operations.
- Node taints and labels, critical for scheduling, might be accidentally altered.
4. Unsupported Practices
OpenShift’s design assumes that all management and troubleshooting occur through API-driven tools. Manual SSH access may invalidate support agreements or create unsupported states.
Retrieving OpenShift Cluster Logs
Logs are invaluable for understanding the state of your OpenShift cluster and diagnosing problems. OpenShift provides several ways to access these logs efficiently:
Get node logs
Display node journal:
oc adm node-logs <node>
Tail 10 lines from node journal:
oc adm node-logs --tail=10 <node>
Get kubelet journal logs only:
oc adm node-logs -u kubelet.service <node>
Grep kernel
word on node journal:
oc adm node-logs --grep=kernel <node>
List /var/log
contents:
oc adm node-logs --path=/ <node>
Get /var/log/audit/audit.log
from node:
oc adm node-logs --path=audit/audit.log <node>
Pod Logs
Pod logs provide insights into application behavior.
- Retrieve logs for a specific pod:
oc logs <pod-name> -n <namespace>
- For pods with multiple containers, specify the container name:
oc logs <pod-name> -c <container-name> -n <namespace>
- Stream logs in real-time:
oc logs -f <pod-name> -n <namespace>