MySphere Posts

When you delete a node using the CLI, the node object is deleted in Kubernetes, but the pods that exist on the node are not deleted. Any bare pods not backed by a replication controller become inaccessible to OpenShift Container Platform. Pods backed by replication controllers are rescheduled to other available nodes. You must delete local manifest pods.

  • To delete the node from the UPI installation, the node must be firstly drained and then marked unschedulable prior to deleting it:

$ oc adm cordon <node_name>
$ oc adm drain <node_name> --force --delete-local-data --ignore-daemonsets
- Ensure also that there are no current jobs/cronjobs being ran or scheduled in this specific node as the draining does not take it into consideration.
- For Red Hat OpenShift Container Platform 4.7+, utilize the option `--delete-emptydir-data` in case `--delete-local-data` doesn't work. The `--delete-local-data` option is deprecated in favor of `--delete-emptydir-data`.

$ oc get node <node_name> -o yaml > backupnode.yaml

Before proceeding with deletion of the node, it needs to be under "power off" status:
$ oc delete node <node_name>

Although the node object is now deleted from the cluster, it can still rejoin the cluster after reboot or if the kubelet service is restarted. To permanently delete the node and all its data, you must decommission the node once it is in shutdown mode.

Once the node is deleted, it can be ready for a power-off activity, or if it is needed to rejoin the cluster, it could be possible to either restart the kubelet or create the yaml back:

$ oc create -f backupnode.yaml

In order to get the node back, it can also be back by restarting kubelet:

$ systemctl restart kubelet

If it is needed to destroy then all the data from the worker node to delete all the software installed, execute the following:

# nohup shred -n 25 -f -z /dev/[HDD]
This command will overwrite all data on /dev/[HDD] repeatedly, in order to make it harder for even very expensive hardware probing to recover the data. Command line parameter -z will overwrite this device with zeros at the end of cycle to re-write data 25 times (it can be overridden with -n [number]).

One should consider running this command from RescueCD.

In order to monitor the deletion of the node, get the kubelet live logs:

$ oc adm node-logs <node-name> -u kubelet

https://access.redhat.com/solutions/4976801

Uncategorized

Applying a specific node selector to all infrastructure components will guarantee that they will be scheduled on nodes with that label. See more details on node selectors in placing pods on specific nodes using node selectors, and about node labels in understanding how to update labels on nodes.

Our node label and matching selector for infrastructure components will be node-role.kubernetes.io/infra: "".

To prevent other workloads from also being scheduled on those infrastructure nodes, we need one of two solutions:

  • Apply a taint to the infrastructure nodes and tolerations to the desired infrastructure workloads.
    OR
  • Apply a completely separate label to your other nodes and matching node selector to your other workloads such that they are mutually exclusive from infrastructure nodes.

TIP: To ensure High Availability (HA) each cluster should have three Infrastructure nodes, ideally across availability zones. See more details about rebooting nodes running critical infrastructure.

TIP: Review the infrastructure node sizing suggestions

By default all nodes except for masters will be labeled with node-role.kubernetes.io/worker: "". We will be adding node-role.kubernetes.io/infra: "" to infrastructure nodes.

However, if you want to remove the existing worker role from your infra nodes, you will need an MCP to ensure that all the nodes upgrade correctly. This is because the worker MCP is responsible for updating and upgrading the nodes, and it finds them by looking for this node-role label. If you remove the label, you must have a MachineConfigPool that can find your infra nodes by the infra node-role label instead. Previously this was not the case and removing the worker label could have caused issues in OCP <= 4.3.

This infra MCP definition below will find all MachineConfigs labeled both “worker” and “infra” and it will apply them to any Machines or Nodes that have the “infra” role label. In this manner, you will ensure that your infra nodes can upgrade without the “worker” role label.

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: infra
spec:
  machineConfigSelector:
    matchExpressions:
      - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]}
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/infra: ""

If you are not using the MachineSet API to manage your nodes, labels and taints are applied manually to each node:

Label it:

oc label node <node-name> node-role.kubernetes.io/infra=
oc label node <node-name> node-role.kubernetes.io=infra

Taint it:

oc adm taint nodes -l node-role.kubernetes.io/infra node-role.kubernetes.io/infra=reserved:NoSchedule node-role.kubernetes.io/infra=reserved:NoExecute

openshift Uncategorized

Infrastructure nodes allow customers to isolate infrastructure workloads for two primary purposes:

  1. to prevent incurring billing costs against subscription counts and
  2. to separate maintenance and management.

This solution is meant to complement the official documentation on creating Infrastructure nodes in OpenShift 4. In addition there is a great OpenShift Commons video describing this whole process: OpenShift Commons: Everything about Infra nodes

To resolve the first problem, all that is needed is a node label added to a particular node, set of nodes, or machines and machineset. Red Hat subscription vCPU counts omit any vCPU reported by a node labeled node-role.kubernetes.io/infra: "" and you will not be charged for these resources from Red Hat. Please see How to confirm infra nodes not included in subscription cost in OpenShift Cluster Manager? to confirm your vCPU reports correctly after applying the configuration changes in this article.

To resolve the second problem we need to schedule infrastructure workloads specifically to infrastructure nodes and also to prevent other workloads from being scheduled on infrastructure nodes. There are two strategies for accomplishing this that we will go into later.

You may ask why infrastructure workloads are different from those workloads running on the control plane. At a minimum, an OpenShift cluster contains 2 worker nodes in addition to 3 control plane nodes. While control plane components critical to the cluster operability are isolated on the masters, there are still some infrastructure workloads that by default run on the worker nodes – the same nodes on which cluster users deploy their applications.

Note: To know the workloads that can be executed in infrastructure nodes, check the “Red Hat OpenShift control plane and infrastructure nodes” section in OpenShift sizing and subscription guide for enterprise Kubernetes.

Planning node changes around any nodes hosting these infrastructure components should not be addressed lightly, and in general should be addressed separately from nodes specifically running normal application workloads.

openshift

It is not possible to change the domain for the API, internal or external.

Starting with OpenShift 4.8, it is possible to change the domain of the console and downloads routes after cluster installation.

Choose your domain name with carrefully.

More information see this document from RedHat https://access.redhat.com/solutions/4853401

openshift

rsync is generally faster than scp for copying files, especially when transferring a large amount of data or syncing directories. Here’s why:

1. Incremental Transfers

  • rsync: Only transfers the parts of files that have changed, rather than the entire file. This makes subsequent transfers much faster.
  • scp: Always transfers the entire file, even if only a small part of it has changed.

2. Compression

  • rsync: Supports compression during the transfer (using the -z option), which reduces the amount of data sent over the network.
  • scp: Also supports compression (using the -C option), but it doesn’t have the same efficiency in skipping unchanged data.

3. Resume Support

  • rsync: Can resume interrupted transfers without starting over (using the --partial flag).
  • scp: Does not natively support resuming transfers. If the transfer is interrupted, you need to restart it.

4. Efficient Directory Handling

  • rsync: Designed for syncing directories, handling file metadata, permissions, and symbolic links efficiently.
  • scp: Less efficient for syncing directories and preserving metadata.

When to Use Each Tool

  • Use rsync if:
    • You need to sync large files or directories.
    • You expect the transfer might be interrupted.
    • Only parts of files or directories have changed.
  • Use scp if:
    • You need a simple, one-time transfer of a few files.
    • You don’t need incremental syncing or advanced features.

Command Examples:

  • rsync:

rsync -avz source_file user@remote:/path/to/destination

scp:

scp source_file user@remote:/path/to/destination

In summary, rsync is more efficient for most use cases, particularly when dealing with large or frequently updated files.

Linux

I follow the instructions to create and share a folder on OMV but i can’t access the shared folder using my MAC or Linux Manchines.

On the Mac i got the error : The operation can’t be completed because the original item for “foder name” can’t be found.

When i try to mount the shared folder on a Linux machine i got Permission Denied.

I discovered that only the first user created during the initial setup can access the shared folders.

I opened a terminal session and ssh to the omv machine. The permissions for the disks are shown bellow:

The failed disk have the permission drwx—–

The only way i found was to change the permission to 0775 and all users can mount the shared folders.

Linux

Yesterday i was helping a customer to deploy his OpenShift 4.16.x cluster. The first step was the bastion host preparation. This includes the setup of the OCP CLI.

We do the default instalation of the CLI on top of RHEL 8.x but after the installation we got the following error:

 oc version
oc: /lib64/libc.so.6: version `GLIBC_2.33' not found (required by oc)
oc: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by oc)
oc: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by oc)

We tried to compile the new GLIBC .but without success.

The solution: Download the CLI version compiled for RHEL 8. Link here for amd64

Linux openshift

In the context of a Kubernetes cluster, CPU throttling still refers to the process of limiting the amount of CPU time a container or pod can use, but it’s slightly different than throttling within an individual CPU or device, as Kubernetes manages resources at the container level. Here’s a breakdown of how CPU throttling works in a Kubernetes environment:

1. CPU Resources in Kubernetes

Kubernetes allows you to specify how much CPU a container can request and how much it is allowed to consume. This is done through resource requests and limits:

  • CPU Request: The minimum CPU resource that the container is guaranteed to have.
  • CPU Limit: The maximum CPU resource the container can use.

Kubernetes uses CPU throttling to ensure that containers do not exceed their allocated CPU limits. If a container tries to use more CPU than it has been allocated (based on the CPU limit), Kubernetes will throttle the container’s CPU usage to prevent it from violating the resource limits.

2. How CPU Throttling Works in Kubernetes

  • CPU Requests: When a container is scheduled on a node, Kubernetes ensures that the requested CPU is available to the container. If the node doesn’t have enough available CPU, the pod may not be scheduled.
  • CPU Limits: If a container exceeds its CPU limit (i.e., tries to use more CPU than what is specified in the limit), Kubernetes throttles the container’s CPU usage. The system does this by applying CPU usage constraints (using mechanisms like CFS (Completely Fair Scheduler) in Linux) to ensure that the container doesn’t exceed its allocated CPU time.
  • CFS Throttling: The CFS quota system controls how much CPU a container can use. If a container tries to use more CPU than its allocated limit, the kernel uses a mechanism called CFS throttling. Essentially, the Linux kernel will temporarily stop a container from using the CPU until it is within the allowed usage range.
  • Exceeding Limits: If a container tries to use more CPU than its limit allows (e.g., 1 CPU core), Kubernetes will restrict or throttle the container, reducing its access to the CPU until it falls back within the limit.

3. Example: CPU Limits and Throttling

Suppose you define a pod in Kubernetes with the following resource configuration:

apiVersion: v1
kind: Pod
metadata:
  name: example-pod
spec:
  containers:
  - name: example-container
    image: myimage
    resources:
      requests:
        cpu: "500m"      # 0.5 CPU core requested
      limits:
        cpu: "1000m"      # 1 CPU core max
  • Request: The container is guaranteed 500 milli-CPU (or 0.5 CPU core).
  • Limit: The container can burst up to 1000 milli-CPU (or 1 CPU core).

If the container tries to use more than 1 CPU core (e.g., if the workload spikes and tries to use 1.5 CPU cores), Kubernetes will throttle it back down to 1 core.

4. How Throttling Happens in Practice

  • Within Node: If there are multiple containers on the same node and they exceed their CPU limits, the Linux kernel (via CFS) enforces throttling to ensure that no container exceeds its CPU limit. This can cause delays or latency in container performance, especially when several containers are competing for CPU on a node.
  • Overcommitment: If a node is overcommitted (i.e., the sum of all container CPU limits exceeds the physical capacity of the node), Kubernetes will throttle the containers that try to exceed the available CPU capacity.

5. Monitoring CPU Throttling in Kubernetes

You can monitor CPU throttling in a Kubernetes cluster by observing certain metrics, such as:

  • container_cpu_cfs_throttled_seconds_total: This metric in Prometheus shows how much time a container has been throttled by the kernel (CFS throttling).
  • container_cpu_usage_seconds_total: This metric shows the total CPU usage by a container, which can help correlate throttling behavior with usage spikes.

You can query these metrics in Prometheus and Grafana to see if containers are being throttled and to identify performance bottlenecks.

6. What Happens When Throttling Occurs?

When CPU throttling happens, a container might experience:

  • Increased Latency: Throttling limits the amount of CPU time available to the container, leading to increased response times.
  • Reduced Performance: The container may be unable to process requests as quickly, affecting application performance.
  • Delays in Processing: If the container cannot access enough CPU resources, jobs that require more compute power will queue up and take longer to complete.

7. How to Avoid CPU Throttling in Kubernetes

To avoid CPU throttling and ensure containers have the necessary resources:

  • Proper Resource Allocation: Set appropriate CPU requests and limits. The request should reflect the expected CPU usage, while the limit should provide headroom for occasional spikes.
  • Monitor Resource Usage: Use monitoring tools like Prometheus, Grafana, and Kubernetes metrics server to observe resource usage and throttling events.
  • Avoid Overcommitment: Ensure that the sum of the CPU requests of all containers on a node doesn’t exceed the total CPU capacity of that node.
  • Horizontal Scaling: If a container regularly hits its CPU limits, consider scaling the application horizontally by adding more pods to distribute the load.

8. Conclusion

In a Kubernetes cluster, CPU throttling is primarily a mechanism to enforce resource limits and ensure that each container gets its fair share of CPU time, preventing any single container from monopolizing resources. While this helps maintain system stability and prevent resource exhaustion, it can result in performance degradation if a container is constantly throttled. Proper resource allocation and monitoring are key to avoiding excessive throttling and ensuring efficient operation of your Kubernetes workloads.

Uncategorized

I am using TimeMachine to do my Mac backup since 2013 and every time i need to restore a file i have a success.
Last month i setup a Raspberry PI 3 with OpenMediaVault to create a NAS on my network. I have several computers at home that need a backup. The setup for the first Mac Book Pro using Moterey was ok, but my other Mac using MacOS Sonoma 14.4.

Using Sonoma i got the error “Apple: The operation can t be completed because the original item for “<file share>” can t be found: “. Searching on the web there is no one solution for the problem.

I found the video bellow that is a compilation of several possible solutions. The solution #3/21 solve my problem after i logoff and logon again.

Laptop

Since June 2022 the Red Hat OpenShift operator index images (redhat/redhat/operator-index) have been served from registry.redhat.io using Quay.io as the backend. OpenShift itself already needs access to the Quay.io registry and CDN hosts as explained in its installation instructions, and so this change required no action from customers at that time.

We are extending this to all Red Hat container images. This allows customers to benefit from the high availability of the Quay.io registry while simplifying the way Red Hat delivers container images and paving the way for future enhancements.

More informaion

Uncategorized