MySphere – Page 2

Published 25/03/2025

How to Fix Error 513 When Unzipping Large Files on macOS

If you’ve ever tried to unzip a large file on macOS and encountered the mysterious “Error 513,” you’re not alone. This pesky error can pop up unexpectedly, leaving you scratching your head and wondering why your file won’t extract properly. Recently, I ran into this issue myself while trying to decompress a massive archive using the built-in Archive Utility on macOS. After some trial and error, I found a reliable solution: the Keka application. Here’s a rundown of what Error 513 is, why it happens, and how Keka saved the day.

What Is Error 513 on macOS?

Error 513 typically occurs when macOS’s default Archive Utility struggles to handle certain zip files—particularly large ones or those with complex structures. The error message might not give you much detail, often just stating that the operation couldn’t be completed. From my experience, it seems to be tied to limitations in how the native tool processes files, especially if they’re compressed in a way that macOS doesn’t fully support or if the file size pushes the utility beyond its comfort zone.

While the exact cause can vary (think file corruption, incompatible compression methods, or even permission issues), the result is the same: you’re stuck with a zip file that won’t budge. For me, it was a multi-gigabyte archive I’d downloaded, and no amount of retrying or rebooting would make Archive Utility cooperate.

The Solution: Keka to the Rescue

After a bit of digging online and some failed attempts with Terminal commands (like using unzip via Homebrew), I stumbled across Keka, a free and lightweight compression tool for macOS. Unlike the built-in Archive Utility, Keka is designed to handle a wider range of file formats and sizes with ease. Here’s how I used it to solve my Error 513 problem—and how you can, too.

Step 1: Download and Install Keka

Head over to the official Keka website (kekadev.com) or grab it from the Mac App Store if you prefer.
Installation is straightforward: just drag the app to your Applications folder, or let the App Store handle it for you.

Step 2: Open Your Problematic Zip File

Launch Keka from your Applications folder.
Drag and drop the zip file causing Error 513 onto the Keka window, or use the “Open” option in the app to locate it manually.

Step 3: Extract the File

Keka will automatically start extracting the file to the same directory as the original zip (you can change the destination if you’d like).
Sit back and let it work its magic. For my large file, Keka churned through it without a hitch—no Error 513 in sight.

Within minutes, I had my files unzipped and ready to use, something macOS’s default tool couldn’t manage despite multiple attempts.

Why Keka Works When Archive Utility Doesn’t

Keka’s strength lies in its versatility and robustness. It supports a variety of compression formats (like 7z, RAR, and more) and seems better equipped to handle edge cases—like oversized zip files—that trip up Archive Utility. Plus, it’s open-source, so it’s constantly being refined by a community of developers who actually care about making it work.

Bonus Tips

Check File Integrity: Before blaming the tool, ensure your zip file isn’t corrupted. You can test it in Keka by right-clicking the file and selecting “Verify” if you suspect an issue.
Permissions: If Keka still struggles, double-check the file’s permissions in Finder (Get Info > Sharing & Permissions) to ensure you have read/write access.
Update Keka: Make sure you’re running the latest version, as updates often fix bugs and improve compatibility.

Final Thoughts

Error 513 might be a roadblock when unzipping large files on macOS, but it doesn’t have to be a dealbreaker. For me, switching to Keka was a game-changer—fast, free, and frustration-free. If you’re tired of wrestling with Archive Utility’s limitations, give Keka a shot. It’s a small download that delivers big results, and it’ll likely become your go-to tool for all things compression-related on macOS.

Have you run into Error 513 before? Let me know how you tackled it—or if Keka worked for you too!

MAC

Published 03/03/2025

Python script to organize photos

“I have lots of photo files. Since 2006, when I purchased my first digital camera, the number of photos has grown quickly, and after getting an iPhone, the number of photos exploded.

With the high number of photos, the number of backups grew as well.

I decided to organize all backups and create folders using the format YYYY-MM from the metadata of the photo files.”

Bellow the python script. The script runs on macos:

import os
import shutil
import datetime
import logging
import tkinter as tk
from tkinter import filedialog
from PIL import Image, ExifTags
import pillow_heif
import piexif
import struct

# Setup logging
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s - %(levelname)s - %(message)s")

ATOM_HEADER_SIZE = 8
EPOCH_ADJUSTER = 2082844800  # Difference between Unix and QuickTime epoch

def get_file_date(file_path):
    try:
        if file_path.lower().endswith(".heic") and pillow_heif.is_supported(file_path):
            heif_file = pillow_heif.open_heif(file_path, convert_hdr_to_8bit=False)
            exif_data = heif_file.info.get("exif")
            if exif_data:
                exif_dict = piexif.load(exif_data)
                date_str = exif_dict["Exif"].get(piexif.ExifIFD.DateTimeOriginal)
                if date_str:
                    return datetime.datetime.strptime(date_str.decode("utf-8"), "%Y:%m:%d %H:%M:%S")
        
        elif file_path.lower().endswith((".jpg", ".jpeg")):
            with Image.open(file_path) as img:
                exif_data = img.getexif()
                if exif_data:
                    exif_dict = {ExifTags.TAGS.get(tag, tag): value for tag, value in exif_data.items()}
                    logging.debug(f"EXIF metadata for {file_path}: {exif_dict}")
                    
                    if "DateTimeOriginal" in exif_dict:
                        date_str = exif_dict["DateTimeOriginal"]
                    elif "DateTime" in exif_dict:
                        date_str = exif_dict["DateTime"]
                    else:
                        date_str = None
                        logging.warning(f"No DateTimeOriginal or DateTime found for {file_path}")
                    
                    if date_str:
                        try:
                            logging.debug(f"Extracted date string from EXIF: {date_str}")
                            return datetime.datetime.strptime(date_str, "%Y:%m:%d %H:%M:%S")
                        except ValueError as ve:
                            logging.error(f"Error parsing date for {file_path}: {ve}")
                    else:
                        logging.warning(f"DateTime metadata missing or unreadable for {file_path}")
                else:
                    logging.warning(f"No EXIF metadata found for {file_path}")
    
    except Exception as e:
        logging.error(f"Error extracting date from {file_path}: {e}")
    
    # If metadata exists but could not be parsed, use file birth time (creation date on macOS)
    file_stats = os.stat(file_path)
    file_birth_time = file_stats.st_birthtime
    logging.debug(f"Using file birth time for {file_path}: {datetime.datetime.fromtimestamp(file_birth_time)}")
    return datetime.datetime.fromtimestamp(file_birth_time)

def move_files_to_folders(source_folder):
    for filename in os.listdir(source_folder):
        file_path = os.path.join(source_folder, filename)
        if filename.lower().endswith((".jpg", ".jpeg", ".heic", ".mov")):
            date_taken = get_file_date(file_path)
            if date_taken:
                folder_name = date_taken.strftime("%Y-%m")
            else:
                logging.warning(f"Could not determine date for {file_path}, using 'unknown' folder.")
                folder_name = "unknown"
            
            dest_folder = os.path.join(source_folder, folder_name)
            os.makedirs(dest_folder, exist_ok=True)
            
            dest_file_path = os.path.join(dest_folder, filename)
            count = 1
            while os.path.exists(dest_file_path):
                name, ext = os.path.splitext(filename)
                dest_file_path = os.path.join(dest_folder, f"{name}_{count}{ext}")
                count += 1
            
            shutil.move(file_path, dest_file_path)
            logging.info(f"Moved {filename} to {dest_folder}")

if __name__ == "__main__":
    root = tk.Tk()
    root.withdraw()
    folder_selected = filedialog.askdirectory(title="Select the folder containing files")
    if folder_selected:
        move_files_to_folders(folder_selected)
        logging.info("File organization complete.")
    else:
        logging.warning("No folder selected.")

Uncategorized

Published 11/02/2025

New Flags in OpenShift 4.17 Must-Gather Tool

The oc adm must-gather tool is essential for troubleshooting and diagnostics in OpenShift. With the release of OpenShift 4.17, new flags have been introduced to enhance flexibility and precision in data collection. These additions enable administrators to gather logs more efficiently while reducing unnecessary data collection.

New Flags in Must-Gather

`--since`

This flag allows users to collect logs newer than a specified duration. For example:

oc adm must-gather --since=24h

This command gathers logs from the past 24 hours, making it easier to pinpoint recent issues.

`--since-time`

The --since-time flag lets users specify an exact timestamp (RFC3339 format) to collect logs from a particular point in time.

oc adm must-gather --since-time=2025-02-10T11:12:39Z

This is useful for investigating incidents that occurred at a specific time.

Existing Flags for Enhanced Customization

Along with the new additions, several existing flags provide more control over the data collection process:

--all-images: Uses the default image for all operators annotated with operators.openshift.io/must-gather-image.
--dest-dir: Specifies a local directory to store gathered data.
--host-network: Runs must-gather pods with hostNetwork: true for capturing host-level data.
--image: Allows specifying a must-gather plugin image to run.
--node-name: Targets a specific node for data collection.
--node-selector: Selects nodes based on a node selector.
--run-namespace: Runs must-gather pods within an existing privileged namespace.
--source-dir: Defines the directory from which data is copied.
--timeout: Sets a time limit for data gathering.
--volume-percentage: Adjusts the maximum storage percentage for gathered data.

Conclusion

The introduction of --since and --since-time in OpenShift 4.17 significantly improves must-gather’s efficiency by enabling targeted log collection. By leveraging these and other available flags, administrators can streamline troubleshooting and optimize diagnostics.

For a deeper dive into must-gather and its latest enhancements, check out the official OpenShift documentation.

openshift

Published 07/02/2025

Configure Haproxy as a load balancer for Openshift 4.16

I set up an OpenShift 4.16 cluster using UPI on top of VMware. The cluster has 3 Masters, 3 Worker Nodes, and 3 InfraNodes. The infra nodes were necessary to install IBM Storage Fusion.

After the setup, I needed to create a load balancer in front of the OpenShift cluster. There are several options, and one of them is HAProxy.

I just installed an RHEL 9 server, added 3 ips to the network card and setup the haproxy.

Prerequisites

A system running RHEL 9
Root or sudo privileges
A basic understanding of networking and load balancing

Step 1: Install HAProxy

First, update your system packages:

sudo dnf update -y

Then, install HAProxy using the package manager:

sudo dnf install haproxy -y

Verify the installation:

haproxy -v

Step 2: Configure HAProxy

The main configuration file for HAProxy is located at /etc/haproxy/haproxy.cfg. Open the file in a text editor:

sudo nano /etc/haproxy/haproxy.cfg

The configuration bellow was used for my cluster. Change the IP adresses to match

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

    # utilize system-wide crypto-policies
    #ssl-default-bind-ciphers PROFILE=SYSTEM
    #ssl-default-server-ciphers PROFILE=SYSTEM

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    tcp
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------

frontend api
    bind 192.168.252.171:6443
    default_backend controlplaneapi

frontend apiinternal
    bind 192.168.252.171:22623
    bind 192.168.252.171:22624
    default_backend controlplaneapiinternal

frontend secure
    bind 192.168.252.170:443
    default_backend secure

frontend insecure
    bind 192.168.252.170:80
    default_backend insecure

#---------------------------------------------------------------------
# static backend
#---------------------------------------------------------------------

backend controlplaneapi
    balance source
    server master-01  192.168.252.5:6443 check
    server master-02  192.168.252.6:6443 check
    server master-03  192.168.252.7:6443 check


backend controlplaneapiinternal
    balance source
    server master-01  192.168.252.5:22623 check
    server master-02  192.168.252.6:22623 check
    server master-03  192.168.252.7:22623 check
    server master-01  192.168.252.5:22624 check
    server master-02  192.168.252.6:22624 check
    server master-03  192.168.252.7:22624 check

backend secure
    balance source
    server worker-01  192.168.252.8:443 check
    server worker-02  192.168.252.9:443 check
    server worker-03  192.168.252.10:443 check
    server  worker-04   192.168.252.11:443 check
    server  worker-05   192.168.252.12:443 check
    server  worker-06   192.168.252.13:443 check

backend insecure
    balance roundrobin
    server worker-01  192.168.252.8:80 check
    server worker-02  192.168.252.9:80 check
    server worker-03  192.168.252.10:80 check
    server worker-04   192.168.252.11:80 check
    server worker-05   192.168.252.12:80 check
    server worker-06  192.168.252.13:80 check

Uncategorized

Published 06/02/2025

How to Install watch on macOS Sequoia

The watch command is a useful utility in Unix-like systems that allows you to execute a command periodically and display its output. However, macOS does not come with watch pre-installed. If you’re running macOS Sequoia and want to use watch, follow the steps below to install it.

Recently i switch my mabook to a new MacBook Pro M2 and try to use the command to watch some openshift logs and i got the following result:

To install just use Homebrew.

brew install watch

Using `watch` on macOS

Now that watch is installed, you can start using it. The basic syntax is:

watch -n <seconds> <command>

For example, to monitor the disk usage of your system every two seconds, you can run:

watch -n 2 df -h

Additional Options

-d: Highlights the differences between updates.
-t: Turns off the title/header display.
-b: Beeps if the command exits with a non-zero status.

Alternative: Using a `while` Loop

If you prefer not to install watch, you can achieve similar functionality using a while loop in the terminal:

while true; do <command>; sleep <seconds>; done

For example:

while true; do df -h; sleep 2; done

This method works in any macOS version without requiring additional installations.

Linux MAC

Published 04/02/2025

Manage OpenShift virtual machines with GitOps

Managing virtual machines in an Infrastructure as Code (IaC) environment requires efficiency and reliability. One of the central ideas for this is having a single source of truth (SSoT) in order to ensure consistency in resources, improve automation, and leverage processes such as version control. In this type of secluded environment, we can track and test changes and increase our scalability with ease.

This learning path will showcase how to use Red Hat OpenShift GitOps with a Git repository as a single source of truth for our infrastructure, thereby enhancing automation, consistency, and efficiency for VMs in Red Hat OpenShift Virtualization.

https://developers.redhat.com/learn/manage-openshift-virtual-machines-gitops?sc_cid=RHCTG0250000438530

openshift

Published 04/02/2025

Free Machine Learning Book

Understanding Machine Learning, by Shai Shalev-Shwartz and Shai Ben-DavidPublished 2014 by Cambridge University Press

PDF of manuscript posted by permission of Cambridge University Press.Users may download a copy for personal use only.

Not for distribution.

https://www.cs.huji.ac.il/w~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf

Machine Learning

Published 28/01/2025

How to remove a worker node from Red Hat OpenShift Container Platform 4 UPI?

When you delete a node using the CLI, the node object is deleted in Kubernetes, but the pods that exist on the node are not deleted. Any bare pods not backed by a replication controller become inaccessible to OpenShift Container Platform. Pods backed by replication controllers are rescheduled to other available nodes. You must delete local manifest pods.

To delete the node from the UPI installation, the node must be firstly drained and then marked unschedulable prior to deleting it:

$ oc adm cordon <node_name>
$ oc adm drain <node_name> --force --delete-local-data --ignore-daemonsets
- Ensure also that there are no current jobs/cronjobs being ran or scheduled in this specific node as the draining does not take it into consideration.
- For Red Hat OpenShift Container Platform 4.7+, utilize the option `--delete-emptydir-data` in case `--delete-local-data` doesn't work. The `--delete-local-data` option is deprecated in favor of `--delete-emptydir-data`.

$ oc get node <node_name> -o yaml > backupnode.yaml

Before proceeding with deletion of the node, it needs to be under "power off" status:
$ oc delete node <node_name>

Although the node object is now deleted from the cluster, it can still rejoin the cluster after reboot or if the kubelet service is restarted. To permanently delete the node and all its data, you must decommission the node once it is in shutdown mode.

Once the node is deleted, it can be ready for a power-off activity, or if it is needed to rejoin the cluster, it could be possible to either restart the kubelet or create the yaml back:

$ oc create -f backupnode.yaml

In order to get the node back, it can also be back by restarting kubelet:

$ systemctl restart kubelet

If it is needed to destroy then all the data from the worker node to delete all the software installed, execute the following:

# nohup shred -n 25 -f -z /dev/[HDD]
This command will overwrite all data on /dev/[HDD] repeatedly, in order to make it harder for even very expensive hardware probing to recover the data. Command line parameter -z will overwrite this device with zeros at the end of cycle to re-write data 25 times (it can be overridden with -n [number]).

One should consider running this command from RescueCD.

In order to monitor the deletion of the node, get the kubelet live logs:

$ oc adm node-logs <node-name> -u kubelet

https://access.redhat.com/solutions/4976801

Uncategorized

Published 24/01/2025

Isolating Infrastructure Nodes

Applying a specific node selector to all infrastructure components will guarantee that they will be scheduled on nodes with that label. See more details on node selectors in placing pods on specific nodes using node selectors, and about node labels in understanding how to update labels on nodes.

Our node label and matching selector for infrastructure components will be node-role.kubernetes.io/infra: "".

To prevent other workloads from also being scheduled on those infrastructure nodes, we need one of two solutions:

Apply a taint to the infrastructure nodes and tolerations to the desired infrastructure workloads.
OR
Apply a completely separate label to your other nodes and matching node selector to your other workloads such that they are mutually exclusive from infrastructure nodes.

TIP: To ensure High Availability (HA) each cluster should have three Infrastructure nodes, ideally across availability zones. See more details about rebooting nodes running critical infrastructure.

TIP: Review the infrastructure node sizing suggestions

By default all nodes except for masters will be labeled with node-role.kubernetes.io/worker: "". We will be adding node-role.kubernetes.io/infra: "" to infrastructure nodes.

However, if you want to remove the existing worker role from your infra nodes, you will need an MCP to ensure that all the nodes upgrade correctly. This is because the worker MCP is responsible for updating and upgrading the nodes, and it finds them by looking for this node-role label. If you remove the label, you must have a MachineConfigPool that can find your infra nodes by the infra node-role label instead. Previously this was not the case and removing the worker label could have caused issues in OCP <= 4.3.

This infra MCP definition below will find all MachineConfigs labeled both “worker” and “infra” and it will apply them to any Machines or Nodes that have the “infra” role label. In this manner, you will ensure that your infra nodes can upgrade without the “worker” role label.

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: infra
spec:
  machineConfigSelector:
    matchExpressions:
      - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]}
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/infra: ""

If you are not using the MachineSet API to manage your nodes, labels and taints are applied manually to each node:

Label it:

oc label node <node-name> node-role.kubernetes.io/infra=
oc label node <node-name> node-role.kubernetes.io=infra

Taint it:

oc adm taint nodes -l node-role.kubernetes.io/infra node-role.kubernetes.io/infra=reserved:NoSchedule node-role.kubernetes.io/infra=reserved:NoExecute

openshift Uncategorized

Published 23/01/2025

Infrastructure Nodes in OpenShift 4

Infrastructure nodes allow customers to isolate infrastructure workloads for two primary purposes:

to prevent incurring billing costs against subscription counts and
to separate maintenance and management.

This solution is meant to complement the official documentation on creating Infrastructure nodes in OpenShift 4. In addition there is a great OpenShift Commons video describing this whole process: OpenShift Commons: Everything about Infra nodes

To resolve the first problem, all that is needed is a node label added to a particular node, set of nodes, or machines and machineset. Red Hat subscription vCPU counts omit any vCPU reported by a node labeled node-role.kubernetes.io/infra: "" and you will not be charged for these resources from Red Hat. Please see How to confirm infra nodes not included in subscription cost in OpenShift Cluster Manager? to confirm your vCPU reports correctly after applying the configuration changes in this article.

To resolve the second problem we need to schedule infrastructure workloads specifically to infrastructure nodes and also to prevent other workloads from being scheduled on infrastructure nodes. There are two strategies for accomplishing this that we will go into later.

You may ask why infrastructure workloads are different from those workloads running on the control plane. At a minimum, an OpenShift cluster contains 2 worker nodes in addition to 3 control plane nodes. While control plane components critical to the cluster operability are isolated on the masters, there are still some infrastructure workloads that by default run on the worker nodes – the same nodes on which cluster users deploy their applications.

Note: To know the workloads that can be executed in infrastructure nodes, check the “Red Hat OpenShift control plane and infrastructure nodes” section in OpenShift sizing and subscription guide for enterprise Kubernetes.

Planning node changes around any nodes hosting these infrastructure components should not be addressed lightly, and in general should be addressed separately from nodes specifically running normal application workloads.

openshift

MySphere Posts

What Is Error 513 on macOS?

The Solution: Keka to the Rescue

Step 1: Download and Install Keka

Step 2: Open Your Problematic Zip File

Step 3: Extract the File

Why Keka Works When Archive Utility Doesn’t

Bonus Tips

Final Thoughts

New Flags in Must-Gather

--since

--since-time

Existing Flags for Enhanced Customization

Conclusion

Prerequisites

Step 1: Install HAProxy

Step 2: Configure HAProxy

Using watch on macOS

Additional Options

Alternative: Using a while Loop

`--since`

`--since-time`

Using `watch` on macOS

Alternative: Using a `while` Loop