Replacing Hardware in a Cluster

If you need to replace a node in your LogScale cluster, for whatever reason, you have a number of different options.

Cluster Node Identity

A cluster node is identified in the cluster by its UUID. The UUID is automatically generated the first time a node is started, and stored in $HUMIO_DATA_DIR/cluster_membership.uuid. When moving or replacing a node, you can use this file to ensure a node rejoins the cluster with the same identity.

If the node will continue to run on the same storage, meaning it keeps its data directory, all you will need to do is to ensure that the node is not a Digest Node before shutting down the node:

Navigate to the Cluster nodes page. Find the node that you need to remove from the list, select it and then click the Mark for eviction action at the bottom of the page.
Wait for the eviction process to complete. This can be monitored in the Cluster nodes page. The node must be Removable before it is shutdown.
Shutdown the LogScale process on the node.
At this point you can see the node being unavailable in the Cluster Management UI.
Replace the hardware components.
Start the LogScale process.
Your node should rejoin the cluster after a short time, and you will see the node becoming available in the Cluster Management UI.

If the node fails to come back, remove the node from the cluster completely using the Remove node action.

New Storage Target — Slow Recovery

You are moving a node to different machine, or installing a new disk or SSD.

There are two requirements that must be fulfilled:

Check if your cluster has multiple replicas of data Replication Factor >= 2) and it is acceptable for the cluster to be in a state of lower replication while the new hardware is being provisioned.
Make sure that the node does not contain any data for which it is the sole owner (this can occur if you have archive divergence).
You can check this in the Cluster Management UI, indicated by red numbers in the Size column.
In this case, the cluster can self-heal once the node reappears. It will discover that the node is missing data it was expected to have and will start re-sending it.

Make a copy of the Node UUID file.
While you won't have to copy all the data on the node you must make a backup of the Node UUID file.
It is located in $HUMIO_DATA_DIR/cluster_membership.uuid; you will be copying it to the new data folder on the new storage target.
Make a copy of the global snapshot file to ensure a backup in case it is corrupt in S3.
It is located in $HUMIO_DATA_DIR/global-data-snapshot.json; you will be copying it to the new data folder on the new storage target.
Shut down the LogScale process.
Copy the Node UUID file from step 1 into the node's data folder.
Start the LogScale process using the new storage.
Your node should rejoin the cluster after a short time, and you will see the node becoming available in the Cluster Management UI.
The other nodes will start re-sending the data that is missing, and the Too Low segment of the replication status in the header will initially be high, but will begin dropping as data is replicated.

New Storage Target — Quick Recovery

If you are moving the node to a new storage target and have hard replication requirements, or your cluster is only storing data in one replica, you cannot use the procedure in New Storage Target — Slow Recovery.

To limit the downtime of your node you should copy node's data directory before and shutting down the original node. This will ensure you only have to copy the most recent data when the node is taken offline.

Use rsync or similar to copy the data directory to the new storage (this includes the UUID File).
Assign another node to any Digest Rules where this node is assigned.
This can be done using the LogScale's Management Cluster UI. You can read more about un-assigning digest rules in the section about removing a node.
Shut down the LogScale process.
Rerun rsync or similar to copy the most recent data to the new storage.
Start the LogScale process. Your node should rejoin the cluster after a short time, and you will see the node become available in the Cluster Management UI.

Storage Malfunctions

If the configured storage malfunctions, and there is no replication or the node had data not found on other nodes a different solution is required to remove the node. There are two options:

Restore the node from backup if you have that enabled. See Bucket Storage.
Forcibly remove the node from the cluster. Any data that was not stored in multiple replicas will be lost. See Forcing Removal.

Self-Hosted Overview

Instance Administration

Organization Essentials

Configure Security

API Tokens

IP Filters

Security policies

Session management

Audit Logging

Authentication & Identity Providers

Users & permissions

Manage Roles

Permissions requirements

Cluster Management

Health Checks

Configuration Settings

Ingesting Data

Configuration Variables

LogScale URLs & Endpoints

Limits & Standards

Deployment Overview

Planning Your Deployment

Provisioning

Installing Using Containers

Installing On Bare Metal or Cloud Instance

Reference Architectures

LogScale Kubernetes Reference Architecture

Installing Load Balancers

Deploying Auxiliary Services

Humio Operator

Data Analysis Overview

LogScale User Interface

Repositories & Views

Parsing Data

Searching Data

Writing Queries

Query Language Syntax

Query Joins and Lookups

Query Functions

Dashboards & Widgets

Automation

Template Language

Keyboard Shortcuts