Node Identifiers

Hosts in LogScale are tracked via unique integral identifiers assigned to each physical node in the cluster. These identifiers are known as vhosts.

For example, a cluster with 3 physical hosts may identify their hosts as vhost 1, 2 and 3.

Vhosts are used in a number of places within LogScale when referring to specific hosts. Examples of uses include:

  • Tracking segment replication

  • Deciding host partition assignment for processing log data from the ingest queue

  • Assigning subsets of hosts to run particular tasks

  • Identifying the logging node when analyzing LogScale debug logs

Since vhosts are supposed to identify hosts uniquely, it is important that each host has one, and that they are not shared. Since they are often repeated in many places, they are kept short, which is why UUIDs are not used directly.

A LogScale node booting for the first time will generate a host UUID, which is written to the cluster_membership.uuid file.

The UUID is associated with a vhost via global, which then uniquely identifies the node.

Both the vhost and UUID are written to global so LogScale can detect if multiple hosts try to use the same vhost.

While the UUID doesn't change, the vhost assigned to a host will not change, unless other hosts are manually configured to use the same vhost.

Cluster administrators can control vhost assignment directly in two ways:

  • ini files
    BOOTSTRAP_HOST_ID=5

    will make the node use vhost 5. This is useful if the administrator can easily enumerate the nodes in the cluster, and they wish to manually assign vhosts to nodes.

  • ini files
    BOOTSTRAP_HOST_UUID_COOKIE=xyz

    will make the node use the UUID xyz. This can be useful if the administrator can assign fixed IDs to each node in the cluster, but can't easily generate numeric identifiers for them.

Important

For both options described above, configuration values must be unique across the cluster. Assigning two hosts the vhost 5 will likely cause crashing until the configuration is corrected.

Administrators may also opt to let LogScale assign vhosts automatically. The assignment logic has the following properties:

  • A host with an empty disk will get a fresh vhost number, which is unlikely to have been used by other hosts recently.

  • A host that is currently a cluster member will always regain its old vhost number, as long as it still has its cluster_membership.uuid file.

  • A host that is not currently a cluster member but is rejoining can regain its old vhost number, as long as it kept its disk contents.

The mechanism described above works well for clusters where nodes can keep their disks, since the cluster_membership.uuid file is retained over time.

In order to also support running LogScale on systems like Kubernetes, where disks may occasionally be wiped, we have automated some routine cleanup that must happen when nodes join and leave the cluster over time.

Nodes that have ephemeral disks should be configured with USING_EPHEMERAL_DISKS set to true, or use a NODE_ROLES setting that cannot store segments. This will cause LogScale to consider the node ephemeral, and therefore eligible for automatic removal from the cluster if it goes offline for too long.

If an ephemeral node is offline for too long, a periodic task will unregister it from the cluster, and clean up any references to the associated vhost.

The job logs when hosts are removed, the logs can be found using the query:

logscale
class=*DeadEphemeralHostsDeletionJob*

in the humio repository.

The delay before removing hosts can be adjusted via the GracePeriodBeforeDeletingDeadEphemeralHostsMs dynamic configuration in the GraphQL API; it controls how long an ephemeral node is allowed to be offline before some other node might unregister it from the cluster.

Note

Do not reduce the delay specified in GracePeriodBeforeDeletingDeadEphemeralHostsMs below the default setting of 2 hours. Removing a host can be very expensive for a host with many segments, therefore it is not recommended unless strictly necessary.

If a host is unregistered from the cluster but retains its UUID and local disk, it can rejoin the cluster later and reacquire the vhost it had previously.

Since a host may get a new vhost when a disk is wiped, cluster administrators for clusters using nodes where USING_EPHEMERAL_DISKS is set to true will need to ensure that the storage and digest partitioning tables are kept up to date as hosts join and leave the cluster.

Updating the tables is handled automatically if using the LogScale Kubernetes operator, but for clusters that do not use this operator, cluster administrators should create and run scripts periodically to keep the storage and digest tables up to date.

The cluster GraphQL query can provide updated tables (the suggestedIngestPartitions and suggestedStoragePartitions fields contain these), which can then be applied via the updateIngestPartitionScheme" and updateStoragePartitionScheme GraphQL mutations.

In order to ensure a vhost refers uniquely to one host, and to allow unregistered nodes to rejoin easily, vhosts are not reused frequently, even if the previous owner of a vhost is no longer registered in the cluster.

Automatic vhost assignment assigns within the range 1-10000, starting at 1 and using each number only once. Over time this pool may be exhausted as hosts join and leave. When this happens, assignment will again start at 1, and vhosts that are no longer used by registered hosts will be open for reuse. This should make it very unlikely that a given vhost is reused within a short timeframe.