Health Check API
This feature is in development and as such will continue to change. Please check the documentation and release notes for updates.
The overall health of a LogScale system is determined by a set of individual health checks. For more information about our health checks references for the individual checks see the Health Checks page.
Status API
This is a publicly available endpoint meant for external monitoring
systems and load balancers. It returns a status value
(OK
,
WARN
, or
DOWN
) and the
current version of the LogScale node. The version can be handy when
automating deployments to check that the new version is actually running.
The version is also available for humans in the UI, at the bottom on the
front page.
GET /api/v1/status
The endpoint will return HTTP status code
200
if LogScale is
running, and code 503 service
unavailable
when the status is
DOWN
.
The response contains JSON like this:
{
"status": status
,
"version": version-string
}
Example
$ curl 'http://humio-host
:8080/api/v1/status'
{"status":"OK","version":"1.9.0--build-3034--sha-34fd501fe"}
Health API
This endpoint is targeted for manual use with cURL or similar command line tools, and can be used to quickly obtain information on the overall health of the system and the individual checks. Only root users can access the endpoint.
GET /api/v1/health
Returns HTTP status code "200 OK" when the health state is either OK or WARN and "503 service unavailable" when one or more health checks are in DOWN state.
Example
$ curl -u user
:user-token
https://humio-host
:8080/api/v1/health
OK - everything is working
Uptime: 5m58s (358s)
Humio version: 1.9.0--build-3034--sha-34fd501fe
Health Check Documentation: https://docs.humio.com/cluster-management/health-checks
Individual checks (DOWN=0 WARN=0 OK=5):
backup-disk-usage [OK]: used=61.17% total=234792128512b free=91180826624b path=/mnt/backup/humio
event-latency-p99 [OK]: p99=10.268s min=0.004s p50=0.939s p95=5.896s max=10.364s size=103
not-alive-count [OK]: not-alive=0 number-of-nodes=1
primary-disk-usage [OK]: used=61.17% total=234792128512b free=91180826624b path=/var/local/humio
secondary-disk-usage [OK - not enabled]:
Health API (JSON)
This API provides the same information as the Health API, but as JSON data. This is meant for machine use, scripting, or automation.
GET /api/v1/health-json
The JSON output is loosely defined as follows:
status
:= {"status": status
,
"statusMessage": "some status description,
"uptime": "5m58s (358s)",
"version": "1.9.0--build-3034--sha-34fd501fe",
"oks": [check
, ...],
"warnings": [check
, ...],
"downs": [check
, ...]
}
check
:= {"name": "some name",
"status": status
,
"statusMessage": "some status description",
"fields": {"field": "value", ...}
}
status
:= "OK" | "WARN" | "ERROR"
Example
Here's an example:
$ curl -u user
:user-token
https://humio-host
:8080/api/v1/health-json
Returns:
{
"downs": [],
"oks": [
{
"fields": {
"free": "90692845568b",
"path": "/mnt/backup/humio",
"total": "234792128512b",
"used": "61.37%"
},
"name": "backup-disk-usage",
"status": "OK",
"statusMessage": ""
},
{
"fields": {
"max": "0.167s",
"min": "0.001s",
"p50": "0.075s",
"p95": "0.126s",
"p99": "0.153s",
"size": "1341"
},
"name": "event-latency-p99",
"status": "OK",
"statusMessage": ""
},
{
"fields": {
"not-alive": "0",
"number-of-nodes": "1"
},
"name": "not-alive-count",
"status": "OK",
"statusMessage": ""
},
{
"fields": {
"free": "90692845568b",
"path": "/var/local/humio",
"total": "234792128512b",
"used": "61.37%"
},
"name": "primary-disk-usage",
"status": "OK",
"statusMessage": ""
},
{
"fields": {},
"name": "secondary-disk-usage",
"status": "OK",
"statusMessage": "not enabled"
}
],
"status": "OK",
"statusMessage": "everything is working",
"uptime": "4m19s (259s)",
"version": "0.0.0-DEV",
"warnings": []
}