Troubleshooting
Start at Step 1 and work through each step in order. Most issues are resolved by Steps 2 to 4.
Step 1 - DNS Resolution
Check DNS resolution from the client and from within the cluster:
dig <global-logscale-hostname>.<zone> @<route53-nameserver>
nslookup <global-logscale-hostname>.<zone>Step 2 - External Connectivity
Verify TCP connectivity to the ALB and TLS certificate validity:
# Check TCP connectivity (no TLS verification)
nc -zv <alb-hostname> 443
# Or using bash /dev/tcp (if nc unavailable)
timeout 5 bash -c "</dev/tcp/<alb-hostname>/443" && echo "Connected" || echo "Failed"
# Check TLS certificate and cipher suites
curl -kv https://<alb-hostname>/ 2>&1 | grep -E "subject=|issuer=|SSL"Step 3 - ALB Health
Check target health and ALB configuration:
aws elbv2 describe-target-health --target-group-arn <target-group-arn> --region us-west-2Step 4 - Route53 Health Check Status
Verify Route53 health checks are passing:
aws route53 get-health-check-status --health-check-id <health-check-id> --region us-east-1Step 5 - Kubernetes Components
Verify pods, services, and ingress configuration:
kubectl get pods -n logging --context <cluster-context>
kubectl get svc -n logging --context <cluster-context>
kubectl get endpoints -n logging --context <cluster-context>
kubectl get ingress -n logging --context <cluster-context> -o yamlStep 6 - Lambda Not Invoked
Check CloudWatch alarm state:
shellaws cloudwatch describe-alarms --alarm-names "<secondary-cluster>-dr-failover-primary-unhealthy" --region us-east-1Check SNS topic subscriptions:
shellaws sns list-subscriptions-by-topic --topic-arn arn:aws:sns:us-east-1:<account-id>:<secondary-cluster>-dr-failover-sns --region us-east-1Check Lambda logs:
shellaws logs tail /aws/lambda/<secondary-cluster>-dr-failover-handler --region us-east-2
Step 7 - Operator Not Scaling
Verify EKS access entry:
shellaws eks list-access-entries --cluster-name <secondary-cluster> --region us-east-2Check Lambda IAM role permissions:
shellaws iam get-role-policy --role-name <secondary-cluster>-dr-failover-lambda --policy-name <secondary-cluster>-dr-failover-lambda-accessVerify HumioCluster name:
shellkubectl get humiocluster -n logging --context <secondary-cluster>
Step 8 - TLS Certificate Errors
Check if TLS secret exists:
shellkubectl get secret -n logging --context <secondary-cluster> | grep -v tokenVerify CA keypair:
shellkubectl get secret <cluster-name>-ca-keypair -n logging --context <secondary-cluster> -o yamlCheck cert-manager logs:
shellkubectl logs -n cert-manager -l app=cert-manager --context <secondary-cluster>