Troubleshooting Checklist
Note
Troubleshooting workflow: Start at Step 1 (Identify LB Type) and work through each step sequentially. Each step builds on the previous one.
Most issues are resolved by Steps 3โ4 (External Connectivity/Traffic Management) or Step 8 (Security Checks).
Step 1: Identify LB Type
First confirm what OCI is actually provisioning from the Service annotations:
kubectl --context oci-primary -n logging-ingress get svc -o wide
kubectl --context oci-primary -n logging-ingress get svc <service-name> -o yaml | rg -n "load-balancer-type|oci-load-balancer"
Classic LB:
oci.oraclecloud.com/load-balancer-type: lb
Step 2: DNS Resolution
dig logscale.<zone_name>
nslookup logscale.<zone_name>Step 3: External Connectivity (TCP vs TLS)
IP="$(dig +short logscale.<zone_name> A | head -n1)"
nc -zv "$IP" 443
curl -vk --connect-timeout 8 "https://logscale.<zone_name>/"
If nc fails: ingress security rules or public routing/DNS.
If nc succeeds but TLS hangs: LB is accepting TCP but something breaks on the dataplane path to backends (or the return path).
MTU/fragmentation sanity check (rules out "SYN works but payload doesn't")
openssl s_client -connect "$IP:443" -servername logscale.<zone_name> -mtu 1200Step 4: OCI Health Check Status (DNS Traffic Management)
oci health-checks http-monitor-result list --monitor-id <monitor-id> --profile <profile>Step 5: OCI Load Balancer Status
Classic LB
oci lb load-balancer list --compartment-id <compartment-id> --profile <profile>
oci lb backend-health get --backend-set-name <name> --backend-name <name> --load-balancer-id <lb-id> --profile <profile>Step 6: Kubernetes Components (Ingress + ingress-nginx)
export KUBECONFIG=$(pwd)/kubeconfig-dr.yaml
kubectl --context oci-primary -n logging-ingress get pods
kubectl --context oci-primary -n logging-ingress get svc -o wide
kubectl --context oci-primary -n logging-ingress describe svc <nginx-service-name>
kubectl --context oci-primary -n logging-ingress get endpoints <nginx-service-name> -o wide
kubectl --context oci-primary -n logging get ingress
kubectl --context oci-primary -n logging get podsStep 7: NodePort Sanity (From Inside the VCN)
From a bastion host (or any VM inside the VCN), validate that the worker NodePorts respond:
nc -zv <worker-node-ip> <nodeport> If NodePort is unreachable from inside the VCN, fix
Kubernetes/kube-proxy/service endpoints first (before blaming OCI
dataplane).
Step 8: Security Checks
Verify your public IP is present in
public_lb_cidrs(for inbound 80/443 to the public LB).Verify worker nodes allow inbound NodePort range (30000-32767) from:
the LB subnet CIDR (VCN CIDR), and/or
the LB NSG (NSG-to-NSG rules).
If
is-preserve-source=true, worker nodes must also allow the original client IP CIDRs on the NodePort range.
Step 9: Packet Proof: Where Do Packets Stop?
When nc connects but TLS times out, stop guessing and capture evidence.
A) tcpdump on a backend node (during an external curl)
On a backend node (one of the LB backend IPs):
sudo tcpdump -ni any '(tcp port <nodeport>)' -nn -vvIn parallel, from your PC or laptop:
curl -vk --connect-timeout 8 https://logscale.<zone_name>/Decision points:
No packets hit the node NodePorts: LB is not forwarding, or traffic is blocked before the node (subnet SL/NSG/route or OCI dataplane).
SYNs arrive but no SYN-ACK: node is not responding (kube-proxy/service endpoints issue) or ingress is blocked at node SL/NSG.
Full 3-way handshake but no payload (or no reply): look at conntrack/iptables/kube-proxy and pod-level capture.
B) VCN Flow Logs
Enable Flow Logs on:
LB subnet (10.0.2.0/24 in this design)
Worker subnets (10.0.160.0/20, 10.0.176.0/20, 10.0.192.0/20)
Then reproduce the curl and confirm whether flows show:
Client โ LB VIP:443 (should be ACCEPT)
LB private IP โ nodeIP:NodePort (must be ACCEPT)
If there is no LB โ node flow while client connects, open an OCI Support Request with:
LB OCID + region + compartment
timestamps of curl attempts
flow log excerpts showing missing/one-way flows
tcpdump showing absence/presence of packets at backends
Bastion Tunnel
# Check if tunnel is running
lsof -i :16443
# Establish tunnel
LOCAL_PORT=16443 ./scripts/setup-bastion-tunnel-v3.sh --workspace primary kubectl