Network Security Configuration

The OCI deployment uses security lists and Network Security Groups (NSGs) to control network access. This section covers the complete network architecture for DR deployments.

VCN Network Architecture

Each DR cluster (primary and secondary) has its own VCN with the following architecture:

VCN Network Architecture
Subnet Configuration
SubnetCIDRTypeRoute TablePurpose
${cluster_name}-lb-subnet10.0.2.0/24Publicpublic-rt (IGW)OCI Classic Load Balancer
${cluster_name}-bastion-subnet10.0.250.0/24Publicbastion-enhanced-rt (IGW)OCI Bastion Service
${cluster_name}-cluster-endpoint-subnet10.0.1.0/28Privateprivate-rt (NAT+SGW)Kubernetes API endpoint
${cluster_name}-node-pool-ad110.0.160.0/20Privateprivate-rt (NAT+SGW)Worker nodes AD1
${cluster_name}-node-pool-ad210.0.176.0/20Privateprivate-rt (NAT+SGW)Worker nodes AD2
${cluster_name}-node-pool-ad310.0.192.0Privateprivate-rt (NAT+SGW)Worker nodes AD3
${cluster_name}-pod-subnet10.0.64.0/18Privateprivate-rt (NAT+SGW)VCN-native pod IPs
Route Tables

Public Route Table (${cluster_name}-public-rt):

DestinationTargetPurpose
0.0.0.0/0Internet GatewayPublic internet access

Private Route Table (${cluster_name}-private-rt):

DestinationTargetPurpose
0.0.0.0/0NAT GatewayOutbound internet via NAT
All Oracle ServicesService GatewayOCI Services access

Bastion Enhanced Route Table (${cluster_name}-bastion-enhanced-rt):

DestinationTargetPurpose
0.0.0.0/0Internet GatewayBastion service connectivity
Network Security Groups (NSGs)

Four NSGs control traffic flow between the internet, load balancer, worker nodes, and Kubernetes API.

The most common networking issue is TLS timeouts when accessing the load balancer from the internet. This happens when the Load Balancer NSG has an egress rule to send traffic to worker nodes, but the Worker NSG is missing the corresponding ingress rule to accept that traffic.

Both rules are required because OCI NSG rules are unidirectional.

Quick fix for TLS timeout: Ensure the Worker NSG has an ingress rule from the LB NSG on ports 30000-32767 (see worker_ingress_from_lb_nsg rule in section 2 below).

Worker NSG (${cluster_name}-worker-nsg)

Ingress Rules:

ProtocolSourcePortsDescription
ALLworker-nsgAllWorker-to-worker communication
ALLbastion-nsgAllBastion access to workers
TCPOCI Services10250OKE control plane to kubelet
TCPOCI Services10255OKE control plane extended
TCPOCI Services12250OKE control plane communication
TCPOCI Services22OKE installation
TCPVCN CIDR22, 80, 443, 6443, 10250, 10255Internal cluster traffic
TCPVCN CIDR30000-32767NodePort range for LB health checks
ICMPVCN CIDRType 3, Code 4Path discovery

Egress Rules:

ProtocolDestinationPort(s)Description
ALL0.0.0.0/0AllGeneral internet access (via NAT)
TCPOCI Services443, 12250OKE services communication
TCP0.0.0.0/06443Kubernetes API
UDP0.0.0.0/053DNS resolution
UDP0.0.0.0/0123NTP time sync
1.4.2 Public Load Balancer NSG (${cluster_name}-public-lb-nsg)

Ingress Rules:

ProtocolSourcePort(s)Description
TCPpublic_lb_cidrs443HTTPS from allowed CIDRs
TCPpublic_lb_cidrs80HTTP from allowed CIDRs

Egress Rules:

ProtocolSourcePort(s)Description
TCPworker-nsg30000-32767LB to worker NodePort services

Important

NSG Rules Are Unidirectional

OCI NSG rules are unidirectional - having an egress rule on NSG-A to NSG-B does not automatically allow the traffic. NSG-B must have a corresponding ingress rule from NSG-A

Unidirectional Rules

Required Rule Pair for LB→Worker Traffic:

NSGDirectionSource/DestPortsRule Name
public-lb-nsgEGRESSworker-nsg30000-32767public_lb_egress_nodeport
worker-nsgINGRESSpublic-lb-nsg30000-32767worker_ingress_from_lb_nsg

Without worker_ingress_from_lb_nsg: TLS handshake times-out because the LB can send traffic, but the worker NSG blocks it at the receiver.

Terraform code (modules/oci/core/main.tf):

terraform
resource "oci_core_network_security_group_security_rule" "worker_ingress_from_lb_nsg" {
  network_security_group_id = oci_core_network_security_group.worker.id
  direction                 = "INGRESS"
  protocol                  = "6" # TCP
  source                    = oci_core_network_security_group.public_lb.id
  source_type               = "NETWORK_SECURITY_GROUP"
  description               = "Allow NodePort traffic from public LB NSG"
  tcp_options {
    destination_port_range {
      min = 30000
      max = 32767
    }
  }
}
1.4.4 API Endpoint NSG (${cluster_name}-api-endpoint-nsg)

Ingress Rules:

ProtocolSourcePort(s)Description
TCPVCN CIDR6443, 22Internal cluster access
TCPcontrol_plane_allowed_cidrs6443External K8s API access

Egress Rules:

ProtocolDestinationPort(s)Description
ALL0.0.0.0/0AllAll outbound traffic
Bastion NSG (${cluster_name}-bastion-nsg)

Ingress Rules:

ProtocolSourcePort(s)Description
TCPbastion_client_allow_list CIDRs22SSH from allowed IPs

Egress Rules:

ProtocolDestinationPort(s)Description
ALL0.0.0.0/0AllAll outbound traffic
TCPVCN CIDR22SSH to VCN resources
ICMPVCN CIDRAllICMP to VCN
Load Balancer Access Control (public_lb_cidrs)

The public_lb_cidrs variable controls which IP ranges can access the load balancer over HTTPS (port 443). This is a security measure to restrict access to known IP ranges (e.g., corporate networks, VPNs)

HCL
# Example: Restrict LB access to specific IP ranges
public_lb_cidrs = [
  "10.110.192.0/22",    # Internal network
  "130.41.55.146/32",   # Specific external IP
  "208.127.86.195/32",  # Another external IP
]

Impact on certificate validation: - When public_lb_cidrs restricts access, Let's Encrypt HTTP-01 validation fails because Let's Encrypt cannot reach port 80 - Solution: Use DNS-01 validation instead (see DNS-01 Certificate Issuance in the DR Operations Guide).

Bastion Service Access Control (bastion_client_allow_list)

The bastion_client_allow_list variable controls which IP ranges can connect to the OCI Bastion Service for cluster access. This is separate from the load balancer security list.

Bastion is enabled by default (provision_bastion = true) for maximum security. To disable bastion and use public endpoint access instead, set provision_bastion = false and endpoint_public_access = true.

Required when provision_bastion=true (default). The list cannot include 0.0.0.0/0.

HCL
# Example: Restrict bastion access to specific IPs
bastion_client_allow_list = [
  "203.176.185.196/32",  # Your IP
  "147.161.213.22/32",   # VPN endpoint
]
Kubernetes API Access Control (control_plane_allowed_cidrs)

The control_plane_allowed_cidrs variable controls which IP ranges can access the Kubernetes API (port 6443) via the API endpoint NSG.

Required when endpoint_public_access=true. The list cannot include 0.0.0.0/0.

HCL
# Example: Restrict K8s API access to specific IPs
control_plane_allowed_cidrs = [
  "10.0.0.0/16",         # VCN CIDR (always included)
  "203.176.185.196/32",  # Your IP for direct API access
]
Kubernetes API Access Modes

OKE clusters support two access modes for the Kubernetes API endpoint:

Access Modeprovision_bastionendpoint_public_accessSecurity Level
Bastion TunneltruefalseHigher (no public exposure)
Public EndpointfalsetrueMedium (IP allowlist)

Bastion Tunnel Mode (Recommended for Production):

  • API endpoint is private (VCN only)

  • Access via OCI Bastion Service SSH tunnel

  • Requires bastion_client_allow_list configuration

  • Terraform commands need: -var="kubernetes_api_host=https://127.0.0.1:<port>"

Public Endpoint Mode (Development/Testing):

  • API endpoint is publicly accessible

  • Protected by control_plane_allowed_cidrs IP allowlist

  • No SSH tunnel required

  • Terraform auto-discovers endpoint from kubeconfig

Security architecture:

Access ModeControl MechanismOpens to internet?
Cluster API (kubectl)OCI Bastion Service + bastion_client_allow_listNo
Cluster API (direct)API Endpoint NSG + control_plane_allowed_cidrsOptional
Load Balancer (HTTPS)Security List + public_lb_cidrsNo - restricted to specific IPs
SSH to nodesNot exposed in public security listNo

Note

The OCI Bastion Service manages its own access control. SSH (port 22) is not opened in the public security list; the bastion service handles access directly.

OCI Load Balancer Backend Node Selection (ingress-nginx)

By default, the OCI Cloud Controller Manager (CCM) adds all cluster nodes as backends to a Service of type LoadBalancer. This causes health check and traffic failures on nodes that don't run the target pods (for example, nodes without ingress-nginx).

To restrict backends to only the nodes running ingress-nginx pods, this implementation uses the OCI Classic Flexible Load Balancer with node label selector and externalTrafficPolicy: Local:

yaml
# Applied to ingress-nginx Service via nginx_ingress_sets in main.tf
annotations:
  oci.oraclecloud.com/load-balancer-type: lb
  service.beta.kubernetes.io/oci-load-balancer-subnet1: <lb_subnet_id>
  service.beta.kubernetes.io/oci-load-balancer-security-list-management-mode: "All"
  oci.oraclecloud.com/oci-network-security-groups: <lb_nsg_id>
  service.beta.kubernetes.io/oci-load-balancer-shape: flexible
  service.beta.kubernetes.io/oci-load-balancer-shape-flex-min: "10"
  service.beta.kubernetes.io/oci-load-balancer-shape-flex-max: "100"
  service.beta.kubernetes.io/oci-load-balancer-backend-protocol: TCP
  oci.oraclecloud.com/node-label-selector: k8s-app=logscale-ingress
spec:
  externalTrafficPolicy: Local

Key configuration options:

Annotation/SettingValuePurpose
oci.oraclecloud.com/load-balancer-typelbUse Classic Load Balancer (not NLB)
service.beta.kubernetes.io/oci-load-balancer-security-list-management-modeAllCCM manages security list rules
oci.oraclecloud.com/oci-network-security-groupsNSG OCIDAttach LB NSG for ingress rules
service.beta.kubernetes.io/oci-load-balancer-shapeflexibleFlexible bandwidth (10-100 Mbps)
service.beta.kubernetes.io/oci-load-balancer-backend-protocolTCPTLS passthrough to nginx-ingress
oci.oraclecloud.com/node-label-selectork8s-app=logscale-ingressOnly ingress nodes as backends
externalTrafficPolicyLocalTraffic only to nodes with pods

Configuration location: main.tf (root module)

terraform
nginx_ingress_sets = [
  {
    name  = "controller.service.annotations.oci\\.oraclecloud\\.com/load-balancer-type"
    value = "lb"
  },
  {
    name  = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/oci-load-balancer-subnet1"
    value = module.core.lb_subnet_id
  },
  {
    name  = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/oci-load-balancer-security-list-management-mode"
    value = "All"
  },
  {
    name  = "controller.service.annotations.oci\\.oraclecloud\\.com/oci-network-security-groups"
    value = module.core.lb_nsg_id
  },
  {
    name  = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/oci-load-balancer-shape"
    value = "flexible"
  },
  {
    name  = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/oci-load-balancer-backend-protocol"
    value = "TCP"
  },
  {
    name  = "controller.service.annotations.oci\\.oraclecloud\\.com/node-label-selector"
    value = "k8s-app=logscale-ingress"
  },
  {
    name  = "controller.service.externalTrafficPolicy"
    value = "Local"
  }
]

Troubleshooting load balancer health checks:

SymptomCauseSolution
Most backends unhealthy / failing health checksMissing node label selector or wrong externalTrafficPolicyAdd oci.oraclecloud.com/node-label-selector and set externalTrafficPolicy: Local
LB created in wrong subnetMissing subnet annotationEnsure service.beta.kubernetes.io/oci-load-balancer-subnet1 is set
TLS timeout (TCP connects but TLS fails)LB NSG not attached or Worker NSG missing ingress from LB NSGVerify oci.oraclecloud.com/oci-network-security-groups and Worker NSG rules

Verify that the annotation is applied:

shell
kubectl --context oci-primary get svc dr-primary-nginx-ingress-nginx-controller \
  -n logging-ingress -o jsonpath='{.metadata.annotations}' | jq .
Request Flow (Internet β†’ LogScale)

The following diagram illustrates the complete network traffic flow from an external client to LogScale pods, showing each network boundary crossing and the security controls at each layer.

Module Dependency Graph

Traffic Flow Details:

StepComponentProtocol/PortSecurity ControlDescription
1DNS QueryUDP 53Public DNSClient queries OCI DNS for logscale-dr.oci-dr.humio.net
2DNS ResponseUDP 53Steering PolicyReturns healthy cluster IP (primary preferred, TTL 30s)
3HTTPS RequestTCP 443public_lb_cidrsClient connects to Load Balancer public IP
4IGW RoutingTCP 443Route TableInternet Gateway routes to LB subnet
5LB β†’ NodePortTCP 30000-32767Security List egressLB forwards to nginx-ingress NodePort on worker nodes
6NodePort β†’ PodTCP (internal)Worker NSGTraffic routed to nginx-ingress pod
7Ingress RoutingHTTP 8080Ingress rulesnginx terminates TLS, routes based on Host header
8Service β†’ PodTCP 8080NetworkPolicy (if configured)kube-proxy load balances to LogScale pods
9Storage AccessHTTPS 443IAM PolicyLogScale reads/writes segments to Object Storage
10ResponseTCP 443StatefulResponse follows reverse path to client

Key Security Boundaries:

  • Internet β†’ VCN: Controlled by public_lb_cidrs in Security List

  • LB β†’ Workers: NodePort range (30000-32767) allowed from VCN CIDR in Worker NSG

  • Pod-to-Pod: Controlled by Kubernetes NetworkPolicy (if configured)

  • Pod β†’ Storage: Controlled by OCI IAM policies for Object Storage access

Detailed LB Traffic Flow with Preserve-Source

When is-preserve-source: true is set on the OCI Classic Load Balancer, the client's original IP address is preserved through the entire path. This requires special security rules.

Preserve-source traffic flow

Source IP Comparison:

ModeSource IP at WorkerUse Case
is-preserve-source=TRUEClient's original IP (<client-ip>)Client IP logging, geo-routing
is-preserve-source=FALSELB's private IP (10.0.2.x)Simpler security rules

Key Points: is-preserve-source=TRUE

When is-preserve-source=TRUE is enabled on the OCI LB:

  • Client IP is preserved: Traffic arrives at worker nodes with the ORIGINAL client IP address, not the LB's private IP.

  • Security rules must allow client IPs: Both the worker NSG and security list must have rules allowing the public_lb_cidrs to access the NodePort range (30000-32767).

  • Health checks use LB's IP: Health check traffic still uses the LB's private IP as source.

  • Bidirectional rules required:

    • LB NSG β†’ Worker NSG (egress rules)

    • Worker NSG ← LB NSG (ingress rules)

    • Worker NSG ← public_lb_cidrs (ingress rules for preserve-source)

    • Worker Security List ← public_lb_cidrs (ingress rules for preserve-source)

DR Failover Scenario

The diagram below illustrates the network path changes during a DR failover event. When the primary cluster becomes unavailable, the OCI DNS steering policy redirects traffic to the secondary cluster. For the full failover procedure and automation details, see the OCI Disaster Recovery (DR) Operations Guide.

DR Failover Network Path
Bastion Access Flow (kubectl)

The diagram below shows how kubectl commands reach the private Kubernetes API endpoint through the OCI Bastion Service SSH tunnel. This is the recommended access pattern for production clusters where provision_bastion=true (default).

kubectl commands
DR Network Considerations

Note

For the full DR failover procedure including DNS steering policy configuration and automated failover via OCI Functions, see the OCI Disaster Recovery (DR) Operations Guide.

For DR failover to work correctly, ensure:

  • Both clusters have identical VCN structures - Same subnet CIDRs relative to VCN

  • Cross-region Object Storage access - IAM policies allow secondary to read primary bucket

  • Health check accessibility - Health check endpoints can reach both cluster load balancers

  • DNS propagation - Low TTL (30s) on steering policy for fast failover