No description
Find a file
2025-12-09 20:30:24 +01:00
.claude Convert ASCII art to mermaid diagrams 2025-11-13 04:08:30 +01:00
ansible v0.0.35 ingress interface is in separate vrf 2025-12-06 15:37:07 +01:00
backend Active-Active with ECMP multipath 2025-11-07 15:17:01 +01:00
container-test maglev-test 2025-11-05 18:40:09 +01:00
gw Convert ASCII art to mermaid diagrams 2025-11-13 04:08:30 +01:00
horizon Remove public IPs 2025-12-09 20:30:24 +01:00
l4 v0.0.35 ingress interface is in separate vrf 2025-12-06 15:37:07 +01:00
l7 How to blacklist conntract 2025-12-06 04:42:45 +01:00
load-test Hook to sdn 2025-11-25 00:41:56 +01:00
CLAUDE.md Doc update 2025-11-11 13:19:27 +01:00
CONFIGURATION.md Refactor fwmarks 2025-12-05 10:19:23 +01:00
IPFIX.md Active-Active with ECMP multipath 2025-11-07 15:17:01 +01:00
NOTES.md v0.0.17, extra inventory checks for vip and groups 2025-11-13 15:21:47 +01:00
README.md Docs about DNS 2025-11-19 19:39:45 +01:00

Maglev Load Balancing Infrastructure

High-performance, highly available load balancing infrastructure using BGP ECMP, IPVS Maglev hashing, and HAProxy.

Table of Contents

Architecture Overview

Network Separation:

  • Control Plane (BGP): Runs on mgmt network (10.1.7.x / 2a0a:7180:81:7::x)
  • Data Plane (Traffic): Flows through ingress0 network (10.1.10.x / 2a0a:7180:81:a::x)

This separation allows BGP sessions to remain stable on the management network while data traffic flows through dedicated high-performance interfaces.

%% Enhanced Maglev / Gateway topology diagram (with .x.y addressing)
graph TD
    %% --- Internet and Router ---
    Internet(["🌐 Internet Traffic"])
    MikroTik(["🛜 Mikrotik Router<br/>int-gw03m<br/>.7.2"])

    Internet --> MikroTik

    %% --- Gateways ---
    subgraph Gateways["💠 Gateway Layer"]
        MGW1(["mgw01<br/>Control: .7.4<br/>Data: .10.4<br/><b>FRR Route Reflector</b>"])
        MGW2(["mgw02<br/>Control: .7.5<br/>Data: .10.5<br/><b>FRR Route Reflector</b>"])
    end

    MikroTik -->|iBGP Active<br/>Control: .7.4| MGW1
    MikroTik -.->|iBGP Standby<br/>Control: .7.5| MGW2

    %% --- Maglevs ---
    subgraph Maglevs["⚙️ Maglev Layer"]
        MAG1(["maglev01<br/>Control: .7.17<br/>Data: .10.17<br/><b>ExaBGP + IPVS</b>"])
        MAG2(["maglev02<br/>Control: .7.18<br/>Data: .10.18<br/><b>ExaBGP + IPVS</b>"])
    end

    MGW1 -->|ECMP 50%<br/>5-tuple hash| MAG1
    MGW1 -->|ECMP 50%<br/>5-tuple hash| MAG2
    MGW2 -->|ECMP 50%| MAG1
    MGW2 -->|ECMP 50%| MAG2

    MGW1 -.->|BFD &lt;1s failover| MAG1
    MGW1 -.->|BFD &lt;1s failover| MAG2
    MGW2 -.->|BFD &lt;1s failover| MAG1
    MGW2 -.->|BFD &lt;1s failover| MAG2

    %% --- IPVS Routing Layer ---
    IPVS{{"⚡ IPVS Routing<br/>Fwmark + mh-hash"}}

    MAG1 --> IPVS
    MAG2 --> IPVS

    %% --- Load Balancers ---
    subgraph MLBs["🧭 L7 Backends"]
        subgraph TraditionalHA["Traditional HAProxy"]
            MLB1(["mlb01<br/>Control: .7.33<br/>Data: .10.33"])
            MLB2(["mlb02<br/>Control: .7.34<br/>Data: .10.34"])
            MLB3(["mlb03<br/>Control: .7.35<br/>Data: .10.35"])
        end
        subgraph K8sHA["Kubernetes + HAProxy Ingress"]
            K8S1(["k0s01<br/>Control: .7.61<br/>Data: .10.61"])
            K8S2(["k0s02<br/>Control: .7.62<br/>Data: .10.62"])
            K8S3(["k0s03<br/>Control: .7.63<br/>Data: .10.63"])
        end
    end

    IPVS -.-> MLB1
    IPVS -.-> MLB2
    IPVS -.-> MLB3
    IPVS -.-> K8S1
    IPVS -.-> K8S2
    IPVS -.-> K8S3

    %% --- Application Servers ---
    subgraph Apps["🧩 Application Layer"]
        APP1(["App Server 1<br/>:80/:443"])
        APP2(["App Server 2<br/>:80/:443"])
        APP3(["App Server N<br/>:80/:443"])
    end

    MLB1 --> APP1
    MLB1 --> APP2
    MLB1 --> APP3
    MLB2 --> APP1
    MLB2 --> APP2
    MLB2 --> APP3
    MLB3 --> APP1
    MLB3 --> APP2
    MLB3 --> APP3

    MLBs -.DSR (Direct Server Return).-> Internet

    %% --- Styling ---
    classDef router fill:#fef7e0,stroke:#d6b600,stroke-width:1px,color:#333;
    classDef gateway fill:#e1f5ff,stroke:#0288d1,stroke-width:1px,color:#000;
    classDef maglev fill:#fff4e1,stroke:#f57f17,stroke-width:1px,color:#000;
    classDef mlb fill:#e8f5e9,stroke:#388e3c,stroke-width:1px,color:#000;
    classDef k8s fill:#e3f2fd,stroke:#1976d2,stroke-width:1px,color:#000;
    classDef app fill:#f3e5f5,stroke:#7b1fa2,stroke-width:1px,color:#000;

    class MikroTik router;
    class MGW1,MGW2 gateway;
    class MAG1,MAG2 maglev;
    class MLB1,MLB2,MLB3 mlb;
    class K8S1,K8S2,K8S3 k8s;
    class APP1,APP2,APP3 app;

VIP (Virtual IP)

  • IPv4: 10.1.7.16/32
  • IPv6: 2a0a:7180:81:7::10/128

Announced by both maglev nodes via BGP, distributed via ECMP at gateway level.

Simplified VIP Management

VIPs are now configured on a dedicated dummy interface (vip) on both maglev and HAProxy nodes:

Benefits:

  • Add new VIPs in seconds - just edit netplan, no service restarts needed
  • FWMark-based IPVS - traffic routed by packet mark, not destination IP
  • Zero-config BGP - ExaBGP auto-discovers VIPs from dummy interface
  • Cleaner separation - VIPs isolated from loopback and other interfaces

Adding a new VIP:

# 1. Add to /etc/netplan/53-vip.yaml on all nodes
# 2. Run: sudo netplan apply
# 3. Done! Traffic flows immediately (for ports 80/443)

No keepalived restart, no ExaBGP restart, no haproxy restart required!

BGP Topology

int-gw03m (10.1.7.2)
    ↓ iBGP (active/standby)
    ├─→ mgw01 (10.1.7.4) - Route Reflector [ACTIVE]
    └─→ mgw02 (10.1.7.5) - Route Reflector [STANDBY]
            ↓ iBGP (RR Client)
            ├─→ maglev01 (10.1.7.17)
            └─→ maglev02 (10.1.7.18)

Current setup:

  • Mikrotik peers with both gateways (mgw01 and mgw02) via BGP
  • Active/Standby routing: Traffic goes to mgw01, fails over to mgw02 if needed
  • Why not ECMP? Current Mikrotik hardware/config limitation (not datacenter-grade)
  • Both gateways act as route reflectors for maglev01/02
  • Blocks inbound routes from upstream (route-map NO-IN)
  • Accepts VIP announcements from maglev nodes
  • Each gateway distributes traffic via ECMP with maximum-paths 2

Potential enhancement (datacenter setup):

  • Mikrotik ECMP: Configure BGP multipath on Mikrotik
  • Distribute traffic 50/50 across both gateways
  • Requires datacenter-grade router with proper ECMP support

Network Plane Separation

The infrastructure separates control plane (BGP) from data plane (traffic forwarding) for improved stability and performance:

Control Plane Network (mgmt)

  • Purpose: BGP sessions, BFD, management access
  • IPv4: 10.1.7.x/24
  • IPv6: 2a0a:7180:81:7::x/64
  • Components: FRR BGP, ExaBGP, BFD, SSH management

Benefits:

  • BGP sessions remain stable regardless of data plane load
  • Management access independent of user traffic
  • Easier troubleshooting and monitoring
  • Security isolation

Data Plane Network (ingress0)

  • Purpose: Actual user traffic forwarding
  • IPv4: 10.1.10.x/24
  • IPv6: 2a0a:7180:81:a::x/64
  • Components: IPVS forwarding, HAProxy traffic

Benefits:

  • Dedicated bandwidth for user traffic
  • No BGP overhead on data interfaces
  • Can use higher-performance NICs/VLANs for data
  • Better QoS and traffic engineering

Next-Hop Rewriting

Gateways use route-maps to rewrite BGP next-hops to the data plane network:

route-map NH-INGRESS permit 10
 set ip next-hop 10.1.10.4              # mgw01 ingress0 IPv4
 set ipv6 next-hop global 2a0a:7180:81:a::4  # mgw01 ingress0 IPv6

Flow:

  1. Maglev nodes announce VIPs via BGP on control plane (10.1.7.17/18)
  2. Gateways receive BGP updates on control plane
  3. Gateways rewrite next-hop to ingress0 addresses (10.1.10.17/18) before sending to upstream
  4. Data traffic flows through ingress0 network

Dynamic VIP Discovery and Announcement

The ExaBGP announcement script (l4/exabgp/announce-active-active.sh) automatically manages VIP announcements:

Key features:

  • Reads VIPs from dummy interface: Discovers all VIPs directly from vip interface
  • Aggregated health per family: Announces ALL IPv4 VIPs if any IPv4 backend is healthy
  • Simpler logic: No per-VIP granularity - family-wide health checks
  • Auto-detected next-hop: Uses ingress0 interface IP addresses automatically
  • Zero-configuration: Add VIPs to netplan without touching the script

How it works:

# Discovers VIPs from vip interface
get_vips_v4() {
    ip -4 -o addr show dev vip | awk '{split($4,a,"/"); print a[1]}'
}

get_vips_v6() {
    ip -6 -o addr show dev vip | awk '{split($4,a,"/"); print a[1]}'
}

# Auto-detects next-hop from ingress0 interface
NEXTHOP_V4=$(ip -4 addr show dev ingress0 | ...)
NEXTHOP_V6=$(ip -6 addr show dev ingress0 | ...)

# Aggregate health check per family
all_ipv4_services_healthy()  # Check ALL IPv4 IPVS services
all_ipv6_services_healthy()  # Check ALL IPv6 IPVS services

# Announce/withdraw based on aggregate health
if all_ipv4_services_healthy; then
    for vip in $(get_vips_v4); do
        announce route ${vip}/32 next-hop ${NEXTHOP_V4}
    done
fi

Benefits:

  • Add new VIPs by editing netplan only - zero config changes
  • Works with fwmark-based IPVS (ports 80/443)
  • Automatically adapts to network configuration
  • Simpler deployment and maintenance

Health logic:

  • Old approach: Per-VIP checks (if VIP X has backends → announce VIP X)
  • New approach: Aggregated per family (if ANY IPv4 service has backends → announce ALL IPv4 VIPs)
  • Works well with fwmark-based IPVS where multiple VIPs share backend pools

Components

Layer 3/4: Gateways (mgw01/mgw02)

Location: gw/

Purpose: BGP route reflector and ECMP distribution

Technology:

  • FRR (Free Range Routing)
  • BGP route reflection
  • BFD (Bidirectional Forwarding Detection)
  • ECMP with L3+4 hashing

Key Features:

  • Routes traffic from upstream (Mikrotik) to maglev cluster
  • Dual gateways for redundancy (currently active/standby at Mikrotik level)
  • Each gateway distributes traffic 50/50 via ECMP to maglev nodes (5-tuple hash)
  • Subsecond failure detection with BFD (< 1s)
  • Route-reflector with outbound policy support for next-hop rewriting

BGP Route Reflector Configuration:

The gateways act as BGP route reflectors with a critical feature enabled:

router bgp 64516
 bgp route-reflector allow-outbound-policy

Why this is required:

By default, FRR route reflectors do NOT apply outbound policies (like route-maps) to reflected routes. This is per BGP standard behavior to preserve routing information integrity. However, we need to rewrite next-hops from control plane to data plane addresses.

Without allow-outbound-policy:

Maglev announces: 10.1.7.16/32 via 10.1.7.17 (control plane)
Gateway reflects: 10.1.7.16/32 via 10.1.7.17 (unchanged!)
Upstream receives: 10.1.7.16/32 via 10.1.7.17 (WRONG - control plane IP)

With allow-outbound-policy:

Maglev announces: 10.1.7.16/32 via 10.1.7.17 (control plane)
Gateway applies NH-INGRESS route-map
Gateway reflects: 10.1.7.16/32 via 10.1.10.17 (data plane)
Upstream receives: 10.1.7.16/32 via 10.1.10.17 (CORRECT - data plane IP)

The NH-INGRESS route-map rewrites next-hops to ingress0 addresses:

route-map NH-INGRESS permit 10
 set ip next-hop 10.1.10.4              # Data plane IPv4
 set ipv6 next-hop global 2a0a:7180:81:a::4  # Data plane IPv6

This ensures traffic flows through the data plane network (ingress0) while BGP control stays on the management network.

Hosts:

Hostname Control IPv4 Control IPv6 Data IPv4 Data IPv6 Role
mgmt02-vm-mgw01 10.1.7.4 2a0a:7180:81:7::4 10.1.10.4 2a0a:7180:81:a::4 BGP RR + ECMP (Primary)
mgmt02-vm-mgw02 10.1.7.5 2a0a:7180:81:7::5 10.1.10.5 2a0a:7180:81:a::5 BGP RR + ECMP (Secondary)

Layer 4: Maglev Nodes (maglev01/02)

Location: l4/

Purpose: Active-active IPVS load balancing with Maglev consistent hashing

Technology:

  • ExaBGP (BGP route announcements)
  • keepalived (IPVS management with fwmark-based services, no VRRP)
  • nftables (packet marking based on interface + port)
  • IPVS with Maglev (mh) scheduler
  • BFD for fast failure detection
  • Direct Routing (DR) mode

Key Features:

  • Both nodes active simultaneously (no failover)
  • FWMark-based IPVS - traffic routed by packet mark, enabling flexible VIP:port→backend-group mappings
  • Simplified VIP management - add VIPs without keepalived config changes
  • Multiple backend groups - different VIP:port combinations route to different L7 groups (mlb, k0s, etc.)
  • Announces VIPs to gateway via ExaBGP (auto-discovered from vip interface)
  • Health-based route announcement (aggregated per IP family)
  • Maglev hashing with mh-port for per-connection distribution
  • DR mode for optimal performance (no NAT overhead)
  • nftables marks incoming packets based on interface+VIP+port (see CONFIGURATION.md)

Hosts:

Hostname Control IPv4 Control IPv6 Data IPv4 Data IPv6 Role
mgmt02-vm-maglev01 10.1.7.17 2a0a:7180:81:7::11 10.1.10.17 2a0a:7180:81:a::11 IPVS + ExaBGP
mgmt02-vm-maglev02 10.1.7.18 2a0a:7180:81:7::12 10.1.10.18 2a0a:7180:81:a::12 IPVS + ExaBGP

Layer 7: L7 Backends

The cluster supports multiple types of L7 backends for different workloads:

dnsdist + PowerDNS (DNS Load Balancer)

Location: l7/dnsdist/

Purpose: L7 DNS load balancing with PowerDNS authoritative servers

Technology:

  • dnsdist (DNS load balancer)
  • PowerDNS Authoritative Server (secondary mode)
  • LightningStream (zone replication via S3)
  • PROXY protocol for client IP preservation
  • Shadow primary (dns00) with PowerAdmin for zone management

Key Features:

  • L7 DNS load balancing with rate limiting
  • PROXY protocol to preserve client source IPs
  • Pool-based backend grouping
  • Active-active deployment (both dnsdist nodes always active)
  • PowerDNS secondary cluster with LightningStream replication
  • Shadow primary (dns00) for zone management via PowerAdmin

Hosts:

Hostname Control IPv4 Control IPv6 Data IPv4 Data IPv6 Role
mgmt02-vm-dns00 10.1.7.40 2a0a:7180:81:7::28 N/A N/A Shadow Primary (hidden)
mgmt02-vm-dnsdist01 10.1.7.41 2a0a:7180:81:7::29 10.1.10.41 2a0a:7180:81:a::29 dnsdist LB
mgmt02-vm-dnsdist02 10.1.7.42 2a0a:7180:81:7::2a 10.1.10.42 2a0a:7180:81:a::2a dnsdist LB
mgmt02-vm-dns01 10.1.7.43 2a0a:7180:81:7::2b 10.1.10.43 2a0a:7180:81:a::2b PowerDNS
mgmt02-vm-dns02 10.1.7.44 2a0a:7180:81:7::2c 10.1.10.44 2a0a:7180:81:a::2c PowerDNS
mgmt02-vm-dns03 10.1.7.45 2a0a:7180:81:7::2d 10.1.10.45 2a0a:7180:81:a::2d PowerDNS

Documentation: See l7/dnsdist/README.md for detailed DNS architecture including shadow primary, PowerAdmin, PROXY protocol, and LightningStream replication.

Traditional HAProxy (mlb01/02/03)

Location: l7/haproxy/

Purpose: HTTP/HTTPS load balancing and SSL termination

Technology:

  • HAProxy (TCP mode for L4 forwarding)
  • VIP on dummy interface (DR mode)
  • External health check API on port 9300
  • node_exporter on port 9100 for Prometheus monitoring

Key Features:

  • Receives traffic from IPVS via Direct Routing
  • VIP bound on dummy interface (vip) with ARP suppression
  • Frontends bind directly to VIPs (not ingress0 interface)
  • Health check endpoint at /health?frontend= on port 9300
  • Identical haproxy.cfg across all hosts
  • Dual-stack (IPv4 + IPv6)

Hosts:

Hostname Control IPv4 Control IPv6 Data IPv4 Data IPv6 Role
mgmt02-vm-mlb01 10.1.7.33 2a0a:7180:81:7::21 10.1.10.33 2a0a:7180:81:a::21 HAProxy
mgmt02-vm-mlb02 10.1.7.34 2a0a:7180:81:7::22 10.1.10.34 2a0a:7180:81:a::22 HAProxy
mgmt02-vm-mlb03 10.1.7.35 2a0a:7180:81:7::23 10.1.10.35 2a0a:7180:81:a::23 HAProxy

Kubernetes + HAProxy Ingress (k0s01/02/03)

Location: l7/k8s-haproxy/

Purpose: Kubernetes-native HTTP/HTTPS load balancing with HAProxy Ingress Controller

Technology:

  • k0s Kubernetes distribution
  • HAProxy Ingress Controller (DaemonSet with hostNetwork mode)
  • Calico CNI (BIRD mode)
  • VIP on loopback interface (for Calico default IP autodetection - interface=ingress0)
  • IPVS kube-proxy
  • Dual-NIC architecture - control and data plane separation

Key Features:

  • HAProxy runs as Kubernetes Ingress Controller
  • VIP bound on loopback (lo) - allows Calico default IP autodetection to use interface=ingress0
  • hostNetwork mode - HAProxy listens directly on host ports 80/443
  • Dual-NIC setup - control plane (mgmt) for SSH/k8s API, data plane (ingress0) for traffic
  • Ingress resources define routing rules
  • Dual-stack Kubernetes networking (IPv4 + IPv6)
  • Application pods deployed via Kubernetes (e.g., httpbin test app)
  • Firewall rules (nftables) restrict ports to appropriate interfaces

Hosts:

Hostname Control IPv4 Control IPv6 Data IPv4 Data IPv6 Role
mgmt02-vm-k0s01 10.1.7.61 2a0a:7180:81:7::3d 10.1.10.61 2a0a:7180:81:a::3d k0s worker
mgmt02-vm-k0s02 10.1.7.62 2a0a:7180:81:7::3e 10.1.10.62 2a0a:7180:81:a::3e k0s worker
mgmt02-vm-k0s03 10.1.7.63 2a0a:7180:81:7::3f 10.1.10.63 2a0a:7180:81:a::3f k0s worker

Note: k0s controller (k0s00 - 10.1.7.60) is separate and not used for L7 traffic forwarding.

Traffic Routing:

Different VIP:port combinations can be routed to different L7 backend groups using fwmark mappings:

  • VIP 10.1.7.15:80 → Traditional HAProxy (mlb group)
  • VIP 10.1.7.16:80/443 → Kubernetes HAProxy Ingress (k0s group)
  • VIP 2a0a:7180:81:7::10:80/443 → Kubernetes HAProxy Ingress (k0s group)

See CONFIGURATION.md for details on fwmark-based routing configuration.

Traffic Flow

Inbound (Client → Application)

1. Client sends request to VIP (e.g., 10.1.7.16:80)
   ↓
2. Mikrotik routes to mgw01 (10.1.7.4)
   ↓
3. mgw01 ECMP selects maglev01 OR maglev02 based on 5-tuple hash
   (src_ip, src_port, dst_ip, dst_port, protocol)
   ↓
4. maglev node:
   - nftables marks packet based on interface+VIP+port (e.g., fwmark 100 for 10.1.7.16:80)
   - IPVS routes by fwmark to appropriate backend group (mlb OR k0s) using Maglev hash + mh-port
   ↓
5. L7 backend receives packet (Direct Routing - no NAT):
   - Traditional HAProxy (mlb01/02/03) - forwards to application backends
   - Kubernetes HAProxy Ingress (k0s01/02/03) - routes via Ingress to pods
   ↓
6. Application responds (e.g., httpbin pod, mock server)

Note: Fwmark-based routing allows different VIP:port combinations to route to different backend groups. Example: 10.1.7.15:80 → mlb group, 10.1.7.16:80/443 → k0s group.

Outbound (Application → Client)

1. Application responds to HAProxy
   ↓
2. HAProxy responds from VIP (10.1.7.16) directly to client
   (Direct Server Return - bypasses maglev and gateway!)
   ↓
3. Client receives response

⚡ DSR: Return traffic doesn't go through maglev or gateway!

Why DSR?

  • Inbound: 12K req/s × 1KB = 12 MB/s
  • Outbound: 12K req/s × 10KB = 120 MB/s
  • DSR saves 120 MB/s on maglev/gateway!

Key Technologies Explained

BGP ECMP (Equal-Cost Multi-Path)

What: Multiple equal-cost paths to the same destination

Where it's used:

  1. Gateways → Maglev nodes (Active): Each gateway sees VIP announced by both maglev nodes

    10.1.7.16/32 via 10.1.7.17 (maglev01)
    10.1.7.16/32 via 10.1.7.18 (maglev02)
    
  2. Mikrotik → Gateways (Potential): Requires datacenter-grade router

    • Current: Active/Standby (Mikrotik limitation)
    • Potential: ECMP across both gateways with capable upstream router

Distribution: 5-tuple hash (src IP + src port + dst IP + dst port + protocol)

Config: net.ipv4.fib_multipath_hash_policy=1 on gateway

Maglev Consistent Hashing

What: Google's Maglev hashing algorithm for consistent load distribution

Why: Minimal disruption when backends change (consistent hashing property)

How: IPVS mh scheduler with mh-port flag

Key: mh-port includes source port in hash for better distribution

  • Without: All connections from same client → same backend
  • With: Different source ports → distributed across backends

Config: ipvsadm -A -t 10.1.7.16:80 -s mh -b mh-port

BFD (Bidirectional Forwarding Detection)

What: Fast failure detection protocol

Why: BGP takes 90-180s to detect failure, BFD takes < 1s

How: Lightweight packets every 300ms, 3 missed = failure

Performance:

  • Detection time: < 1 second
  • Bandwidth: ~3 Kbps (negligible)
  • CPU: < 0.3%

Result: During node failure, only 85 errors out of 1.4M requests (0.0058%)

Direct Routing (DR) Mode

What: Real servers respond directly to client, bypassing load balancer

Why:

  • Eliminates return path bottleneck
  • Load balancer only handles inbound (typically 10% of traffic)
  • Real servers handle outbound (typically 90% of traffic)

Requirements:

  • VIP configured on VIP interface with /32 prefix (dummy vip or loopback lo)
  • ARP suppression (arp_ignore=1, arp_announce=2)
  • Real server must respond from VIP

Stateless Operation

What: The cluster operates without connection tracking at both gateway and maglev layers

Why:

  • Performance: No state table overhead, pure packet forwarding
  • Scalability: No memory limits from connection tracking tables
  • Reliability: No state synchronization issues across ECMP paths
  • Active-Active: Both maglev nodes operate independently without shared state
  • Simplicity: Easier to reason about and debug

Implementation:

  • IPVS conntrack disabled via sysctls: net.ipv4.vs.conntrack=0 and net.ipv6.vs.conntrack=0
  • No state maintained across packets from the same connection
  • Connection tracking modules (nf_conntrack) may still be loaded as kernel dependencies, but tracking is disabled

Trade-off: Features requiring connection state (NAT helpers, connection counting) are unavailable, but not needed for this architecture with Direct Server Return.

Verification:

# Check IPVS conntrack disabled (maglev nodes)
sysctl net.ipv4.vs.conntrack net.ipv6.vs.conntrack
# Should return: net.ipv4.vs.conntrack = 0
#                net.ipv6.vs.conntrack = 0

# Check no connections are being tracked
sudo conntrack -L 2>/dev/null | wc -l
# Should return: 0

Performance Characteristics

Measured Performance

Test: wrk with 50 threads, 100 connections, 120 seconds

Application backend: Mock HTTP server (10.1.7.12:8888) returning minimal responses. Production application backend with scaled app servers is planned for future testing.

Results:

  • Requests/sec: 12,146
  • Total requests: 1,458,849
  • Avg latency: 9.11ms
  • Transfer rate: 1.83 MB/s

Failover test (maglev02 shutdown during test):

  • Failed requests: 85 out of 1,458,849
  • Error rate: 0.0058%
  • Success rate: 99.9942%
  • Failover time: < 1 second (BFD detected)

Note: Performance numbers reflect infrastructure overhead only. Actual application performance will depend on backend application servers.

Capacity

Current configuration:

Layer Component Count Utilization
L3/4 Gateway 2 Active/Standby
L4 Maglev 2 Active-Active ECMP
L7 Traditional HAProxy (mlb) 3 Active-Active
L7 Kubernetes HAProxy (k0s) 3 Active-Active

Redundancy:

  • Dual gateways (mgw01 + mgw02) - BGP failover (currently not load balanced)
  • Dual maglev nodes - Active-active ECMP
  • Multiple L7 backends - Auto-failover per backend group
  • Flexible routing - Different VIP:port combinations route to different backend groups

Performance: Tested at 12K req/s with 99.9942% availability during node failure

Scale out:

  • Add more L7 backends to existing groups → keepalived auto-adds to IPVS pool
  • Add new L7 backend groups → configure via fwmark mappings
  • Add more maglev nodes → update gateway maximum-paths
  • Add third gateway → Additional redundancy (requires ECMP-capable upstream router)
  • Datacenter enhancement: Use router with BGP ECMP to fully utilize both gateways

Quick Start

Prerequisites

All nodes need:

  • Ubuntu 22.04+ or Debian 12+
  • Network connectivity
  • Passwordless sudo access
  • Python 3.x

Deployment Method

Recommended: Ansible Automation

The entire cluster can be deployed using Ansible automation:

cd ansible/

# Install dependencies
just sync

# Verify connectivity
just ping

# Deploy entire cluster (dry run first)
just dry-run

# Deploy for real
just deploy

📖 See ansible/README.md for detailed deployment instructions

Manual Setup (Alternative)

For manual component-by-component setup or troubleshooting:

1. Gateway Setup

cd gw/
# Follow README.md for:
# - FRR installation
# - BGP configuration
# - ECMP sysctl settings
# - BFD setup (optional but recommended)

📖 See gw/README.md for detailed instructions

2. Maglev Node Setup

cd l4/

# keepalived (IPVS management)
cd keepalived/
# Follow README.md

# ExaBGP (BGP announcements)
cd ../exabgp/
# Follow README.md

# BFD (optional, for fast failover)
cd ../bfd/
# Follow README.md

📖 See l4/keepalived/README.md and l4/exabgp/README.md

3. L7 Backend Setup

Choose one or both L7 backend types:

Option A: Traditional HAProxy
cd l7/haproxy/
# Follow README.md for:
# - HAProxy installation
# - VIPs on dummy vip interface
# - ARP suppression
# - Network configuration

📖 See l7/haproxy/README.md for detailed instructions

Option B: Kubernetes + HAProxy Ingress
cd l7/k8s-haproxy/
# Follow README.md for:
# - k0s cluster deployment
# - HAProxy Ingress Controller installation
# - VIPs on loopback (for Calico default IP autodetection)
# - Test application deployment

📖 See l7/k8s-haproxy/README.md for detailed instructions

Note: You can run both backend types simultaneously! Use fwmark mappings to route different VIP:port combinations to different backend groups. See CONFIGURATION.md.

Operational Procedures

Planned Maintenance

Taking Down a Maglev Node (Zero Downtime)

Example: Maintenance on maglev02

# Step 1: Stop BGP announcements (gateway stops sending traffic)
ssh maglev02 "sudo systemctl stop exabgp"

# Step 2: Wait for BFD detection and BGP convergence
sleep 2

# Step 3: Verify traffic shifted to maglev01
ssh mgw01 "sudo vtysh -c 'show ip route 10.1.7.16'"
# Should only show: nexthop via 10.1.7.17

# Step 4: Verify no active IPVS connections on maglev02
ssh maglev02 "sudo ipvsadm -Ln --stats"
# ActiveConn should be 0 for all backends

# Step 5: Stop keepalived (safe now, no traffic)
ssh maglev02 "sudo systemctl stop keepalived"

# Step 6: Perform maintenance
ssh maglev02 "sudo apt update && sudo apt upgrade -y"

# Step 7: Restart services
ssh maglev02 "sudo systemctl start keepalived"
sleep 2
ssh maglev02 "sudo systemctl start exabgp"

# Step 8: Verify back in service
ssh mgw01 "sudo vtysh -c 'show ip route 10.1.7.16'"
# Should show both: via 10.1.7.17 and via 10.1.7.18

# Step 9: Verify ECMP is working
ssh mgw01 "ip route show 10.1.7.16"
# Should show: nexthop via 10.1.7.17 dev eth0 weight 1
#              nexthop via 10.1.7.18 dev eth0 weight 1

Expected impact: Zero errors (graceful drain)

Taking Down a HAProxy Backend (Automatic)

Using hactl (recommended):

For managing HAProxy backends across all nodes in an LB group, use the hactl tool:

# Navigate to hactl directory
cd l7/haproxy/hactl

# List all backends and server status
./hactl list

# Put a server in maintenance mode (graceful)
./hactl maint lb_name/server_name

# Bring server back up
./hactl enable lb_name/server_name

See l7/haproxy/hactl/README.md for full documentation including:

  • Multi-node management (controls all HAProxy nodes simultaneously)
  • Automatic inconsistency detection
  • Health check visibility
  • Clickable stats URLs

Manual Example: Maintenance on mlb02

# Step 1: Stop HAProxy (keepalived will auto-detect)
ssh mlb02 "sudo systemctl stop haproxy"

# Step 2: Verify removed from IPVS pool (both maglev nodes)
ssh maglev01 "sudo ipvsadm -Ln | grep 10.1.7.44"
ssh maglev02 "sudo ipvsadm -Ln | grep 10.1.7.44"
# Should show nothing (backend removed)

# Step 3: Perform maintenance
ssh mlb02 "sudo apt update && sudo apt upgrade -y"

# Step 4: Start HAProxy
ssh mlb02 "sudo systemctl start haproxy"

# Step 5: Verify re-added to IPVS pool
ssh maglev01 "sudo ipvsadm -Ln | grep 10.1.7.44"
# Should show: -> 10.1.7.44:80 Route Weight 1

Expected impact: Minimal (automatic failover to other backends)

Health check frequency: 6 seconds (keepalived)

Taking Down a Gateway (Zero Downtime)

Current setup: Dual gateways (mgw01 and mgw02) - Active/Standby failover

Example: Maintenance on mgw02 (standby)

# Step 1: Verify mgw02 is standby
ssh mikrotik "/routing/bgp/session print where remote-address=10.1.7.5"
# Should show: Established

ssh mikrotik "/ip/route print where dst-address=10.1.7.16/32"
# Should show traffic going via 10.1.7.4 (mgw01 is active)

# Step 2: Stop BGP on mgw02 (no traffic impact - it's standby)
ssh mgw02 "sudo systemctl stop frr"

# Step 3: Perform maintenance on mgw02
ssh mgw02 "sudo apt update && sudo apt upgrade -y"

# Step 4: Restart FRR on mgw02
ssh mgw02 "sudo systemctl start frr"

# Step 5: Verify mgw02 back in service
ssh mikrotik "/routing/bgp/session print where remote-address=10.1.7.5"
# Should show: Established

Expected impact: Zero (mgw02 is standby, no active traffic)

Maintenance on mgw01 (active):

  • Traffic will automatically failover to mgw02 via BGP
  • Failover time: 90-180s (BGP hold time) or < 1s with BFD
  • During failover, traffic continues with minimal impact

Unplanned Failure (Automatic)

Maglev Node Failure

Detection: BFD (< 1s) or BGP (90-180s if no BFD)

Action: Automatic

1. BFD detects failure (300-900ms)
2. Gateway tears down BGP session
3. Gateway removes from ECMP pool
4. All traffic goes to remaining maglev node

Impact: 0.0058% errors (85 out of 1.4M requests in test)

HAProxy Backend Failure

Detection: keepalived health check (6 seconds)

Action: Automatic

1. keepalived HTTP_GET to /haproxy-health fails
2. keepalived removes from IPVS pool on both maglev nodes
3. Traffic distributed to remaining backends

Impact: Minimal (automatic failover within 6-18 seconds)

Gateway Failure

Current setup: Dual gateways provide automatic failover (active/standby)

Detection: BGP (90-180s hold time), faster with BFD if configured on Mikrotik

Action: Automatic

1. mgw01 (active) failure detected by BGP
2. Mikrotik removes failed gateway from routing table
3. Mikrotik routes all traffic via mgw02
4. Traffic continues with no manual intervention

Impact:

  • With BFD on Mikrotik: < 1 second failover
  • Without BFD (BGP only): 90-180 second failover
  • Traffic continues via remaining gateway once failover complete

Note: Currently mgw02 is standby, so its failure has no traffic impact.

Monitoring

Health Checks

Gateway (mgw01/mgw02):

# BGP status
sudo vtysh -c "show bgp summary"
# Should show: 3 neighbors Established (upstream + 2 maglev nodes)

# BFD status
sudo vtysh -c "show bfd peers"
# Should show: 2 peers up (maglev01 and maglev02)

# ECMP routes
ip route show 10.1.7.16
# Should show: 2 nexthops (to maglev nodes)

Maglev nodes:

# ExaBGP status
sudo systemctl status exabgp

# IPVS status
sudo ipvsadm -Ln
# Should show: 3 backends (10.1.7.43/44/45)

# keepalived status
sudo systemctl status keepalived

HAProxy backends:

# HAProxy status
sudo systemctl status haproxy

# Health endpoint
curl http://10.1.7.43/haproxy-health
# Should return: 200 OK

Automated Diagnostics with Horizon

Horizon is a comprehensive diagnostic tool for automated cluster health validation using testinfra.

Location: horizon/

Purpose: Automated testing and validation of all cluster components

Key Features:

  • Automatic prerequisite validation - SSH and sudo checks before running tests
  • SSH-based remote testing of all nodes
  • Pretty console output with checkmarks and warnings
  • Per-component test organization
  • Comprehensive coverage of critical cluster components
  • MTU validation
  • Interface error statistics
  • Connection tracking verification (stateless operation)
  • Cluster validation mode (no SSH needed)

Quick Start:

cd horizon
uv sync
horizon                        # Run all diagnostics (auto-checks prerequisites)
horizon --host maglev          # Check only maglev nodes
horizon --cluster-only         # Fast cluster validation (no SSH)
horizon --force                # Skip prerequisite checks

Example Output:

╭────────────────────────────╮
│ Maglev Cluster Diagnostics │
╰────────────────────────────╯

Maglev Nodes
  maglev01
    ✓ ExaBGP Running
    ✓ Keepalived Running
    ✓ Ingress0 Interface Up
    ✓ Ingress0 MTU
    ✓ Ingress0 No Errors
    ✓ VIP Interface Exists
    ✓ IPVS Configuration
    ✓ IPVS Backends Healthy
    ✓ Connection Tracking Disabled

      Summary
╭──────────┬───────╮
│ Status   │ Count │
├──────────┼───────┤
│ ✓ Passed │    15 │
╰──────────┴───────╯

✓ All checks passed!

Documentation: See horizon/README.md for detailed usage and test coverage.

Monitoring Endpoints

HAProxy stats UI (per host):

Metrics to monitor:

Metric Command Alert Threshold
BGP neighbors (gateway) vtysh -c "show bgp summary" < 3 neighbors
BFD peers (gateway) vtysh -c "show bfd peers" Any peer down
ECMP paths (gateway) ip route show 10.1.7.16 < 2 nexthops
IPVS backends (maglev) ipvsadm -Ln < 3 backends
HAProxy health curl /haproxy-health != 200

Monitoring Script Example

Create /usr/local/bin/check-lb-health.sh:

#!/bin/bash
# Simple health check for load balancing infrastructure

ERRORS=0

# Check on gateway (mgw01/mgw02)
if hostname | grep -q mgw; then
    # Check BGP neighbors (should be 3: upstream + 2 maglev nodes)
    NEIGHBORS=$(sudo vtysh -c "show bgp summary" | grep -c Established)
    if [ "$NEIGHBORS" -lt 3 ]; then
        echo "ERROR: Only $NEIGHBORS BGP neighbors established (expected 3)"
        ERRORS=$((ERRORS + 1))
    fi

    # Check BFD peers
    BFD_UP=$(sudo vtysh -c "show bfd peers brief" | grep -c " up ")
    if [ "$BFD_UP" -lt 2 ]; then
        echo "ERROR: Only $BFD_UP BFD peers up (expected 2)"
        ERRORS=$((ERRORS + 1))
    fi

    # Check ECMP
    NEXTHOPS=$(ip route show 10.1.7.16 | grep -c "nexthop via")
    if [ "$NEXTHOPS" -lt 2 ]; then
        echo "ERROR: Only $NEXTHOPS ECMP nexthops (expected 2)"
        ERRORS=$((ERRORS + 1))
    fi
fi

# Check on maglev nodes
if hostname | grep -q maglev; then
    # Check ExaBGP
    if ! systemctl is-active --quiet exabgp; then
        echo "ERROR: ExaBGP not running"
        ERRORS=$((ERRORS + 1))
    fi

    # Check keepalived
    if ! systemctl is-active --quiet keepalived; then
        echo "ERROR: keepalived not running"
        ERRORS=$((ERRORS + 1))
    fi

    # Check IPVS backends
    BACKENDS=$(sudo ipvsadm -Ln | grep -c "Route")
    if [ "$BACKENDS" -lt 3 ]; then
        echo "WARNING: Only $BACKENDS IPVS backends (expected 3)"
    fi
fi

# Check on HAProxy backends
if hostname | grep -q mlb; then
    # Check HAProxy
    if ! systemctl is-active --quiet haproxy; then
        echo "ERROR: HAProxy not running"
        ERRORS=$((ERRORS + 1))
    fi

    # Check health endpoint
    MYIP=$(hostname -I | awk '{print $1}')
    if ! curl -sf http://${MYIP}/haproxy-health > /dev/null; then
        echo "ERROR: HAProxy health check failed"
        ERRORS=$((ERRORS + 1))
    fi
fi

if [ $ERRORS -eq 0 ]; then
    echo "OK: All health checks passed"
    exit 0
else
    echo "CRITICAL: $ERRORS errors found"
    exit 2
fi

Install:

sudo chmod +x /usr/local/bin/check-lb-health.sh

# Run from cron
echo "*/5 * * * * /usr/local/bin/check-lb-health.sh" | sudo tee -a /etc/crontab

Troubleshooting

Traffic Not Reaching VIP

# On gateway
sudo vtysh -c "show ip route 10.1.7.16"
# Should show routes to maglev nodes

# Check BGP
sudo vtysh -c "show bgp ipv4 unicast 10.1.7.16/32"
# Should show paths from both maglev nodes

# On maglev node
sudo journalctl -u exabgp | tail
# Should show: "announce route 10.1.7.16/32 next-hop self"

# Check IPVS
sudo ipvsadm -Ln
# Should show virtual server and backends

Traffic Not Distributed Evenly

# On gateway - check ECMP hash policy
sysctl net.ipv4.fib_multipath_hash_policy
# Should be: 1 (L3+4 hashing)

# On maglev - check mh-port
sudo ipvsadm -Ln --sort
# Should show: Scheduler: mh (NOT rr, lc, etc)

# Verify mh-port is enabled (kernel logs)
sudo journalctl -k | grep mh-port

High Error Rate

# Check if BFD is working
sudo vtysh -c "show bfd peers"
# All peers should be "up"

# Check BGP session flapping
sudo journalctl -u frr | grep -i established | tail -20

# Check for network issues
mtr -r -c 100 10.1.7.17
# Should have < 1% packet loss

Advanced Topics

Scaling

Add more HAProxy backends:

  1. Deploy HAProxy on new host
  2. Configure VIPs on appropriate interface (dummy vip or loopback)
  3. keepalived auto-detects via health checks
  4. Automatically added to IPVS pool

Add more maglev nodes:

  1. Deploy maglev03 (10.1.7.19)
  2. Configure ExaBGP, keepalived, BFD
  3. Update gateway: maximum-paths 3
  4. Automatically joins ECMP pool

Add third gateway (requires ECMP-capable router):

  1. Deploy mgw03 with same config
  2. Add to upstream router BGP peers
  3. Enable BGP multipath/ECMP on upstream router
  4. Increases redundancy and capacity

IPv6

All components support dual-stack (IPv4 + IPv6):

  • VIP IPv6: 2a0a:7180:81:7::10/128
  • Gateway: 2a0a:7180:81:7::4 (mgw01), ::5 (mgw02)
  • Maglev: 2a0a:7180:81:7::11, ::12
  • HAProxy: 2a0a:7180:81:7::2b, ::2c, ::2d

Same configuration applies, just enable ipv6 unicast in BGP.

Security Considerations

Network segmentation:

  • Management network: SSH access only
  • Ingress network: Load balancer traffic only
  • Backend network: Application traffic only

Firewall rules:

  • Gateway: Allow BGP (179), BFD (3784)
  • Maglev: Allow BGP (179), BFD (3784), IPVS traffic
  • HAProxy: Allow HTTP/HTTPS (80/443), health checks

BGP security:

  • route-map NO-IN blocks upstream routes (prevents route injection)
  • Route reflector prevents horizontal peering (reduces attack surface)
  • TTL security (optional): neighbor 10.1.7.17 ttl-security hops 1

Limitations (current non-datacenter setup):

  • Mikrotik router: Active/Standby gateway usage (no ECMP)
  • Single upstream router (no redundancy at internet edge)
  • Can be upgraded with datacenter-grade equipment for full ECMP

Performance Tuning

Kernel Parameters

Gateway nodes:

# IP forwarding
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1

# ECMP multipath hashing (L3+4)
net.ipv4.fib_multipath_hash_policy = 1
net.ipv6.fib_multipath_hash_policy = 1

Maglev nodes:

# IP forwarding
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1

# Stateless operation (NO connection tracking)
net.ipv4.vs.conntrack = 0
net.ipv6.vs.conntrack = 0

HAProxy nodes (traditional and k8s):

# Disable rp_filter for Direct Server Return
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0

# Allow binding to non-local addresses (VIPs)
net.ipv4.ip_nonlocal_bind = 1
net.ipv6.ip_nonlocal_bind = 1

Note: These are automatically deployed by Ansible (ansible/playbooks/deploy_sysctl.yml)

HAProxy Tuning

global
    maxconn 100000        # Increase for high traffic
    nbthread 4            # Match CPU cores

IPVS Tuning

Note: This cluster runs IPVS in stateless mode (net.ipv4.vs.conntrack=0) for optimal performance and scalability. Connection tracking-related tunables are not applicable.

For stateless IPVS operation:

  • No connection table overhead
  • No timeout tuning needed
  • No state synchronization required
  • Works perfectly with Direct Server Return (DSR)

References

Documentation

Deployment:

  • Ansible Automation: ansible/README.md - Automated cluster deployment (recommended)
  • Configuration Management: CONFIGURATION.md - Inventory structure, fwmark mappings, node types
  • Horizon Diagnostics: horizon/README.md - Automated cluster health validation

Manual Setup (Component-Specific):

  • Gateway: gw/README.md - FRR, BGP, ECMP
  • BFD: gw/bfd.md - Fast failure detection
  • Maglev IPVS: l4/keepalived/README.md - IPVS configuration
  • ExaBGP: l4/exabgp/README.md - BGP announcements
  • dnsdist + PowerDNS: l7/dnsdist/README.md - DNS load balancing with dnsdist, PowerDNS, LightningStream, and shadow primary
  • HAProxy (Traditional): l7/haproxy/README.md - L7 load balancing with standalone HAProxy
  • HAProxy (Kubernetes): l7/k8s-haproxy/README.md - L7 load balancing with k0s and HAProxy Ingress

External Resources

RFCs

License

This configuration is provided as-is for educational and operational purposes.

DNS Records

; gateways
mgmt02-vm-mgw01                 IN    A         10.1.7.4
mgmt02-vm-mgw02                 IN    A         10.1.7.5
mgmt02-vm-mgw01                 IN    AAAA      2a0a:7180:81:7::4
mgmt02-vm-mgw02                 IN    AAAA      2a0a:7180:81:7::5
ingress0.mgmt02-vm-mgw01        IN    A         10.1.10.4
ingress0.mgmt02-vm-mgw02        IN    A         10.1.10.5
ingress0.mgmt02-vm-mgw01        IN    AAAA      2a0a:7180:81:a::4
ingress0.mgmt02-vm-mgw02        IN    AAAA      2a0a:7180:81:a::5

; maglev
mgmt02-vm-maglev01              IN    A         10.1.7.17
mgmt02-vm-maglev02              IN    A         10.1.7.18
mgmt02-vm-maglev01              IN    AAAA      2a0a:7180:81:7::11
mgmt02-vm-maglev02              IN    AAAA      2a0a:7180:81:7::12
ingress0.mgmt02-vm-maglev01     IN    A         10.1.10.17
ingress0.mgmt02-vm-maglev02     IN    A         10.1.10.18
ingress0.mgmt02-vm-maglev01     IN    AAAA      2a0a:7180:81:a::11
ingress0.mgmt02-vm-maglev02     IN    AAAA      2a0a:7180:81:a::12

;
; mlb
;

; mlb mgmt
mgmt02-vm-mlb01                 IN    A         10.1.7.33
mgmt02-vm-mlb02                 IN    A         10.1.7.34
mgmt02-vm-mlb03                 IN    A         10.1.7.35
mgmt02-vm-mlb01                 IN    AAAA      2a0a:7180:81:7::21
mgmt02-vm-mlb02                 IN    AAAA      2a0a:7180:81:7::22
mgmt02-vm-mlb03                 IN    AAAA      2a0a:7180:81:7::23
; mlb ingress
ingress0.mgmt02-vm-mlb01        IN    A         10.1.10.33
ingress0.mgmt02-vm-mlb02        IN    A         10.1.10.34
ingress0.mgmt02-vm-mlb03        IN    A         10.1.10.35
ingress0.mgmt02-vm-mlb01        IN    AAAA      2a0a:7180:81:a::21
ingress0.mgmt02-vm-mlb02        IN    AAAA      2a0a:7180:81:a::22
ingress0.mgmt02-vm-mlb03        IN    AAAA      2a0a:7180:81:a::23

;
; k0s
;

; k0s cp
mgmt02-vm-k0s00                 IN    A         10.1.7.60
mgmt02-vm-k0s00                 IN    AAAA      2a0a:7180:81:7::3c
; edge k0s workers mgmt
mgmt02-vm-k0s01                 IN    A         10.1.7.61
mgmt02-vm-k0s02                 IN    A         10.1.7.62
mgmt02-vm-k0s03                 IN    A         10.1.7.63
mgmt02-vm-k0s01                 IN    AAAA      2a0a:7180:81:7::3d
mgmt02-vm-k0s02                 IN    AAAA      2a0a:7180:81:7::3e
mgmt02-vm-k0s03                 IN    AAAA      2a0a:7180:81:7::3f
; edge k0s workers ingress
ingress0.mgmt02-vm-k0s01        IN    A         10.1.10.61
ingress0.mgmt02-vm-k0s02        IN    A         10.1.10.62
ingress0.mgmt02-vm-k0s03        IN    A         10.1.10.63
ingress0.mgmt02-vm-k0s01        IN    AAAA      2a0a:7180:81:a::3d
ingress0.mgmt02-vm-k0s02        IN    AAAA      2a0a:7180:81:a::3e
ingress0.mgmt02-vm-k0s03        IN    AAAA      2a0a:7180:81:a::3f
; other k0s workers mgmt
mgmt02-vm-k0s04                 IN    A         10.1.7.64
mgmt02-vm-k0s05                 IN    A         10.1.7.65
mgmt02-vm-k0s06                 IN    A         10.1.7.66
mgmt02-vm-k0s04                 IN    AAAA      2a0a:7180:81:7::40
mgmt02-vm-k0s05                 IN    AAAA      2a0a:7180:81:7::41
mgmt02-vm-k0s06                 IN    AAAA      2a0a:7180:81:7::42
; other k0s workers ingress
ingress0.mgmt02-vm-k0s04        IN    A         10.1.10.64
ingress0.mgmt02-vm-k0s05        IN    A         10.1.10.65
ingress0.mgmt02-vm-k0s06        IN    A         10.1.10.66
ingress0.mgmt02-vm-k0s04        IN    AAAA      2a0a:7180:81:a::40
ingress0.mgmt02-vm-k0s05        IN    AAAA      2a0a:7180:81:a::41
ingress0.mgmt02-vm-k0s06        IN    AAAA      2a0a:7180:81:a::42

; dns
; shadow primary
mgmt02-vm-dns00                 IN    A         10.1.7.40
mgmt02-vm-dns00                 IN    AAAA      2a0a:7180:81:7::28

; dnsdist
; dnsdist mgmt
mgmt02-vm-dnsdist01             IN    A         10.1.7.41
mgmt02-vm-dnsdist02             IN    A         10.1.7.42
mgmt02-vm-dnsdist01             IN    AAAA      2a0a:7180:81:7::29
mgmt02-vm-dnsdist02             IN    AAAA      2a0a:7180:81:7::2a
; dnsdist ingress
ingress0.mgmt02-vm-dnsdist01    IN    A         10.1.10.41
ingress0.mgmt02-vm-dnsdist02    IN    A         10.1.10.42
ingress0.mgmt02-vm-dnsdist01    IN    AAAA      2a0a:7180:81:a::29
ingress0.mgmt02-vm-dnsdist02    IN    AAAA      2a0a:7180:81:a::2a

; dns
; dns mgmt
mgmt02-vm-dns01                 IN    A         10.1.7.43
mgmt02-vm-dns02                 IN    A         10.1.7.44
mgmt02-vm-dns03                 IN    A         10.1.7.45
mgmt02-vm-dns01                 IN    AAAA      2a0a:7180:81:7::2b
mgmt02-vm-dns02                 IN    AAAA      2a0a:7180:81:7::2c
mgmt02-vm-dns03                 IN    AAAA      2a0a:7180:81:7::2d
; dns ingress
ingress0.mgmt02-vm-dns01        IN    A         10.1.10.43
ingress0.mgmt02-vm-dns02        IN    A         10.1.10.44
ingress0.mgmt02-vm-dns03        IN    A         10.1.10.45
ingress0.mgmt02-vm-dns01        IN    AAAA      2a0a:7180:81:a::2b
ingress0.mgmt02-vm-dns02        IN    AAAA      2a0a:7180:81:a::2c
ingress0.mgmt02-vm-dns03        IN    AAAA      2a0a:7180:81:a::2d