Chapter 0: Welcome & How to Use This Guide

Welcome to the Data Plane Microservice onboarding guide. This document will take you from zero knowledge of this codebase to being a productive contributor.

Who is this for? Developers joining the data-plane team. This guide assumes you have telecom background (e.g., AMF/system testing) and basic C knowledge, but are new to this specific user-plane codebase.

What is the Data Plane?

The data-plane (also called eric-pc-up-data-plane) is the packet-forwarding engine inside Ericsson's Packet Core Gateway (PCG) product. It processes millions of packets per second β€” GTP tunneling, NAT, DPI, QoS enforcement, service chaining β€” all in a cloud-native Kubernetes pod using DPDK for high-speed I/O.

How to Navigate

Repository Structure at a Glance

data-plane/
β”œβ”€β”€ main/           # Main application (dp.c, main.c, config parsers)
β”œβ”€β”€ pktio/          # Packet I/O subsystem (DPDK, AF_XDP, TAP backends)
β”œβ”€β”€ vrf/            # Virtual Routing & Forwarding, GRE, FIB
β”œβ”€β”€ sc/             # Service Chaining engine
β”œβ”€β”€ cgnat/          # Carrier-Grade NAT
β”œβ”€β”€ ipfix/          # IPFIX flow export (NAT logging)
β”œβ”€β”€ upf/            # UPF (User Plane Function) session handling
β”œβ”€β”€ rib-client/     # RIB (Routing Information Base) client
β”œβ”€β”€ punt/           # Punt path (slow-path packets to controller)
β”œβ”€β”€ contest/        # Container integration tests
β”œβ”€β”€ up-common/      # Shared libraries (EVL, mbox, RCU, timers, etc.)
β”œβ”€β”€ cdpi-main/      # DPI (Deep Packet Inspection) integration
β”œβ”€β”€ scripts/        # CI helper scripts
β”œβ”€β”€ dpi-packages/   # DPI heuristics packages
β”œβ”€β”€ Makefile        # Developer shortcuts
β”œβ”€β”€ CMakeLists.txt  # Top-level CMake (in main/)
β”œβ”€β”€ Jenkinsfile*    # CI/CD pipeline definitions
β”œβ”€β”€ ruleset2.0.yaml # Bob build rules
└── common-properties.yaml  # Docker images, repos, versions

Chapter 1: 3GPP User Plane Fundamentals

The Split Architecture

In modern 3GPP networks (4G EPC and 5G Core), the control plane and user plane are separated (CUPS β€” Control and User Plane Separation):

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ CONTROL PLANE β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ AMF β”‚ β”‚ SMF β”‚ β”‚ PCF β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ PFCP (N4) β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”‚ β”‚ USER PLANE β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β” β”‚ UPF │◄── This is what β”‚ β”‚ β”‚ gNB │───►│(data- β”‚ data-plane β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ N3 β”‚ plane) β”‚ implements! β”‚ β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ N6 β”‚ β”‚ β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β” β”‚ β”‚ β”‚ DN β”‚ (Internet/Corporate) β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Concepts You Already Know (from AMF)

What the UPF Does

FunctionDescription
GTP-U tunnelingEncap/decap GTP tunnels on N3 (from gNB) and N9 (between UPFs)
Packet DetectionMatch packets to PDRs (Packet Detection Rules) from SMF
ForwardingApply FARs (Forwarding Action Rules) β€” forward, drop, buffer
QoS EnforcementApply QERs (QoS Enforcement Rules) β€” rate limiting, marking
Usage ReportingApply URRs (Usage Reporting Rules) β€” volume/time measurement
NAT/CGNATCarrier-Grade NAT for IPv4 address sharing
DPIDeep Packet Inspection for traffic classification
Service ChainingSteer traffic through service functions (firewall, etc.)

4G vs 5G Terminology

4G (EPC)5G (5GC)Our Code
SGW-UUPF (I-UPF)data-plane
PGW-UUPF (PSA)data-plane
S1-U, S5/S8N3, N9GRE/GTP tunnels in pktio
APNDNNNetwork Instance
BearerQoS FlowPDR/FAR/QER
πŸ’‘ Key insight: The data-plane supports both 4G and 5G simultaneously. In the code you'll see references to both SGW-U/PGW-U (EPG mode) and UPF (PCG/VPN-GW mode). The same binary handles both.

Chapter 2: PFCP Protocol & Session Management

What is PFCP?

PFCP (Packet Forwarding Control Protocol, 3GPP TS 29.244) is the protocol between the SMF (control plane) and UPF (user plane). Think of it as "the boss telling the worker what to do with packets."

PFCP Session Lifecycle

SMF UPF (data-plane) β”‚ β”‚ │── PFCP Session Establishment ─►│ Create session with rules │◄─ Response ────────────────────│ β”‚ β”‚ │── PFCP Session Modification ──►│ Update rules (handover, QoS change) │◄─ Response ────────────────────│ β”‚ β”‚ │◄─ PFCP Session Report ─────────│ Usage report, DL data notification │── Response ───────────────────►│ β”‚ β”‚ │── PFCP Session Deletion ──────►│ Tear down session │◄─ Response ────────────────────│

The Rules Model (PDR β†’ FAR/QER/URR)

Each PFCP session contains rules that tell the UPF how to handle packets:

In the code: PFCP handling lives primarily in the upf/ directory. The session data structures and PDR matching logic are in the service chaining engine (sc/). Configuration from PFCP arrives via the CM Mediator path (JSON over REST) in PCG mode, or via PIAF/ICD in EPG (IPOS) mode.

How Config Reaches data-plane

The data-plane does NOT speak PFCP directly in PCG mode. Instead:

  1. SMF sends PFCP to the UP Control (UPC) microservice
  2. UPC translates PFCP rules into a JSON configuration model
  3. UPC pushes config to data-plane via CM Mediator (REST/JSON)
  4. data-plane parses JSON and installs forwarding rules
// In main/dp_cm_mediator.c β€” handles config from CM Mediator
// In main/dp_config_json_parser.c β€” parses the JSON config model
// In main/dp_config_json_validator.c β€” validates before applying

Chapter 3: Where Data Plane Fits in Ericsson PCG

The PCG Product

PCG (Packet Core Gateway) is Ericsson's cloud-native 5G UPF product. It runs on Kubernetes and consists of multiple microservices:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ PCG Kubernetes Cluster β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ UP Control β”‚ β”‚ Routing β”‚ β”‚ CM Mediator β”‚ β”‚ β”‚ β”‚ (UPC) β”‚ β”‚ Agent β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ JSON/REST β”‚ gRPC β”‚ REST β”‚ β”‚ β–Ό β–Ό β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ DATA PLANE (this repo!) β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ PKTIO β”‚ β”‚ VRF β”‚ β”‚ SC β”‚ β”‚ CGNAT β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ (DPDK) β”‚ β”‚ GRE β”‚ β”‚ DPI β”‚ β”‚ IPFIX β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β–² β–² β”‚ β”‚ β”‚ SR-IOV/DPDK β”‚ veth β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ N3/N9 NIC β”‚ β”‚ N6 NIC β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Neighboring Microservices

ServiceRoleInterface to DP
UP Control (UPC)PFCP termination, session managementJSON config via CM Mediator
Routing AgentRoute management, BGPgRPC (route announcements)
CM MediatorConfiguration distributionREST (JSON patches)
Log TransformerLog collectionstdout/syslog
PM ServerMetrics collectionPrometheus scrape
KV DB (Redis)Session state, geo-redundancyRedis protocol
KafkaEvent streaming (IPFIX, etc.)Kafka producer

EPG vs PCG vs VPN-GW

The same data-plane binary is used in multiple products:

⚠️ Build flags: You'll see WITH_IPOS_SDK, EPG_BUILD throughout the code. For PCG development (what you'll mostly do), these are OFF. Code inside #if defined(WITH_IPOS_SDK) is EPG-specific.

πŸ“ Quiz 1 β€” Context & Fundamentals

Q1: What protocol does the SMF use to communicate with the UPF?

Q2: In PCG mode, how does configuration reach the data-plane?

Q3: What does a PDR (Packet Detection Rule) do?

Q4: What does the compile flag WITH_IPOS_SDK indicate?

Q5: What is the 5G equivalent of the 4G term "APN"?

Chapter 4: High-Level Architecture Overview

Single-Process, Multi-Threaded

The data-plane is a single process with many threads, each pinned to a specific CPU core. This is a classic DPDK pattern β€” avoid context switches, avoid locks, maximize cache locality.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ data-plane process β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚Controllerβ”‚ β”‚ Ingress β”‚ β”‚ Input β”‚ β”‚ Output β”‚ β”‚ β”‚ β”‚ (CPU 0) β”‚ β”‚ (CPU 1-N)β”‚ β”‚ (CPU M) β”‚ β”‚ (CPU K) β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β€’ Config β”‚ β”‚ β€’ RX pktsβ”‚ β”‚ β€’ Punt β”‚ β”‚ β€’ TX to β”‚ β”‚ β”‚ β”‚ β€’ REST β”‚ β”‚ β€’ Classifyβ”‚ β”‚ β€’ ARP β”‚ β”‚ networkβ”‚ β”‚ β”‚ β”‚ β€’ Timers β”‚ β”‚ β€’ Forwardβ”‚ β”‚ β€’ BFD β”‚ β”‚ β€’ Encap β”‚ β”‚ β”‚ β”‚ β€’ OAM β”‚ β”‚ β€’ NAT β”‚ β”‚ β€’ ICMP β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β€’ DPI β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β€’ SC β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Architectural Principles

The Global g_dp Structure

The entire data-plane state is rooted in a single global: dataplane_t* g_dp. This is the "god object" that holds references to all subsystems:

// From main/dp.h
typedef struct dataplane dataplane_t;
extern dataplane_t* g_dp;

dataplane_t*
dataplane_create(
    cb_timer_framework_cpu_t* cb_timer_fws,
    dp_options_t* options,
    random_t* random,
    evl_t* evl,           // Event loop for controller
    evl_t* evl_dbproxy,   // Event loop for DB proxy
    ...
    mbox_t* mbox,         // Mailbox for inter-CPU msgs
    rcu_t* rcu,           // Read-Copy-Update
    pktio_t* pktio,       // Packet I/O subsystem
    ...
);

Chapter 5: CPU Roles

CPU Role Assignment

Each CPU core in the data-plane pod is assigned a specific role. The assignment is dynamic based on available cores:

RoleCountResponsibility
Controller1Configuration, REST API, timers, OAM, session management
Ingress1+Receive packets from NIC, classify, spray to workers
Worker (Forwarding)ManyFull packet processing pipeline (the "fast path")
Input1Handle punted packets (ARP, BFD, ICMP, control protocols)
Output1Transmit packets to NIC after processing
Dynamic CPU roles: See main/ipos/assign_dynamic_cpu_roles.c and main/linux/src/assign_dynamic_cpu_roles.c. The number of worker cores scales with the pod's CPU allocation.

Controller Thread Details

The controller runs an EVL event loop and handles:

Worker Thread Details

Workers run a tight poll loop (no sleeping!) that:

  1. Polls the NIC RX queue (via DPDK/eventdev)
  2. Classifies the packet (GTP? IP? ARP?)
  3. Looks up the session/PDR
  4. Applies service chain (DPI, NAT, QoS, firewall)
  5. Encapsulates if needed (GTP, GRE)
  6. Transmits on the TX queue
// Simplified worker loop concept (from pktio/src/pktio.c)
while (running) {
    nb_pkts = rte_eth_rx_burst(port, queue, pkts, BURST_SIZE);
    for (i = 0; i < nb_pkts; i++) {
        classify_packet(pkts[i]);
        apply_service_chain(pkts[i]);
        transmit_packet(pkts[i]);
    }
    mbox_poll(mbox);  // Check for config updates
}

Chapter 6: Packet Processing Pipeline

The Fast Path

The "fast path" is the optimized packet processing pipeline that handles the vast majority of traffic without involving the controller:

Packet arrives on NIC (N3/N9/N6) β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 1. RX Burst β”‚ DPDK polls NIC, gets batch of packets β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 2. Classify β”‚ Parse headers, identify tunnel, lookup NWID β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 3. PDR Match β”‚ Match packet to Packet Detection Rule β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 4. Service β”‚ Execute service chain: β”‚ Chain β”‚ DPI β†’ NAT β†’ QoS β†’ Firewall β†’ ... β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 5. FAR Apply β”‚ Forward/Drop/Buffer per Forwarding Action Rule β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 6. Encap β”‚ Add GTP/GRE/IP headers for egress tunnel β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 7. TX Burst β”‚ Batch transmit to NIC β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–Ό Packet leaves on NIC

Slow Path (Punt)

Some packets can't be handled on the fast path and are "punted" to the Input/Controller CPU:

Network Instance (NWID)

A Network Instance (NI, also called NWID in code) is the data-plane's equivalent of a VRF. Each DNN/APN maps to a network instance with its own:

Chapter 7: Traffic Spraying

Why Spraying?

With multiple worker CPUs, incoming traffic must be distributed. Spraying is how packets are assigned to worker cores. The goal: even load distribution while keeping related packets on the same core (for stateful processing).

Spraying Modes

ModeHash KeyUse Case
SessionGTP TEIDAll packets of one session β†’ same core
Flow5-tuple (src/dst IP, ports, proto)Per-flow affinity
GREGRE keyGRE tunnel-based distribution
PacketRound-robin or RSSMaximum parallelism (stateless)
In the code: See pktio/src/ingress_spraying.c and pktio/src/ingress_spraying.h. The spraying logic uses the five-tuple hash (pktio/src/five_tuple.c) and flow tables (pktio/src/flow_table.c).

Eventdev (Hardware-Assisted Spraying)

On supported NICs, DPDK's eventdev framework can do spraying in hardware/firmware, offloading the ingress CPU. See pktio/src/eventdev.c.

// From pktio/src/eventdev.h β€” eventdev configuration
// Eventdev distributes packets to worker cores based on flow_id
// This avoids software-based spraying overhead

πŸ“ Quiz 2 β€” Architecture

Q1: What is the "run-to-completion" model?

Q2: Which CPU role handles configuration and REST APIs?

Q3: What mechanism is used for lock-free config updates while forwarding continues?

Q4: Which spraying mode uses the GTP TEID as hash key?

Q5: What kind of packets get "punted" to the slow path?

Chapter 8: Packet I/O (PKTIO)

Overview

The pktio/ directory is the packet I/O subsystem β€” the interface between the data-plane and the network. It abstracts multiple backends behind a common API.

PKTIO Backends

BackendFileUse Case
libpiopktio_libpio.cProduction: DPDK-based high-performance I/O
linuxpktio_linux.cAF_XDP / raw sockets for non-DPDK environments
pktsockpktio_pktsock.cTesting: packet sockets (used in SFT/contest)
tappktio_tap.cTesting: TAP devices for local testing
nativepktio_native.cNative Linux networking

Key PKTIO Concepts

// The main PKTIO public API (pktio/include/pktio/pktio.h)
// ~50K lines β€” this is the largest header in the project
// Key functions:
pktio_t* pktio_create(...);
void pktio_start(pktio_t* pktio);
void pktio_stop(pktio_t* pktio);
int pktio_rx_burst(pktio_t* pktio, pktio_packet_t* pkts, int max);
int pktio_tx_burst(pktio_t* pktio, pktio_packet_t* pkts, int n);

FIB (Forwarding Information Base)

The pktio/fib/ subdirectory contains the FIB β€” the routing lookup table used during packet forwarding. It's a longest-prefix-match (LPM) structure optimized for fast lookups.

Chapter 9: UPF Engine & DPI

UPF Module (upf/)

The upf/ directory handles UPF-specific session logic β€” PDR installation, FAR application, URR counting. It works closely with the service chaining engine.

DPI (Deep Packet Inspection)

DPI is handled by the cdpi-main/ directory and the external dpisf library. It classifies application-layer traffic (YouTube, Netflix, WhatsApp, etc.) using:

DPI packages: The dpi-packages/ directory and scripts/download_heuristics_package.sh handle downloading DPI signature databases. These are versioned separately from the main binary.

DPI Thread Controller

DPI can run on dedicated threads to avoid impacting forwarding latency. See main/dp_config_dpi_thread.c and main/dp_config_dpi_thread_controller.c.

Chapter 10: VRF, GRE & FIB

VRF Module (vrf/)

The VRF (Virtual Routing and Forwarding) module manages network instances β€” isolated routing domains, each with their own FIB, GRE tunnels, and interfaces.

// VRF source files (from vrf/CMakeLists.txt)
src/config.c                // VRF configuration handling
src/config_delta_builder.c  // Incremental config changes
src/controller.c            // VRF controller (control plane side)
src/engine.c                // VRF engine (data plane side)
src/routes.c                // Route management
src/mac_learning_mgr.c      // MAC address learning
src/cre_route.c             // CRE (Cloud Routing Engine) routes

GRE Tunnels

GRE (Generic Routing Encapsulation) tunnels connect the data-plane to the transport network. Each network instance can have multiple GRE tunnels. The GRE subsystem lives in vrf/gre/.

FIB Controller (vrf/fib-ctrl/)

The FIB controller manages route installation/removal. Routes come from:

Route Announcements

The data-plane announces its routes to the Routing Agent so that external routers know how to reach UE addresses. See main/dp_config_route_announcements.c.

Chapter 11: CGNAT, Service Chaining & Protocols

CGNAT (cgnat/)

Carrier-Grade NAT translates private IPv4 addresses to shared public IPs. Key components:

Service Chaining (sc/)

The Service Chaining engine is the heart of packet processing. It executes an ordered list of "service functions" on each packet:

Packet β†’ [PDR Match] β†’ [DPI] β†’ [QoS] β†’ [NAT] β†’ [Firewall] β†’ [Forward] β–² Service Chain (configured per session)

Key files in sc/:

IPFIX (ipfix/)

IPFIX (IP Flow Information Export) generates NAT logging records. Required for lawful intercept β€” when CGNAT is used, operators must log which subscriber had which public IP:port at what time.

Protocol Handling

The data-plane handles several control protocols on the slow path:

ProtocolPurpose
ARPAddress resolution on N6 interfaces
BFDFast failure detection on GRE tunnels
GTP-UUser plane tunneling (N3/N9)
LACPLink aggregation (bonded interfaces)
ICMP/ICMPv6Ping, unreachable, PMTUD
TCPNAT state tracking, RST generation

πŸ“ Quiz 3 β€” Subsystems

Q1: Which PKTIO backend is used in production with DPDK?

Q2: What does the NWID table map?

Q3: What is the primary purpose of CGNAT in the data-plane?

Q4: Why is IPFIX logging required when CGNAT is used?

Q5: What does "tromboning" mean in the service chaining context?

Chapter 12: Module Lifecycle API

The Module Pattern

Every subsystem in data-plane follows a strict module lifecycle pattern defined in MODULE_GUIDELINES.md. This ensures controlled startup and shutdown.

State Machine

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” create() β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” start() β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ NULL │─────────────►│ CREATED │────────────►│STARTING β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚ on_started_cb() β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” delete() β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” stop() β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ NULL │◄─────────────│ STOPPED │◄────────────│ STARTED β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–² on_stopped_cb() β”‚ (safe to delete)

API Contract

// Every module exposes these functions:

// Constructor β€” allocate and initialize
my_module_t* my_module_create(dependencies...);

// Start β€” begin async operations, call on_started_cb when ready
void my_module_start(my_module_t* m, on_started_cb, cb_ctx);

// Stop β€” gracefully shut down, call on_stopped_cb when done
void my_module_stop(my_module_t* m, on_stopped_cb, cb_ctx);

// Destructor β€” free all resources (only after stopped)
void my_module_delete(my_module_t* m);

Key Rules

⚠️ Common mistake: Forgetting to cancel timers or deferred work before calling on_stopped_cb. This leads to use-after-free crashes during shutdown.

Chapter 13: Control-Engine Pattern & Mailbox

Control-Engine Split

Many modules have two parts:

Example: VRF has vrf/src/controller.c (control) and vrf/src/engine.c (data path).

Mailbox (mbox)

The mailbox is the inter-CPU communication mechanism. It's a lock-free SPSC (single-producer, single-consumer) or MPSC queue.

// Sending a message from controller to worker
mbox_msg_t msg = {
    .type = MBOX_MSG_CONFIG_UPDATE,
    .data = new_config_ptr
};
mbox_send(worker_mbox, &msg);

// Worker polls mailbox in its main loop
while (mbox_recv(my_mbox, &msg)) {
    handle_message(&msg);
}

Mbox Dispatcher

The mbox_dispatcher provides a higher-level API for synchronous and asynchronous cross-thread tasks:

// Execute a function on a specific worker thread
mbox_dispatcher_dispatch(dispatcher, target_cpu, my_function, arg);

// Used for operations that need specific thread context
// (e.g., timer start/stop must happen on the timer's owning thread)

Thread Context Rules

πŸ’‘ Important: Memory allocation can happen from any thread. But timer start/stop, deferred work scheduling, and certain data structure operations MUST happen on the correct thread. Use mbox_dispatcher when you need to execute something on a specific CPU.

Chapter 14: RCU, Async Config & EVL Batch Iterator

RCU (Read-Copy-Update)

RCU allows the controller to update configuration while workers continue reading the old version without locks:

Controller thread: Worker threads: 1. Create new config Reading old config (no lock!) 2. rcu_assign_pointer(new) β”‚ 3. synchronize_rcu() ◄──────────────►│ Grace period 4. Free old config Now reading new config
// Writer (controller thread)
new_config = build_new_config(...);
rcu_assign_pointer(global_config, new_config);
rcu_synchronize(rcu);  // Wait for all readers to finish with old
free(old_config);

// Reader (worker thread) β€” no locks needed!
rcu_read_lock(rcu);
config = rcu_dereference(global_config);
use_config(config);
rcu_read_unlock(rcu);

Async Configuration Model

From CONFIG_GUIDELINES.md: Configuration is applied asynchronously using a two-step model:

  1. Step 1 β€” Build config: Parent module builds a module-specific config object
  2. Step 2 β€” Apply config: Child module applies it via deferred jobs
// Every module exposes:
void module_set_config(
    module_t* module,
    module_config_t* config,
    on_done_cb_t on_done,      // Called when ALL deferred work completes
    void* on_done_ctx,
    void* on_done_arg
);

// Key rules:
// - All deferred work MUST be started inside set_config
// - on_done MUST be called exactly once, after all work completes
// - on_done may NOT be interrupted or cancelled

EVL Batch Iterator

For "large number objects" (e.g., thousands of network instances), you must NOT iterate in a blocking loop. Use the EVL Batch Iterator:

// BAD β€” blocks the event loop for too long
for (i = 0; i < 10000; i++) {
    configure_network_instance(ni[i]);  // ❌ Blocks!
}

// GOOD β€” process one at a time with low-priority defers
evl_batch_iterator_config_t cfg = {
    .defer_priority = EVL_DEFER_PRIORITY_LOW,
    .batch_size = 1,
    .on_item = configure_one_ni,
    .on_done = all_ni_configured_cb
};
evl_batch_iterator_start(&cfg, ni_list, count);
⚠️ Why this matters: The controller thread also handles health checks, timers, and OAM. If config blocks it for seconds, Kubernetes thinks the pod is dead and kills it!

πŸ“ Quiz 4 β€” Code Patterns

Q1: What happens if a module's start function encounters missing resources?

Q2: When is it safe to call module_delete()?

Q3: What is the purpose of RCU in the data-plane?

Q4: Why must large number objects be processed with low-priority defers?

Q5: What mechanism is used for cross-CPU communication?

Chapter 15: up-common & EVL (Event Loop)

What is up-common?

The up-common/ directory is a git submodule containing shared libraries used across all UP (User Plane) microservices. Think of it as the "standard library" for the data-plane ecosystem.

Key Libraries in up-common

LibraryPurpose
evlEvent loop (epoll-based async I/O)
mboxLock-free inter-thread mailbox
rcuRead-Copy-Update for lock-free reads
smpSMP utilities (CPU pinning, barriers)
cbtimerCallback-based timer framework
tw-clientTimer wheel client (efficient bulk timers)
dstraceDistributed tracing (OpenTelemetry)
containerData structures (hash maps, lists, etc.)
netNetwork utilities (inet_addr, etc.)
httpHTTP client/server
kafkaKafka producer/consumer
msgbusMessage bus abstraction (NATS)
evlsockEVL-integrated socket library
tlsTLS configuration and management
loggingStructured logging
metricsPrometheus metrics
stringString utilities
memoryMemory allocation wrappers
db-proxyDatabase proxy (Redis)
db-trackerDatabase connection tracking

EVL (Event Loop)

EVL is the async I/O framework used by the controller thread. It wraps Linux epoll and provides:

// Create an event loop
evl_t* evl = evl_create();

// Register a file descriptor for read events
evl_add_fd(evl, fd, EVL_READ, my_callback, my_arg);

// Schedule a deferred task (runs on next iteration)
evl_defer(evl, my_deferred_fn, arg);

// Schedule a low-priority deferred task
evl_defer_low(evl, my_low_prio_fn, arg);

// Run the event loop (blocks until stopped)
evl_run(evl);
Defer priorities: EVL supports multiple priority levels. High-priority defers (health checks, timers) run before low-priority ones (config application). This is how the async config model avoids blocking critical tasks.

Chapter 16: Timers, Defers & Clients

Timer Framework (cbtimer)

The cbtimer library provides callback-based timers. Each CPU has its own timer framework instance (no cross-CPU timer operations without mbox).

// Create a timer
cb_timer_t* timer = cb_timer_create(fw, my_timeout_cb, arg);

// Start with 5 second timeout
cb_timer_start(timer, duration_from_seconds(5));

// Cancel
cb_timer_stop(timer);

// Destroy
cb_timer_delete(timer);

Timer Wheel (tw-client)

For scenarios with thousands of timers (e.g., session timeouts), the timer wheel is more efficient than individual cbtimers. It batches timer expirations.

Deferred Work

Defers are "do this later" tasks scheduled on the event loop:

REST Client & HTTP

The http library provides both server (for health endpoints) and client (for CM Mediator communication). The rest_client is a higher-level wrapper for making REST API calls.

tw_client (Timer Wheel Client)

Used for bulk session timers. The data-plane can have millions of active sessions, each with timeout timers. The timer wheel handles this efficiently with O(1) start/stop operations.

Chapter 17: ET Test Framework & dstrace

ET (Ericsson Test) Framework

All unit tests and SFTs use the ET framework β€” a C test framework from up-common. It provides:

// Test file structure
#include "et/et.h"

ET_TEST(my_module, test_create_and_delete) {
    my_module_t* m = my_module_create(deps);
    ET_ASSERT(m != NULL);
    my_module_delete(m);
}

ET_TEST(my_module, test_start_stop) {
    my_module_t* m = my_module_create(deps);
    my_module_start(m, on_started, ctx);
    // ... drive event loop until started ...
    my_module_stop(m, on_stopped, ctx);
    // ... drive event loop until stopped ...
    my_module_delete(m);
}

ET_TEST_SUITE(my_module) {
    ET_RUN_TEST(my_module, test_create_and_delete);
    ET_RUN_TEST(my_module, test_start_stop);
}

SFT (Signal Flow Tests)

SFTs test the interaction between modules. They use stub/mock versions of dependencies (note the -sft library variants in CMakeLists.txt):

dstrace (Distributed Tracing)

The dstrace library integrates with OpenTelemetry for distributed tracing across microservices. It allows you to trace a request from SMF β†’ UPC β†’ data-plane.

// Create a span for a config operation
dstrace_span_t* span = dstrace_span_start(tracer, "apply_config");
// ... do work ...
dstrace_span_end(span);

πŸ“ Quiz 5 β€” Libraries

Q1: What is up-common?

Q2: What Linux system call does EVL wrap?

Q3: Why use a timer wheel instead of individual cbtimers for sessions?

Q4: What compile defines are set for SFT test builds?

Q5: What does evl_defer_low() do differently from evl_defer()?

Chapter 18: CMake Structure & 3PP Management

CMake Organization

The project uses CMake as its build system. Each subdirectory has its own CMakeLists.txt:

data-plane/
β”œβ”€β”€ main/CMakeLists.txt      # Main executable + libraries
β”œβ”€β”€ pktio/CMakeLists.txt     # Packet I/O library
β”œβ”€β”€ vrf/CMakeLists.txt       # VRF library
β”œβ”€β”€ cgnat/CMakeLists.txt     # CGNAT library
β”œβ”€β”€ sc/CMakeLists.txt        # Service Chaining library
β”œβ”€β”€ ipfix/CMakeLists.txt     # IPFIX library
β”œβ”€β”€ contest/CMakeLists.txt   # Integration tests
└── up-common/               # Submodule with its own CMake

Library Naming Convention

Libraries follow the pattern up-dp-{module} with SFT variants:

Build Targets

# The main executable
add_executable(data-plane main.c)

# Links against everything
target_link_libraries(data-plane PRIVATE
    up-dp-gwu        # Main library
    up-dp-vrf        # VRF
    up-dp-cgnat      # CGNAT
    up-dp-sc         # Service Chaining
    up-dp-ipfix      # IPFIX
    up-dp-pktio      # Packet I/O
    up-evl           # Event loop
    up-mbox          # Mailbox
    up-rcu           # RCU
    ...
)

3PP (Third-Party Packages)

Third-party dependencies are managed via the staging directory. Running bob generate:3pp downloads and unpacks them:

# Key 3PP dependencies:
- DPDK          # Data Plane Development Kit
- jansson       # JSON parsing
- hiredis       # Redis client
- flatbuffers   # Serialization (for Kafka messages)
- grpc/protobuf # gRPC (routing agent communication)
- openssl       # TLS
- abseil (absl) # Google's C++ utilities
- yyjson        # Fast JSON parser

Chapter 19: Bob Tool & Rulesets

What is Bob?

Bob is Ericsson's ADP (Application Development Platform) build orchestration tool. It reads ruleset2.0.yaml files and executes build rules in Docker containers for reproducibility.

Key Bob Commands

# Initialize bob (downloads bob binary)
$ bob/bob init-dev

# Generate 3PP staging (download dependencies)
$ bob/bob generate:3pp

# Generate CMake build system
$ bob/bob -p build-dir=builds/san generate:cmake

# Build C++ targets
$ bob/bob -p build-dir=builds/san build:cpp

# Run tests
$ bob/bob -p build-dir=builds/san test:cpp

# Build Docker image
$ bob/bob -p build-dir=builds/release build image package

Ruleset Structure (ruleset2.0.yaml)

The ruleset defines build rules, Docker images, properties, and task sequences. Key sections:

Bob Parameters (-p)

ParameterEffect
build-dir=XOutput directory for build artifacts
cc=clangUse Clang compiler
cxx=clang++Use Clang++ for C++
asan=onEnable AddressSanitizer
ubsan=onEnable UndefinedBehaviorSanitizer
assert=onEnable assertions
coverage=onEnable code coverage
lto=offDisable Link-Time Optimization (faster builds)
shared-libs=onBuild shared libraries (faster linking)
cpp-target=XBuild only specific target

Chapter 20: Makefile Developer Shortcuts

The Developer Makefile

The top-level Makefile wraps bob commands into convenient shortcuts. Run make help to see all targets:

# Quick test (assert build, fast)
$ make test

# Test with sanitizers (catches memory bugs)
$ make testsan

# Test a specific suite
$ make test t=my_module_SUITE

# Build a Docker image for system test
$ make image

# Run contest (container integration tests)
$ make contest

# Get test coverage report
$ make testcov

# Run linter (clang-tidy on your commit)
$ make lint

# Generate compile_commands.json for IDE
$ make lsp

# Clean build artifacts
$ make clean

# Nuclear option β€” delete everything
$ make realclean

Build Variants

VariantDirectoryPurpose
assertbuilds/assertFast build with assertions. Default for make test
sanbuilds/sanASan + UBSan. Catches memory bugs. Used for make testsan
san2builds/san2Alternative sanitizer config (GCC instead of Clang)
debugbuilds/debugNo optimization (-O0). Best for GDB debugging
covbuilds/covCoverage instrumentation. For make testcov
releasebuilds/releaseProduction build (LTO, optimized). For make image
πŸ’‘ Daily workflow: Most developers use make test for quick iteration and make testsan before pushing. The CI runs both plus more.

ccache

The Makefile auto-detects ccache directories (/local/scratch/ccache or /local/persistent_docker/ccache) to speed up rebuilds. If you have ccache configured, subsequent builds are much faster.

Chapter 21: Unit Tests & SFT

Test Organization

Tests live in test/ subdirectories within each module:

cgnat/test/ut_cgnat_*.c         # CGNAT unit tests
pktio/test/test_pktio*.c        # PKTIO tests
sc/test/utest_*.c               # Service Chaining tests
ipfix/test/ut_ipfix_*.c         # IPFIX tests
vrf/test/                       # VRF tests
main/linux/test/test_linux.c    # Linux-specific tests

Running Tests

# Run all tests
$ make test

# Run specific test suite
$ make test t=ut_cgnat_ip_pool_SUITE

# Run with sanitizers
$ make testsan t=ut_cgnat_ip_pool_SUITE

# Under the hood, bob uses CTest:
$ cd builds/san && ctest -R ut_cgnat

Writing a Unit Test

#include "et/et.h"
#include "cgnat/cgnat.h"

// Setup/teardown per test
static cgnat_t* cgnat;

static void setup(void) {
    cgnat = cgnat_create(test_deps);
}

static void teardown(void) {
    cgnat_delete(cgnat);
}

ET_TEST(cgnat_pool, allocate_returns_valid_ip) {
    ip_addr_t addr = cgnat_pool_allocate(cgnat->pool);
    ET_ASSERT(addr.s_addr != 0);
    ET_ASSERT_EQ(addr.s_addr, expected_first_ip);
}

ET_TEST_SUITE(cgnat_pool) {
    ET_SETUP(setup);
    ET_TEARDOWN(teardown);
    ET_RUN_TEST(cgnat_pool, allocate_returns_valid_ip);
}

SFT vs UT

AspectUnit Test (UT)Signal Flow Test (SFT)
ScopeSingle function/moduleMultiple modules interacting
DependenciesFully mockedReal modules with stub externals
LibrariesStandard libs*-sft variants (e.g., up-dp-gwu-sft)
SpeedVery fastSlower (more setup)
DefinesENV_UTENV_UT + UP_SFT

Chapter 22: Contest & VETO

Contest (Container Integration Tests)

Contest runs the actual data-plane binary in a Docker container with simulated network interfaces. It tests end-to-end packet forwarding.

# Run contest locally
$ make contest

# What happens:
# 1. Builds data-plane with sanitizers
# 2. Builds Docker image
# 3. Starts docker-compose with:
#    - data-plane container (SUT)
#    - test driver container
#    - mock services (Redis, CM Mediator, etc.)
# 4. Runs test scenarios

Contest Structure

contest/
β”œβ”€β”€ contest.yaml          # Docker-compose definition
β”œβ”€β”€ run.sh                # Test runner script
β”œβ”€β”€ main.c                # Test driver entry point
β”œβ”€β”€ framework/            # Test utilities
β”‚   β”œβ”€β”€ contest.c         # Framework core
β”‚   β”œβ”€β”€ pfcp.c            # PFCP message builder
β”‚   β”œβ”€β”€ config.c          # Config injection
β”‚   β”œβ”€β”€ session.c         # Session management
β”‚   β”œβ”€β”€ io.c              # Packet send/receive
β”‚   └── sut.c             # System Under Test control
β”œβ”€β”€ tests/                # Test cases
β”‚   β”œβ”€β”€ test_pgwu.c       # PGW-U scenarios
β”‚   β”œβ”€β”€ test_colocated.c  # Co-located SGW-U + PGW-U
β”‚   β”œβ”€β”€ test_vpngw.c      # VPN-GW scenarios
β”‚   └── test_redis.c      # Redis/geo-redundancy tests
└── simulator/            # Mock external services

VETO (System Test)

VETO is the full system test that runs on real hardware or cloud infrastructure with the complete PCG deployment. It's triggered by the CI pipeline (JenkinsfileVeto).

evrtd: Contest can also run on shared test infrastructure via evrtd (test channel allocation). See contest/evrtd/ for SSH tunnel scripts.

Chapter 23: Fuzz Testing, ISSU & Benchmarks

Fuzz Testing

The project includes fuzz tests that feed random/malformed data to parsers:

ISSU (In-Service Software Upgrade)

ISSU tests verify that the data-plane can be upgraded without dropping traffic. The new version takes over sessions from the old version via shared state in Redis.

Benchmarks

Performance benchmarks measure packets-per-second and latency:

Stress Tests

pktio/test/pktio_stress.c β€” Long-running stress test that exercises the packet path under load to find race conditions and memory leaks.

πŸ“ Quiz 6 β€” Testing

Q1: What is the difference between UT and SFT?

Q2: What does contest test?

Q3: Which command runs tests with AddressSanitizer?

Q4: What does ISSU testing verify?

Q5: What tool is used for fuzz testing?

Chapter 24: PreCodeReview Pipeline

What Triggers It

Every time you push a patch to Gerrit, the JenkinsfilePreCodeReview pipeline runs automatically. It's the gatekeeper β€” your patch won't get merged if this fails.

Pipeline Stages

Push to Gerrit β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ PreCodeReview Pipeline β”‚ β”‚ β”‚ β”‚ 1. Init β†’ bob init-dev, generate:3pp β”‚ β”‚ 2. Generate β†’ CMake generation β”‚ β”‚ 3. Build β†’ Compile (assert + san) β”‚ β”‚ 4. Test β†’ Run UT + SFT (both builds) β”‚ β”‚ 5. Lint β†’ clang-tidy on changed files β”‚ β”‚ 6. Commit Check β†’ Commit message format β”‚ β”‚ 7. Contest β†’ Container integration tests β”‚ β”‚ 8. Image Build β†’ Build Docker image β”‚ β”‚ 9. Helm Lint β†’ Validate Helm chart β”‚ β”‚ 10. Report β†’ Publish results to Gerrit β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό +1/-1 on Gerrit (Verified label)

What Gets Checked

πŸ’‘ Before pushing: Run make test and make lint locally. This catches most issues before CI runs (saves 30+ minutes of waiting).

Chapter 25: Drop Pipeline

What is a "Drop"?

A drop is a versioned release candidate. The Drop pipeline (JenkinsfileDrop) runs on the main branch after patches are merged. It produces artifacts that can be deployed.

Drop Pipeline Stages

  1. Build β€” Full release build (LTO, optimized)
  2. Test β€” Complete test suite (UT + SFT + contest)
  3. Image β€” Build production Docker image
  4. Helm Package β€” Package Helm chart
  5. Publish β€” Push image to Docker registry, chart to Helm repo
  6. VA Scan β€” Vulnerability scanning (Trivy, Grype)
  7. Design Rule Check β€” ADP compliance checks
  8. Version β€” Tag with version number

Artifact Repositories

ArtifactRepository
Docker imageserodocker.sero.gic.ericsson.se/proj-pc-dev/
Helm chart (dev)proj-pc-dev-helm-local
Helm chart (drop)proj-pc-drop-1-helm-local
Documentationproj-pc-marketplace-docs-dev-generic-local

Chapter 26: PRA (Release) Pipeline

PRA = Product Release Approval

The PRA pipeline produces the official released version that goes to customers. It runs additional quality gates beyond the Drop pipeline:

Version Scheme

# Version format: PREFIX + auto-increment
$ cat VERSION_PREFIX
CXU1012345

# Full version example: CXU1012345_1_R2A (drop 2, revision A)

Chapter 27: VA & SoC Pipelines

VA 2.0 (Vulnerability Assessment)

The VA pipeline scans for security vulnerabilities using multiple tools:

ToolWhat it Scans
TrivyContainer image CVEs (OS packages, libraries)
GrypeAdditional CVE scanning (different database)
HadolintDockerfile best practices
KubesecKubernetes security configuration
KubeauditKubernetes security audit
CIS-CATCIS benchmark compliance
XrayArtifactory-based vulnerability scan
CAPPAADP-specific security checks

Daily Pipeline

The JenkinsfileDaily runs nightly and catches issues that don't block individual patches:

Readiness Check

JenkinsfileReadinessCheck verifies that the service meets ADP readiness criteria before release.

Chapter 28: Docker Image

Image Structure

The data-plane Docker image is built from a minimal base (ADP Base OS) and contains:

Building an Image for ST

# Build a release image
$ make image

# Output tells you the tag to use in ST:
# helm.flags=eric-pc-up-data-plane.imageCredentials.repoPath=proj-pc-dev,
#            eric-pc-up-data-plane.images.dataPlane.tag=1.2.3.unstripped

Image Variants

Contest Dockerfile

Contest uses a separate contest/Dockerfile that includes the test driver alongside the SUT (System Under Test).

Chapter 29: Helm Chart & Kubernetes

Helm Chart: eric-pc-up-data-plane

The data-plane is deployed as a Kubernetes DaemonSet or Deployment via Helm. Key resources:

Key Helm Values

# Typical values override for ST:
eric-pc-up-data-plane:
  replicaCount: 2
  resources:
    dataPlane:
      requests:
        cpu: "8"
        memory: "16Gi"
        hugepages-1Gi: "4Gi"
      limits:
        cpu: "8"
        memory: "16Gi"
        hugepages-1Gi: "4Gi"
  imageCredentials:
    repoPath: proj-pc-dev
  images:
    dataPlane:
      tag: "your-version"

Pod Resource Requirements

Chapter 30: SR-IOV, DPDK & Cloud-Native I/O

Why DPDK?

Standard Linux networking (kernel TCP/IP stack) adds too much overhead for millions of packets/second. DPDK bypasses the kernel entirely:

Traditional: DPDK: App ←→ Kernel ←→ NIC App ←→ NIC (direct!) (syscalls, (userspace driver, interrupts, poll mode, copies) zero-copy) ~1M pps ~20M+ pps per core

SR-IOV (Single Root I/O Virtualization)

SR-IOV allows a physical NIC to present multiple Virtual Functions (VFs) that appear as independent NICs to the pod. Each data-plane pod gets its own VF with direct hardware access.

Physical NIC (PF) β”œβ”€β”€ VF0 β†’ data-plane pod 1 β”œβ”€β”€ VF1 β†’ data-plane pod 2 β”œβ”€β”€ VF2 β†’ data-plane pod 3 └── ...

Memory: Hugepages

DPDK requires hugepages (1GB or 2MB pages) for:

Without hugepages, TLB misses would destroy performance.

CPU Isolation

Data-plane pods use CPU Manager (Kubernetes) to get dedicated cores. The kernel's scheduler never runs other workloads on these cores, ensuring consistent latency.

In practice: A typical PCG deployment has data-plane pods with 8-16 dedicated cores, 4GB hugepages, and 2-4 SR-IOV VFs (N3, N6, N9 interfaces).

Chapter 31: Linting & CodeChecker

Clang-Tidy

The project uses clang-tidy for static analysis. Configuration is in cdpi-main/.clang-tidy (29K lines!) which enables/disables specific checks.

# Run clang-tidy on your changes
$ make lint

# What it checks:
# - Bugprone patterns (use-after-move, suspicious comparisons)
# - Modernize suggestions (C++17 features)
# - Performance issues (unnecessary copies)
# - Readability (naming conventions)
# - Custom Ericsson checks

CodeChecker

For deeper analysis, CodeChecker runs Clang Static Analyzer:

# Run CodeChecker (slower but more thorough)
$ make codechecker

# Generates HTML report at:
# builds/assert/code-checker-report/
# View with: python3 -m http.server -d ./builds/assert/code-checker-report

Commit Linting

The CI also checks commit message format:

# Good commit message:
PCPB-12345: Add timeout handling to CGNAT garbage collector

Implement periodic cleanup of expired NAT translations.
The garbage collector now runs every 30 seconds and removes
translations that have been idle beyond the configured timeout.

# Bad (will be rejected):
fix stuff

Chapter 32: Design Rule Checks

ADP Design Rules

Ericsson's ADP (Application Development Platform) defines mandatory design rules for cloud-native services. The CI checks compliance automatically.

Helm DR (Design Rules)

Checked by adp-helm-dr-check Docker image. Validates:

Image DR (Design Rules)

Checked by adp-image-dr-check. Validates:

PLM DR (Product Lifecycle Management)

Ensures proper product registration in Ericsson's systems (Munin, EVMS).

Chapter 33: VA Scans, FOSSA & EVMS

Vulnerability Assessment Flow

Docker Image Built β”‚ β”œβ”€β”€β–Ί Trivy Scan ──► CVE Report β”œβ”€β”€β–Ί Grype Scan ──► CVE Report β”œβ”€β”€β–Ί Hadolint ───► Dockerfile Issues β”œβ”€β”€β–Ί CIS-CAT ───► Compliance Report └──► Xray Scan ─► Artifactory CVEs β”‚ β–Ό VA Report (aggregated) β”‚ β–Ό EVMS Registration (track known vulnerabilities)

FOSSA (License Compliance)

FOSSA scans all dependencies for license compliance. It ensures:

EVMS (Ericsson Vulnerability Management System)

All known CVEs in the product are registered in EVMS. The team must:

Munin / PLMS

Product registration system. Each release is registered with:

πŸ“ Quiz 7 β€” Quality & Compliance

Q1: What does `make lint` run?

Q2: Why is FOSSA scanning important?

Q3: What does the Image DR check verify?

Q4: What is EVMS used for?

Q5: Which tools scan the Docker image for CVEs?

Chapter 34: Gerrit Workflow & Making a Test Image

Gerrit Patch Lifecycle

1. Create branch β†’ git checkout -b my-feature 2. Make changes β†’ edit files 3. Commit β†’ git commit (with JIRA ticket in message) 4. Push to Gerrit β†’ git push origin HEAD:refs/for/master 5. CI runs β†’ PreCodeReview pipeline 6. Code review β†’ Team reviews, +1/+2 7. CI passes β†’ Verified +1 8. Submit β†’ Merged to master 9. Drop pipeline β†’ Builds release artifacts

Commit Message Format

PCPB-XXXXX: Short description (max ~72 chars)

Longer description explaining WHY the change is needed.
What problem does it solve? What approach was taken?

- Bullet points for multiple changes
- Reference related patches if needed

Change-Id: I1234567890abcdef  (auto-generated by git hook)

Making a Test Image for ST

# Step 1: Build the release image
$ make image

# Step 2: Note the output β€” it gives you the Helm override:
# helm.flags=eric-pc-up-data-plane.imageCredentials.repoPath=proj-pc-dev,
#            eric-pc-up-data-plane.images.dataPlane.tag=X.Y.Z.unstripped

# Step 3: In ST (Beets), paste that into "Custom Arguments"

# Alternative: Push to your personal repo
$ docker tag data-plane:latest serodocker.sero.gic.ericsson.se/proj-pc-dev/data-plane:my-test
$ docker push serodocker.sero.gic.ericsson.se/proj-pc-dev/data-plane:my-test

Amending a Patch

# After review feedback, amend your commit:
$ git add -p                    # Stage fixes
$ git commit --amend            # Amend (keeps Change-Id)
$ git push origin HEAD:refs/for/master  # Push new patchset

Rebasing on Latest Master

$ git fetch origin
$ git rebase origin/master
# Fix conflicts if any
$ git push origin HEAD:refs/for/master

Chapter 35: Debugging, Common Pitfalls & Useful Links

Debugging with GDB

# Build debug variant
$ make builds/debug

# Run under GDB (for unit tests)
$ cd builds/debug
$ gdb ./my_test_binary
(gdb) break my_function
(gdb) run

# Attach to running container (for contest/ST)
$ kubectl exec -it data-plane-pod -- /bin/sh
$ gdb -p $(pidof data-plane)

Reading Core Dumps

# With unstripped image:
$ gdb ./data-plane core.12345
(gdb) bt          # Backtrace
(gdb) info threads # All threads
(gdb) thread 3    # Switch to thread 3
(gdb) bt          # Backtrace of that thread

Common Pitfalls

PitfallSymptomFix
Forgetting rcu_synchronizeUse-after-free on config updateAlways synchronize before freeing old data
Timer not cancelled in stopCrash during shutdownCancel all timers before calling on_stopped_cb
Blocking in controller threadPod killed (liveness probe fails)Use defers for long operations
Cross-CPU timer operationCorruption/crashUse mbox_dispatcher to run on correct CPU
Missing EXCLUDE_FROM_ALLLibrary built even when not neededAdd EXCLUDE_FROM_ALL to add_library()
Large config loopHealth check timeoutUse EVL batch iterator with low priority
Sanitizer suppression neededFalse positive in 3PPAdd to lsan_suppr.txt or asan suppression file

Useful Make Targets Cheat Sheet

make test              # Quick test (assert build)
make testsan           # Test with sanitizers
make test t=FOO_SUITE  # Run specific suite
make contest           # Container integration test
make image             # Build Docker image for ST
make lint              # Clang-tidy on your commit
make lsp               # Generate compile_commands.json
make testcov           # Coverage report
make codechecker       # Deep static analysis
make clean             # Clean builds (keep configs)
make realclean         # Delete everything

Useful Links

πŸ’‘ Final tip: When in doubt, read the code. The data-plane is well-structured β€” each module follows the same patterns. Once you understand one module (start with a small one like punt/), you can navigate any of them.

πŸŽ“ Congratulations!

You've completed the Data Plane Microservice Learning Guide.

35 chapters β€’ 7 quizzes β€’ From 3GPP fundamentals to daily debugging

Now go break something in the codebase and fix it. That's how you really learn. πŸš€