PCPB-25616: Smart Internal Batching to Improve Signaling Latency

A study guide for understanding the data-plane and pfcp-endpoint repositories in the context of the smart batching feature.

Chapter 1: What Are These Services?

You're looking at two C microservices that together form the User Plane (UP) of a 3GPP Packet Core (PCG/PCC):

Service	Role	Analogy
`data-plane`	Forwards user traffic (payload) at wire speed using DPDK	The "fast engine" — touches every packet
`pfcp-endpoint` (PEP)	Handles PFCP signaling from the Control Plane (SMF/CP)	The "brain" — manages sessions, associations, paths

ℹ️ Key Insight

The DP does the packet forwarding. The PEP tells the DP how to forward by provisioning sessions (PDRs, FARs, QERs, URRs) into the DP via an internal interface.

Repo sizes at a glance

Repo	Language	Approx LOC (src)	Key binary
data-plane	C (some C++)	~2M+ (huge, 90+ modules)	`data-plane`
pfcp-endpoint	C (some C++)	~300K (src/)	`pfcp-endpoint`

Chapter 2: Where They Fit in PCG/PCC

┌─────────────────────────────────────────────────────────────────┐ │ PCG (Packet Core Gateway) │ │ │ │ ┌──────────┐ PFCP ┌──────────────┐ internal ┌─────┐│ │ │ SMF │◄───────────►│ pfcp-endpoint│◄────────────►│ DP ││ │ │(Control) │ │ (PEP) │ (sessions) │ ││ │ └──────────┘ └──────────────┘ │ ││ │ │ ││ │ ┌──────────┐ │ ││ │ │ NWCMA │───── config ──────────────────────────────►│ ││ │ │(CM Agent)│ │ ││ │ └──────────┘ └─────┘│ │ ▲ ▲ │ │ N3/N9│ │N6│ │ (GTP) │ │ │ │ ▼ ▼ │ │ [gNB] [DN] │ └─────────────────────────────────────────────────────────────────┘

Key interfaces

N4 (PFCP): Between SMF and PEP — session establishment/modification/deletion
Punter interface: DP → PEP (incoming PFCP from network, forwarded via UDP)
Relay interface: PEP → DP (outgoing PFCP responses, sent back via UDP)
Session provisioning: PEP → DP (FlatBuffers over internal protocol to install PDRs/FARs)
N3/N9: GTP-U tunnels (user traffic in/out of DP)
N6: Towards data network (internet)

Chapter 3: The PFCP Protocol

PFCP (Packet Forwarding Control Protocol, 3GPP TS 29.244) is the protocol between the Control Plane and User Plane. As a former system tester, you've likely seen PFCP messages in traces. Here's the developer perspective:

Message types PEP handles

Direction	Messages
CP → UP (received)	Association Setup/Update/Release, Heartbeat, Session Establishment/Modification/Deletion
UP → CP (sent)	Association Update, Heartbeat, Session Report

Session = collection of rules

PDR (Packet Detection Rule): Which packets match this rule?
FAR (Forwarding Action Rule): What to do with matched packets?
QER (QoS Enforcement Rule): Rate limiting, gating
URR (Usage Reporting Rule): Counting bytes/packets

💡 For your feature

"Signaling latency" = the time from when a PFCP message arrives at the UP until the response goes back to the CP. The batching feature aims to reduce this by being smarter about how internal work is grouped and scheduled.

Chapter 4: The Feature — Smart Internal Batching (PCPB-25616)

The feature study PowerPoint is at:
/lab/epg_st_sandbox/etahris/PCPB/PCPB-25616/FS/

Problem statement

When the PEP receives a burst of PFCP session messages, it processes them and sends provisioning requests to the DP. Currently, each message may trigger individual internal operations (DB writes, mbox messages, etc.) that could be batched together to reduce overhead and latency.

Why it matters

Lower signaling latency → faster session setup for subscribers
Better CPU utilization under load
Improved throughput for session operations

Where to look in code

Component	Repo	Relevance
Work Manager	pfcp-endpoint	Queues and prioritizes incoming PFCP work
Session Engine	pfcp-endpoint	Processes session establishment/modification
ext_adapter	pfcp-endpoint	Receives messages from DP, feeds work manager
UPF session ctrl	data-plane	Receives provisioned sessions from PEP
Mailbox (mbox)	data-plane	Inter-CPU message passing — potential batching point
Session Queue	data-plane	Queues session operations per-session

Chapter 5: Data-Plane Architecture

The data-plane runs on multiple vCPUs, each assigned specific roles:

┌─────────────────────────────────────────┐ │ DATA-PLANE POD │ │ │ Packets in ──────►│ [Input]──►[Ingress]──►[Egress]──►[Output]──────► Packets out (NIC/VF) │ │ │ │ │ │ (NIC/VF) │ │ │ │ │ │ │ └──────────┴─────┬─────┴──────────┘ │ │ │ │ │ [Controller] │ │ (background) │ │ - session provisioning │ │ - config handling │ │ - metrics │ └─────────────────────────────────────────┘

Role descriptions

Role	Type	Responsibility
Input	Foreground	Poll NIC, parse packets, calculate flow hash, prioritize
Ingress	Foreground	Intra-instance load balancer, flow lookup, traffic steering
Egress	Foreground	Main business logic: PFCP-based forwarding (PDR matching, FAR application)
Output	Foreground	Send packets to NIC
Controller	Background	Session provisioning, config, metrics, OAM

ℹ️ Overload Protection Order

Processing priority is reverse of packet path: Output > Egress > Ingress > Input. This ensures already-started work completes before accepting new packets.

Key concepts

Fast-path: Once a flow's PDR is determined, it's cached. Subsequent packets skip DPI and go directly to the assigned egress CPU.
Flow spraying: Distributing packets across egress CPUs based on 5-tuple hash.
RCU: Read-Copy-Update for lock-free data structure access across CPUs.
Eventdev: DPDK event device library for passing events between foreground roles.

Source layout (data-plane/)

data-plane/
├── main/           # main.c, dp.c — the application entry point & orchestration
├── upf/            # UPF module: sessions, PDRs, FARs, QERs, URRs, DPI
├── pktio/          # Packet I/O: NIC abstraction, backends (DPDK, Linux, TAP)
├── mbox/           # Mailbox: inter-CPU message passing
├── core-loop/      # Core loop: the run-loop for each CPU role
├── protocol/       # Protocol handlers: GTP, IP/UDP, ARP, BFD, etc.
├── vrf/            # VRF (Virtual Routing & Forwarding)
├── cgnat/          # Carrier-Grade NAT
├── firewall/       # Firewall/ACL
├── itc/            # Internal Traffic Capture
├── up-common/      # Shared libraries (evl, logging, net, tls, etc.)
├── CMakeLists.txt  # Top-level build
├── Makefile        # Developer convenience targets
└── ARCHITECTURE.md # The architecture doc you should read first!

Chapter 6: PFCP-Endpoint Architecture

PEP is a single-threaded event-loop application (with a few helper threads). It's much simpler than the DP in terms of threading.

┌──────────────────────────────────────────────────────────────┐ │ PFCP-ENDPOINT (PEP) │ │ │ │ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ ext_adapter │────►│ work_manager │────►│ pep.c │ │ │ │ (rx thread) │ │ (priority Q) │ │ (main logic) │ │ │ └─────────────┘ └──────────────┘ └──────┬───────┘ │ │ ▲ │ │ │ │ UDP ▼ │ │ from DP ┌──────────────────┐ │ │ (punter) │ session_engine │ │ │ │ association_eng │ │ │ │ path_supervisor │ │ │ └────────┬─────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────────┐ │ │ │ relay to DP │ │ │ │ (UDP out) │ │ │ └──────────────────┘ │ └──────────────────────────────────────────────────────────────┘

Key source files (pfcp-endpoint/src/)

File	Size	Purpose
`pep_main.c`	94K	Application entry, TLS setup, thread creation
`pep.c`	110K	Core orchestration, module creation/wiring
`pep_session_engine.c`	662K	Session establishment/modification/deletion logic
`pep_association_engine.c`	392K	PFCP association handling
`pfcp.c`	185K	PFCP message encoding/decoding
`gtp_path_supervisor.c`	230K	GTP path management and heartbeats
`pep_ctrl.c`	88K	Control logic, start/stop
`ext_adapter.c`	57K	External adapter — receives from DP

⚠️ File sizes

pep_session_engine.c is 662K — that's ~15,000+ lines. Don't try to read it top-to-bottom. Use the function index and search for specific flows.

Chapter 7: How DP and PEP Communicate

Message flow: CP → UP session establishment

SMF (CP) DP PEP │ │ │ │──PFCP Session Est Req──►│ │ │ │──dp-ctrl header+msg────►│ (punter, UDP) │ │ │ │ │ │── parse PFCP │ │ │── allocate SEID │ │ │── build session (PDRs,FARs) │ │ │── provision to DP ──────►│ │ │◄── session data (FB) ────│ │ │ │ │ │ │ │── install session ───────│ │ │ │ │ │ │ │◄─ provision response ────│ │ │ │ │── build PFCP response │ │ │◄── relay (UDP) ──────────│ │ │◄─PFCP Session Est Resp──│ │

The dp-ctrl header

Messages between DP and PEP are wrapped in a private "dp-control" header containing:

Source/destination IP and port
Network instance ID
Timestamp (when DP received the message from network)

PCG vs EPG steering

EPG: DP uses partition table to steer to specific PEP instance (same board)
PCG: DP forwards to PEP's service/cluster IP (Kubernetes). Node messages via TCP, session messages via UDP.

Chapter 8: Threading & Event Loops

PEP threading model

Main thread (evl): Runs the event loop, processes all business logic
ext_adapter thread: Receives UDP from DP, puts work into work_manager queues
Monitor thread: Health checks, metrics

ℹ️ EVL = Event Loop

evl_t is the core event loop abstraction from up-common. It handles timers, deferred work, I/O events. Think of it like libuv or epoll wrapped in a nice C API. Almost everything in PEP runs on the main EVL thread.

DP threading model

Each vCPU runs a core loop with its assigned role(s)
The controller CPU runs an EVL for background tasks
Foreground CPUs are poll-mode (no sleeping, no EVL)
Communication between CPUs: mailbox (lock-free queues)

💡 Why this matters for batching

The signaling path crosses thread boundaries: ext_adapter → work_manager → main EVL → session_engine → relay. Each boundary is a potential batching point. The mbox in DP is another: controller → egress for session installation.

Chapter 9: UPF Module (data-plane)

The upf/ directory is the heart of session handling in the data-plane. It's where PFCP sessions live after PEP provisions them.

Key files

File	Purpose
`upf_session_ctrl.c` (1MB!)	Session controller — receives provisioning from PEP, manages session lifecycle
`upf_session_engine.c`	Engine side — runs on egress CPUs, applies session rules to packets
`upf_session_queue.h`	Per-session message queue (serializes operations on same session)
`sx_session.c/.h`	The session data structure (PDRs, FARs, QERs, URRs)
`sx_session_transaction.c`	Transaction handling for session modifications
`upf_pfcp_punter.c`	Receives PFCP from network, forwards to PEP
`pep_adapter.c`	Adapter between PEP's provisioning and DP's session ctrl
`upf_engine.c` (623K)	The main packet processing engine on egress

Session message types (from upf_session_queue.h)

UPF_SESSION_MSG_TYPES:
  PFCP                    // Generic PFCP operation
  PFCP_ESTABLISHMENT      // New session
  REPORT                  // Usage report
  PAYLOAD                 // Packet triggered re-evaluation
  INVALIDATE              // Session invalidation
  TERMINATE              // Session deletion
  GEO_ESTABLISHMENT      // Geo-redundancy
  INTERNAL_MODIFICATION  // Internal config change
  ...

💡 For batching

The session queue serializes operations per-session. If multiple messages arrive for different sessions, they can potentially be batched at the controller level before being dispatched to individual session queues.

Chapter 10: Mailbox (mbox)

The mailbox is the primary mechanism for passing messages between CPUs in the data-plane.

How it works

// mbox/include/mbox/mbox.h
typedef enum mbox_priority {
    MBOX_PRIORITY_CRITICAL,  // Highest
    MBOX_PRIORITY_HIGH,
    MBOX_PRIORITY_MID,
    MBOX_PRIORITY_LOW,       // Lowest
} mbox_priority_t;

typedef struct mbox_msg {
    struct {
        uint32_t u32_1;
        uint32_t u32_2;
        void*    ptr;
        uint64_t u64;
    } data;
} mbox_msg_t;

The mbox uses lock-free MPSC queues (Multiple Producer, Single Consumer) — multiple CPUs can send to one CPU without locks.

Controller CPU Egress CPU 0 ┌────────────┐ ┌────────────┐ │ │── mbox_send() ──►│ mbox Q │ │ session │ │ (MPSC) │ │ ctrl │ │ │ │ │── mbox_send() ──►│ processes │ └────────────┘ │ in poll │ └────────────┘ Egress CPU 1 Controller CPU ┌────────────┐ ┌────────────┐ │ │── mbox_send() ──►│ mbox Q │ │ engine │ │ (MPSC) │ │ (metrics) │ │ │ └────────────┘ └────────────┘

Relevance to batching

When the controller provisions a session, it sends mbox messages to egress CPUs. If many sessions are being provisioned simultaneously, batching these mbox messages could reduce overhead (fewer cache-line bounces, fewer wakeups).

Chapter 11: PKTIO & Roles

PKTIO (Packet I/O) is the bottom layer abstracting NIC access. It has a frontend and multiple backends:

Backend	Interface	Use
LIBPIO	carrier + pool	Production: DPDK-based I/O
LINUX	carrier + pool	Development: AF_PACKET
TAP	carrier	Testing
NATIVE	pool	Dynamic packet buffers

The frontend provides hooks for:

ACL matching
FIB lookup
Traffic capture (ITC)
Traffic steering (partition lookup)
MTU enforcement

ℹ️ PKTIO and signaling

PKTIO's punter functionality is what extracts PFCP signaling packets from the wire and forwards them to PEP. The upf_pfcp_punter.c handles this.

Chapter 12: Session Engine (PEP)

The session engine (pep_session_engine.c, 662K) is the largest file in PEP. It handles:

PFCP Session Establishment Request → allocate SEID, build session, provision to DP
PFCP Session Modification Request → update PDRs/FARs/QERs/URRs
PFCP Session Deletion Request → clean up session
Session Report Request → send usage reports to CP

Key flow: Session Establishment

// Simplified flow in pep_session_engine.c:
1. Receive PFCP Session Establishment Request (from work_manager)
2. Decode PFCP IEs (PDRs, FARs, QERs, URRs)
3. Allocate UP SEID (Session Endpoint Identifier)
4. Build internal session representation
5. Encode session as FlatBuffer
6. Send to DP via session_client (provisioning)
7. Wait for DP acknowledgment
8. Build PFCP Session Establishment Response
9. Send response via relay interface back through DP to CP

Related files

pep_internal_session_engine.c (220K) — internal session operations
pep_session_info.c (59K) — session information management
session_commands.c (81K) — command handling
pfcp_resource_encoder.c (65K) — encoding PFCP resources for DP

Chapter 13: Work Manager (PEP)

The work manager is PEP's internal scheduler. It's critical for understanding where batching can be applied.

Priority queues (highest to lowest)

Priority	Work Type	Dropped during OLP?
1	Ongoing work (continuations)	No
2	Node messages (association, heartbeat)	No
3	Session report responses	No
4	Session report requests	No
5	Session deletion requests	Yes
6	Session modification requests	Yes
7	Session establishment requests	Yes
8	Background work (droppable)	Yes
9	Background work	No

⚠️ Batching opportunity

The work manager dequeues one item at a time from the highest-priority non-empty queue. A "smart batching" approach could dequeue multiple items when they're available, process them together, and send a single batched provisioning request to the DP.

Overload protection

When queues grow (ext_adapter adds faster than main thread processes), the work manager checks max queue time. If exceeded, lower-priority work is dropped. Establishments drop first, then modifications, then deletions.

Chapter 14: Module Lifecycle

Both repos follow a strict module lifecycle pattern (documented in data-plane/MODULE_GUIDELINES.md):

┌──────────┐ create() ┌──────────┐ start() ┌──────────┐ │ │─────────────────►│ │─────────────────►│ │ │ (none) │ │ CREATED │ │ STARTING │ │ │ │ │ │ │ └──────────┘ └──────────┘ └────┬─────┘ │ on_started_cb() │ ▼ ┌──────────┐ delete() ┌──────────┐ stop() ┌──────────┐ │ │◄─────────────────│ │◄─────────────────│ │ │ (none) │ │ STOPPED │ │ STARTED │ │ │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘

Rules

Start must not fail — if resources are missing, retry until available
Start is called once — no support for multiple start calls
Stop must complete all ongoing work before calling on_stopped_cb
Stop must be callable before start completes (for interrupted startups)
Dependencies are injected in create() or start(), never exposed via getters

API pattern

// Every module exposes:
module_t* module_create(dependencies...);
void      module_start(module_t*, on_started_cb, cb_arg);
void      module_stop(module_t*, on_stopped_cb, cb_arg);
void      module_destroy(module_t**);

Chapter 15: Control & Engine Pattern

Components that span both background (controller) and foreground (egress) CPUs follow the Control & Engine pattern:

Part	Runs on	Responsibility
`*_ctrl.c`	Controller CPU	Configuration, lifecycle, provisioning, metrics collection
`*_engine.c`	Egress CPU(s)	Per-packet processing, fast-path logic

Communication: Control → Engine

// Via mailbox:
mbox_send(mbox, egress_cpuid, MBOX_PRIORITY_HIGH, &msg);

// The engine polls its mbox in the core loop and processes messages

⚠️ No direct function calls control→engine

Direct calls risk race conditions. Always use mbox. Exception: metrics collection (read-only, with careful synchronization).

Examples in the codebase

upf_session_ctrl.c (control) ↔ upf_session_engine.c (engine)
upf_ctrl.c (control) ↔ upf_engine.c (engine)
li_session_ctrl.c (control) ↔ li_session_engine.c (engine)

Chapter 16: RCU in Data-Plane

Read-Copy-Update is used extensively to allow lock-free reads on shared data structures:

// Pattern:
// 1. Reader (egress, hot path):
rcu_read_lock();
element_t* elem = hash_table_lookup(table, key);
// use elem... (pointer valid only within rcu_read_lock/unlock)
rcu_read_unlock();

// 2. Writer (controller):
element_t* old = hash_table_lookup(table, key);
element_t* new = copy_and_modify(old);
hash_table_replace(table, key, new);
call_rcu(old, free_element);  // deferred free after all readers done

⚠️ Critical rule

Never store a pointer to an RCU-protected element for async handling! The pointer is only valid within the rcu_read_lock/unlock section.

Chapter 17: Async Config (EVL Batch Iterator)

From CONFIG_GUIDELINES.md — the DP is moving from synchronous to asynchronous configuration application:

Rules for large-number objects

Never iterate over large-number objects in a single blocking loop
Each object handled by its own low-priority deferred job
Use EVL Batch Iterator with defer_priority = EVL_DEFER_PRIORITY_LOW and batch_size = 1

Two-step config model

// Step 1: Parent builds module-specific config
module_config_t* cfg = build_module_config(raw_config);

// Step 2: Child applies it
module_set_config(module, cfg, on_done, on_done_ctx, on_done_arg);
// All deferred work started inside set_config
// on_done called exactly once when all work complete

💡 Connection to batching

This async config pattern shows how the codebase already handles "don't block the main loop" problems. The batching feature likely follows similar principles: defer and batch work to avoid blocking signaling processing.

Chapter 18: Build System

Tools

CMake: Primary build system for both repos
Bob: CI/CD build orchestrator (wraps Docker + CMake)
Makefile: Developer convenience (data-plane only)

Data-plane Makefile targets

make test        # Build and run tests (fast, clang + sanitizers)
make testsan     # Build with slow sanitizers (ASAN+UBSAN)
make testcov     # Get UT/SFT coverage
make lsp         # Generate compile_commands.json for clangd
make image       # Build Docker image for system test
make lint        # clang-tidy on head commit only
make builds/san  # Sanitizer build
make builds/debug # Debug build (no optimization, good for GDB)

Building pfcp-endpoint standalone

# Using bob:
./bob/bob init-dev
./bob/bob generate:3pp
./bob/bob generate:cmake
./bob/bob build

# Or manually with CMake:
mkdir build && cd build
cmake .. -DPLATFORM=Linux_elc
make -j$(nproc)

Key CMake options

Option	Default	Purpose
`WITH_IPOS_SDK`	OFF	Enable EPG/IPOS-specific code paths
`BUILD_TESTING`	ON	Build unit tests and SFTs
`USE_ASAN`	OFF	Address Sanitizer
`USE_TSAN`	OFF	Thread Sanitizer

Chapter 19: Testing Layers

Layer	Location	What it tests	Speed
Unit Tests (UT)	`tests/ut/`	Individual functions/modules with mocks	Fast (seconds)
SFT (Software Function Test)	`tests/sft/`	Full binary with simulated peers	Medium (minutes)
TOADS	`tests/toads/`	Integration tests in containers	Slow (10+ min)
Veto	`tests/veto/`	System-level tests in K8s	Slowest (hours)

PEP SFT architecture

// tests/fixture/pep_sft_fix.c — the test fixture
// Simulates:
//   - Control Plane (sends PFCP messages)
//   - Data Plane (receives provisioning, sends responses)
//   - Config (NWCMA simulator)
//   - LEP (Local Endpoint)
//   - UEIP Allocator

// tests/sft/pep_sft_sessions.c — session test cases
// tests/sft/pep_sft_associations.c — association test cases

💡 For your feature work

You'll likely write SFT tests that verify batching behavior: send multiple session establishments rapidly and verify they're processed correctly with lower latency.

Chapter 20: CI/CD Pipeline

Pipeline stages (both repos)

Pipeline	Trigger	What it does
PreCodeReview	Push to Gerrit	Build, lint, UT, SFT, helm chart check
Drop	Merge to master	Full build, publish Docker image + Helm chart
Pra	Release	PRA (Product Release Approval) pipeline
VA2.0	Scheduled	Vulnerability Assessment scans
SoC	Scheduled	Structure of Code analysis

Gerrit workflow

# 1. Create branch
git checkout -b my-feature

# 2. Make changes, commit
git add -A
git commit  # Include Change-Id from commit-msg hook

# 3. Push for review
git push origin HEAD:refs/for/master

# 4. Wait for PreCodeReview (+1/-1)
# 5. Address review comments, amend
git commit --amend
git push origin HEAD:refs/for/master

# 6. Get Code-Review +2, Submit

Chapter 21: The Signaling Latency Problem

Let's trace the full signaling path and identify where latency accumulates:

Time ──────────────────────────────────────────────────────────────────► SMF sends PFCP Session Establishment Request │ ▼ [Network latency] DP receives packet on NIC │ ▼ [PKTIO processing: parse, identify as signaling] DP punts to PEP (upf_pfcp_punter → UDP to PEP) │ ▼ [UDP transit: ~negligible within pod] PEP ext_adapter receives │ ▼ [Queue into work_manager: ~negligible] PEP main thread dequeues from work_manager │ ▼ [★ PROCESSING: decode PFCP, build session, encode FlatBuffer] PEP provisions session to DP │ ▼ [UDP to DP session_client] DP receives provisioning, installs session │ ▼ [★ SESSION INSTALL: mbox to egress, RCU update] DP sends acknowledgment to PEP │ ▼ [UDP back] PEP builds PFCP response │ ▼ [Relay via UDP to DP] DP sends PFCP response to SMF │ ▼ [Network] SMF receives response Total signaling latency = sum of all ★ steps + transit

Where batching helps

When many PFCP messages arrive in a burst (e.g., during mass attach), the current model processes them one-by-one sequentially. Smart batching can:

Batch DB operations: Multiple sessions writing to Redis can be pipelined
Batch mbox messages: Send one batch notification to egress instead of N individual messages
Batch provisioning: Send multiple sessions to DP in one request
Reduce context switches: Process a batch before yielding to the event loop

ℹ️ The trade-off

Batching improves throughput but can increase latency for the first message in a batch (it waits for the batch to fill). "Smart" batching means: batch when there's a queue, don't batch when idle (don't add artificial delay).

Chapter 22: Batching Concept

Smart batching strategy

Without batching: With smart batching: msg1 → process → provision → ack msg1 ─┐ msg2 → process → provision → ack msg2 ─┼─► batch process → batch provision → ack all msg3 → process → provision → ack msg3 ─┘ msg4 → process → provision → ack msg4 ─┐ msg5 ─┼─► batch process → batch provision → ack all Total: 4 round-trips to DP msg6 ─┘ Total: 2 round-trips to DP

Key design questions for the feature

Batch trigger: When to flush a batch? (queue depth? timer? both?)
Batch size: Maximum messages per batch?
Scope: Which operations can be batched together?
Error handling: If one message in a batch fails, what happens to others?
Ordering: Must messages for the same session be ordered?

Existing batching in the codebase

// pfcp-endpoint/src/pep_options.h:
pep_options_kvdb_request_batch_interval(const pep_options_t* opt);
// Already has a concept of batching DB requests!

// data-plane/upf/src/upf_session_queue.h:
// Per-session queue already serializes operations
// Multiple sessions can be processed in parallel

Chapter 23: Code Reading Plan

Here's your recommended reading order to understand the feature context:

Phase 1: Understand the architecture (Week 1)

#	File	Why
1	`data-plane/ARCHITECTURE.md`	Overall DP architecture, roles, concepts
2	`data-plane/MODULE_GUIDELINES.md`	Module lifecycle pattern
3	`data-plane/CONFIG_GUIDELINES.md`	Async config pattern (relevant to batching)
4	`pfcp-endpoint/docs/mad/pfcp-endpoint.md`	PEP architecture, work manager, steering
5	`pfcp-endpoint/CONTRIBUTING.md`	How to build and contribute

Phase 2: Understand the signaling path (Week 2)

#	File	What to look for
6	`pfcp-endpoint/src/pep.h`	Main PEP struct and create params — see all dependencies
7	`pfcp-endpoint/src/ext_adapter.c`	How messages arrive from DP
8	`pfcp-endpoint/src/pep_session_engine.h`	Session engine interface (start with .h, not .c!)
9	`data-plane/upf/src/pep_adapter.c/.h`	DP side of the PEP↔DP interface
10	`data-plane/upf/src/upf_session_queue.h`	Session queue message types
11	`data-plane/mbox/include/mbox/mbox.h`	Mailbox API

Phase 3: Understand batching points (Week 3)

#	File	What to look for
12	`pfcp-endpoint/src/pep_options.h`	Search for "batch" — existing batch config
13	`pfcp-endpoint/src/pep_session_engine.c`	Search for provisioning flow (how sessions are sent to DP)
14	`data-plane/upf/src/upf_session_ctrl.c`	How DP receives and installs sessions (search for "pep_adapter")
15	`data-plane/mbox/src/mbox.c`	Mbox implementation — understand send/receive

💡 Reading strategy for huge files

Always start with the .h file — it shows the public API
Use grep -n "function_name" to find implementations
Use make lsp then open in VS Code with clangd for navigation
Focus on the flow, not every line. Trace one session establishment end-to-end.

Chapter 24: Day-to-Day Workflow

Setting up your environment

# 1. Generate compile_commands.json for IDE
cd /workspace/git/etahris/data-plane
make lsp

# 2. For pfcp-endpoint:
cd /workspace/git/etahris/pfcp-endpoint
./bob/bob init-dev
./bob/bob generate:3pp
./bob/bob generate:cmake
# compile_commands.json will be in the build dir

Running tests locally

# Data-plane: fast test cycle
cd /workspace/git/etahris/data-plane
make test                    # All tests
make test t=upf_session_SUITE  # Specific suite

# PEP: using bob
cd /workspace/git/etahris/pfcp-endpoint
./bob/bob build
./bob/bob test

Debugging tips

GDB: Use make builds/debug for unoptimized builds
ASAN: Use make builds/san to catch memory bugs
Logs: Both services use structured JSON logging. Use upi_tools/frog_rs or format_json_logs to read them
Core dumps: Already have some in your pfcp-endpoint dir — use gdb pfcp-endpoint core.X

Commit message format

Short summary (max 50 chars)

Longer description of what and why (not how).
Wrap at 72 characters.

Change-Id: I1234567890abcdef  (auto-generated by hook)

Key contacts

Team Infinity: Owns pfcp-endpoint (PDLGCTEAMI@pdl.internal.ericsson.com)
Team Fiji: Owns async config work in data-plane