Telecom Control and Data Plane Convergence

Telecom Control and Data Plane Convergence
Choosing the right multicore software architecture for
high performance control and data plane applications
Magnus Karlsson
Systems Architect
Telecom Market Conditions
 IP-based services driving
exponential data traffic growth
Traffic volume
Expected
Traffic volume
 High focus on rich user
experience and service
platforms
Expected
Revenue
 Declining Average Revenue
Per User (ARPU)
 Focus on Network OPEX and
CAPEX cost
Desired
network
cost/bit
Time
Voice dominated
Data dominated
Trends in Telecom
 Telecom is going “ALL IP”
- Massive IP packet processing applications
- Power consumption is a critical design factor
- QoS is becoming increasingly important – telecom reliability in the
datacom world
 Industry response:
- Application specific processors, multicore and HW acceleration engines
- CPU and DSP use cases converging
- Integration of control plane and data plane into the same multicore
processors
Fundamental Differences between Control Plane
and Data Plane Applications
 Control Plane Characteristics:
- CPU bound processing
- Operations And Maintenance functions
- Typically terminates IP traffic
 Data Plane Characteristics
- IO bound processing
- Highly optimized to use as few CPU cycles as
possible
- Typically do not terminate IP traffic
Multicore in Networking Applications
Examples
Application Type => Software Architecture
Parallel, symmetric processing
”run-to-completion”
Parallel, asymmetric processing
functional pipelining
egress
ABC
ABC
egress
ABC
ingress





All tasks in a flow can be handled by a thread
I/O bound processing – low cpu budget
Load balancing through hardware support
Scaling depends on advanced HW support
Popular for data plane
C
C
B
B
A
A
ingress





Different cores work on different stages
Complex protocols - CPU bound processing
Load balancing through run-time rebalancing
Scaling capability depends on OS support for
state migration/sharing
Popular for control plane and O&M
Application Type => Software Architecture
Parallel, symmetric processing
”run-to-completion”
Parallel, asymmetric processing
functional pipelining
egress
ABC
ABC
egress
ABC
ingress





All tasks in a flow can be handled by a thread
I/O bound processing – low cpu budget
Load balancing through hardware support
Scaling depends on advanced HW support
Popular for data plane
C
C
B
B
A
A
ingress





Different cores work on different stages
Complex protocols - CPU bound processing
Load balancing through run-time rebalancing
Scaling capability depends on OS support for
state migration/sharing
Popular for control plane and O&M
Another view of Multi-core Use Cases
IP Packet Processing
Control Plane
CPU-bound
Cycles/Byte
SMP domain
Linux or
RTOS
Termination
Control
Signaling
IO-bound
Transcoding
AMP Linux,
RTOS or
”bare-metal”
domain
Deep Packet Inspection
Intrusion Prevention
Other
IP
Forwarding
Data Plane
Level of parallelism
or cores
Multiple Use Cases Demand Multiple OS Solutions
One “size” doesn’t fit all – i.e. both Linux and RTOS
Requirements on Multicore Operating
Systems
 Future for data plane applications
Bare-metal or AMP
 Direction for control plane apps
SMP
Challenge:
Find an operating environment for multicore processors that satisfy both data
plane and control plane applications, despite their fundamental differences.
Solution:
Use a modular and flexible system that combines the best characteristics of
SMP, AMP and bare-metal.
Incumbent Configuration for Integrated
Control and Data Plane
SMP Operating System + Bare-Metal
Control plane application
Shared OS resources
SMP OS
Core 0
Core 1
Data plane app
Exec. env
Core 2
…
Data plane app
Exec. env
Core n
Advantages:
• Control plane application can use high-level SMP RTOS or Linux
• Can fully utilize processor vendor’s Executive Environment, if any
• Raw bare-metal performance for data plane processing
Disadvantages:
• Bare-metal cores becomes silos, hence only suitable for run-to-completion applications
• Poor debugging, profiling and run-time management capabilities on bare-metal cores
• No platform services available such as IPC, file systems and networking stacks on bare-metal
cores
Introducing “XMP” – A better Way
XMP provides both SMP and AMP Characteristics
XMP Hybrid AMP/SMP Multiprocessing
Common shared OS resources
Scheduler
Scheduler
…
Scheduler
…
Core n
Kernel event backplane
Core 1
Core 2
SMP Characteristics:
AMP Characteristics
• Easy to use
• Simple configuration
• Load balancing/process migration
• Deterministic
• Very good scalability
• Suitable for IO intensive applications
XMP – Combining the Best of AMP and SMP in One RTOS
Enea OSE Multicore Edition – an XMP Solution
 Linear scalability of performance on multicore devices
- Asymmetric kernel design that has a scheduler instance on each core
- Avoid use of global or shared locks in kernel calls or interrupt processing
 Maintain single core performance on each core
- Enhanced driver execution model, allows HW vendor bare-metal SDK:s “Executive
Environments” to run inside an OSE process without additional overhead
 Management and debug
-
Shared OS services as in an SMP OS
Seamless runtime debugging, CPU and memory profiling on all cores
User defined load-balancing based on open API
Booting, loading, starting and stopping applications
Fault management
OSE 5 / Multicore Edition
Fully featured RTOS for distributed and fault-tolerant systems

Highly Scalable RTOS

Designed for distributed systems

Support for memory protection
and dynamic software updates

Comprehensive IP networking
support

Optima tools integrated with
CodeWarrior

Multicore Edition:

Hybrid SMP/AMP Microkernel

SMP ease-of-use/configuration

AMP scalability and performance

Ability to run bare-metal
applications on a core
Reaching Bare-Metal Performance
 Supervisor threads + polling busy loop to achieve bare-metal performance
 The rest of the OS functionality can be used as needed
- No OS overhead when not in use
 Provides management, debugging and profiling of software on all cores
Bare-Metal Performance and Linear
Scalability in OSE Multicore Edition
 “Bare Metal” Light-Weight Executive - packet polling loop
 Enea OSE Multicore Edition version 5.4.1
 Two benchmark scenarios:
- Packet processing
- Simple packet routing in a bare-metal environment
- Simple packet routing in an OSE Multicore Edition process, with full access to all
services in OSE
- Scalability
- Instantiation of a “silo” application over many cores
Throughput Mbyte/s
Data Plane Processing Performance
Bare-Metal
OSE ME
112
128
256
512
1025
1280
1518
Frame size
 Performance nearly identical to raw bare-metal speed
Scalability over many Cores
Total number of transactions (normalized)
Scalability OSE Multicore Edition vs Linux
18
16
14
12
10
OSE
8
Ideal
6
4
2
0
1
2
3
4
5
6
7
8
9
10
Number of cores
11
12
13
14
15
16
What about Linux for Control Plane and OSE
Multicore Edition for Data Plane?

Who owns the boot and configuration policies?

How to partition shared resources like memory?

How to share devices and services in runtime?

How to be able to profile and debug all parts of the system?

How to be able to dynamically balance the load?
Challenge:
How do we create hardware abstraction for all those OS:es? A classic
request, but historically put on OS by applications!
Solution:
Heterogeneous execution environments on multicore devices needs a new
“OS” software layer, a so called Hypervisor!
Enea Hypervisor
Example: Linux, RTOS & EE Applications
Multicore Processor
Linux App
RTOS/App
Tools and
Management I/F
EE App
Linux
Enea Hypervisor
CPU 0
CPU 1
CPU 2
 Based on OSE Multicore Edition technology: flexible, lightweight, extensible, framework
for execution and management
 Enables multiple operating environments to coexist on the same multicore in different
configurations
 Enables boot, remote management and system configuration control
 Enables tool support for debugging and profiling of the whole system
Enea Hypervisor Features
 Provides support to:
- Dynamically load and remove native applications, or
guest domains
Multicore Processor
- Measure CPU load per core or individual application
in runtime
- Define scheduling policy between guest domains
Linux App
RTOS/App
- Perform system wide load regulation
- Communicate between guest domains using Enea
LINX
Bare Metal
Linux
Enea Hypervisor
CPU 0
CPU 1
- Share services like file systems across guest
domains
Optimized for co-existence between Linux and OSE
– LINX communication channel over shared memory
– Shared file system using LINX
– Shared Ethernet device(s)
– Shared console
CPU 2
Low Entry Alternative
A lightweight OS model for Control/Data Plane
OSEck for Multicore CPU’s – AMP Model
OSEck – Enea “lightweight” kernel executive
Background
User App /
System Proc
Foreground
User App
Executive Env.
OSEck
Worker
Core X
Ethernet
Data Plane
LINX over
Shared
Pools

LINX shared pool connection manager

Implements support for EE applications

Optima support

Bare-metal performance

Easy migration - simple to port user applications

Add observability on a per core level

Linux Management Core
Worker
Core Y
LINX over
Shared
Pools
Linux
Management
Core
LINX over
Ethernet
OSEck – Two Scheduling Models
Pre-emptive Model
CPU x
Optima
Monitor
IDLE
dSPEED
dSPEED
dSPEED
Interrupt-driven
Run-to-completion Model
CPU y
Optima
Monitor
dSPEED
dSPEED
dSPEED
IDLE
PRI
0-31
PRI
0-31
LINX
RLNH
Shell
LINX
RLNH
Shell
Timeout
Server
Timer
INT
0-31
LINX
Shmem
RX
Timeout
Server
Timer
Run-to-completion
processing loop
LINX
Shmem
RX
INT
1-31
INT0
Scalable, Uniform IPC – Enea LINX
Core-to-core, device-to-device, board-to-board level message based IPC

Intra-core communication (OSE and OSEck)
- Intra-core message passing is by reference (zero-copy)

Inter-core communication (LINX)
- LINX can transport messages over shared memory, DMA,
hardware queues etc.

Message
Process 1
Inter-device communication (LINX)
- Accomplished using LINX communicating over
sRIO, Ethernet or PCIe
Process 2

Open source for Linux, superior in performance to TIPC

Proprietary for OSE, OSEck and other OS
OSEck for MSC8155/56
 Message ID
 Sender
 Receiver
 Owner
 Data
PC
OSE for P4080
LINX over Shared Memory / sRIO / Ethernet / …
SC3850 SC3850 SC3850 SC3850 SC3850 SC3850
Control
Worker
Worker
e500
e500
e500
GPP
Process 3
Enea Multicore Solutions





Target Customers:
Network Equipment Providers
Adjacent markets
Target applications:
Packet forwarding/processing
LTE L1/L2 processing
Connectivity layer processing
Combined Control/Data plane
Cycles/Byte



Transcoding
Termination
Other
Packet
Processing
Control
Signaling
IP
Forwarding
Level of parallelism






Enea Offering:
OSE ME - Fully featured Multicore RTOS with SMP ease-of-use and AMP performance/scalability
Enea Hypervisor – for heterogenous/guest OS support, like Linux + OSE
OSEck - Compact Kernel “AMP” RTOS for signal processing/packet processing on multicore CPU’s/DSPs
LINX - Inter Process Communication (IPC) framework: OS, processor, interconnect independent
Optima – Eclispe Development tools for debugging and system profiling/analysis
True Convergence of Control and Data Plane
A single implementation that supports SMP
and AMP, or Hybrid models for All Use
Cases:
• Control Plane
• Control + Data Plane (with bare metal)
• Data Plane
Technologies and features
• Heterogeneous systems - support for
both Linux and RTOS
• Hypervisor/Virtualization
• Performance
• Load balancing
• Fast Path IP
• System wide IPC
• High Availability – Fault Localization
• Integration with applications
environments
• Eclipse based Integrated tools
Summary
No “one” processing model meets every use case for multicore in
telecom. An understanding of specific use cases is crucial in
determining the best solution:
 Control plane and data plane applications require different software
architectures, but can co-exist on multicore processors
 A flexible OS platform that can combine properties of bare-metal, AMP, and
SMP is the best fit
Enea provides the most flexible OS framework that addresses most
use cases
 OSE Multicore Edition with its hybrid kernel technology can support both IO
and CPU bound processing in one homogeneous configuration
 For control plane applications on Linux, OSE Multicore Edition can be
extended with Hypervisor support to incorporate guest domains in a
heterogeneous configuration
 OR, a lightweight, small footprint AMP model for special low end or entry use
cases
Questions?
magnus.karlsson@enea.com