PARALLELIZING NETWORK PACKET PROCESSING IN SOFTWARE ROUTER Team Members: Amriteshwar Singh, Bibin Thomas, Divya Sasidharan, Hars Vardhan, and Pritesh Baviskar Parallel Systems and Architecture CS6399 Term Project Overview Introduction Project Description Software Router v/s Hardware Router Parallelizing Across Servers Parallelizing Within Server Project Methodology Experiment Results Conclusion References Introduction What is a Software Router. How to go about developing one. Across Server Parallelism. With-in Server Parallelism. Route-Bricks Architecture. Project Description Objective Understanding and evaluating the clustered software router architecture. Exploiting parallelism Challenges High Performance Re-programmability Software Router v/s Hardware Router External Lines Server Server 1 2 Server R bps R bps Server Inter Server Connection Server N Server Internal Switching Fabric Traditional Router Architecture Server Server Software Router Architecture Single Server Software Router Cluster Router Architecture Key features Each server processes packets at a rate proportional to R.(cR, c=constant independent of N) Decentralization -Load Balanced interconnects Extensible – Increase in number of ports proportional to N , with the addition of new servers. R bps Parallelizing Across Servers Valiant Load Balancing Routing.(VLB) Full Mesh Interconnection. Two Phase Routing Node Roles - Source, Intermediate, Destination. Drawbacks Phase I - Source sends input packet to a random intermediate node. Phase II – Intermediate node relays to Destination node. Additional cost of forwarding packets(3R as opposed to 2R still << NR) Direct VLB Observation of uniform traffic across nodes. Avoiding Phase I overhead in VLB for randomizing the inflow. Avoids the VLB’s per-server processing additional cost.(Max=2R) Valiant Load Balanced Mesh External Lines 1 1’ 1’’ 2 2’ 2’’ 3 3’ 3’’ Internal Lines R/8 bps 8 8’ 8’’ Logical View:8-port VLB Mesh Topology: k-ary n-fly k=per server fanout ; n=logk N Configurations (Feasibility Study) Current Servers – 1 router port and 5NIC’s per server (Max N=32 ports) Additional NIC’s – 1 router port and 20 NIC’s (Max N=128 ports) Faster Servers – 2 router ports and 20 NIC’s (Max N=2048 ports) Parallelism within server Exploring the design considerations Server architecture Workload balancing across cores Single queue or multiple queues Batch processing Server Architecture (Traditional) Server Architecture (NUMA) Load balancing across cores Accessing the queue to transmit/receive the packet Rule : Each network queue must be accessed by single core Processing of the packets Rule: Parallel approach must be adopted Queuing technique Single queue or Multiple Queue Parallelism within server Exploring the design considerations Server architecture Division of work among cores Single queue or multiple queue Batch processing Experimental Results • para1 – para8.utdallas.edu machine used • Mesh interconnect up to 16 nodes. • Valiant Load Balancing • MPI with multi-threading • Threads on each node: • Packet Generator (data-in) • Packet receiver within System • Table lookup and Sender (data-out) • MPI Source Code •Results Experiment Results No. of nodes v/s Rate Processing Rates (Mbps) 6000 5000 4000 3000 Rate(Mbps) 2000 1000 0 0 5 10 Number of Nodes 15 20 Conclusion Performance Competitive performance when compared to a hardware router. Cost 20-port Programmability Use 10Gbps ~ 13K existing MPI Lib for implementation Relaxed Performance Guarantee References RouteBricks: Exploiting Parallelism To Scale Software Routers, Mihai Dobrescu and Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone, Allan Knies, Maziar Manesh, Sylvia Ratnasamy, “ 22nd ACM Symposium on Operating Systems Principles (SOSP), October 2009”. Can Software Routers Scale? , Katerina Argyraki, Salman Baset, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone, Allan Knies, Eddie Kohler, Maziar Manesh, Sergiu Nedveschi, Sylvia Ratnasamy, “ ACM Sigcomm Workshop - PRESTO, August 2008”. Next Generation Intel Microarchitecture (Nehalem), White Paper, Intel Corp., 2008. ???
© Copyright 2025