Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
ovstack : A Protocol Stack of Common Data Plane for Overlay Networks
Ryo Nakamura†, Kouji Okada‡, Yuji Sekiya†, Hiroshi Esaki† †University of Tokyo
‡Lepidum
IEEE/IFIP NOMS, SDNMO 2014
Various Applica=ons and Requirements to Network
• Various Applica=on communica=ng via IP network – Content Delivery, Voice and Video Streaming – Carrier Services : L2/L3VPN, IaaS Cloud, MVNO
• Networks should be constructed from Requirements – Content Delivery Network, Sure Route (Akamai)
Applica=on Requirements
Network/Opera=onal Requirements
The Internet or Mul=ple xSP Networks
Networks based on each requirement
IEEE NOMS, SDNMO Workshop 2
IEEE NOMS, SDNMO Workshop
Limits of Exis=ng IP Architecture
• IP provides global scale network, and end-‐to-‐end communica=on – IP network can not be constructed by applica[on and users requirements.
• Issues of new rou=ng systems by IRTF Rou=ng Research Group, RFC6227 : Design Goals for Scalable Internet Rou=ng – Scalability – Traffic Engineering – Mul[-‐Homing – Locator/id Separa[on – Mobility – Simplified renumbering – Modularity – Rou[ng quality – Rou[ng security – Deployability
IPv4 IPv6
Ethernet ATM
1000BASE-‐T 802.11
TCP UDP SCTP
SMTP TELNET HTTP
Mail Firefox Skype Zabbix
Lack of flexibility!
10GBASE-‐SR 100GBASE-‐LR4 3
• IP encapsula=on with a header including new ID space. – NVO3 : Network Virtualiza=on Overlays – It reduces opera[onal costs such as hop-‐by-‐hop configura[ons – It provides some func[onali[es that can not be achieved by IP
(e.g., prefix mobility, mul[ homing, mul[ path, mul[ tenancy, and rou[ng).
• Point to Point tunneling – L2TP、EtherIP、GRE
• IP over IP overlays – LISP
• Ethernet over IP overlays – VXLAN, NVGRE, GENEVE(?)
• Research – Resilient Overlay Network, Scribe (ScaderCast, Mul[cast Middleware and more)
Overlay Network
Exis=ng IP network Overlay
with new features
IEEE NOMS, SDNMO Workshop 4
Problems of Exis=ng Overlay Networks
• Problems of exis=ng overlay technologies – Point to Point tunneling (standardized technologies) – Dependence of Control Plane and Data Plane (and Applica[ons)
• Performance Degrada=on • Increasing of Development Cost and Complica=on of Opera=ons
IEEE NOMS, SDNMO Workshop 5
IEEE NOMS, SDNMO Workshop
Tunneling vs Rou=ng Overlays
• Point to Point tunneling – Basic forwarding principle of exis[ng standardized overlays is
“Point-‐to-‐Point tunneling” (Full-‐Mesh). – But, to improve requirements, “Overlay Rou=ng” has to be achieved. – Rou=ng Table according to Node ID on Overlay is required.
50ms
10ms 20ms
Node A
Node B
Node C
30ms
10ms 20ms
Node A
Node B
Node C
Point-‐to-‐Point Overlays Mul=-‐hop Rou=ng Overlays 6
IEEE NOMS, SDNMO Workshop
Dependence of Control Plane and Data Plane (and upper layer Applica=ons)
• Exis=ng overlays are completely isolated. – Func[onali[es of each overlay do not have transparency.
• Control Plane and Data Plane are dependent mutually and specified to upper layer applica[ons (e.g, Ethernet and IP datagram).
– It causes increasing of development cost and complica[on of opera[ons.
Ethernet
Data Plane Control Plane
IP transport
IP
Data Plane Control Plane
IP transport
IP
Data Plane Control Plane
IP transport
Mul=cast
Data Plane Control Plane
IP transport
VXLAN LISP RON Scribe
7
Approach
• Introducing a new abstrac=on layer for overlay • Isola=on of Data Plane and Control Plane • Data Plane does not depend on any Control Plane Systems
• Control Plane constructs just rou[ng table like IP and OSPF • Data Plane provides APIs for modify rou[ng table and ID of overlay.
Overlay #1
Applica=on #1
Data Plane
Control Plane
IP transport
Overlay #2
Applica=on #2
Data Plane
Control Plane
IP transport Common Data Plane
IP transport
Control Plane
Applica=on #1
Control Plane
Applica=on #2
Overlay #1 Overlay #2
IEEE NOMS, SDNMO Workshop 8
Design of Common Overlay Data Plane
• ovstack : Protocol Stack of Common Overlay Data Plane – IP is u[lized as locator and node-‐to-‐node transport.
it helps deployability and modularity. – Overlay Network improves requirements from Applica[ons.
• Control Plane Independent – Mul[ple control plane can exist on same overlay rou[ng space.
Physical
Data link
Network
Overlay
Applica[on
Overlay Rou=ng !!
Data Plane
Rou=ng Table
for App #2
Rou=ng Table
for App #3
Rou=ng Table
for App #4
Control Plane Protocol #A Control Plane
Protocol #B
Control Plane
Protocol #C
Control Plane
Rou=ng Table
for App #1
Packet with
app id 1
ovstack Rou=g stack Physical
Data link
Network
Overlay
Physical
Data link
Network
Overlay
Applica[on
Physical
Data link
Network
Physical
Data link
Network
IEEE NOMS, SDNMO Workshop 9
Node Iden=fier
• Iden=fier for each node on ovstack overlay • Full Flat ID
– Address block alloca[on scheme of Internet Protocol • achieves prefix aggrega[on in Default Free Zone, and it helps scalability. • causes Locator/ID binding problems.
– ovstack uses Full Flat ID for overlay node iden[fier. • Locator/ID binding does not occur. • DHT based rou[ng algorithm can be u[lized.
IANA RIR NIR xSP
IPv4 0.0.0.0/0
/8
/8
/8
Locator/ID binding Problems 1. Address renumbering cost. 2. Address Mobility is not supported. 3. Mul[-‐homing requires eBGP based
connec[on on the Internet.
IEEE NOMS, SDNMO Workshop 10
Overlay Rou=ng
• Overlay Rou=ng using exis=ng tunneling overlays – Relay nodes have to consider encapsulated applica[ons. – Re-‐encapsula[on overhead occurs in relay nodes.
Physical Data link Network Overlay Ethernet
Physical Data link Network Overlay Ethernet
Physical Data link Network Overlay Ethernet
Physical Data link Network Overlay Ethernet
Exis=ng Tunneling Overlays (e.g. GRE, VXLAN)
Ethernet Switching
Physical Data link Network Overlay Ethernet
Physical Data link Network Overlay
Physical Data link Network Overlay
Physical Data link Network Overlay Ethernet
Overlay rou=ng on overlay layer
Overlay Rou=ng
• Overlay hop-‐by-‐hop rou=ng – Reducing encap/decap overhead. – IP is u[lized as node-‐to-‐node transport only.
IEEE NOMS, SDNMO Workshop 11
IEEE NOMS, SDNMO Workshop
Mul=ple Overlay Rou=ng Table
• Overlay routes are constructed by applica=on and opera=onal requirements – Delay aware overlay routes are different from bandwidth aware it,
even if they encapsulate same frame (ex, Ethernet frame). • Overlay rou=ng tables are isolated for each applica=on
– IP is u[lized as a locator and end-‐to-‐end transport between nodes. – Rou[ng Informa[on Base (RIB) and Locator Informa[on Base (LIB) are
constructed for each Applica[on (Applica[on Mul[-‐tenancy)
Delay based Overlay
Bandwidth based Overlay
Mul=cast Overlay
ovstack router RIB1 ...... ......
RIB2 ...... ......
RIB2 ...... ......
12
IEEE NOMS, SDNMO Workshop
ovstack Header Format
• Encap : Ethernet – IP – UDP – ovstack – app packet • Applica=on independent
– Applica[on field presents upper layer applica[on. • Rou[ng table that has to be u[lized is according to this field. • Routes of the table is decided by a commuta[ve Control Plane
– Hash field iden[fies “Flow” that has to be prevented from packet reordering by locator based load balancing. • This flow can be decided by only upper layer applica[on.
ovstack header format : 0 1 2 3 4 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version | Application | TTL | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Virtual Network Identifier | RSV | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Hash | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Node ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Node ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
13
Implementa=on on Linux
• ovstack : Overlay Rou=ng Stack in the network stack of Linux – Overlay rou[ng table, and forwarding
• ovstack is implemented as a kernel module of network stack of Linux. • Encapsulated packets are received through a udp socket. ovstack delivers packets to each protocol driver according to applica[on number of packet.
• Mul[cast support (mul[ple next-‐hop for 1 des[na[on, and RPF-‐check)
• oveth : Virtual NIC Driver for Ethernet frame over ovstack – oveth provides oveth type virtual nic interface.
Pseudo interfaces are created per VNI. – Forwarding Data Base
• <MAC, [NodeID]> Forwarding table. And some func[ons (add/dell/lookup) • FDB is constructed by source mac/source id snooping (like VXLAN)
IEEE NOMS, SDNMO Workshop 14
oveth
IP Stack
ovstack
ovstack structure in Linux kernel
UDP socket
oveth2 (vni2)
FDB (vniX) DstMac: NodeID MacA : 0.0.0.1 MacB : 0.0.0.2
FDB (vniX) DstMac: NodeID MacA : 0.0.0.1 MacB : 0.0.0.2
FDB (vni0) DstMac: NodeID MacA : 0.0.0.1 MacB : 0.0.0.2
Linu
x Netlin
k AP
I • FDB add/delete • create/delete
virtual nic
• route add/delete • node add/delete • loc add/del • set node id
Userland
Linu
x Ke
rnel
Rou=ng Table Dest : Next-‐hop 0.0.0.1 : 0.0.0.3 0.0.0.2 : 0.0.0.3
LIB NodeID : LocAddr 0.0.0.1 : X.X.X.X 0.0.0.2 : Y.Y.Y.Y
oveth1 (vni1)
oveth0 (vni0)
IEEE NOMS, SDNMO Workshop 15
IEEE NOMS, SDNMO Workshop
Contribu=on
• Introducing a new abstrac=on layer for overlays – It provides common data plane for overlays. – IP is u[lized as a node-‐to-‐node transport.
• Isola=on of Control/Data Planes and Applica=ons – Opera[ons will be simple! – Developers should design a control plane only for a new overlay technology.
Common Data Plane
IP transport
Control Plane
Applica=on #1
Control Plane
Applica=on #2
Overlay #1 Overlay #2
Physical
Data link
Network
Overlay
Applica[on
Physical
Data link
Network
Overlay
Physical
Data link
Network
Overlay
Applica[on
Physical
Data link
Network
Physical
Data link
Network
16
Data Plane Evalua=on
• Packet forwarding performance – Performance improvement by overlay rou[ng – Degrada[on due to encapsula[on
• Test environment – Linux kernel 3.8 – Intel i7-‐3770K CPU 3.50GHz, 32GB memory, X520 NIC – Openvswitch version 1.9.0 (for Ethernet switching) – Tester sotware is implemented using netmap (a framework for high speed Packet I/O)
Throughput test
Latency test
Test Traffic
IEEE NOMS, SDNMO Workshop 17
Test cases
Physical Data link Network Overlay Ethernet
Physical Data link Network Overlay Ethernet
Physical Data link Network Overlay
Physical Data link Network Overlay
Physical Data link Network
Physical Data link Network
Physical Data link
Physical Data link
1. Overlay Rou=ng (ovstack) 2. Overlay Rou=ng (using tunneling overlay : VXLAN)
3. IP Rou=ng 3. Ethernet Switching
IEEE NOMS, SDNMO Workshop 18
0
200000
400000
600000
800000
1e+06
1.2e+06
1.4e+06
1.6e+06
1.8e+06
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
thro
ughp
ut [p
ps]
packet size [bytes]
Routing packet per sec
ovstackvxlan
ethernet switchingip routing
Throughput
performance degrada[on caused by
ovstack encap
ovstack performance is greater than
VXLAN + OVS !!
IEEE NOMS, SDNMO Workshop 19
0
20
40
60
80
100
120
ovstack vxlan ethernet switching ip routing
dela
y (m
icro
sec
)
Routing delay (128bytes packet)
Latency
IEEE NOMS, SDNMO Workshop 20
Conclusion
• Introducing a new abstrac=on layer for overlays • Common Overlay Data Plane implementa=on – Mul[ple applica[on specific networks can be accommodated on one overlay rou[ng layer.
– Developers should just write codes for control plane to build new overlay rou[ng systems.
• Future Work – Designing Control Plane Protocols – Considering Node ID scheme
IEEE NOMS, SDNMO Workshop 21
Thank you for ajen=on
• Codes of ovstack are published at Github. – hdps://gihub.com/upa/ovstack – Linux Kernel Module
• ovstack • oveth
– iproute2 including • ip ov command • ip oveth command
IEEE NOMS, SDNMO Workshop 22
backup slides
IEEE NOMS, SDNMO Workshop 23
System Requirements
1. Iden=fier of each node on overlay network 2. Rou=ng and forwarding according to node iden=fier 3. Overlay network mul=-‐tenancy for upper applica=ons 4. Applica=on independent Data Plane 5. Isola=on of Control Plane and Data Plane systems
Requirement 1, 2 : Func[onali[es to realize a Data Plane.
Requirement 3 : Many applica[ons have each requirement to networks.
Requirement 4, 5 : Data Plane has to be u[lized by each applica[ons and control planes transparently.
Common Data Plane
IP transport
Control Plane
Applica=on #1
Control Plane
Applica=on #2
Overlay #1 Overlay #2
IEEE NOMS, SDNMO Workshop 24
12/06/2014
Ether frame to 0:0:1:1:1:1
ovstack header Dst : X.X.X.X Src : A.B.C.D
Encapsula[ng frame with ovstack header
FDB for VNI 0 Dst MAC: NodeID
Look up next hop from RIB
for app
Look up IP address of next hop Node ID
IP Rou=ng and Output
oveth Applica=on Driver
ovstack Rou=ng Layer
IP header + UDP header Dst : 10.0.0.1 Src: MyIP
RIB for app 1 Dest ID : NextHop X.X.X.X Y.Y.Y.Y Z.Z.Z.Z
A.A.A.A A.A.A.A B.B.B.B
LIB for app 1 NodeID : Locator IP A.A.A.A B.B.B.B C.C.C.C
[ 10.0.0.1] [ 10.10.0.2] [ 172.16.1.3 ]
0:0:1:1:1:1 0:0:1:1:2:2 0:0:1:1:3:3
X.X.X.X Y.Y.Y.Y Y.Y.Y.Y
Ether frame to 0:0:1:1:1:1
ovstack header Dst : X.X.X.X Src : A.B.C.D
Ether frame to 0:0:1:1:1:1
25
Host stack
• In a Data Center, overlay network should be constructed on bandwidth aware topology. • Overlay backbone over the Internet should be constructed on latency aware topology. • Encapsulated ethernet frames are transported without en/decapsula[on on inter overlay networks.
Overlay networks interconnect
Data Center 1 Data Center 3
Data Center 2
Internet
Bandwidth aware overlay network over a DC network.
Latency aware overlay network over the internet
12/06/2014 26
iproute2 extensions
• ovstack % ip ov help Usage: ip ov [ { add | del } ] [ locator | node ]
[ app APPID ] [ id NODEID ] [ addr ADDRESS ] [ weight WEIGHT ]
ip ov set { id | locator | node } [ app APPID ] [ id NODEID ] [ addr ADDRESS ] [ weight WEIGHT ]
ip ov route { show | add | del } [ app APPID ] [ to NODEID ] [ via NODEID ]
ip ov show { app | id | locator | node } [ app APPID ] [ id NODEID ] [ addr ADDRESS ]
• oveth % ip link add name oveth0 type oveth vni 0 % ip oveth help Usage : ip oveth fdb { add | del }
[ vni VNI ] [ to MACADDR ] [ via NODEID ]
ip oveth show { fdb }
12/06/2014
• Opera[onal commands are implemented u[lizing Generic Netlink Extension. – Other common opera[ons through
struct netdevice_ops and struct rtnl_link_ops are also implemented.
27