© 2016 Mesosphere, Inc. All Rights Reserved.
LYING, CHEATING, AND WINNING WITH CONTAINERS IN NETWORKING
1
Sargun Dhillon, 2016
© 2016 Mesosphere, Inc. All Rights Reserved.
WHO AM I?
2
© 2016 Mesosphere, Inc. All Rights Reserved.
WHO DO I WORK FOR?
3
© 2016 Mesosphere, Inc. All Rights Reserved.
DC/OS
4
© 2016 Mesosphere, Inc. All Rights Reserved.
NETWORKING
A HISTORY
5
© 2016 Mesosphere, Inc. All Rights Reserved.
IT, IN THE YEAR 2000
6
• Applications mostly client / server
• Mostly desktops, controlled by central IT
• Local storage, on-site
• Low bandwidth, high latency
© 2016 Mesosphere, Inc. All Rights Reserved. 7
The Transformation
© 2016 Mesosphere, Inc. All Rights Reserved.
OLD WORLD
8
© 2016 Mesosphere, Inc. All Rights Reserved.
NEW WORLD
9
© 2016 Mesosphere, Inc. All Rights Reserved.
BYOD: BRING YOUR OWN DEVICE
10
•Real gains by allowing employees to use their own device
•Productivity
•Morale
•Cost
•End node security problems
© 2016 Mesosphere, Inc. All Rights Reserved.
OLD WORLD
11
© 2016 Mesosphere, Inc. All Rights Reserved.
OLD WORLD
12
© 2016 Mesosphere, Inc. All Rights Reserved.
OLD WORLD
13
© 2016 Mesosphere, Inc. All Rights Reserved.
RISE OF SOFTWARE AS A SERVICE
14
•3rd Party provided apps:
•Hosted applications
•Hosted email
•Hosted file storage
•Elastic services for an elastic work force
•Amortization of cost over time
© 2016 Mesosphere, Inc. All Rights Reserved.
NEED
15
• More applications need to be hosted
• Better utilization of computer hardware
• Greater need for faster reactivity
• Greater need for “reliability”
• Greater need for elasticity
• Greater need for scalability
© 2016 Mesosphere, Inc. All Rights Reserved.
ENABLING TECHNOLOGIES
16
•Hardware-assisted virtualization
•Guest oblivious
•Paravirtualization
•Guest assisted
•Proprietary Storage Hardware
•Proprietary Networking Hardware
© 2016 Mesosphere, Inc. All Rights Reserved. 17
The Private “Cloud”
© 2016 Mesosphere, Inc. All Rights Reserved. 18
Networking’s Answer?
© 2016 Mesosphere, Inc. All Rights Reserved. 19
“Software Defined Networking”
© 2016 Mesosphere, Inc. All Rights Reserved. 20
Openflow
© 2016 Mesosphere, Inc. All Rights Reserved. 21
© 2016 Mesosphere, Inc. All Rights Reserved. 22
© 2016 Mesosphere, Inc. All Rights Reserved. 23
© 2016 Mesosphere, Inc. All Rights Reserved. 24
© 2016 Mesosphere, Inc. All Rights Reserved. 25
© 2016 Mesosphere, Inc. All Rights Reserved. 26
© 2016 Mesosphere, Inc. All Rights Reserved. 27
© 2016 Mesosphere, Inc. All Rights Reserved. 28
Never Panned out
© 2016 Mesosphere, Inc. All Rights Reserved. 29
Virtualization Kept Becoming More Common
© 2016 Mesosphere, Inc. All Rights Reserved. 30
And then something happened…
© 2016 Mesosphere, Inc. All Rights Reserved. 31
© 2016 Mesosphere, Inc. All Rights Reserved.
EXPLOSION OF THE SOFTWARE DEFINED NETWORKS
32
• Offerings
• Cisco Nexus
• Juniper Contrail
• Plumgrid
• Nuage
• Calico
• Also replicated state
© 2016 Mesosphere, Inc. All Rights Reserved. 33
And after a couple years…
© 2016 Mesosphere, Inc. All Rights Reserved. 34
© 2016 Mesosphere, Inc. All Rights Reserved. 35
(Or more generically, “containers”)
© 2016 Mesosphere, Inc. All Rights Reserved.
NETWORKING SOLUTIONS COMPARED
36
Containers:
•Calico
•Plumgrid
•Cisco Contiv
•Contrail
•Weave
Openstack:
•Calico
•Plumgrid
•Cisco Contiv
•Contrail
•Cisco Nexus 1000V
© 2016 Mesosphere, Inc. All Rights Reserved.
NETWORKING SOLUTIONS COMPARED
37
Containers:
•Calico
•Plumgrid
•Cisco Contiv
•Contrail
•Weave
Openstack:
•Calico
•Plumgrid
•Cisco Contiv
•Contrail
•Cisco Nexus 1000V
© 2016 Mesosphere, Inc. All Rights Reserved. 38
Containers are all about abstraction
© 2016 Mesosphere, Inc. All Rights Reserved. 39
Containers are UX
© 2016 Mesosphere, Inc. All Rights Reserved.
WHERE THE RUBBER MEETS THE ROAD
40
• Namespaces for abstraction
• Mount
• PID
• User
• UTS
• Network
• CGroups for resource isolation
• Memory
• Blkio
• CPU
© 2016 Mesosphere, Inc. All Rights Reserved.
MOUNT NAMESPACE
41
© 2016 Mesosphere, Inc. All Rights Reserved.
PID NAMESPACE
42
© 2016 Mesosphere, Inc. All Rights Reserved.
UTS NAMESPACE
43
© 2016 Mesosphere, Inc. All Rights Reserved.
NETWORK NAMESPACE
44
© 2016 Mesosphere, Inc. All Rights Reserved.
NETWORK NAMESPACE
45
© 2016 Mesosphere, Inc. All Rights Reserved.
NETWORK NAMESPACE
46
© 2016 Mesosphere, Inc. All Rights Reserved.
NETWORK NAMESPACE
47
© 2016 Mesosphere, Inc. All Rights Reserved.
NETWORK NAMESPACE
48
© 2016 Mesosphere, Inc. All Rights Reserved. 49
Somehow we brought the blight with us
© 2016 Mesosphere, Inc. All Rights Reserved. 50
Why do we need this abstraction?
© 2016 Mesosphere, Inc. All Rights Reserved. 51
Why do we need network namespaces at all?
© 2016 Mesosphere, Inc. All Rights Reserved. 52
© 2016 Mesosphere, Inc. All Rights Reserved. 53
© 2016 Mesosphere, Inc. All Rights Reserved. 54
© 2016 Mesosphere, Inc. All Rights Reserved. 55
Peering inside
© 2016 Mesosphere, Inc. All Rights Reserved. 56
© 2016 Mesosphere, Inc. All Rights Reserved. 57
Ok, so well, what’s some memory?
© 2016 Mesosphere, Inc. All Rights Reserved. 58
Performance
© 2016 Mesosphere, Inc. All Rights Reserved.
REDIS PERFORMANCE
59
© 2016 Mesosphere, Inc. All Rights Reserved. 60
MySQL Performance with Containers
0
75000
150000
225000
300000
Container-free Host Mode Bridged Overlay
Transactions / Sec
© 2016 Mesosphere, Inc. All Rights Reserved.
ENTER: CHECMATE
61
IPTables for sys calls
© 2016 Mesosphere, Inc. All Rights Reserved. 62
First try: LD_PRELOAD
© 2016 Mesosphere, Inc. All Rights Reserved. 63
How does connect() work?
© 2016 Mesosphere, Inc. All Rights Reserved. 64
© 2016 Mesosphere, Inc. All Rights Reserved. 65
© 2016 Mesosphere, Inc. All Rights Reserved. 66
How does connect() work on LD_PRELOAD?
© 2016 Mesosphere, Inc. All Rights Reserved. 67
© 2016 Mesosphere, Inc. All Rights Reserved. 68
© 2016 Mesosphere, Inc. All Rights Reserved. 69
© 2016 Mesosphere, Inc. All Rights Reserved. 70
© 2016 Mesosphere, Inc. All Rights Reserved. 71
…But no
© 2016 Mesosphere, Inc. All Rights Reserved. 72
Static Linking
© 2016 Mesosphere, Inc. All Rights Reserved. 73
What else is there?
© 2016 Mesosphere, Inc. All Rights Reserved. 74
This seems familiar
© 2016 Mesosphere, Inc. All Rights Reserved. 75
Something new
© 2016 Mesosphere, Inc. All Rights Reserved. 76
Something new(-ish)
© 2016 Mesosphere, Inc. All Rights Reserved. 77
© 2016 Mesosphere, Inc. All Rights Reserved. 78
© 2016 Mesosphere, Inc. All Rights Reserved.
EBPF: EXTENDED BERKELEY PACKET FILTER
79
• Stems from BPF (“port 80 and protocol tcp” look familiar)
• JIT’d to X86-64 code
• Or custom hardware (see Mellanox, Netronome)
• Safe
• No jumping backwards
• No unsafe access
• Programmed in C
© 2016 Mesosphere, Inc. All Rights Reserved. 80
© 2016 Mesosphere, Inc. All Rights Reserved. 81
© 2016 Mesosphere, Inc. All Rights Reserved. 82
Advanced Usecases
© 2016 Mesosphere, Inc. All Rights Reserved. 83
How do I prevent my containers from exhausting ephemeral ports?
© 2016 Mesosphere, Inc. All Rights Reserved. 84
IPTables?
© 2016 Mesosphere, Inc. All Rights Reserved. 85
© 2016 Mesosphere, Inc. All Rights Reserved. 86
© 2016 Mesosphere, Inc. All Rights Reserved. 87
Load Balancing
© 2016 Mesosphere, Inc. All Rights Reserved. 88
© 2016 Mesosphere, Inc. All Rights Reserved. 89
What about performance?
© 2016 Mesosphere, Inc. All Rights Reserved. 90
Redis Operations / Second
Ops/Sec
0 1250 2500 3750 5000
Bridged Checmate
More is better
© 2016 Mesosphere, Inc. All Rights Reserved. 91
Redis Latency
Milliseconds
0 0.55 1.1 1.65 2.2
Bridged Checmate
Less is better
© 2016 Mesosphere, Inc. All Rights Reserved. 92
What about debugging?
© 2016 Mesosphere, Inc. All Rights Reserved.
STANDARD BSD API
93
• getpeername() work
• Makes “ALGs” less relevant
• Makes connection-less protocols more sane
• recvfrom()
• sendto()
© 2016 Mesosphere, Inc. All Rights Reserved. 94
© 2016 Mesosphere, Inc. All Rights Reserved. 95
How long until this ends up in your living room datacentre?
© 2016 Mesosphere, Inc. All Rights Reserved. 96
Kernel Patches in Development
© 2016 Mesosphere, Inc. All Rights Reserved. 97
Interest in Developing Higher Level Language
© 2016 Mesosphere, Inc. All Rights Reserved. 98
Functional Programming?
© 2016 Mesosphere, Inc. All Rights Reserved. 99
© 2016 Mesosphere, Inc. All Rights Reserved. 100
More Natural
© 2016 Mesosphere, Inc. All Rights Reserved. 101
Need help with usecases, and testers
© 2016 Mesosphere, Inc. All Rights Reserved. 102
Development of Control Plane
© 2016 Mesosphere, Inc. All Rights Reserved. 103
Kernel Upgrades
© 2016 Mesosphere, Inc. All Rights Reserved. 104
What did we learn?
© 2016 Mesosphere, Inc. All Rights Reserved. 105
We’re probably doing it wrong (today)
© 2016 Mesosphere, Inc. All Rights Reserved. 106
The future looks bright
© 2016 Mesosphere, Inc. All Rights Reserved. 107
With programmable filtering, what would you do?