107
© 2016 Mesosphere, Inc. All Rights Reserved. LYING, CHEATING, AND WINNING WITH CONTAINERS IN NETWORKING 1 Sargun Dhillon, 2016

Lying, Cheating, and Winning with Containers in Networking

Embed Size (px)

Citation preview

© 2016 Mesosphere, Inc. All Rights Reserved.

LYING, CHEATING, AND WINNING WITH CONTAINERS IN NETWORKING

1

Sargun Dhillon, 2016

© 2016 Mesosphere, Inc. All Rights Reserved.

WHO AM I?

2

© 2016 Mesosphere, Inc. All Rights Reserved.

WHO DO I WORK FOR?

3

© 2016 Mesosphere, Inc. All Rights Reserved.

DC/OS

4

© 2016 Mesosphere, Inc. All Rights Reserved.

NETWORKING

A HISTORY

5

© 2016 Mesosphere, Inc. All Rights Reserved.

IT, IN THE YEAR 2000

6

• Applications mostly client / server

• Mostly desktops, controlled by central IT

• Local storage, on-site

• Low bandwidth, high latency

© 2016 Mesosphere, Inc. All Rights Reserved. 7

The Transformation

© 2016 Mesosphere, Inc. All Rights Reserved.

OLD WORLD

8

© 2016 Mesosphere, Inc. All Rights Reserved.

NEW WORLD

9

© 2016 Mesosphere, Inc. All Rights Reserved.

BYOD: BRING YOUR OWN DEVICE

10

•Real gains by allowing employees to use their own device

•Productivity

•Morale

•Cost

•End node security problems

© 2016 Mesosphere, Inc. All Rights Reserved.

OLD WORLD

11

© 2016 Mesosphere, Inc. All Rights Reserved.

OLD WORLD

12

© 2016 Mesosphere, Inc. All Rights Reserved.

OLD WORLD

13

© 2016 Mesosphere, Inc. All Rights Reserved.

RISE OF SOFTWARE AS A SERVICE

14

•3rd Party provided apps:

•Hosted applications

•Hosted email

•Hosted file storage

•Elastic services for an elastic work force

•Amortization of cost over time

© 2016 Mesosphere, Inc. All Rights Reserved.

NEED

15

• More applications need to be hosted

• Better utilization of computer hardware

• Greater need for faster reactivity

• Greater need for “reliability”

• Greater need for elasticity

• Greater need for scalability

© 2016 Mesosphere, Inc. All Rights Reserved.

ENABLING TECHNOLOGIES

16

•Hardware-assisted virtualization

•Guest oblivious

•Paravirtualization

•Guest assisted

•Proprietary Storage Hardware

•Proprietary Networking Hardware

© 2016 Mesosphere, Inc. All Rights Reserved. 17

The Private “Cloud”

© 2016 Mesosphere, Inc. All Rights Reserved. 18

Networking’s Answer?

© 2016 Mesosphere, Inc. All Rights Reserved. 19

“Software Defined Networking”

© 2016 Mesosphere, Inc. All Rights Reserved. 20

Openflow

© 2016 Mesosphere, Inc. All Rights Reserved. 21

© 2016 Mesosphere, Inc. All Rights Reserved. 22

© 2016 Mesosphere, Inc. All Rights Reserved. 23

© 2016 Mesosphere, Inc. All Rights Reserved. 24

© 2016 Mesosphere, Inc. All Rights Reserved. 25

© 2016 Mesosphere, Inc. All Rights Reserved. 26

© 2016 Mesosphere, Inc. All Rights Reserved. 27

© 2016 Mesosphere, Inc. All Rights Reserved. 28

Never Panned out

© 2016 Mesosphere, Inc. All Rights Reserved. 29

Virtualization Kept Becoming More Common

© 2016 Mesosphere, Inc. All Rights Reserved. 30

And then something happened…

© 2016 Mesosphere, Inc. All Rights Reserved. 31

© 2016 Mesosphere, Inc. All Rights Reserved.

EXPLOSION OF THE SOFTWARE DEFINED NETWORKS

32

• Offerings

• Cisco Nexus

• Juniper Contrail

• Plumgrid

• Nuage

• Calico

• Also replicated state

© 2016 Mesosphere, Inc. All Rights Reserved. 33

And after a couple years…

© 2016 Mesosphere, Inc. All Rights Reserved. 34

© 2016 Mesosphere, Inc. All Rights Reserved. 35

(Or more generically, “containers”)

© 2016 Mesosphere, Inc. All Rights Reserved.

NETWORKING SOLUTIONS COMPARED

36

Containers:

•Calico

•Plumgrid

•Cisco Contiv

•Contrail

•Weave

Openstack:

•Calico

•Plumgrid

•Cisco Contiv

•Contrail

•Cisco Nexus 1000V

© 2016 Mesosphere, Inc. All Rights Reserved.

NETWORKING SOLUTIONS COMPARED

37

Containers:

•Calico

•Plumgrid

•Cisco Contiv

•Contrail

•Weave

Openstack:

•Calico

•Plumgrid

•Cisco Contiv

•Contrail

•Cisco Nexus 1000V

© 2016 Mesosphere, Inc. All Rights Reserved. 38

Containers are all about abstraction

© 2016 Mesosphere, Inc. All Rights Reserved. 39

Containers are UX

© 2016 Mesosphere, Inc. All Rights Reserved.

WHERE THE RUBBER MEETS THE ROAD

40

• Namespaces for abstraction

• Mount

• PID

• User

• UTS

• Network

• CGroups for resource isolation

• Memory

• Blkio

• CPU

© 2016 Mesosphere, Inc. All Rights Reserved.

MOUNT NAMESPACE

41

© 2016 Mesosphere, Inc. All Rights Reserved.

PID NAMESPACE

42

© 2016 Mesosphere, Inc. All Rights Reserved.

UTS NAMESPACE

43

© 2016 Mesosphere, Inc. All Rights Reserved.

NETWORK NAMESPACE

44

© 2016 Mesosphere, Inc. All Rights Reserved.

NETWORK NAMESPACE

45

© 2016 Mesosphere, Inc. All Rights Reserved.

NETWORK NAMESPACE

46

© 2016 Mesosphere, Inc. All Rights Reserved.

NETWORK NAMESPACE

47

© 2016 Mesosphere, Inc. All Rights Reserved.

NETWORK NAMESPACE

48

© 2016 Mesosphere, Inc. All Rights Reserved. 49

Somehow we brought the blight with us

© 2016 Mesosphere, Inc. All Rights Reserved. 50

Why do we need this abstraction?

© 2016 Mesosphere, Inc. All Rights Reserved. 51

Why do we need network namespaces at all?

© 2016 Mesosphere, Inc. All Rights Reserved. 52

© 2016 Mesosphere, Inc. All Rights Reserved. 53

© 2016 Mesosphere, Inc. All Rights Reserved. 54

© 2016 Mesosphere, Inc. All Rights Reserved. 55

Peering inside

© 2016 Mesosphere, Inc. All Rights Reserved. 56

© 2016 Mesosphere, Inc. All Rights Reserved. 57

Ok, so well, what’s some memory?

© 2016 Mesosphere, Inc. All Rights Reserved. 58

Performance

© 2016 Mesosphere, Inc. All Rights Reserved.

REDIS PERFORMANCE

59

© 2016 Mesosphere, Inc. All Rights Reserved. 60

MySQL Performance with Containers

0

75000

150000

225000

300000

Container-free Host Mode Bridged Overlay

Transactions / Sec

© 2016 Mesosphere, Inc. All Rights Reserved.

ENTER: CHECMATE

61

IPTables for sys calls

© 2016 Mesosphere, Inc. All Rights Reserved. 62

First try: LD_PRELOAD

© 2016 Mesosphere, Inc. All Rights Reserved. 63

How does connect() work?

© 2016 Mesosphere, Inc. All Rights Reserved. 64

© 2016 Mesosphere, Inc. All Rights Reserved. 65

© 2016 Mesosphere, Inc. All Rights Reserved. 66

How does connect() work on LD_PRELOAD?

© 2016 Mesosphere, Inc. All Rights Reserved. 67

© 2016 Mesosphere, Inc. All Rights Reserved. 68

© 2016 Mesosphere, Inc. All Rights Reserved. 69

© 2016 Mesosphere, Inc. All Rights Reserved. 70

© 2016 Mesosphere, Inc. All Rights Reserved. 71

…But no

© 2016 Mesosphere, Inc. All Rights Reserved. 72

Static Linking

© 2016 Mesosphere, Inc. All Rights Reserved. 73

What else is there?

© 2016 Mesosphere, Inc. All Rights Reserved. 74

This seems familiar

© 2016 Mesosphere, Inc. All Rights Reserved. 75

Something new

© 2016 Mesosphere, Inc. All Rights Reserved. 76

Something new(-ish)

© 2016 Mesosphere, Inc. All Rights Reserved. 77

© 2016 Mesosphere, Inc. All Rights Reserved. 78

© 2016 Mesosphere, Inc. All Rights Reserved.

EBPF: EXTENDED BERKELEY PACKET FILTER

79

• Stems from BPF (“port 80 and protocol tcp” look familiar)

• JIT’d to X86-64 code

• Or custom hardware (see Mellanox, Netronome)

• Safe

• No jumping backwards

• No unsafe access

• Programmed in C

© 2016 Mesosphere, Inc. All Rights Reserved. 80

© 2016 Mesosphere, Inc. All Rights Reserved. 81

© 2016 Mesosphere, Inc. All Rights Reserved. 82

Advanced Usecases

© 2016 Mesosphere, Inc. All Rights Reserved. 83

How do I prevent my containers from exhausting ephemeral ports?

© 2016 Mesosphere, Inc. All Rights Reserved. 84

IPTables?

© 2016 Mesosphere, Inc. All Rights Reserved. 85

© 2016 Mesosphere, Inc. All Rights Reserved. 86

© 2016 Mesosphere, Inc. All Rights Reserved. 87

Load Balancing

© 2016 Mesosphere, Inc. All Rights Reserved. 88

© 2016 Mesosphere, Inc. All Rights Reserved. 89

What about performance?

© 2016 Mesosphere, Inc. All Rights Reserved. 90

Redis Operations / Second

Ops/Sec

0 1250 2500 3750 5000

Bridged Checmate

More is better

© 2016 Mesosphere, Inc. All Rights Reserved. 91

Redis Latency

Milliseconds

0 0.55 1.1 1.65 2.2

Bridged Checmate

Less is better

© 2016 Mesosphere, Inc. All Rights Reserved. 92

What about debugging?

© 2016 Mesosphere, Inc. All Rights Reserved.

STANDARD BSD API

93

• getpeername() work

• Makes “ALGs” less relevant

• Makes connection-less protocols more sane

• recvfrom()

• sendto()

© 2016 Mesosphere, Inc. All Rights Reserved. 94

© 2016 Mesosphere, Inc. All Rights Reserved. 95

How long until this ends up in your living room datacentre?

© 2016 Mesosphere, Inc. All Rights Reserved. 96

Kernel Patches in Development

© 2016 Mesosphere, Inc. All Rights Reserved. 97

Interest in Developing Higher Level Language

© 2016 Mesosphere, Inc. All Rights Reserved. 98

Functional Programming?

© 2016 Mesosphere, Inc. All Rights Reserved. 99

© 2016 Mesosphere, Inc. All Rights Reserved. 100

More Natural

© 2016 Mesosphere, Inc. All Rights Reserved. 101

Need help with usecases, and testers

© 2016 Mesosphere, Inc. All Rights Reserved. 102

Development of Control Plane

© 2016 Mesosphere, Inc. All Rights Reserved. 103

Kernel Upgrades

© 2016 Mesosphere, Inc. All Rights Reserved. 104

What did we learn?

© 2016 Mesosphere, Inc. All Rights Reserved. 105

We’re probably doing it wrong (today)

© 2016 Mesosphere, Inc. All Rights Reserved. 106

The future looks bright

© 2016 Mesosphere, Inc. All Rights Reserved. 107

With programmable filtering, what would you do?