36
© 2008 H e w le t t -P a c k a rd D e v e lo p m e n t C o m p a n y , L .P . T he in fo rm a tio n c o n ta in e d herein is s u b je c t to c h a n g e w ith o u t n otice A c h ie v in g 1 0 G b /s U s in g Xen P a ra -v irtu a liz e d N e tw o rk Drivers K a u s h ik K u m a r R am *, J. R enato S antos + , Yo s h io Turner + , A la n L . C o x *, S c o tt R ixner* + H P Labs *R ic e University

XS Oracle 2009 Networking 10gig

Embed Size (px)

DESCRIPTION

Jose Renato Santos: Achieving 10GB/s Paravirtualized

Citation preview

Page 1: XS Oracle 2009 Networking 10gig

© 2 00 8 H e w le t t -P a c k a r d D e v e lo p m e n t C o m p a n y , L .P . T h e in f o rm a t io n c o n t a in e d h e r e in is s u b je c t to c h a n g e w it h o u t n o t ic e

A c h ie v in g 1 0 G b /s U s in g X e n P a r a -v i r t u a l iz e d N e tw o r k D r iv e r s

K a u s h ik K u m a r R a m *, J. Renato S antos+, Y o s h io T u r n e r +, A la n L . C o x *, S c o t t R ix n e r*

+H P L a b s *R ic e U n iv e r s i t y

Page 2: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 2

X e n P V D r iv e r o n 1 0 G ig N e tw o r k s

• F o c u s o f t h is t a lk : R X

0123456789

10

RX TX TX sendfile

Rat

e (G

b/s)

Xen Linux

Throug hput on a s ing le TC P c onnec tion (netperf)

Page 3: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 00 9 3

N e tw o r k P a c k e t R e c e p t io n in X e n

D r iv e r D o m a in G u e s t D o m a in

B a c k e n d D r iv e r

F r o n te nd D r iv e r

X e n

P h y s ic a l D r iv e r

H a r d w a r eN IC

I/O C h a n n e l

I n c o m in g P k t

IR Q

B r id g eg rant c opy

eventDM A

demux

23

4

5

6

7Pus h into the

netw ork s tac k1 Pos t g rant on

I /O c hannel

g r M ec hanis ms to reduc e driver domain c os t:

• U s e of M ulti-queue N IC− A v o id d a t a c o p y

− P a c k e t d e m u lt ip le x in h a r d w a r e

• G rant R eus e M ec hanis m− R e d u c e c o s t o f g r a n t

o p e r a t io n s

Page 4: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 4

U s in g M u l t i-Q u e u e N IC s

D r iv e r D o m a in

G u e s t D o m a in

B a c k e n d D r iv e r

F r o n te nd D r iv e r

X e n

P h y s ic a l D r iv e r

H a r d w a r eM Q N IC

I/O C h a n n e ls

I n c o m in g P k t

IR Qevent6 8

Pos t g rant on I/O c hannel

1

M ap bufferpos t buf on

dev queue

DM A5

U nM ap buffer

7

9 Pus h into the netw ork s tac k

g r

3

2

• A d v a n t a g e o f m u l t i-q u e u e• A v o id d a ta c o p y

• A v o id s o f tw a r e b r id g e

O n e R X q u e u ep e r g u e s t

g u e s tM A C a d d r

demux4

Page 5: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 5

P e r f o rm a n c e Im p a c t o f M u l t i-q u e u e

• S a v in g s d u e to m u lt iq u e u e• g r a n t c o p y

• b r id g e

• M o s t o f r e m a in in g c o s t• g r a n t h y p e r c a l ls

(g r a n t + x e n fu n c t io n s )

0100020003000400050006000700080009000

10000

Current Xen Multi-queue

Cycle

s/Pac

ket

XenXen grantLinux otherUser copyLinux grantmmmem*bridgenetworknetbackdriver

Driver Doma in C PU C os t

Page 6: XS Oracle 2009 Networking 10gig

F r o n te n d D r iv e r

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 6

U s in g G r a n t s w i t h M u l t i-q u e u e N IC

D r iv e r D o m a in G u e s t D o m a in

B a c k e n d D r iv e r

X e n

P h y s ic a l D r iv e r

N IC

1map g rant hyperc a ll

g r

3unmap g rant hyperc a ll

• M u lt i-q u e u e r e p la c e s o n e g r a n t h y p e r c a l l (c o p y ) w it h tw o h y p e r c a l ls (m a p /u n m a p )

• G r a n t h y p e r c a l ls a r e e x p e n s iv e• M a p /u n m a p c a l ls fo r e v e r y

I/O o p e r a t io n

us e pag e for I /O2

Page 7: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 7

R e d u c in g G r a n t C o s t

• G r a n t R e u s e−D o n o t r e v o k e g r a n t a f t e r I/O is c o m p le t e d

−K e e p b u f f e r p a g e o n a p o o l o f u n u s e d I/O p a g e s

−R e u s e a lr e a d y g r a n t e d p a g e s a v a i la b le o n b u f f e r p o o l f o r f u t u r e I/O o p e r a t io n s− A v o id s m a p /u n m a p o n e v e r y I/O

Page 8: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 8

R e v o k in g a G r a n t fo r w h e n th e P a g e is M a p p e d in D r iv e r D o m a in

• G u e s t m a y n e e d to r e c la im I/O p a g e fo r o t h e r u s e (e .g . m e m o r y p r e s s u r e o n g u e s t)

• N e e d to u n m a p p a g e a t d r iv e r d o m a in b e fo r e u s in g i t in g u e s t k e r n e l• T o p r e s e r v e m e m o r y is o la t io n (e .g . p r o t e c t f r o m d r iv e r b u g s )

• N e e d h a n d s h a k e b e tw e e n f r o n t e n d a n d b a c k e n d to r e v o k e g r a n t• T h is m a y b e s lo w e s p e c ia l ly i f t h e d r iv e r d o m a in is n o t r u n n in g

Page 9: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 9

A p p r o a c h to A v o idH a n d s h a k e w h e n R e v o k in g G r a n t s

• O b s e r v a t io n : N o n e e d to m a p g u e s t p a g e in t o d r iv e r d o m a in w i t h m u l t i-q u e u e N IC

• S o f tw a r e d o e s n o t n e e d to lo o k a t p a c k e t h e a d e r , s in c e d e m u x is p e r f o rm e d in th e d e v ic e

• J u s t n e e d p a g e a d d r e s s fo r D M A o p e r a t io n

• A p p r o a c h : R e p la c e g r a n t m a p h y p e r c a l l w i t h a s h a r e d m e m o r y in t e r f a c e to th e h y p e r v is o r

• S h a r e d m e m o r y ta b le p r o v id e s t r a n s la t io n o f g u e s t g r a n t t o p a g e a d d r e s s

• N o n e e d to u n m a p p a g e w h e n g u e s t n e e d s to r e v o k e g r a n t (n o h a n d s h a k e )

Page 10: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 0

S o f tw a r e I/O T r a n s la t io n T a b le

D r iv e r D o m a in G u e s t D o m a in

B a c k e n d D r iv e r

F r o n te n d D r iv e r

X e n

P h y s ic a l D r iv e r

N IC

c reate a g rant for buffer pag e

S end g rant over I/O c hannel

2s et hyperc a ll

V a lida te, pin and update S IOTT

9c lea r hyperc a ll

6g et pag e

8res et us e

1

3

S IO TT

#pg

us epg

01

U s e pag e for I /O7

DM A

event

10c hec k us e and revoke

g r

4

5s et us e

pg

• S IO T T : s o f tw a r e I/O t r a n s la t io n ta b le− I n d e x e d b y g r a n t

r e fe r e n c e

− “p g ” f ie ld : g u e s t p a g e a d d r e s s & p e rm is s io n

− “u s e ” f ie ld in d ic a t e s i f g r a n t is in u s e b y d r iv e r d o m a in

• s e t/c le a r h y p e r c a l ls− I n v o k e d b y g u e s t

− S e t v a l id a te s g r a n t , p in s p a g e , a n d w r i t e s p a g e a d d r e s s to S IO T T

− C le a r r e q u ir e s th a t “u s e ”= 0

Page 11: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 1

G r a n t R e u s e :A v o id p in /u n p in h y p e r c a l l o n e v e r y I/O

D r iv e r D o m a in G u e s t D o m a in

B a c k e n d D r iv e r

F r o n te n d D r iv e r

X e n

P h y s ic a l D r iv e r

N IC

c reate g rant

2s et hyperc a ll

va lida te, pin and upda te S IOTT

1

3

S IO T T

#pg

us epg

0

event

I /O B uffer Pool

reus e buffer & g rant from pool5

return buffer to pool & keep g rant4

kernel mem pres s ure

c learhyperc a ll

8

9c lea r S IOT

return pag e to kernel

7

11

g rg r

6 return buffer to pool & keep g rant

10 revoke g rant

Us e pag e for I /O

Page 12: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 2

P e r f o rm a n c e Im p a c t o f G r a n t R e u s e w / S o f tw a r e I/O T r a n s la t io n T a b le

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Current Xen Multi-queue SIOTT w/ grantreuse

Cyc

le /

Pack

et

XenXen grantLinux otherUser copyLinux grantmmmem*bridgenetworknetbackdriver

c o s t s a v in g : g r a n t h y p e r c a l l

Driver Domain C PU C os t

Page 13: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 3

Im p a c t o f o p t im iz a t io n s o n th r o u g h p u t

Data ra te C PU utilization

0

1

2

3

4

5

6

7

8

9

10

current Xen multi-queue w/grant reuse

Linux

Rat

e (G

b/s)

• M u l t i-q u e u e w / g r a n t r e u s e s ig n i f ic a n t ly r e d u c e d r iv e r d o m a in c o s t

• B o t t le n e c k s h i f t s f r o m d r iv e r d o m a in to g u e s t

• H ig h e r c o s t in g u e s t t h a n in L in u x s t i l l l im i t s t h r o u g h p u t in X e n

0

20

40

60

80

100

120

current Xen multi-queue w/grant reuse

Linux

CPU

util

izat

ion

(%)

driver domain guest linux

Page 14: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 4

A d d i t io n a l o p t im iz a t io n s a t g u e s t f r o n t e n d d r iv e r

• L R O (L a r g e R e c e iv e O f f lo a d ) s u p p o r t a t f r o n te n d− C o n s e c u t iv e p a c k e t s o n s a m e c o n n e c t io n c o m b in e d in t o o n e la r g e p a c k e t

− R e d u c e s c o s t o f p r o c e s s in g p a c k e t in n e tw o r k s t a c k

• S o f tw a r e p r e f e t c h− P r e f e t c h n e x t p a c k e t a n d s o c k e t b u f f e r s t r u c t in t o C P U c a c h e w h i le p r o c e s s in g c u r r e n t p a c k e t

− R e d u c e s c a c h e m is s e s a t f r o n t e n d

• A v o id fu l l p a g e b u f f e r s− U s e h a l f-p a g e (2 K B ) b u f f e r s (M a x p k t s iz e is 1 5 0 0 b y t e s )

− R e d u c e s T L B w o r k in g s e t a n d th u s T L B m is s e s

Page 15: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 5

P e r f o rm a n c e im p a c t o f g u e s t f r o n te n d o p t im iz a t io n s

• O p t im iz a t io n s b r in g C P U c o s t in g u e s t c lo s e to n a t iv e L in u x

• R e m a in in g c o s t d i f f e r e n c e− H ig h e r c o s t in n e t f r o n t th a n in p h y s ic a l d r iv e r

− X e n fu n c t io n s to s e n d a n d d e l iv e r e v e n t s

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

multiqueue w/grant reuse

frontend w/LRO

frontend w/prefetch

reducedbuffer size

(2KB)

Linux

cycl

es/p

acke

tXenXen grantLinux otherUser copyLinux-grantmmmemnetworkdriver

G ues t Domain C PU C os t

Page 16: XS Oracle 2009 Networking 10gig

0

1

2

3

4

5

6

7

8

9

10

Rat

e (g

b/s)

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 6

Im p a c t o f a l l o p t im iz a t io n s o n th r o u g h p u t

• M u lt iq u e u e w it h s o f tw a r e o p t im iz a t io n s a c h ie v e s t h e s a m e th r o u g h p u t a s d ir e c t I/O ( ~8 G b /s )

• 2 o r m o r e g u e s t s a r e a b le to s a t u r a t e 1 0 g ig a b i t l in k

c urrent PV driver

optim ized PV driver (1 g ues t)

optim ized PV driver (2 g ues ts )

D irec t I /O (1 g ues t)

L inux

Page 17: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 7

C o n c lu s io n

• U s e o f m u l t i-q u e u e s u p p o r t in m o d e r n N IC s e n a b le s h ig h p e r f o rm a n c e n e tw o r k in g w i t h X e n P V D r iv e r s− A t t r a c t iv e a l t e r n a t iv e to D ir e c t I/O

• S a m e th r o u g h p u t , a l t h o u g h w i t h s o m e a d d i t io n a l C P U c y c le s a t d r iv e r d o m a in

• A v o id s h a r d w a r e d e p e n d e n c e in th e g u e s t s

− L ig h t d r iv e r d o m a in e n a b le s s c a la b i l i t y f o r m u l t ip le g u e s t s• D r iv e r d o m a in c a n n o w h a n d le 1 0 G b /s d a ta r a te s

• M u l t ip le g u e s t s c a n le v e r a g e m u lt ip le C P U c o r e s a n d s a t u r a te1 0 g ig a b i t l in k

Page 18: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 0 0 90 2 /2 5 /2 00 9 1 8

S ta tu s

• S ta t u s− P e r f o rm a n c e r e s u l t s o b ta in e d o n a m o d i f ie d n e t f r o n t/n e t b a c k im p le m e n t a t io n u s in g th e o r ig in a l N e t c h a n n e l1 p r o t o c o l

− C u r r e n t ly p o r t in g m e c h a n is m s t o N e t c h a n n e l2• B a s ic m u l t i-q u e u e a l r e a d y a v a i la b le o n p u b l ic n e t c h a n n e l2 t r e e

• A d d i t io n a l s o f tw a r e o p t im iz a t io n s s t i l l in d is c u s s io n w it h c o m m u n i t y a n d s h o u ld b e in c lu d e d in n e t c h a n n e l2 s o m e t im e s o o n .

• T h a n k s to− M it c h W il l ia m s a n d J o h n R o n c ia k f r o m In t e l fo r p r o v id in g s a m p le s o f

In te l N IC s a n d fo r a d d in g m u lt i-q u e u e s u p p o r t o n th e ir d r iv e r

− I a n P r a t t , S te v e n S m ith a n d K e ir F r a s e r f o r h e lp fu l d is c u s s io n s

Page 19: XS Oracle 2009 Networking 10gig

© 2 00 8 H e w le t t-P a c k a r d D e v e lo p m e n t C o m p a n y , L .P . T h e in fo r m a t io n c o n ta in e d h e r e in is s u b je c t to c h a n g e w i t h o u t n o t i c e

A c h ie v in g 1 0 G b /s U s in g X e n P a r a -v ir t u a l iz e d N e tw o r k D r iv e r s

K a u s h ik K u m a r R a m *, J. Renato S antos+, Y o s h io T u r n e r +, A la n L . C o x *, S c o t t R ix n e r *

+H P L a b s *R ic e U n iv e r s i t y

1

Page 20: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 2X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 2

X e n P V D r iv e r o n 1 0 G ig N e tw o r k s

• F o c u s o f th is t a lk : R X

0123456789

10

RX TX TX sendfile

Rat

e (G

b/s)

Xen Linux

Throug hput on a s ing le TC P connec tion (netperf)

2

Page 21: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 3X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 3

N e tw o r k P a c k e t R e c e p t io n in X e n

D r iv e r D o m a in G u e s t D o m a in

B a c k e n d D r iv e r

F r o n te nd D r iv e r

X e n

P h y s ic a l D r iv e r

H a r d w a r eN IC

I/O C h a n n e l

I n c o m in g P k t

IR Q

B r id g eg rant c opy

eventDM A

dem ux

23

4

5

6

7Pus h into the

netw ork s tack1 Pos t g rant on

I /O c hannel

g r M ec hanis ms to reduc e driver doma in cos t:

• U s e of M ulti-queue N IC− A v o id d a ta c o p y

− P a c k e t d e m u l t ip le x in h a r d w a r e

• G rant R eus e M ec hanis m− R e d u c e c o s t o f g r a n t

o p e r a t io n s

3

Page 22: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 4X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 4

U s in g M u l t i-Q u e u e N IC s

D r iv e r D o m a in

G u e s t D o m a in

B a c k e n d D r iv e r

F r o n te nd D r iv e r

X e n

P h y s ic a l D r iv e r

H a r d w a r eM Q N IC

I/O C h a n n e ls

I n c o m in g P k t

IR Qevent6 8

Pos t g rant on I/O c hannel

1

M ap bufferpos t buf on

dev queue

DM A5

U nM ap buffer

7

9 Pus h into the netw ork s tac k

g r

3

2

• A d v a n ta g e o f m u l t i-q u e u e• A v o id d a t a c o p y

• A v o id s o f tw a r e b r id g e

O n e R X q u e u ep e r g u e s t

g u e s tM A C a d d r

demux4

4

Page 23: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 5X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 5

P e r f o rm a n c e Im p a c t o f M u l t i-q u e u e

• S a v in g s d u e t o m u lt iq u e u e• g r a n t c o p y

• b r id g e

• M o s t o f r e m a in in g c o s t• g r a n t h y p e r c a l l s

(g r a n t + x e n fu n c t io n s )

0100020003000400050006000700080009000

10000

Current Xen Multi-queue

Cycle

s/Pac

ket

XenXen grantLinux otherUser copyLinux grantmmmem*bridgenetworknetbackdriver

Driver Dom ain C PU C os t

5

Page 24: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 6

F r o n te n d D r iv e r

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 6

U s in g G r a n t s w i t h M u l t i-q u e u e N IC

D r iv e r D o m a in G u e s t D o m a in

B a c k e n d D r iv e r

X e n

P h y s ic a l D r iv e r

N IC

1map g rant hyperc a ll

g r

3unmap g rant hyperc a ll

• M u lt i-q u e u e r e p la c e s o n e g r a n t h y p e r c a l l (c o p y ) w i t h tw o h y p e r c a l ls (m a p /u n m a p )

• G r a n t h y p e r c a l ls a r e e x p e n s iv e• M a p /u n m a p c a l ls f o r e v e r y

I/O o p e r a t io n

us e pag e for I /O2

6

Page 25: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 7X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 7

R e d u c in g G r a n t C o s t

• G r a n t R e u s e−D o n o t r e v o k e g r a n t a f t e r I/O is c o m p le t e d

−K e e p b u f f e r p a g e o n a p o o l o f u n u s e d I/O p a g e s

−R e u s e a lr e a d y g r a n t e d p a g e s a v a i la b le o n b u f f e r p o o l f o r f u t u r e I/O o p e r a t io n s− A v o id s m a p /u n m a p o n e v e r y I/O

7

Page 26: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 8X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 8

R e v o k in g a G r a n t f o r w h e n th e P a g e is M a p p e d in D r iv e r D o m a in

• G u e s t m a y n e e d to r e c la im I/O p a g e fo r o t h e r u s e (e .g . m e m o r y p r e s s u r e o n g u e s t)

• N e e d to u n m a p p a g e a t d r iv e r d o m a in b e fo r e u s in g i t in g u e s t k e r n e l• T o p r e s e r v e m e m o r y is o la t io n (e .g . p r o te c t f r o m d r iv e r b u g s )

• N e e d h a n d s h a k e b e tw e e n f r o n t e n d a n d b a c k e n d to r e v o k e g r a n t• T h is m a y b e s lo w e s p e c ia l ly i f th e d r iv e r d o m a in is n o t r u n n in g

8

Page 27: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 9X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 9

A p p r o a c h to A v o idH a n d s h a k e w h e n R e v o k in g G r a n t s

• O b s e r v a t io n : N o n e e d to m a p g u e s t p a g e in t o d r iv e r d o m a in w i t h m u l t i-q u e u e N IC

• S o f tw a r e d o e s n o t n e e d to lo o k a t p a c k e t h e a d e r , s in c e d e m u x is p e r f o rm e d in th e d e v ic e

• J u s t n e e d p a g e a d d r e s s f o r D M A o p e r a t io n

• A p p r o a c h : R e p la c e g r a n t m a p h y p e r c a l l w i t h a s h a r e d m e m o r y in t e r f a c e to th e h y p e r v is o r

• S h a r e d m e m o r y t a b le p r o v id e s t r a n s la t io n o f g u e s t g r a n t t o p a g e a d d r e s s

• N o n e e d to u n m a p p a g e w h e n g u e s t n e e d s to r e v o k e g r a n t (n o h a n d s h a k e )

9

Page 28: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 0X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 0

S o f tw a r e I/O T r a n s la t io n T a b le

D r iv e r D o m a in G u e s t D o m a in

B a c k e n d D r iv e r

F r o n te n d D r iv e r

X e n

P h y s ic a l D r iv e r

N IC

c reate a g rant for buffer pag e

S end g rant over I /O c hannel

2s et hyperc a ll

Va lida te, pin and update S IOTT

9c lear hyperc a ll

6g et pag e

8res et us e

1

3

S IO TT

#pg

us epg

01

U s e pag e for I/O7

DM A

event

10c hec k us e a nd revoke

g r

4

5s et us e

pg

• S IO T T : s o f tw a r e I/O t r a n s la t io n ta b le− I n d e x e d b y g r a n t

r e f e r e n c e

− “p g ” f ie ld : g u e s t p a g e a d d r e s s & p e rm is s io n

− “u s e ” f ie ld in d ic a t e s i f g r a n t is in u s e b y d r iv e r d o m a in

• s e t/c le a r h y p e r c a l ls− I n v o k e d b y g u e s t

− S e t v a l id a t e s g r a n t , p in s p a g e , a n d w r i t e s p a g e a d d r e s s to S IO T T

− C le a r r e q u i r e s th a t “u s e ”= 0

10

Page 29: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 1X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 1

G r a n t R e u s e :A v o id p in /u n p in h y p e r c a l l o n e v e r y I/O

D r iv e r D o m a in G u e s t D o m a in

B a c k e n d D r iv e r

F r o n te n d D r iv e r

X e n

P h y s ic a l D r iv e r

N IC

c reate g rant

2s et hyperc a ll

va lida te, pin and update S IOTT

1

3

S IO T T

#pg

us epg

0

event

I /O B uffer Pool

reus e buffer & g rant from pool5

return buffer to pool & keep g rant4

kernel mem pres s ure

c learhyperc a ll

8

9c lear S IOT

return pag e to kernel

7

11

g rg r

6 return buffer to pool & keep g rant

10 revoke g rant

U s e pag e for I/O

11

Page 30: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 2X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 2

P e r fo rm a n c e Im p a c t o f G r a n t R e u s e w / S o f tw a r e I/O T r a n s la t io n T a b le

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Current Xen Multi-queue SIOTT w/ grantreuse

Cyc

le /

Pack

et

XenXen grantLinux otherUser copyLinux grantmmmem*bridgenetworknetbackdriver

c o s t s a v in g : g r a n t h y p e r c a l l

Driver Dom ain C PU C os t

12

Page 31: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 3X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 3

Im p a c t o f o p t im iz a t io n s o n th r o u g h p u t

Data ra te C PU utiliza tion

0

1

2

3

4

5

6

7

8

9

10

current Xen multi-queue w/grant reuse

Linux

Rat

e (G

b/s)

• M u lt i-q u e u e w / g r a n t r e u s e s ig n i f i c a n t ly r e d u c e d r iv e r d o m a in c o s t

• B o t t le n e c k s h i f t s f r o m d r iv e r d o m a in to g u e s t

• H ig h e r c o s t in g u e s t th a n in L in u x s t i l l l im i t s th r o u g h p u t in X e n

0

20

40

60

80

100

120

current Xen multi-queue w/grant reuse

Linux

CP

U u

tiliz

atio

n (%

)

driver domain guest linux

13

Page 32: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 4X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 4

A d d i t io n a l o p t im iz a t io n s a t g u e s t f r o n t e n d d r iv e r

• L R O (L a r g e R e c e iv e O f f lo a d ) s u p p o r t a t f r o n t e n d− C o n s e c u t iv e p a c k e t s o n s a m e c o n n e c t io n c o m b in e d in to o n e la r g e p a c k e t

− R e d u c e s c o s t o f p r o c e s s in g p a c k e t in n e tw o r k s t a c k

• S o f tw a r e p r e f e t c h− P r e fe t c h n e x t p a c k e t a n d s o c k e t b u f f e r s t r u c t in t o C P U c a c h e w h i le p r o c e s s in g c u r r e n t p a c k e t

− R e d u c e s c a c h e m is s e s a t f r o n t e n d

• A v o id fu l l p a g e b u f f e r s− U s e h a lf-p a g e (2 K B ) b u f f e r s (M a x p k t s iz e is 1 5 0 0 b y t e s )

− R e d u c e s T L B w o r k in g s e t a n d t h u s T L B m is s e s

14

Page 33: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 5X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 5

P e r f o rm a n c e im p a c t o f g u e s t f r o n t e n d o p t im iz a t io n s

• O p t im iz a t io n s b r in g C P U c o s t in g u e s t c lo s e t o n a t iv e L in u x

• R e m a in in g c o s t d i f f e r e n c e− H ig h e r c o s t in n e t f r o n t t h a n in p h y s ic a l d r iv e r

− X e n fu n c t io n s to s e n d a n d d e l iv e r e v e n t s

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

multiqueue w/grant reuse

frontend w/LRO

frontend w/prefetch

reducedbuffer size

(2KB)

Linux

cycl

es/p

acke

t

XenXen grantLinux otherUser copyLinux-grantmmmemnetworkdriver

G ues t Dom ain C PU C os t

15

Page 34: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 6

0

1

2

3

4

5

6

7

8

9

10

Rat

e (g

b/s)

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 6

Im p a c t o f a l l o p t im iz a t io n s o n th r o u g h p u t

• M u lt iq u e u e w i t h s o f tw a r e o p t im iz a t io n s a c h ie v e s t h e s a m e th r o u g h p u t a s d i r e c t I/O ( ~8 G b /s )

• 2 o r m o r e g u e s t s a r e a b le to s a t u r a t e 1 0 g ig a b i t l in k

c urrent PV driver

optim ized PV driver (1 g ues t)

optim ized PV driver (2 g ues ts )

D irec t I /O (1 g ues t)

Linux

16

Page 35: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 7X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 7

C o n c lu s io n

• U s e o f m u l t i-q u e u e s u p p o r t in m o d e r n N IC s e n a b le s h ig h p e r f o rm a n c e n e tw o r k in g w it h X e n P V D r iv e r s− A t t r a c t iv e a l t e r n a t iv e to D ir e c t I/O

• S a m e th r o u g h p u t , a l th o u g h w ith s o m e a d d i t io n a l C P U c y c le s a t d r iv e r d o m a in

• A v o id s h a r d w a r e d e p e n d e n c e in th e g u e s t s

− L ig h t d r iv e r d o m a in e n a b le s s c a la b i l i t y f o r m u l t ip le g u e s t s• D r iv e r d o m a in c a n n o w h a n d le 1 0 G b /s d a ta r a te s

• M u l t ip le g u e s t s c a n le v e r a g e m u lt ip le C P U c o r e s a n d s a tu r a te1 0 g ig a b i t l in k

17

Page 36: XS Oracle 2009 Networking 10gig

X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 8X e n s u m m it – F e b 2 00 90 2 /2 5 /2 0 0 9 1 8

S ta t u s

• S ta t u s− P e r fo rm a n c e r e s u l t s o b ta in e d o n a m o d i f ie d n e t f r o n t/n e t b a c k im p le m e n t a t io n u s in g t h e o r ig in a l N e t c h a n n e l1 p r o to c o l

− C u r r e n t ly p o r t in g m e c h a n is m s t o N e t c h a n n e l2• B a s ic m u l t i-q u e u e a lr e a d y a v a i la b le o n p u b l ic n e tc h a n n e l2 t r e e

• A d d i t io n a l s o f tw a r e o p t im iz a t io n s s t i l l in d is c u s s io n w it h c o m m u n i t y a n d s h o u ld b e in c lu d e d in n e t c h a n n e l2 s o m e t im e s o o n .

• T h a n k s to− M i tc h W il l ia m s a n d J o h n R o n c ia k f r o m In t e l fo r p r o v id in g s a m p le s o f

I n t e l N IC s a n d fo r a d d in g m u lt i-q u e u e s u p p o r t o n th e ir d r iv e r

− I a n P r a t t , S te v e n S m it h a n d K e ir F r a s e r f o r h e lp fu l d is c u s s io n s

18