1sedgewick Robert Implementing Quicksort Programs

Embed Size (px)

Citation preview

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    1/11

    P r o g r a m m i n g

    T e c h n i q u e s

    S . L . G rah am , R . L . Rives t

    Ed i to r s

    Implementing

    Quicksort Programs

    Robert Sedgewick

    Brown University

    T h i s p ap er i s a p rac t i ca l s t u d y o f h ow t o i m p l em en t

    t h e Q u i ck s ort s or t i n g a l gor i t h m an d i t s b es t var i an t s on

    rea l com p u t ers , i n c l u d i n g h ow t o ap p l y var i ou s cod e

    op t i m i zat i on t ech n i q u es . A d e t a i l ed i m p l em en t at i on

    c o m b i n i ng t h e m o s t e f f e c ti v e i m p r o v e m e n t s t o

    Q u i ck s ort i s g i ven , a l on g w i t h a d i s cu s s i on o f h ow t o

    i m p l em en t i t i n as s em b l y l an gu age . A n al y t i c re s u l t s

    d es cr i b i n g t h e p er form an ce o f t h e p rogram s are

    s u m m ari zed . A var i e t y o f s p ec i a l s i t u at i on s are

    con s i d ered f rom a p rac t i ca l s t an d p oi n t t o i l l u s trat e

    Q u i ck s ort s w i d e ap p l i cab i li t y as an i n t ern a l s or t i n g

    m et h od w h i ch req u i res n eg l i g i b l e ex t ra s t orage .

    K e y W o r d s a n d P h r a s e s : Q u i c k s o r t, a n a l y s i s o f

    a l gor i t h m s , cod e op t i m i zat i on , s or t i n g

    C R C at egor i e s : 4 . 0 , 4 . 6 , 5 . 25 , 5 . 31 , 5 . 5

    I n t rod u ct i on

    O n e o f t h e m o s t w i d e l y s t u d i e d p r a ct i ca l p r o b l e m s i n

    com pute r sc i ence is so rt ing : t he u se o f a comp ute r t o pu t

    f i le s i n o rde r . A pe r son wi sh ing t o use a co mp ute r t o so r t

    i s fa c e d w i t h t h e p r o b l e m o f d e t e rm i n i n g w h i c h o f th e

    ma ny ava i l ab l e a lgor i thms i s be s t su i t ed fo r h i s pu rpose .

    Th i s t a sk i s becoming l e ss d i f f i cu l t t han i t once was fo r

    three reasons. Fi rst , sor t ing i s an area in which the

    ma thema t i ca l ana lys i s o f a lgor i t hms has been pa r t i cu -

    l a r l y success fu l : we can p red i c t t he pe r fo rm ance o f many

    sor t ing me tho ds and c om pare t hem in t el l igen t ly . Second ,

    we have a g rea t dea l o f expe r i ence us ing so r t i ng a lgo-

    P erm is s ion to copy wi thou t f ee a l l o r pa r t o f th i s m ate r ia l i s

    g ran ted p ro v ided tha t the cop ies a r e no t m ad e o r d i s t r ibu ted fo r d i r ec t

    c o m m e r c i a l a d v a n t a g e , t h e A C M c o p y r i g h t n o ti c e a n d t h e t i t le o f t h e

    publ ica t io n and i t s da te appear , an d no t ice i s g iven tha t copy ing i s by

    p e r m i s si o n o f t h e A s s o c i a ti o n f o r C o m p u t i n g M a c h i n e r y . T o c o p y

    o therwis e , o r to r epub l i s h , r equ i r es a f ee and /o r s pec i f i c pe rm is s ion .

    T h i s w o r k w a s s u p p o r t e d i n p a r t b y t h e F a n n i e a n d J o h n H e r t z

    F o u n d a t i o n a n d i n p a r t b y N S F G r a n t s . N o . G J - 2 8 0 7 4 a n d M C S 7 5 -

    23738.

    A u t h o r ' s a d d r es s : D i v i si o n o f A p p l i e d M a t h e m a t i c s a n d C o m p u t e r

    Sc ience P rogram , B row n Univer s i ty , P rov idence , RI 0 2 91 2 .

    1 9 78 AC M 0 0 0 1 -0 7 8 2 /7 8 /1 0 0 0 -0 8 4 7 $ 0 0 .7 5

    847

    r i thms , and we can l e a rn f rom tha t expe r i ence t o sepa ra t e

    good a lgor i t hms f rom b ad ones . Th i rd , i f t he t i le f it s in to

    the mem ory o f t he compute r , t he re i s one a lgor i t hm,

    ca l l ed Quicksor t , wh ich ha s been shown to pe r fo rm we l l

    i n a va r i e ty o f s i t ua ti ons . N o t on ly i s t h is a lgor i t hm

    simple r t han many o the r so r t i ng a lgor i t hms , bu t empi r -

    ica l [2 , l l , 13 , 21] and analyt ic [9] s tudies show tha t

    Quicksor t c an be expec t ed t o be up t o tw ice a s f a s t a s i ts

    nea re s t compe t i t o r s . The me thod i s s imple enough to be

    l e a r n e d b y p r o g r a m m e r s w h o h a v e n o p r e v i o u s e x p e r i -

    ence wi th so r t i ng , and t hose who do know o the r so r t i ng

    me thods shou ld a l so f i nd i t p ro f i t ab l e t o l e a rn abou t

    Quicksor t .

    Because o f it s p rominence , i t i s approp r i a t e t o s t udy

    how Quicksor t m igh t be improved . Th i s sub j ec t ha s

    rece ived considerable a t tent ion (see , for example , [1 , 4 ,

    11 , 13 , 14 , 18, 20]), but few rea l im pro vem ents h ave b een

    s u g g e st e d b e y o n d t h o s e d e s c r i b e d b y C . A . R . H o a r e , t h e

    inven to r o f Quicksor t , i n h i s o r i g ina l pape r s [5, 6] . Ho a re

    a l so showed how to ana lyze Quicksor t and p red i c t i t s

    runn ing t ime . The ana lys i s ha s s i nce been ex t ended t o

    the improv emen t s t ha t he sugges t ed , and used t o i nd i ca te

    how the y m ay bes t be imp lemented [9, 15 , 17 ] . The

    subjec t o f t he ca re fu l implem enta t i on o f Quicksor t ha s

    n o t b e e n s t u d i e d a s w i d e l y a s g l o b a l i m p r o v e m e n t s t o

    the a lgor i t hm, bu t t he sav ings t o be r ea l i zed a re a s

    s ign i f i c an t . The h i s to ry o f Quicksor t i s qu i t e complex ,

    and [15 ] con t a ins a fu l l su rvey o f t he man y va r i an t s

    which , have been p roposed .

    The purpo se o f t h i s pape r i s to de sc r ibe i n de t a il how

    Quicksor t c an be s t be implemented t o hand l e ac tua l

    app l i ca t i ons on r ea l compute r s . A gene ra l de sc r ip t i on o f

    the a lgor i t hm i s fo l l owed by de sc r ip t i ons o f t he mos t

    e f fec t i ve improvement s t ha t have been p roposed (a s

    d e m o n s t r a t e d i n [ 1 5 ] ) . N e x t , a n i m p l e m e n t a t io n o f

    Quicksor t i n a t yp i ca l h igh l eve l l anguage i s p re sen t ed ,

    and a ss embly language im plemen ta t i on i s sues a re con-

    s ide red . Th i s d i scuss ion shou ld ea s i l y t r ans l a t e t o r ea l

    l anguages on r ea l mach ines . F ina l l y , a numb er o f spec i a l

    i s sues a re cons ide red wh ich may b e o f impo r t ance i n

    par t icular sor t ing appl ica t ions.

    Th i s pape r i s i n t ended t o be a se l f - con t a ined ove rv i ew

    o f t h e p r o p e r ti e s o f Q u i c k s o r t f o r u s e b y t h o s e w h o n e e d

    to ac tua l ly implement an d use t he a lgor i thm. A com pan-

    ion pap e r [17 ] p rov ides t he ana ly t i c a l r e su lt s which su -

    por t m uch o f the d i scuss ion p re sen t ed he re .

    T h e A l g o f it h m

    Quicksor t i s a r e cur s ive me thod fo r so r t i ng an a r ray

    A[1 ] , A[2 ] . . . . . A[N ] b y f i r st "pa r t i t i on ing" i t so t ha t t he

    fo l l owing cond i t i ons ho ld :

    ( i) Som e key v is in it s f ina l pos i t ion in the array. ( I f i t

    i s the j th smal lest , i t i s in posi t ion A[j ] . )

    ( ii ) Al l e l emen t s t o t he l e f t o f A[ j ] a re l e ss t han o r equa l

    to i t . (The se e lem ents A [ 1 , A [2] . . . . . A [ j - 1 a re

    ca l led the "lef t subt i le .")

    C o m m u n i c a t i o n s O c t o b e r 1 97 8

    o f V o l u m e 2 1

    t h e A C M N u m b e r 1 0

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    2/11

    ( ii i) A l l e l e m e n t s t o t h e r i g h t o f A [ j ] a r e g r e a t e r t h a n o r

    e q u a l t o i t. ( T h e s e e l e m e n t s A [ j + 1 . . . . . A I N ] a r e

    c a l l e d t h e " r i g h t s u b t i l e . " ]

    A f t e r p a r t i t i o n i n g , t h e o r i g i n a l p r o b l e m o f s o r t i n g t h e

    e n t i r e a r r a y i s r e d u c e d t o t h e p r o b l e m o f s o r t in g t h e l e f t

    a n d r i g h t s u b fi le s i n d e p e n d e n t l y . T h e f o l l o w i n g p r o g r a m

    i s a r e c u rs i v e i m p l e m e n t a t i o n o f th i s m e t h o d , w i t h t h e

    pa r t i t i on ing p roces s s pe l l ed ou t exp l i c i t ly .

    Program

    1

    procedure

    qui ksort

    integer va lue/ , r) ;

    c omme n t

    S o r t

    A l l

    : r] w h e r e

    A[r +

    1 ] _> A [ / ] . . . . . A i r ] ;

    if r > I then

    i : = / ; j : = r + 1 ; v : = A [ / ] ;

    loop:

    loop:

    i := i + 1; wh ile

    A[i] < v

    r e p e a t ;

    I o o p : j

    := j -

    1 ; whi le A[ ] ] > v r epe a t ;

    unt il j < i:

    A[i] :=: A[ J I ;

    r e p e a t ;

    Al t] :=: A[/3;

    quicksor t l , j - 1 ) ;

    quicksor t i ,

    r) ;

    end i f ;

    ( T h i s p r o g r a m u s e s a n e x c h a n g e ( o r s w a p ) o p e r a t o r : = : ,

    and the con t ro l cons t ruc t s loop . . . r epea t and i f . . . endif ,

    w h i c h a r e l i k e t h o s e d e s c r i b e d b y D . E . K n u t h i n [ 1 0 ] .

    S t a t e m e n t s b e t w e e n l o o p a n d r e p e a t a r e i t e r a t e d : w h e n

    t h e w h i l e c o n d i t i o n f a i l s (o r t h e until cond i t ion i s s a t i s -

    f i ed ) t h e l o o p i s e x i t ed i m m e d i a t e l y . T h e k e y w o r d rep eat

    m a y b e t h o u g h t o f a s m e a n i n g " e x e c u t e t h e c o d e s t a r ti n g

    a t lo o p a g a i n , " a n d , f o r e x a m p l e , " u n t i l j < i " m a y b e

    r e a d a s " i f j < i t h e n l e a v e t h e l o o p " . )

    T h e p a r t i ti o n i n g p r o ce s s m a y b e m o s t e a s il y u n d e r -

    s t o o d b y f i rs t a s s u m i n g t h a t t h e k e y s A [ 1 . . . . A [ N ] a r e

    d i s t i n c t . T h e p r o g r a m s t a r t s b y t a k i n g t h e l e f t m o s t e l e -

    m e n t a s t h e p a r t i t i o n i n g e l e m e n t . T h e n t h e r e s t o f th e

    a r r a y i s d i v i d e d b y s c a n n i n g f r o m t h e l e f t t o f r e d a n

    e l e m e n t > v , s c a n n i n g f r o m t h e f i g h t to f i n d a n e l e m e n t

    < v , e x c h a n g i n g t h e m , a n d c o n t i n u i n g t h e p r o c e ss u n t i l

    t h e p o i n t e r s cr o ss . T h e l o o p t e r m i n a t e s w i t h j + 1 = i , a t

    w h i c h p o i n t i t is k n o w n t h a t A[l + 1 ] . . . . A[j] are < v

    a n d A[j + 1 ] . . . . A[r] a r e > v , s o t h a t t h e e x c h a n g e A[l]

    .=: A[ j] c o m p l e t e s t h e jo b o f p a r t i t io n i n g A[l] . . . . . A ir].

    T h e c o n d i t i o n t h a t

    Air +

    1 ] m u s t b e g r e a t e r t h a n o r

    e q u a l t o a l l o f t h e k e y s All] . . . . . A[r] i s i n c l u d e d t o s t o p

    t h e i p o i n t e r i n t h e c a s e t h a t v i s t h e l a r g e s t o f t h e k e y s .

    T h e p r o c e d u r e c a l l q u i c k s o r t ( 1 , N ) w i l l t h e r e f o r e s o r t

    A [ I ] . . . . . A[N ] i f A [ N + 1 ] i s i n i t i a li z e d t o s o m e v a l u e

    a t l e a s t a s l a r g e a s t h e o t h e r k e y s . ( T h i s i s n o r m a l l y

    s p e c i f ie d b y t h e n o t a t i o n

    A[N +

    1] := oo.)

    I f e q u a l k e y s a r e p r e s e n t a m o n g A [ 1 , . . . , A [ N ] , t h e n

    P r o g r a m 1 s t il l o p e r a t e s p r o p e r l y a n d e f f i c i e n t l y , b u t n o t

    e x a c t l y a s d e s c r i b e d a b o v e . I f s o m e k e y e q u a l t o v i s

    a l r e a d y i n p o s i t i o n i n t h e f i l e , t h e n t h e p o i n t e r s c a n s

    c o u l d b o t h s t o p w i t h i = j , s o t h a t , a f t e r o n e m o r e t i m e

    t h r o u g h t h e l o o p , i t t e r m i n a t e s w i t h j + 2 = i . B u t a t t h i s

    p o i n t i t is k n o w n n o t o n l y t h a t A[I + 1 ] . . . . A[j] are _

    v a n d A[j + 2 ] . . . . . A[r] a r e _ v b u t a l s o t h a t A [ j + 1 ]

    8 4 8

    = v . A f t e r t h e e x c h a n g e A[I] ~ : A [ j ] , w e h a v e two

    e l e m e n t s i n t h e i r f i n a l p l a c e i n t h e a r r a y ( A [ j ] a n d A [ j

    + 1 ] ), an d the s ubf i l e s a re recu rs ive ly s o r t ed .

    F i g u r e s 1 a n d 2 s h o w t h e o p e r a t io n o f P r o g r a m 1 o n

    the f i r s t 1 6 d ig i t s o f ~r. In F igu re 1 , e l em en t s m ark ed by

    a r r o w s a r e t h o s e p o i n t e d t o b y i a n d j , a n d e a c h l i n e i s

    t h e r e s u l t o f a p o i n t e r i n c r e m e n t o r a n e x c h a n g e . I n

    F i g u r e 2 , e a c h l i n e i s t h e r e s u l t o f o n e " p a r t i t i o n i n g

    s t a g e , " a n d b o l d f a c e e l e m e n t s a r e th o s e p u t i n t o p o s i t i o n

    b y p a r t i t i o n i n g .

    T h e d i f fe r e n c e s b e t w e e n t h e i m p l e m e n t a t i o n o f p a r -

    t i ti o n i n g g iv e n i n P r o g r a m 1 a n d t h e m a n y o t h e r p a r t i-

    t i o n i n g m e t h o d s w h i c h h a v e b e e n p r o p o s e d a r e s u b t l e ,

    b u t t h e y c a n h a v e a s i g n if i c a nt e ff e c t o n t h e p e r f o r m a n c e

    o f Q u i c k s o r t . T h e i s su e s in v o l v e d a r e t r e a t e d f u l l y i n

    [ 1 5 ] . B y u s i n g t h i s p a r t i c u l a r m e t h o d , w e h a v e a l r e a d y

    b e g u n t o " o p t i m i z e " Q u i c k s o r t , f o r i t h a s t h r e e m a i n

    a d v a n t a g e s o v e r a l t e rn a t i v e m e t h o d s .

    F i r s t , a s w e s h a l l s e e i n m u c h m o r e d e t a i l l a t e r , t h e

    i n n e r l o o p s a r e e f f i c i e n t l y c o d e d . M o s t o f t h e r u n n i n g

    t i m e o f t h e p r o g r a m i s s p e n t e x e c u t in g t h e s t a t e m e n t s

    loop: i := i + 1 ; whi le A[Q < v r epe a t ;

    loop : j ~ j - 1 ; whi le A[J1 > v r e p e a t ;

    e a c h o f w h i c h c a n b e i m p l e m e n t e d i n m a c h i n e l a n g u a g e

    w i t h a p o i n t e r i n c r e m e n t , a c o m p a r e , a n d a c o n d i t i o n a l

    b r a n c h . M o r e n a i v e i m p l e m e n t a t i o n s o f p a r t i ti o n i n g i n -

    c l u d e o t h e r t e s t s , f o r t h e p o i n t e r s c r o s s i n g o r e x c e e d i n g

    t h e a r r a y b o u n d s , w i t h i n t h e se lo o p s . F o r e x a m p l e , r a t h e r

    t h a n u s i n g t h e " s e n t i n e l "

    A [ N +

    1 ] = ~ we cou ld us e

    loop: i ~i + 1; wh ile i _< N and A[i] < v r e p e a t ;

    f o r t h e i p o i n t e r i n c r e m e n t , b u t t h i s w o u l d b e f a r l e s s

    e f f i c i en t .

    S e c o n d , w h e n e q u a l k e y s a r e p r e s e n t , t h e r e i s t h e

    q u e s t i o n o f h o w k e y s e q u a l t o t h e p a r t i t i o n i n g e l e m e n t

    s h o u l d b e t r e a t e d . I t m i g h t s e e m b e t t e r t o s c a n o v e r s u c h

    keys (by us ing the con d i t ion s A [ i ] _< v and A [ j ] _> v in

    t h e s c a n n i n g l o o p s ) , b u t c a r e f u l a n a l y s i s s h o w s t h a t i t i s

    a l w a y s b e t t e r t o s to p t h e s c a n n i n g p o i n t e r s o n k e y s e q u a l

    t o t h e p a r t i t i o n i n g e l e m e n t , a s i n P r o g r a m 1 . ( T h i s i d e a

    w a s s u g g e s t e d i n 1 9 6 9 b y R . C . S i n g l e t o n [ 1 8] .) I n t h i s

    p a p e r , w e w i l l a d o p t t h i s s t r a t e g y f o r a l l o f o u r p r o g r a m s ,

    b u t i n t h e a n a l y s i s w e w i l l a s su m e t h a t a l l o f t h e k e y s

    b e i n g s o r t e d a r e d i s t i n c t . J u s t i f i c a t i o n f o r d o i n g s o m a y

    b e f o u n d i n [ 1 6 ] , w h e r e t h e s u b j e c t o f Q u i c k s o r t w i t h

    e q u a l k e y s i s s t u d i e d i n c o n s i d e r a b l e d e t a i l .

    T h i r d , t h e p a r t i t i o n i n g m e t h o d u s e d i n P r o g r a m 1

    d o e s n o t i m p o s e a b i a s u p o n t h e s u b f i l e s. T h a t i s, i f w e

    s t ar t w i t h a r a n d o m a r r a n g e m e n t o f A [ l ] . . . . . A [ N ] , t h e n ,

    a f t e r p a r t i t i o n i n g , t h e l e f t s u b t i l e i s a r a n d o m a r r a n g e -

    m e n t o f i ts e l e m e n t s a n d t h e r i g h t s u b t i l e is a r a n d o m

    p e r m u t a t i o n o f it s e le m e n t s . T h i s f a c t i s c r u c i a l t o t h e

    a n a l y s i s o f t h e p r o g r a m , a n d i t a l s o s e e m s t o b e a

    r e q u i r e m e n t f o r e f f i c ie n t o p e r a t i o n . I t is c o n c e i v a b l e t h a t

    a m e t h o d c o u l d b e d e v i s e d w h i c h i m p a r t s a f a v o r a b l e

    b i a s to t h e s u b f i le s , b u t t h e c r e a t i o n o f n o n r a n d o m

    s u b fi le s is u s u a l ly d o n e i n a d v e r t e n tl y . N o m e t h o d w h i c h

    C o m m u n i c a t i o n s O c t o b e r 1 97 8

    o f V o l u m e 2 1

    t h e A C M N u m b e r 10

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    3/11

    Fig. 1. Par titio ning ~r (Pro gra m 1).

    3 1 4 1 5 9 2 6 5

    1

    4

    3 1 3

    1

    5

    3 1 3 1 3

    9

    5

    e--

    6

    2

    e--

    3 1 3 1 3 2 9 6 5

    9

    2 --'

    2 1 3 1 3 3 9 6 5

    5

    3

    5 5

    5 8 9 7 9 3

    8

    7

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    4/11

    i f A [ i ] >

    A[i +

    1] t h e n

    v . --- A[i ]; j := i + 1;

    loop: A [ j - 1] ~ A [ j ] ; j : = j + 1 ; w h i l e A[j] < v r e p e a t ;

    A [ j - 1 ] . '= v ;

    end i f ;

    ( J u s t a s t h e r e a r e m a n y d i f f e r e n t i m p l e m e n t a t i o n s o f

    Q u i c k s o r t , s o t h e r e a r e a v a r i e t y o f w a y s to i m p l e m e n t

    I n se r t i o n so r t . Th i s su b j ec t i s t r ea t ed i n d e t a i l i n [9 ] an d

    [ 15 ]. ) N o w , t h e o b v i o u s w a y t o i m p r o v e P r o g r a m 1 i s t o

    c h a n g e t h e f i rs t i f s t a t e m e n t t o

    if r - l --< M th en i ns er t ionsor t l , r ) else . . .

    w h e r e M i s s o m e t h r e s h o l d v a l u e a b o v e w h i c h Q u i c k s o r t

    i s fa s t e r t h a n I n s e r t i o n s o r t .

    I t is s h o w n i n [ 1 5] t h a t t h e r e i s a n e v e n b e t t e r w a y t o

    p r o c e e d . S u p p o s e t h a t s m a l l s u b f i l e s a r e s i m p l y i g n o r e d

    d u r i n g p a r t i t i o n i n g , e . g . b y c h a n g i n g t h e f i r st i f s t a t e m e n t

    i n P r o g r a m 1 t o " i f r - l > M t h e n . . . . " T h e n , a f t e r t h e

    e n t i r e t i l e h a s b e e n p a r t i t i o n e d , i t h a s a l l t h e e l e m e n t s

    w h i c h w e r e u s e d a s p a r t i t i o n i n g e l e m e n t s i n p l a c e , w i t h

    u n s o r t e d s u b t i l e s o f l e n g t h M o r le s s b e t w e e n t h e m . A

    s i n gl e I n s e r t i o n s o r t o f t h e e n t i r e f i l e w i l l q u i t e e f f i c i e n t l y

    c o m p l e t e t h e j o b o f s o r t i n g t h e f i le .

    A n a l y s i s s h o w s t h a t i t t a k e s I n s e r t i o n s o r t o n l y s l i g h t l y

    l o n g e r t o s o r t t h e w h o l e t i l e t h a n i t w o u l d t o s o r t a l l o f

    t h e s u b t il e s , b u t a l l o f t h e o v e r h e a d o f i n v o k i n g I n s e r -

    t i o n s o r t d u r i n g p a r t i t i o n i n g i s e l i m i n a t e d . F o r e x a m p l e ,

    s u b fi le s w i t h M o r f e w e r e l e m e n t s n e v e r n e e d b e p u t o n

    t h e s t a c k , s i n c e t h e y a r e i g n o r e d d u r i n g p a r t i t i o n i n g . I t

    3

    t u r n s o u t t h a t t h i s e l i m i n a t e s z o f t h e s t a c k p u s h e s u s e d ,

    o n t h e a v e r a g e . T h i s m a k e s t h e m e t h o d p r e f e r a b l e t o t h e

    s c h e m e o f s o rt i n g t h e s m a l l s u b t il e s d u r i n g p a r t i t i o n i n g

    ( e v e n i n a n " o p t i m a l " m a n n e r ) .

    F o r m o s t i m p l e m e n t a t i o n s , t h e b e s t v a lu e o f M i s

    a b o u t 9 o r 1 0 , t h o u g h t h e e x a c t v a l u e i s n o t h i g h l y

    c r i ti c a l: A n y v a l u e b e t w e e n 6 a n d 1 5 w o u l d d o a b o u t a s

    w e l l . F i g u r e 3 s h o w s t h e t o t a l r u n n i n g t i m e o n t h e

    m a c h i n e i n [ 17 ] f o r N = 1 0 ,0 0 0 f o r v a r i o u s v a l u e s o f M .

    T h e b e s t v a l u e i s M = 9 , a n d t h e t o t a l r u n n i n g t i m e f o r

    t h i s v a l u e i s a b o u t 1 1 . 6 6 6 7 N I n N - 1 . 7 4 3 N ti m e u n i t s .

    F i g u r e 4 i s a g r a p h o f th e f u n c t i o n 1 4 . 0 5 5 N / ( 1 1 . 6 6 6 7 N

    I n N + 1 2 . 3 1 2 N ), w h i c h s h o w s t h e p e r c e n t a g e i m p r o v e -

    m e n t f o r t h is o p t i m u m c h o i c e M = 9 o v e r t h e n a i v e

    c h o i c e M = 1 ( P r o g r a m 1 ).

    W o r s t C a s e

    A t h i r d m a i n f l a w o f P r o g r a m 1 i s t h a t t h e r e a r e s o m e

    f i l e s w h i c h a r e l i k e l y t o o c c u r i n p r a c t i c e f o r w h i c h i t

    w i ll p e r f o r m b a d l y . F o r e x a m p l e , s u p p o se t h a t t h e n u m -

    b e r s A [ 1 ] , A [ 2 ] . . . . . A [ N ] a r e i n o r d e r a l r e a d y w h e n

    P r o g r a m 2 i s in v o k e d . T h e n A [ 1 w i l l b e t h e f i r s t p a r t i -

    t i o n i n g e l e m e n t , a n d t h e f i r s t p a r t i t i o n w i l l p r o d u c e a n

    e m p t y l e f t s u b t il e a n d a r i g h t s u b t i l e c o n s i s t in g o f A [ 2 ] ,

    . . . . A [ N ] . T h e n t h e s a m e t h i n g w i l l h a p p e n t o t h a t

    s u b t i l e , a n d s o o n . T h e p r o g r a m h a s t o d e a l w i t h f i l e s o f

    s i ze N , N - l , N - 2 . . . . a n d i ts t o t a l r u n n i n g t i m e i s o b v i o u s l y

    p r o p o r t i o n a l t o N 2 . T h e s a m e p r o b l e m a r i s e s w i t h a t i l e

    i n r e v e r s e o r d e r . T h i s O ( N 2 ) w o r s t c a s e i s i n h e r e n t i n

    8 5 0

    F i g . 3. T o t a l r u n n i n g t i m e o f Q u i c k s o r t f o r N = 1 0 ,0 0 0 .

    1,3~,000

    1,2~,000

    1.1~,00~

    I,O00, X)O

    Cutoff for small subfiles M).

    F i g . 4 . I m p r o v e m e n t d u e t o s o r t i n g s m a l l s u b f i l es o n a s e p a r a t e p a s s .

    25

    20

    _E

    5

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    5/11

    M e d i a n - o f - T h r e e M o d i f i c a t i o n

    T h e m e t h o d i s b a s ed o n t h e o b s e r v a ti o n t h a t Q u i c k -

    s o r t p e r f o r m s b e s t w h e n t h e p a r t i t i o n i n g e l e m e n t t u r n s

    o u t t o b e n e a r t h e c e n t e r o f t h e f i le . T h e r e f o r e c h o o s i n g

    a g o o d p a r t i t i o n i n g e l e m e n t i s a k i n t o e s t i m a t i n g t h e

    m e d i a n o f t h e f il e . T h e s t a t is t i ca l l y s o u n d m e t h o d f o r

    d o i n g t h i s i s t o c h o o s e a s a m p l e f r o m t h e f i l e , f i n d t h e

    m e d i a n , a n d u s e t h a t v a l u e a s t h e e s t im a t e f o r th e m e d i a n

    o f t h e w h o l e f i le . T h i s i d e a w a s s u g g e s t e d b y H o a r e i n

    h i s o r i g i n a l p a p e r , b u t h e d i d n ' t p u r s u e i t b e c a u s e h e

    f o u n d i t " v e r y d i f f i c u l t t o e s t i m a t e t h e s a v i n g . " I t t u r n s

    o u t t h a t m o s t o f t h e s a v in g s t o b e h a d f r o m t h i s i d e a

    c o m e w h e n s a m p l e s o f s iz e t h r e e a r e u s e d a t e a c h p a r t i -

    t i on ing s t age . L a rge r s ample s i z e s g ive be t t e r e s t ima te s

    o f t h e m e d i a n , o f c o ur s e , b u t t h e y d o n o t i m p r o~ ,e t h e

    r u n n i n g t i m e s i g n i f ic a n t l y . P r i m a r i l y , s a m p l i n g p r o v i d e s

    i n s u r a n c e t h a t t h e p a r t i t i o n i n g e l e m e n t s d o n ' t c o n s i s t -

    e n t l y f a l l n e a r t h e e n d s o f t h e s u b f i le s , a n d t h r e e e l e m e n t s

    a r e s u f f i c ie n t f o r t h i s p u r p o s e . ( S e e [ 1 5 ] a n d [ 1 7 ] f o r

    a n a l y t i c r e s u l t s c o n f i r m i n g t h e s e c o n c l u s i o n s . ) T h e a v -

    e r a g e p e r f o r m a n c e w o u l d b e i m p r o v e d i f w e u se d a n y

    t h r e e e l e m e n t s f o r t h e s a m p l e , b u t t o m a k e t h e w o r s t c as e

    u n l i k e l y w e s h a l l u se t h e f i rs t, m i d d l e , a n d l a s t e l e m e n t s

    a s th e s a m p l e , a n d t h e m e d i a n o f t h o s e t h r e e a s t h e

    p a r t i t i o n i n g e l e m e n t . T h e u s e o f th e s e t h r e e p a r t i c u l a r

    e l em en t s was s ugge s ted by S ing le ton in 1 9 69 [1 8 ]. Ag a in ,

    c a r e m u s t b e t a k e n n o t t o d i s t u r b t h e p a r t i t i o n i n g p r o c e s s.

    T h e m e t h o d c a n b e i m p l e m e n t e d b y i n s e r t in g t h e s t at e -

    m e n t s

    A[ I + r ) +

    2 ] ~ :

    A[I +

    1];

    i f A l l + 1 ] > A [ r ] t h e n A [ / + 1 ] .----:A [ r ] endif;

    i f A [ l ] > A[r]

    then

    A[I] ~ : A[r]

    endif;

    i f A l l + 1] > A [1]

    then

    A l l + 1 ] ~: A [ / ]

    endif;

    b e f o r e p a r t i t i o n i n g ( a f t e r " i f r > l t h e n " i n P r o g r a m 1.

    T h i s c h a n g e m a k e s A [1 ] th e m e d i a n o f t h e t h r e e e l e m e n t s

    o r i g i n a l l y a t A[I], A [ I + r) + 2 ] , a n d A[r] b e f o r e

    p a r t i t i o n i n g . F u r t h e r m o r e , i t m a k e s

    A[I +

    1] _ A [ l ] , s o the po in te r in i t i a l i z a t ion s can b e c han ge d

    to " i . --- l + 1 ; j . --- r " . T h i s m e th od p re s e rves r and om nes s

    in the s ubf i l e s .

    M e d i a n - o f - t h r e e p ar t i ti o n i n g r e d uc e s t h e n u m b e r o f

    c o m p a r i s o n s b y a b o u t 1 4 p e r c e n t , b u t i t i n c r e a s e s t h e

    n u m b e r o f e x c h a n g e s sl i g ht l y a n d r e q u ir e s t h e a d d e d

    o v e r h e a d o f f in d i n g t h e m e d i a n a t e a c h s ta g e. T h e t o t a l

    e x p e c te d r u n n i n g t i m e f o r t h e m a c h i n e i n [ 1 7 ] ( w i t h t h e

    o p t i m u m v a l u e M - 9 ) i s a b o u t 1 0 . 6 2 86 N In N + 2 . 1 1 6 N

    t i m e u n i t s , a n d F i g u r e 5 s h o w s t h e p e r c e n t a g e s a v i n g s .

    I m p l e m e n t a t i o n

    C o m b i n i n g a l l o f t h e i m p r o v e m e n t s d e s c r ib e d a b o v e ,

    w e h a v e P r o g r a m 2 , w h i c h h a s n o r e c u r s i o n , w h i c h

    i g n o r e s s m a l l s u b f i l e s d u r i n g p a r t i t i o n i n g , a n d w h i c h

    p a r t i t i o n s a c c o r d i n g t o t h e m e d i a n - o f - t h r e e m o d i f i c a t i o n .

    F o r c l a r i ty , t h e d e t a i l s o f s t a c k m a n i p u l a t i o n a n d s e le c t

    i n g t h e s m a l l e r o f t h e t w o s u b f i l e s a r e o m i t t e d . A l s o ,

    8 5 1

    F i g . 5 . I m p r o v e m e n t d u e t o m e d i a n - o f - t h r e e p a r t i t i o n i n g .

    25

    2

    J5

    E

    5

    f

    3 0 0 0 4 0 0 0 5 0 0 0 6 0 0 0 7 0 0 0 8 00 0

    Fiie Size (N)

    90o0 t0ood

    s i n ce r e c u r s i o n i s n o l o n g e r i n v o l v e d , w e w i l l d e a l w i t h

    a n i n - l in e p r o g r a m t o s o rt A l l ] . . . . . A [ N ] .

    Program 2

    integer

    l, r, i, j;

    integer a r r a y s tack[l : 2 X f l N ) ] ;

    b o o l e a n done;

    a r b m o d e a r r a y A [ I : N + 1 ] ;

    a r b m o d e v ;

    l.---- l; r .--- N; d on e ~ N A[r] then A[ l + 1] .---:A[r] endif;

    i f A l l ] >

    A[r] then A[I] ~ : A[r ] endif;

    i f A [ / + 1 ] > A[I] t h e n A[I + 1] .---:A[I] endif;

    i . ---- l+ l ; jm r; v . --- A[ I];

    loop:

    loop:

    i ~ i + 1; wh ile

    A[i] < v

    r e p e a t ;

    loop: j . ---j - 1; whi le A[j] > v r e p e a t ;

    until j < i:

    A[i] ~: A[J];

    r e p e a t ;

    A l l ]

    := : Af t ] ;

    if max( ./" - l, r - i + 1) --< M

    then i f

    s t a c k em p t y

    then done .--- t r u e

    e l s e

    1, r ) .~ p opstack

    endif;

    e l s e i f m i n ( j - l , r - i +

    l ) < _ M

    then (1, r ) := l a r g e s u b t i l e ;

    e lse pushstack

    ( l a r g e s u b t i l e ) ;

    (1, r ) := sm a l l su b t i l e

    endif;

    endif;

    r e p e a t ;

    A [ N +

    1] .--- oo;

    l o op f o r N -

    1 > _ i - - 1 :

    i f A [ i ] > A [ i + 1] then

    v .= A[i] ; j

    :--- i + 1;

    loop: A [ j -

    1] ~

    A[j]; j ~ j +

    l ; w h i l e

    A[j] < v

    r e p e a t ;

    A b - 1] .-- v;

    endif;

    r e p e a t ;

    I n t h e l o g i c f o r m a n i p u l a t i n g t h e s t a c k a f t e r p a r t i -

    t i on ing , ( / , j - 1 ) i s t he " l a rge s u b t i l e " a nd ( i, r ) is t he

    " s m a l l s u b t i le " i f m a x ( j - / , r - i + 1 ) = j - / , a n d v i ce

    v e r s a i f r - i + 1 > j - l . T h i s m a y b e i m p l e m e n t e d

    C o m m u n i c a t i o n s O c t o b e r 1 9 78

    o f V o l u m e 2 1

    t h e A C M N u m b e r 1 0

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    6/11

    s i m p l y a n d e f f ic i e n tl y b y m a k i n g o n e c o p y o f t h e c o d e

    f o r e a c h o f t h e t w o o u t c o m e s o f c o m p a r i n g j - l w i t h r

    - i + 1 .

    N o t e t h a t t h e c o n d i t i o n A [ N + 1 ] = oo i s n o w o n l y

    n e e d e d f o r t h e i n s e r t i o n s o rt . T h i s c o u l d b e e l i m i n a t e d , i f

    d e s i r e d , a t o n l y s l i g h t l o ss b y c h a n g i n g t h e c o n d i t i o n a l i n

    the in ne r loop o f Ins e r t ions o r t t o " w hi l e A [ j' ] < v and j

    _< N ".

    L e f t u n s p e c i f i e d in P r o g r a m 2 a r e t h e v a l u e s o f M ,

    t h e t h re s h o l d f o r sm a l l su b f il e s, a n d f ( N ) , t h e m a x i m u m

    s t ac k d e p t h . T h e s e a r e i m p l e m e n t a t i o n p a r a m e t e r s w h i c h

    s h o u l d b e s p e c i f i e d a s c o n s t a n t s a t c o m p i l e t i m e . A s

    m e n t i o n e d a b o v e , t h e b e s t v a lu e o f M f o r m o s t i m p l e-

    m e n t a t i o n s i s 9 o r 1 0 , a l t h o u g h a n y v a l u e f r o m 6 t o 1 5

    wi l l do nea r ly a s w e l l . (Of cours e , w e m us t hav e M ___ 2 ,

    s i n c e t h e p a r t i t i o n i n g m e t h o d n e e d s a t l e a s t t h r e e e l e -

    m e n t s t o f i n d t h e m e d i a n o f . ) T h e m a x i m u m s t ac k d e p t h

    t u r n s o u t t o b e a l w a y s l es s t h a n l o gz ( N + 1 ) / ( M + 2 ) s o

    ( f o r M = 9 ) a s t a c k w i t h f ( N ) = 2 0 w i l l h a n d l e f i le s o f u p

    t o a b o u t t e n m i l l i o n e l e m e n t s . ( S e e t h e a n a l y s i s i n [ 1 1 ,

    15, 171.)

    F i g u r e 6 d i a g r a m s t h e o p e r a t i o n o f P r o g r a m 2 u p o n

    the d ig i t s o f ~r. N o te th a t a f t e r pa r t i t i on in g a l l t ha t i s l e f t

    f o r t h e i n s e r t i o n s o r t i s t h e s u b t i l e 5 5 5 4 , a n d t h e

    i n s e r t i o n s o r t s i m p l y s c a n s o v e r t h e o t h e r k e y s .

    T h e t o t a l a v e r a g e r u n n i n g t i m e o f a p r o g r a m i s

    d e t e r m i n e d b y f i r st f i n d i n g a n a l y t i c a l l y t h e a v e r a g e f r e -

    q u e n c y o f e x e c u t i o n o f e a c h o f t h e i n s t r u c t i o n s , t h e n

    m u l t i p l y i n g b y t h e t im e p e r i n s t r u c t i o n a n d s u m m i n g

    o v e r a l l i n s tr u c t i o n s . I t t u r n s o u t t h a t t h e t o t a l e x p e c t e d

    r u n n i n g t i m e o f P r o g r a m 2 c a n b e d e t e r m i n e d f r o m t h e

    s ix quan t i t i e s :

    A s t h e n u m b e r o f p a r t i ti o n i n g s ta g es ,

    B u t h e n u m b e r o f e x c h a n g e s d u r i n g p a r ti t io n i n g ,

    CN

    t h e n u m b e r o f c o m p a r i s o n s d u r i n g p a r t i t i o n in g ,

    SN

    t h e n u m b e r o f s ta c k p u s h e s ( a n d p o p s) ,

    D N t h e n u m b e r o f i ns e r ti o n s, a n d

    E2v t h e n u m b e r o f k e y s m o v e d d u r i n g in s e r t i o n .

    I n P r o g r a m 2 ,

    CN

    i s t he n um be r o f t imes i . --- i + 1 i s

    e x e c u t e d p l u s t h e n u m b e r o f t i m e s j .'= j + 1 i s e x e c u t e d

    w i t h i n t h e s c a n n i n g l o o p s ; B N i s t h e n u m b e r o f t im e s

    A[i] .= A[ j]

    i s e x e c u t e d i n t h e p a r t i t i o n i n g l o o p ;

    AN

    is

    t h e n u m b e r o f ti m e s t h e m a i n l o o p i s it e r a t e d ;

    DN

    i s t he

    n u m b e r o f t i m e s v i s c h a n g e d i n t h e i n s e r t i o n s o r t ; a n d

    EN

    i s t h e n u m b e r o f t i m e s A [ j - 1 ] .---

    A[j]

    i s e x e c u t e d

    F i g . 6 . Q u i c k s o r t i n g ~ r - - i m p r o v e d m e t h o d ( P r o g r a m 2 , M

    Q u ic k s o r t : 3 1 4 1 5 9 2 6 5 3 5 8 9

    2 3 3 1 1 3 9 5 5 4 5 8 9

    1 1 2 3 3

    5 5 5 4 6 8 9

    7 8

    I n s e r t i o n - 1 1 2 3 3 3 5 5 5 4 6 7 8

    s o ~ : 4 6 7 8

    4 5

    1 1 2 3 3 3 4 5 5

    1 1 2 3 3 3 4 5 5 5 6 7 8

    85 2

    = 4).

    7 9 3

    7 9 6

    7 9 9

    9 9 9

    9 9 9

    9 9 9

    9 9 9

    i n t h e i n s e r t i o n s o r t . E a c h i n s t r u c t i o n i n a n a s s e m b l y

    l a n g u a g e i m p l e m e n t a t i o n c a n b e l a b e l e d w i t h i t s f r e -

    q u e n c y i n t e r m s o f th e s e q u a n t i t i e s a n d N . ( T h e r e m ~ ty

    b e a f e w o t h e r q u a n t i t i e s i n v o l v e d : i f t h e y d o n o t r e l a t e

    s i m p l y t o t h e m a i n q u a n t i t i e s o r c a n c e l o u t w h e n t h e

    t o t a l r u n n i n g t i m e i s c o m p u t e d , t h e n t h e y g e n e r a l l y c a n

    b e a n a l y z e d i n t h e s a m e w a y a s t h e o t h e r q u a n t i t i e s

    [1 7] .) T h e ana lys i s in [1 7 ] y i e lds exac t va lue s fo r the s e

    q u a n t i t i e s , f r o m w h i c h t h e t o t a l r u n n i n g t i m e c a n b e

    c o m p u t e d a n d t h e b es t v a l u e o f M c h o s e n . F o r M = 9 i t

    t u r n s o u t t h a t

    CN --~ 1.714 N In N - 3.74N ,

    B2v -~ .343 N In N - .84N

    E2v = 1.14N,

    DN

    " ~ . 6 0 N ,

    AN M . 1 6 N , SN . 05N.

    F r o m t h e se e q u a t io n s , t h e t o ta l r u n n i n g t i m e o f a n y

    p a r t i cu l a r i m p l e m e n t a t i o n o f P r o g r a m 2 ( w i t h M = 9 )

    can ea s i ly be e s t ima ted . Fo r the m od e l in [9 , 1 5, 1 7 ] , t he

    t o t a l e x p e c t e d r u n n i n g t i m e i s 53AN +

    11B2v + 4CN +

    3DN + 8E2v + 9SN + 7N,

    w h i c h l e a d s t o t h e e q u a t i o n

    1 0 . 62 8 6N In N + 2 . 1 1 6N g iven above .

    A s s e m b l y L a n g u a g e

    P r o g r a m 2 i s a n e x t r e m e l y e f f i c i e n t s o r t i n g m e t h o d ,

    b u t i t w i l l n o t r u n e f f ic i e n tl y o n a n y p a r t i c u l a r c o m p u t e r

    u n l e s s i t i s t r a n s l a t e d i n t o a n e f f i c i e n t p r o g r a m i n t h a t

    c o m p u t e r ' s m a c h i n e l a n g u a g e . I f l a r g e ti l es a re t o b e

    s o r t e d o r i f t h e p r o g r a m i s t o b e u s e d o f t e n , th i s t a s k

    s h o u l d n o t b e e n t r u s t e d t o a n y c o m p i l er . W e s h a ll n o w

    t u r n f r o m m e t h o d s o f i m p r o v i n g t h e a l g o r it h m t o m e t h -

    o d s o f c o d i n g th e p r o g r a m f o r a m a c h i n e .

    O f m o s t i n t e r e st i s t h e " i n n e r l o o p " o f t h e p r o g r a m ,

    t h o s e s t a t e m e n t s w h o s e e x e c u t i o n f r e q u e n c i e s a r e p r o -

    p o r t i o n a l t o N I n N . W e s h a l l t h e r e f o r e c o n c e r n o u r s e l v e s

    w i t h t h e t r a n s l a t i o n o f t h e s t a te m e n t s

    loop:

    loop:

    i ~ i + 1; wh ile A[i] < v r e p e a t ;

    I oo p : j . ' = - j - 1 ; w h i l e A[ j ] > v r e pe a t ;

    u n t i l j < i :

    A[i] ~: A []' ] ;

    r e pe a t ;

    A s s e m b l y - l a n g u a g e i m p l e m e n t a t i o n s o f t h e r e s t o f t h e

    p r o g r a m s m a y b e f o u n d i n [ 9 ] o r [1 5 ]. R a t h e r t h a n u s e

    a n y p a r t i c u l a r a s s e m b l y - l a n g u a g e o r d e a l w i t h a n y p a r -

    t i c u l a r m a c h i n e , w e s h a l l u s e a m y t h i c a l s e t o f i n s t r u c -

    t i o n s s i m i l a r t o t h o s e i n K n u t h ' s M IX [ 7 ]. O n l y s i m p l e

    m a c h i n e - l a n g u a g e c a p a b i l i t i e s a r e u s e d , a n d t h e p r o -

    g r a m s a n d r e s u l t s t h a t w e s h a l l d e r i v e m a y e a s i l y b e

    t r a n s l a t e d t o a p p l y t o m o s t r e a l m a c h i n e s .

    T o b e g i n , a d i r e c t tr a n s l a t i o n o f t h e i n n e r l o o p o f

    P r o g r a m s 1 a n d 2 is g iv e n b el o w . T h e c o m m e n t s o n e a c h

    l i n e e x p l a i n w h a t t h e i n s t r u c t i o n s a r e i n t e n d e d t o d o .

    T h e m n e m o n i c s I , V , J , X , a n d Y a r e s y m b o l i c r e g i s te r

    n a m e s , a n d t h e n o t a t i o n

    A(I)

    m e a n s t h e c o n t e n t s o f t h e

    m e m o r y l o c a t i o n w h o s e a d d r e s s is A p l u s t h e c o n t e n t s o f

    i n d e x r e g i s t e r / , o r A[i]. R e a d e r s u n f a m i l i a r w i t h a s s e m -

    b l y l a n g u a g e p r o g r a m m i n g s h o u l d c o n s u l t [ 7] .

    C o m m u n i c a t i o n s O c t o b e r 1 9 78

    o f V o l u m e 2 1

    t h e A C M N u m b e r 1 0

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    7/11

    L O O P I N C I , 1

    C M P V , A ( I )

    J G * - 2

    D E C J , 1

    C M P V , A ( J )

    JL * - 2

    C M P J , I

    J L O U T

    L D X , A ( I )

    L D Y , A ( J )

    S T X , A ( J )

    S T Y , A ( I )

    J M P L O O P

    O U T

    I n c r e m e n t r e g i s t e r I b y l .

    C o m p a r e v w i t h

    A[i].

    G o b a c k t w o i n s t r u c t i o n s i f v >

    A[i].

    D e c r e m e n t r e g i s t e r J b y 1 .

    C o m p a r e v w i t h A [ j ] .

    G o b a c k t w o i n s t r u c t i o n s i f v < A [ j ] .

    C o m p a r e J w i t h I .

    L e a v e l o o p i f j < i .

    L o a d A [ i ] i n t o r e g i s t e r X .

    L o a d A [ j ] i n t o r e g i s t e r Y .

    S t o r e r e g i s t e r X i n t o A [ j ] .

    S t o r e r e g i s t e r Y i n t o A [ 0 .

    U n c o n d i t i o n a l j u m p t o L O O P .

    T h i s d i r e c t t ra n s l a t i o n o f t h e i n n e r l o o p o f P r o g r a m s 1

    a n d 2 is m u c h m o r e e f f ic i e n t t h a n t h e c o d e t h a t m o s t

    c o m p i l e r s w o u l d p r o d u c e , a n d t h e r e i s s t i l l r o o m f o r

    i m p r o v e m e n t .

    F i rs t , n o i n n e r l o o p s h o u l d e v e r e n d w i t h a n u n c o n -

    d i t io n a l j u m p . A n y s u c h l o o p m u s t c o n t a i n a c o n d i t i o n a l

    j u m p s o m e w h e r e , a n d i t c a n a l w a y s b e " r o t a t e d " t o e n d

    w i t h t h e c o n d i t i o n a l j u m p , a s f o ll o w s :

    J M P I N T O

    L O O P L D X , A ( I)

    L D Y , A ( J )

    S T X , A ( J )

    S T Y , A ( I )

    I N T O I N C I , 1

    C M P V , A ( I )

    JG - 2

    D E C J , 1

    C M P V , A ( J )

    JL * - 2

    C M P J , I

    J G E L O O P

    O U T

    T h i s s e q u e n c e c o n t a in s e x a c t l y t h e s a m e n u m b e r o f

    i n s t r u c t i o n s a s t h e a b o v e , a n d t h e y a r e i d e n t i c a l w h e n

    e x e c u t ed ; b u t th e u n c o n d i t i o n a l j u m p h a s b e e n m o v e d

    o u t o f t h e i n n e r l o o p . ( I f t h e i n i t i a l i z a t i o n o f I w e r e

    c h a n g e d , a f u r t h e r s a v i n gs c o u l d b e a c h i e v e d b y m o v i n g

    I N T O d o w n o n e i n s t r u c t i o n . ) T h i s s i m p l e c h a n g e r e -

    d u c e s th e r u n n i n g t i m e o f t h e p r o g r a m b y a b o u t 3

    p e r c e n t .

    T h e c o e f f i c i e n t s I l a n d 4 f o r B N a n d CN i n t h e

    e x p r e s s io n g i v e n a b o v e f o r t h e t o t a l r u n n i n g t i m e c a n b e

    v e r i f i e d b y c o u n t i n g t w o t i m e u n i t s f o r i n s t r u c t i o n s w h i c h

    r e f e r e n c e m e m o r y a n d o n e t i m e u n i t f o r t h o s e w h i c h d o

    n o t. I t is th i s l ow a m o u n t o f o v e r h e a d t h a t m a k e s Q u i c k -

    s o r t s t a n d o u t a m o n g s o r t i n g a l g o r i t h m s . I n f a c t , t h e t r u e

    " i n n e r l o o p " i s e v e n t i g h t e r , b e c a u s e w e h a v e t w o l o o p s

    w i t h i n t h e i n n e r l o o p h e r e : t h e p o i n t e r s c a n n i n g i n s t r u c -

    t i o n s

    I N C I , 1 D E C J , 1

    C M P V , A ( I ) C M P V , A ( J )

    JG * - 2 JL * - 2

    a r e e x e c u t e d , o n t h e a v e r a g e , t h re e t i m e s m o r e o f t e n t h a n

    t h e o t h e r s f o r P r o g r a m 1 . ( T h e f a c t o r i s 2 f o r P r o g r a m

    2 . ) I t i s h a r d t o i m a g i n e a s i m p l e r s e q u e n c e o n w h i c h t o

    b a s e a n a l g o r i t h m : p o i n t e r i n c r e m e n t , c o m p a r e , a n d

    c o n d i t i o n a l j u m p . T h e f a c t t h a t t h e s e l o o p s a r e s o s m a l l

    8 5 3

    m a k e s t h e p r o p e r i m p l e m e n t a t i o n a n d t r a n s l a t i o n

    Q u i c k s o r t c r i t ic a l . I f w e h a d a t r a n s l a t i o n o f l o o p: i :

    + 1; while

    A[i] < v

    r e p e a t w h i c h u s e d o n l y t h r e e s u p

    f l u o u s in s t r u c t i o n s , o r i f w e h a d c h e c k e d f o r t h e p o i n t e

    c r o s s i n g o r g o i n g o u t s i d e t h e a r r a y b o u n d s w i t h i n t h e

    l o op s , t h e n t h e r u n n i n g t i m e o f t h e w h o l e p r o g r a m c o u

    b e d o u b l e d

    Loop Unwrapping

    O n t h e o t h e r h a n d , w i t h o u r a t t e n t i o n f o c u s e d

    t h e s e t w o p a i r s o f t h r e e i n s t r u c t i o n s , w e c a n f u r t h

    i m p r o v e t h e e f f i c ie n c y o f t he p r o g r a m s . T h e o n l y r

    o v e r h e a d w i t h i n t h e s e i n n e r l o o p s i s t h e p o i n t e r a r i t

    m e t i c , I N C I , 1 a n d D E C J , 1 . W e s h a l l u se a t e c h n i q

    c a l le d " l o o p u n w r a p p i n g " ( o r " l o o p u n r o l l i n g " - - s

    [ 3 ] ) w h i c h u s e s t h e a d d r e s s i n g h a r d w a r e t o r e d u c e t h

    o v e r h e a d . T h e i d e a i s t o m a k e t w o c o p i e s o f t h e l o o

    o n e f o r A[i] a n d o n e f o r A[ i + 1 ] , t h e n i n c r e m e n t t

    p o i n t e r o n c e b y 2 e a c h t i m e t h r o u g h . O f c o u r s e , t h e c o

    c o m i n g i n t o a n d g o i n g o u t o f th e l o o p h a s t o b e a p p r

    p r i a t e l y m o d i f i e d .

    L o o p u n w r a p p i n g i s a w e l l - k n o w n t e c h n i q u e , b u t

    i s n o t w e l l u n d e r s t o o d , a n d i t w i l l b e i n s t r u c t i v e

    e x a m i n e i t s a p p l i c a t i o n t o Q u i c k s o r t i n d e t a i l . T

    s t r a i g h tf o r w a r d w a y t o p r o c e e d w o u l d b e t o r e p l a c e t

    i n s t r u c t i o n s

    I N C I , 1

    C M P V , A ( I )

    JG * - 2

    b y o n e o f th e e q u i v a l e n t c o d e s e q u e n c e s

    J M P I N T O L O O P C M P V , A + 1 (I )

    L O O P I N C I , 1 J L E O U T 1

    C M P V , A ( I ) I N C I , 2

    J L E O U T C M P V , A ( I)

    I N T O C M P V , A + 1 (I) J G L O O P

    J G L O O P J M P O U T

    I N C I , 1 O U T 1 I N C I , 1

    O U T ~ O U T

    W e c a n m e a s u r e t h e r e l a t i ve e f f i c ie n c y o f t h es e a l t e r a

    t iv e s b y c o n s i d e r in g h o w m a n y m e m o r y r e f e r e n c e t h

    i n v o l v e , a s s u m i n g t h a t t h e l o o p i t e r a t e s s t i m e s . T

    o r i g i n a l c o d e u s e s 4 s m e m o r y r e f e r e n c e s ( t h r e e f o r i

    s t ru c t io n s , o n e f o r d a t a ). F o r t h e u n w r a p p e d p r o g r a m

    t h e l e ft a b ov e , t h e n u m b e r o f m e m o r y r e f e r e n c e s t a k

    for s = 1, 2 , 3 , 4 , 5 . . . i s 5 , 8 , 12, 15, 19 . . . . an d a ge

    e r a l f o r m u l a f o r t h e n u m b e r o f r e f e r e nc e s s a v e d i s [( s

    2 ) / 2 J . F o r t h e p r o g r a m o n t h e r i g h t , t h e v a l u e s a r e 4 ,

    11, 15, 18

    . . . a n d t h e s a v i n g s a r e L(s l ) J. I n b o t h c a s

    a b o u t

    V2s

    i n c r e m e n t s a r e s a v e d , b u t th e p r o g r a m o n t

    r i g h t i s s l i g h t l y be t t e r .

    H o w e v e r , b o t h s e q u e n c e s c o n t a i n u n n e c e s s a r y u

    c o n d i t i o n a l j u m p s , a n d b o t h c a n b e r e m o v e d , a l t h o u

    w i t h q u i t e d i f f e r e n t t e c h n i q u e s . I n t h e s e c o n d p r o g r a

    t h e c o d e a t O U T c o u l d b e d u p l i c a te d a n d a c o p y s u bs

    t u t e d f o r J M P O U T . T h i s t e c h n i q u e is c u m b e r s o m e

    t h i s c o d e c o n t a i n s b r a n c h e s , a n d f o r Q u i c k s o r t i t e v

    c o n t a i n s a n o t h e r l o o p t o b e u n w r a p p e d . D e s p i t e s u

    co m p l i ca t i o n s , t h i s w i l l i n c rease t h e sav in g s t o Ls

    C o m m u n i c a t i o n s O c t o b e r 1 9 78

    o f V o l u m e 2 1

    t h e A C M N u m b e r 1 0

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    8/11

    w h e n t h e l o o p i s i te r a t e d s t im e s . F o r t u n a t e l y , t h i s s a m e

    e f f i c i e n c y c a n b e a c h i e v e d b y r e p a i r i n g t h e j u m p i n t o t h e

    l o o p i n t h e p r o g r a m o n t h e l e f t . T h e c o d e i s e x a c t l y

    e q u i v a l e n t t o

    C M P V, A + l ( I )

    J L E O U T 1

    L O O P I N C I , 1

    CMP V, A(I)

    JLE O U T

    C M P V, A + l ( I )

    J G L O O P

    O U T 1 I N C I , 1

    O U T i

    a n d t h i s c o d e s a v e s / s / 2 J m e m o r y r e f e r e n c e s o v e r t h e

    o r i g i n a l w h e n t h e l o o p i s i t e r a t e d s t i m e s . T h e j l o o p c a n

    o b v i o u s l y b e u n w r a p p e d i n t h e s a m e w a y , a n d t h e s e

    t r a n s f o r m a t i o n s g i v e u s a m o r e e f f i c i e n t p r o g r a m i n

    w h i c h t h e I a n d J p o i n t e r s a r e a l t e r e d m u c h l e s s o f t e n .

    N o t e t h a t s i n c e t h e i n n e r l o o p s o f Q u i c k s o r t a r e

    i t e r a t e d o n l y a f e w t i m e s o n t h e a v e r a g e , i t i s v e r y

    i m p o r t a n t t h a t l o o p u n w r a p p i n g b e c a r e f u l ly i m p l e -

    m e n t e d . T h e f i rs t im p l e m e n t a t i o n a b o v e i s s lo w e r t h a n

    t h e o r i g i n a l l o o p i f i t is i t e r a t e d j u s t o n c e , a n d a c t u a l l y

    i n c r e as e s t h e t o t a l r u n n i n g t i m e o f t h e p r o g r a m .

    T h e a n a l y si s o f th e e f f e c t o f l o o p u n w r a p p i n g t u r n s

    o u t t o b e m u c h m o r e d i f f ic u l t t h a n t h e o t h e r v a r i a n ts

    t h a t w e h a v e s e e n. T h e r e s u l ts i n [ 1 7] s h o w t h a t u n w r a p -

    p i n g t h e l o o p s o f P r o g r a m 2 o n c e r e d u c e s i ts r u n n i n g

    t i m e t o a b o u t 1 0 .0 0 3 8 N I n N + 3 . 53 0 N , t i m e u n i t s , a n d

    t h a t i t i s n o t w o r t h w h i l e t o u n w r a p f u r t h e r . F i g u r e 7

    s h o w s th e p e r c e n t a g e i m p r o v e m e n t w h e n t h i s te c h n i q u e

    i s a p p l i e d t o P r o g r a m 2 .

    Perspective

    B y d e s c r i b i n g a l g o r i t h m s t o s o r t r a n d o m l y o r d e r e d

    a n d d i s t i n c t s i n g l e - w o r d k e y s i n a h i g h l e v e l l a n g u a g e ,

    a n d u s i n g p e r f o r m a n c e s t a t i s t i c s f r o m l o w l e v e l i m p l e -

    m e n t a t i o n s o n a m y t h i c a l m a c h i n e , w e h av e a v o i d e d a

    n u m b e r o f c o m p l i c a t e d p r a c t i c a l is s ue s . F o r e x a m p l e , a

    r e a l a p p l i c a t i o n m i g h t i n v o l v e w r i t i n g a p r o g r a m i n a

    h i g h l e v e l l a n g u a g e t o s o r t a la r g e f i le o f m u l t i w o r d k e y s

    o n a v i r t u a l m e m o r y s y s t e m . W h i l e o t h e r s o r ti n g m e t h o d s

    m a y b e a p p r o p r i a t e f o r s o m e p a r t i c u l a r a p p l i c a t i o n s ,

    Q u i c k s o r t i s a v e r y f le x i b l e a l g o r i t h m , a n d t h e p r o g r a m s

    d e s c r i b e d a b o v e c a n b e a d a p t e d t o r u n e f f i c i e n t l y i n

    m a n y s p e c i a l s i t u a t i o n s t h a t a r i s e i n p r a c t i c e . W e s h a l l

    e x a m i n e , i n t u r n , r a m i f i c a t i o n s o f t h e a n a l y s i s , sp e c i a l

    c h a r a c t e r i s t i c s o f a p p l i c a t i o n s , s o f t w a r e c o n s i d e r a t i o n s ,

    a n d h a r d w a r e c o n s i d er a t io n s .

    Analysis

    I n a p r a c t i c a l s i t u a t io n , w e m i g h t n o t e x p e c t t o h a v e

    r a n d o m l y o r d e r e d f i le s o f d i s t i n c t k ey s , so t h e r e l e v a n c e

    o f t h e a n a l y t i c r e su l t s m i g h t b e q u e s t i o n e d . F o r t u n a t e l y ,

    w e k n o w t h a t t h e s t a n d a r d d e v i a t i o n is lo w ( f o r P r o g r a m

    1 t h e s t a n d a r d d e v i a t i o n h a s b e e n s h o w n t o b e a b o u t

    0 . 6 4 8 N [ 1 1 , 1 7 ]) , s o w e c a n e x p e c t t h e a v e r a g e r u n n i n g

    8 5 4

    F i g . 7 . I m p r o v e m e n t d u e t o l o o p u n w r a p p i n g .

    [

    E

    i[

    h

    I ~ 2 0 0 0 3 0 0 0 4 0 0 0 5 ( ) 0 0 60 0 0 7 ~ 8 0 b 0 9 0 0 0 1 00 00

    File Size (N)

    t i m e t o b e r e a s o n a b l y c l o s e t o t h e f o r m u l a s g i v e n ( f o r

    e x a m p l e , w e c a n b e 9 9 p e r c e n t s u r e t h a t t h e f o r m u l a f o r

    P r o g r a m 1 i s a c c u r a t e t o w i t h i n 2 N ) . I t i s s h o w n i n [ 1 6]

    t h a t t h e a s s u m p t i o n t h a t t h e k e y s a r e d i s t i n c t i s j u s t i f i e d

    a n d t h a t P r o g r a m 2 p e rf o r m s w e l l w h e n e q u a l k e y s a r e

    p r e se n t . F u r t h e r m o r e t h e t e c h n i q u e o f p a r t i t io n i n g o n

    t h e m e d i a n o f t h e f i r st , m i d d l e , a n d l a s t e l e m e n t s o f th e

    f i l e e n s u r e s t h a t P r o g r a m 2 w i l l w o r k w e l l o n f i l e s t h a t

    a r e a l m o s t i n o r d e r , w h i c h d o o c c u r i n p r a c ti c e . I f o t h e r

    b i a s es a r e s u s p e ct e d , t h e u se o f a r a n d o m e l e m e n t f o r

    p a r t i t i o n i n g w i l l l e a d t o a c c e p t a b l e p e r f o r m a n c e .

    A l l o f t h e Q u i c k s o r t p r o g r a m s d o h a v e a n

    O N z)

    w o r s t ca s e. O n e c a n a l w a y s " w o r k b a c k w a r d s " t o f r e d a

    f i le w h i c h w i l l r e q u i r e t i m e p r o p o r t i o n a l t o N 2 t o s o r t .

    T h i s f a c t o f t e n d i s s u a d e s p e o p l e f r o m u s i n g Q u i c k s o r t ,

    b u t i t s h o u l d n o t . T h e l o w s t a n d a r d d e v i a t i o n s a y s t h a t

    t h e w o r s t c a s e i s e x t r e m e l y u n l i k e l y t o o c c u r i n a p r o b -

    ab i l i s t i c s ens e . T h i s p ro v ides l i t t l e cons o la t io n i f i t does

    o c c u r i n a p r a c t i c a l f i l e, a n d t h i s i s p o ss i b l e f o r P r o g r a m

    1 s i n ce f i l e s a l r e a d y i n o r d e r a n d o t h e r s i m p l e f i l e s w i l l

    l e a d t o t h e w o r s t c a s e. T h i s d o e s n o t s e e m t o b e t h e c a s e

    f o r P r o g r a m 2 . H o a r e ' s t e c h n i q u e o f u si n g a r a n d o m

    p a r t i ti o n i n g e l e m e n t m a k e s i t e x t r e m e l y u n l i k e l y t h a t t h e

    r u n n i n g t i m e w i l l b e f a r f r o m t h e p r e d i c t e d a v e r a g e s .

    ( T h e a n a l y s i s i s e n t i r e l y v a l i d i n t h i s c a s e , n o m a t t e r

    w h a t t h e i n p u t i s .) H o w e v e r , t h i s i s m o r e e x p e n s i v e t h a n

    t h e m e t h o d o f P r o g r a m 2 , w h i c h a p p e a r s t o o f f e r s u ff i -

    c i e n t p r o t e c t i o n a g a i n s t t h e w o r s t c a se .

    Applications

    W e h a v e i m p l i c i tl y a s s u m e d t h r o u g h o u t t h a t a l l o f

    t h e r e c o r d s t o b e s o r t e d f i t i n t o m e m o r y - - Q u i c k s o r t i s

    a n " i n t e r n a l " s o r t i n g m e t h o d . T h e i s s u e s i n v o l v e d i n

    s o r t i n g v e r y , v e r y l a r g e f i le s i n e x t e r n a l s t o r a g e a r e very

    d i f f er e n t . M o s t " e x t e r n a l " s o r t i n g m e t h o d s f o r d o i n g s o

    a r e b a s e d o n s o r t i n g s m a l l s u b f i l e s o n o n e p a s s t h r o u g h

    t h e d a t a , t h e n m e r g i n g t h e s e o n s u b s e q u e n t p a s s e s . T h e

    t i m e t a k e n b y s u c h m e t h o d s i s d e p e n d e n t o n p h y s i c a l

    d e v i c e c h a r a c t e r i st i c s a n d h a r d w a r e c o n f i g u r a t i o n s . S u c h

    m e t h o d s h a v e b e e n s t u d i e d e x t e n s iv e l y , b u t t h e y a r e n o t

    c o m p a r a b l e t o i n t e r n a l m e t h o d s l i k e Q u i c k s o r t b e c a u se

    t h e y a r e s o l v i n g a d i f f e r e n t p r o b l e m .

    C o m m u n i c a t i o n s O c t o b e r 1 97 8

    o f V o l u m e 2 1

    t h e

    A C M N u m b e r 10

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    9/11

    I t i s c o m m o n i n p r a c t i c a l s i t u a t i o n s t o h a v e m u l t i -

    w o r d k e y s a n d e v e n l a r g e r r e c o r d s i n t h e f i e l d s t o b e

    s o r t ed . I f r e c o r d s a r e m o r e t h a n a f e w w o r d s l o n g , i t i s

    b e s t to k e e p a t a b l e o f p o i n t e r s a n d r e f e r to t h e r e c o r d s

    i n d i r e c t ly , so o n l y o n e - w o r d p o i n t e r s n e e d b e e x c h a n g e d ,

    n o t l o n g r e c o rd s . T h e r e c o r d s c a n b e r e a r r a n g e d a f t e r t h e

    p o i n t e r s h a v e b e e n " s o r t e d . " T h i s i s c a l l e d a " p o i n t e r "

    o r " a d d r e s s t a b l e " s o r t ( se e [ 1 1 ] ). T h e m a i n e f f e c t o f

    m u l t i w o r d k e y s t o b e c o n s i d e r e d i s t h a t t h e r e i s m o r e

    o v e r h e a d a s s o c i a t e d w i t h e a c h c o m p a r i s o n a n d e x -

    c h a n g e . T h e r e s u l ts g i v e n a b o v e a n d i n [ 1 7 ] m a k e i t

    p o s s i b l e t o c o m p a r e v a r i o u s a l t e r n a t i v e s a n d d e t e r m i n e

    t h e t o t a l e x p e c t e d r u n n i n g t i m e f o r p a r t i c u l a r a p p l i c a -

    t i o n s . F o r l a r g e r e c o r d s , t h e i m p r o v e m e n t d u e t o l o o p

    u n w r a p p i n g b e c o m e s u n i m p o r t a n t . I f t h e k e y s a re v e r y

    l o n g , i t m a y p a y t o s a v e e x t r a i n f o r m a t i o n o n t h e s t a c k

    i n d i c at i n g h o w m a n y w o r d s i n th e k e y s a re k n o w n t o b e

    e q u a l ( s e e [ 6] ) . O u r c o n c l u s i o n s c o m p a r i n g m e t h o d s a r e

    s t i l l v a l i d , b e c a u s e t h e e x t r a o v e r h e a d a s s o c i a t e d w i t h

    l a r g e k e y s a n d r e c o r d s i s p r e s e n t i n a l l t h e s o r t i n g m e t h -

    ods .

    W h e n w e s a y th a t Q u i c k s o r t is a g o o d " g e n e r a l

    p u r p o s e " m e t h o d , o n e i m p h c a t i o n i s t h a t n o t m u c h

    i n f o r m a t i o n i s a v a i l a b l e o n t h e k e y s t o b e s o r t e d o r t h e i r

    d i s t r ib u t i o n . I f s u c h i n f o r m a t i o n i s a v a i l a b l e , t h e n m o r e

    e f f i c ie n t m e t h o d s c a n b e d e v i s e d . F o r e x a m p l e , i f t h e

    k e y s a re t h e n u m b e r s 1 , 2 , .. . , N , a n d a n e x t r a t a b l e o f

    s i z e N i s a v a i l a b l e f o r o u t p u t , t h e y c a n b e s o r t e d b y

    s c a n n i n g t h r o u g h t h e f i l e s e q u e n t i a l l y , p u t t i n g t h e k e y

    w i t h v a l u e i i n t o t h e / t h p o s i t i o n i n t h e t a b l e . ( T h i s k i n d

    o f s o r ti n g , c al l e d " a d d r e s s c a l c u l a t i o n , " c a n b e e x t e n d e d

    t o h a n d l e m o r e g e n e r a l s i tu a t i o n s .) A s a n o t h e r e x a m p l e ,

    s u p p o s e t h a t t h e N e l e m e n t s t o b e s o r t e d h a v e o n l y

    2 t

    +

    1 d i s t i n c t v a l u e s, a l l o f w h i c h a r e k n o w n . T h e n w e c a n

    p a r t i ti o n t h e a r r a y o n t h e m e d i a n v a l u e, a n d a p p l y t h e

    s a m e p r o c e d u r e t o t h e s u b f il e s, i n t o t a l ti m e p r o p o r t i o n a l

    to ( t + 1 ) (N + 1 ). I t is s ho wn in [1 6 ] tha t P rog ram 1 wi l l

    t a k e o n t h e o r d e r o f (2 I n 2 ) t N c o m p a r i s o n s o n s u c h f il e s,

    s o Q u i c k s o r t d o e s n o t p e r f o r m b a d l y . O t h e r s p e c i a l -

    p u r p o s e m e t h o d s c a n b e a d a p t e d t o o t h e r s p e c i a l s i t u a -

    t io n s , b u t P r o g r a m 2 c a n b e r e c o m m e n d e d a s a g e n e r a l

    p u r p o s e s o r t in g m e t h o d b e c a u se i t h a n d l e s m a n y o f th e s e

    s i t u a t i o n s a d e q u a t e l y .

    S o f t w a r e

    M o d e r n c o m p i l e rs h a v e n o t p r o g r es s e d t o t h e p o i n t

    w h e r e t h e y c a n p r o d u c e t h e b e s t p o s s i b l e ( o r e v e n v e r y

    g o o d ) a s s e m b l y - l a n g u a g e t r a n s l a t i o n s o f h i g h l e v e l p r o -

    g r a m s , s o w e h a v e d e a l t w i t h " i d e a l " a s s e m b l y - l a n g u a g e

    i m p l e m e n t a t i o n s . S t a n d a r d c o m p i l e r s p r o d u c e c o d e f o r

    Q u i c k s o r t t h a t i s 3 0 0 -4 0 0 p e r c e n t s l o w e r t h a n t h e a s s e m -

    b l y - l a n g u a g e i m p l e m e n t a t i o n ( s ee [ 1 5] ). I t is n o t u n r e a -

    s o n a b le t o e x p e c t t h a t c o m p i l er s m a y s o m e d a y p r o d u c e

    p r o g r a m s c l o se t o t h e i d e a l , si n c e s o m e o f t h e i m p r o v e -

    m e n t s t h a t w e m a d e c o u l d b e d o n e m e c h a n i c a l l y a n d a r e

    u s e d i n s o - c a l l e d " o p t i m i z i n g " c o m p i l e r s . Q u i c k s o r t ' s

    p a r t i t i o n i n g l o o p , b e c a u s e o f i t s s tr u c t u r e , i s a c t u a l l y a

    g o o d t e s t c as e fo r o p t i m i z i n g c o m p i l e r s - - o n e w e l l - k n o w n

    855

    c o m p i l e r a c t u a l ly m a k e s t h e i n n e r l o o p l o n g e r w h e n i t s

    o p t i m i z i n g f e a t u r e i s i n v o k e d [ 15 ].

    I f a s o r t in g p r o g r a m m u s t r u n e f f i c i e n tl y , i t s h o u l d b e

    i m p l e m e n t e d i n a s se m b l y l a n g u a g e , a n d w e h a v e s h o w n

    a g o o d w a y t o d o s o . I t i s i n t e r e s t i n g t o n o t e t h a t o n

    m a n y c o m p u t e r s a n i m p l e m e n t a t i o n o f Q u i c k s o r t i n

    F o r t r a n ( f o r e x a m p l e ) w i l l r e q u i re a b o u t a s m a n y s o u r c e

    s t a te m e n t s a s a n a s s e m b l y - l a n g u a g e i m p l e m e n t a t i o n ( s ee

    [15 ] , bu t i t wi l l o f cours e p r odu ce a m uc h l e s s e f f i c i en t

    p r o g r a m .

    I f o n e i s w i l l i n g t o p a y f o r t h e e x t r a o v e r h e a d o f

    i m p l e m e n t i n g h i s s o r t i n g p r o g r a m i n a h i g h l e v e l l a n -

    g u a g e , t h e n Q u i c k s o r t s h o u l d s t il l b e u s e d b e c a u s e i t w i l l

    i n c u r r e l a t i v e l y l e s s o v e r h e a d t h a n o t h e r m e t h o d s . P r o -

    g r a m 2 c a n b e u s e d a s i t s t a n d s , a l t h o u g h a n y e f f o r t

    s p e n t t r y i n g t o " o p t i m i z e " i t ( s u c h a s c h o o s i n g t h e v e r y

    b e s t v a l u e o f M ) w o u l d b e b e t t e r s p e n t s i m p l y i m p l e -

    m e n t i n g i t i n a s s e m b l y l a n g u a g e . I f a s o r ti n g p r o g r a m i s

    t o b e u s e d o n l y a f e w t i m e s o n f i le s w h i c h a r e n o t l a r g e ,

    then Pro gram 1 (pos s ib ly wi th " A [ / ] . ---:A[ I + r ) + 2 ] "

    i n s e r t e d b e f o r e p a r t i t i o n i n g t o m a k e t h e w o r s t c a s e u n -

    l i k e l y ) w i l l d o q u i t e n i c e l y . T h e o n l y d a n g e r i s t h a t t h e

    s t a c k f o r r e c u r s i o n m i g h t c o n s u m e e x c e s s i v e s p a c e , b u t

    th i s i s ve ry un l ike ly ( i t wi l l r equ i re l e s s than 3 0 en t r i e s ,

    on the ave rage , fo r f i l e s o f 1 0 , 0 0 0 e l emen t s [1 5 ] ) and i t

    p r o v i d e s a ~ o n v e n i e n t " a l a r m " t h a t t h e w o r s t c a s e i s

    h a p p e n i n g . P r o g r a m 1 i s a s i m p l e p r o g r a m w h o s e a v e r-

    a g e r u n n i n g t i m e i s l o w - - i t w i l l s o r t t h o u s a n d s o f el e -

    m e n t s i n o n l y a f ew s e c o n d s o n m o s t m o d e r n c o m p u t e r

    s ys tems .

    H a r d w a r e

    P a r t i c u l a r c h a r a c t e r i s t ic s o f p a r t i c u l a r r e a l c o m p u t e r s

    m i g h t a l l o w f o r f u r t h e r i m p r o v e m e n t s t o Q u i c k so r t . F o r

    e x a m p l e , s o m e c o m p u t e r s h a v e " c o m p a r e a n d s k i p " a n d

    " i n c r e m e n t a n d t e s t" i n s t r u c ti o n s w h i c h a l l o w t h e i n n e r

    l o o p s t o b e i m p l e m e n t e d i n t w o i n s t r u c t i o n s , t h u s e l im i -

    n a t i n g t h e n e e d f o r l o o p u n w r a p p i n g . S i m i l a r " l o c a l "

    i m p r o v e m e n t s m a y b e p o s s i b le i n o t h e r p a r t s o f t h e

    p r o g r a m s .

    T h e h a r d w a r e f e a t u r e o n m o d e r n c o m p u t e r s t h a t h a s

    t h e m o s t d r a s ti c e f f ec t o n t h e p e r f o r m a n c e o f a l g o r i th m s

    i s p a g i n g . Q u i c k s o r t a c t u a l l y d o e s n o t p e r f o r m b a d l y i n

    a v i r t u a l m e m o r y s i t u a t i o n ( s ee [ 2 ] ) b e c a u s e i t h a s t w o

    s l o w l y c h a n g i n g " l o c a l i t i e s " a r o u n d t h e s c a n n i n g

    p o i n t e r s . I n s o m e s i t u a t i o n s , it w i l l b e w i s e t o m i n i m i z e

    p a g e f a u l t s b y p e r f o r m i n g t h e e x t r a p r o c e s s i n g n e c e s s a r y

    t o s p l it t h e a r r a y i n t o m a n y p a r t i t i o n s ( i n s t e a d o f o n l y

    t w o ) o n t h e f i r s t p a r t i t i o n i n g s t a g e . O f c o u r s e , t h e p r o -

    g r a m s s h o u l d b e c h a n g e d s o t h a t s m a l l s u b f i l e s a r e

    " i n s e r t i o n s o r t e d " a s t h e y a r e e n c o u n t e r e d , b e c a u s e o t h -

    e r w i s e t h e l a s t s c a n o v e r t h e w h o l e f i l e w i l l i n v o l v e

    u n n e c e s s a r y p ag e f a u lt s . M a n y i n t e r n a l s o r t in g m e t h o d s

    d o n o t w o r k w e l l a t a l l u n d e r p a g i n g , b u t Q u i c k s o r t c a n

    b e a d a p t e d t o r u n q u i t e e f f i c i e n t l y .

    A n o t h e r h a r d w a r e f e a t u r e o f i n t e r e s t i s p a r a l l e l i s m .

    Q u i c k s o r t d o e s n o t t a k e g o o d a d v a n t a g e o f t h e p a r a l l e l-

    i s m i n l a r g e s c ie n t i fi c c o m p u t e r s , a n d t h e r e a r e m e t h o d s

    Com munications October 1978

    of Volume 21

    the ACM Number 10

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    10/11

    which should do better if parallel computations are

    involved. However, Quicksort has been shown to per-

    form quite well on one such computer [19]. Of course, if

    true parallelism is available then subfiles can be sorted

    independently by different processors.

    Many modern computers have hardware features

    such as instruction stacks, pipelined execution, caches,

    and interleaved storage which can improve performance

    greatly. Knuth [9] concludes that radix sorting might be

    preferred on number-crunching computers with pipe-

    lining. Loop unwrapping could be disastrous on com-

    puters with small instructions stacks, and the other fea-

    tures mentioned above will very often hide the time used

    for pointer arithmetic behind the time used for other

    instructions. The analysis of the effect of such hardware

    features can be very difficult, but again Quicksort makes

    a good test case for such studies because its inner loop is

    so small and its analysis is so well understood (see the

    analysis of loop unwrapping in [17]). However, there will

    probably always remain a role for empirical testing of

    alternatives in superoptimized implementations on ad-

    vanced machines.

    It is often the case that advanced hardware features

    allow the implementation of very fast routines for sorting

    small files. Using such a routine instead of Insertionsorts

    can lead to substantial improvements for Quicksort on

    some computers. To develop a good implementation of

    Quicksort on a new computer, one should first pay

    careful attention to the partitioning loops, then deal with

    the problem of sorting small subfiles efficiently.

    C o n c l u s i o n

    Our goal in this paper has been to illustrate methods

    by which a typical computer can be made to sort a file

    as quickly and conveniently as possible. The algorithm,

    improvements, and implementation techniques de-

    scribed here should make it possible for readers to im-

    plement useful, efficient programs to solve specific sort-

    ing problems.

    Economic issues surrounding modern computer sys-

    tems are very complex, and it is necessary always to be

    sure that it will be worthwhile to implement projected

    improvements to programs. Many simple applications

    can be handled perfectly adequately with simple pro-

    grams such as program 1. However, sorting is a task

    which is performed frequently enough that most com-

    puter installations have utility programs for the pur-

    pose. Such programs should use the best techniques

    available, so something on the order of an assembly-

    language implementation of Program 2 is called for.

    Sorting small subfiles on a separate pass, partit ioning

    on the median of three elements, and unwrapping the

    inner loops reduces the expected running time on a

    typical computer from about 11.6667N In N + 12.312N

    to about 10.0038N In N + 3.530N time units. Figure 8

    shows the total percentage improvement for these im-

    provements together.

    Many of the issues raised above relating to other

    sorting programs are treated fully in [9], and the issues

    specific to Quicksort are also dealt with in [15]. We have

    not described here the countless other variants of Quick-

    sort which have been proposed to improve the algorithm

    or to deal with the various problems outlines above [1,

    4, 13, 14, 20]. Many of these turn out not to be improve-

    ments at all: see [15] for complete descriptions. For

    example, nearly every published implementation of

    Quicksort uses a different partitioning method. The var-

    ious methods seem to differ only slightly, but actually

    their performance characteristics can differ greatly. Cau-

    tion should be exercised before a partitioning method

    which differs from those above is used.

    Program 2 is the method of choice in many practical

    sorting situations and will be very quick if properly

    implemented. Quicksort is an interesting algorithm

    which combines utility, elegance, and efficiency.

    Rece ived M ay 1 9 7 6 ; r ev i s ed F ebr uary 1 9 78 .

    F ig . 8 . Cum ula t ive im pro vem e n t due to s o r t ing s m al l s ubf i l es on a

    s epara te pas s , m ed ian -o f - th ree par t i t ion ing , and loop unwrapp ing .

    25

    2o

    E

    ~. 15

    8 5 6

    1060 2~

    30b0

    4~oo 5 ~ 6~oo 7 ~ so~ 9o6o iooo6

    File Size N)

    R e f e r e n c e s

    . B oo th royd , J . So r t o f a s ec t ion o f the e lem en ts o f an a r r ay by

    d e t e r m i n i n g t h e r a n k o f e a c h e l e m e n t : A l g o r i t h m 2 5; a n d O r d e r i n g

    the s ubs cr ip t s o f an a r r ay s ec t ion accord ing to the m agn i tudes o f the

    e lem en ts : A lgo r i thm 2 6 . Comptr. 10 (Nov. 1967) , 308-310. (See

    no tes by R .S . Scowen in

    Comptr. J. 12

    (N ov . 1 9 69 ) , 4 0 8 -4 0 9 , and by

    A.D . W ooda l l in

    Comptr. J. 13

    (Aug. 1970.)

    2 . B rawn , B .S ., G us tavs on , F .G . , and M an kin , E . Sor t ing in a

    p a g i n g e n v i r o n m e n t . Comm. AC M 13 , 8 (Aug. 1970) , 483-494.

    3 . Cocke , J. , and Schwar tz , J .T . P rogram m ing languages and the i r

    com pi le r s . P re l im inary N o tes . Couran t I ns t . o f M ath . Sc iences , N ew

    York U. , N ew York , 1 9 7 0 .

    4 . F razer , W .D . , and M cKel la r , A .C . Sam ples o r t : A s am pl ing

    approa ch to m in im a l s to rage t r ee s o rt ing .

    A C M

    17, 3 (July 1970),

    4 9 6-5 0 7 .

    5 . H oare , C .A .R . P ar t i t ion : A lgor i th m 63 ; Qu icks or t : A lgo r i thm 64;

    and F ind : A lgor i thm 65 . C om m . A C M 4, 7 (July 1961) , 321-322. (See

    a l s o ce r t i f i ca t ion by J .S . H i l lm ore in C om m . A C M 5, 8 (Aug. 1962),

    4 3 9, and B . Rande l l an d L . J . Rus s e l l in

    Comm. A CM

    6, 8 (Aug.

    1963), 446.)

    6 . H oare , C .A .R . Qu icks or t .

    Comput e r

    J . 5 , 4 (Apri l 1962) , 10-15.

    7 . Knu th , D .E . T he Art of Compute r Programm ing, VoL 1:

    C o m m u n i c a t i o n s O c t o b e r 1 97 8

    o f V o l u m e 2 1

    t h e A C M N u m b e r 1 0

  • 7/26/2019 1sedgewick Robert Implementing Quicksort Programs

    11/11

    Fun dam en ta l Algori thms . Addison-W esley, Mass., 1968.

    8. Knuth , D.E.

    The Ar t o f Com put e r Programmin g , Vol . 2 :

    Seminumerical Algori thms. Addison-Wesley, Mass. , 1969.

    9. Knuth , D.E. The

    Art of Com puter Programmin g, Vol . 3: Sort ing

    an d Searching. Addison-W esley, Mass., 1972.

    10. Knuth , D.E. St ructured programming wi th go to s ta tements .

    Comput i ng Surv ey s 6, 4 (Dec. 1974), 261-301.

    11. Loeser, R. Some perform ance test s of "quick sort" and

    descendants. C om m . A C M 17, 3 (M arch 1974), 143-152.

    12. M orris , R. Some theorem s on sorting .

    S l AM J . App l . Math . 17 , 1

    (Jan . 1969), I -6 .

    13. Rich , R.P. I n t e r n a l Sor ti ng Me thods I l lus t ra t ed w ith PL / I

    Progams.

    Prent ice-Hal l , Eng lewood Cl i ffs, N .J . , 1972.

    14. Scowen, R.S. Quickersort : Algori thm 271. Comm. AC M 8, 11

    (Nov. 1965), 669-670. (See a lso cert i f ica t ion by C.R. Bla i r in

    Comm.

    A C M 9, 5 (Ma y 1966), 354.)

    15. Sedgewick, R. Quicksort. Ph.D. Th. Stanford Comptr. Sci. Rep.

    STAN-CS-75-492, S tanford U., S tanford , Cal i f . , May 1975.

    16 . Sedgewick, R. Quicksort wi th equal keys. S iam J . Comput . 6 , 2

    (June 1977), 240-287.

    17. Sedgewick, R . The analysis of Quickso rt programs. Ac ta

    I n format ic a 7 (1977), 327-355.

    18. S ingle ton , R.C. An effic ien t a lgori thm for sort ing wi th min ima l

    storage: Algori thm 347. C om m . A C M 12, 3 (M arch 1969), 185-187.

    (See a lso remarks by R. Gri ff in and K.A. Redish in

    Comm. AC M 13 ,

    l (Jan. 1970), 54 and by R . Peto, Comm. AC M 13 , l0 (Oct. 1970),

    624.)

    19. S tone, H.S. Sort ing on STA R. IEE E Trans. Software Eng.

    SE-4,

    2 (Mar. 1978), 138-146.

    20. van Em aen, M .N. Increasing the effic iency of quicksort :

    A l g o r i t h m 402 .

    C om m . A C M

    13, 11 (No v. 19 70), 693-694. (See also

    t h e a r t ic l e by t h e same n ame i n C om m . A C M 13, 9 (Sept. 1970),

    563-567.)

    21. Wirth , N.

    Algori thms + Da ta S tructures = Programs.

    Prent ice-

    Hal l , Englew ood Cl i ffs , N.J . , 1976.

    P rogrammin g

    T echn iqu es

    S .L . Graham, R .L . R ives t

    Editors

    P a c k e d S c a t t e r T a b l e s

    G o r d o n L y o n

    N a t i o n a l B u r e a u o f S t a nd a r d s

    Scatter table s f or op en add res sin g ben e f i t from

    reeurs ive entry displacements , cutoffs for unsuccessful

    searches , and auxil iary cost functions . Compared with

    conventional methods , the new techniques provide

    substantial ly improved tables that resemble exact-

    solution optimal packings . The displacements are

    dep th- limi ted ap p roximat ion s to an en u merat ive

    (exhaustive) optimization , although packing costs

    r e m a i n l i n e a r - - O ( n ) - - w i t h t a b l e s i z e n. T h e t e ch n i q u e s

    are primarily suited for important f ixed (but poss ibly

    qu i te large) table s f or w hich re f eren ce f requ en c ie s may

    be known: op-code tables , spel l ing dictionaries , access

    arrays . Introduction of frequency weights further

    improves retrievals , but the enhancement may degrade

    cutoffs .

    Key W ords an d P hras es : as s ign men t p roblem,

    backtrack programming, hashing, open address ing,

    recurs ion , scatter table rearrangements

    C R C ategor ie s : 3 .74 , 4 .0

    8 5 7

    Permission to copy w i thout fee a l l or part of th is materia l i s

    granted provided that the copies are not made or d is t r ibuted for d i rect

    co mm erc i a l ad v an t ag e , t h e A C M co p yr i g h t n o t i c e an d t h e t i t l e o f th e

    publ ica t ion and i t s date appear, and not ice i s g iven that copying i s by

    p e rmi ss i on o f t h e A sso c i a t io n fo r C o mp u t i n g Mach i n e ry . To co p y

    otherwise , or to republ i sh , requi res a fee and/or speci fic permission .

    A u t h o r ' s ad dress : U .S . Dep a r t m en t o f C o mm erce , Na t i o n a l Bu -

    reau of Standards, Com puter S cience Section, A 367-Tech, W ashington ,

    D.C. 20234.

    1978 AC M 0001-0782/78/1000-0857 $00.75

    C o mm u n i ca t i o n s O c t o be r 1978

    o f Vo l u me 21

    t h e A C M N u m b e r 1 0