C1 - Insertion Sort

ì Insertion Sort Ing. Juan Ignacio Zamora M. MSc. | Universidad La8noamericana de Ciencia y Tecnología

La leyenda de Gauss

ì  Érase una vez un niño alemán llamado Carl Friedrich Gauss. Cuando tenía diez años, en 1787, su profesor de la escuela, enfadado porque sus alumnos se portaban mal, le puso un problema matemá8co al pequeño Carl y a sus compañeros.

ì  Los niños debían sumar todos los números del 1 al 100; es decir, 1+2=3+3=6+4=10+5=15+6=21 y así sucesivamente hasta sumar los 100

ì  El profesor se sentó en su silla a leer el periódico, confiaba en que tendría horas hasta que los niños sumaran todos los números….

ì Gauss lo resolvió en 5 minutos…

Como lo hizo?

ì  Sea la progresión S = a……m…….p…….u cuya razón esta definida por r.

ì  Entonces S = a+b+c………….+l+m+u

ì  También S = u+m+l………….+c+b+a

ì  Entonces 2S = (a+u) + (b+m) + (c+l) + (l+c) + (m+b) + (u+a).

ì  Todos los binomios anteriores son iguales a (a+u). Recuerde que a es el primer termino y u el ul8mo.

ì  Esto quiere decir que la la suma de la progresión es (a+u) “n” veces. Ósea, (a+u)n y esto se divide entre 2 ya que todos los términos se cuentan 2 veces por tanto:

S = (a+u)n2

Probemos el Teorema

S =1+ 2+3.......+ 98+ 99+100S =100+ 99+ 98.......+3+ 2+12S =101+101+101.......+101+101+101

S = (a+u)n2

S = (1+100)1002

=(101)100

2= 5050

Progresiones Aritméticas

ì  Es toda serie es la cual cada termino después del primero se ob8ene sumándole al termino anterior una can8dad constante.

ì  S =1, 3, 5, 7 …. Donde la razón r o diferencia d es 2, ya que 3-‐1= 2 à esto implica que la razón (r) es la diferencia entre un termino cualquiera menos el anterior.

Deducción de la formula del enésimo termino

ì  Sea la progresión S = a, b, c ,d……….u, en donde “u” es el enésimo termino y cuya razón es “r”

ì  Entonces tenemos que ì  b = a + r ì  c = b + r à (a + r) + r = a + 2r ì  d = c + r à (a + 2r) + r = a + 3r

ì  Entonces cada termino es igual al primer termino de la progresión mas la razón como términos le preceden.

ì  Sabemos que el primer termino es “a” y le preceden (n-‐1) términos donde la razón esta dada por “r”, entonces podemos concluir que

u = a+ (n−1)r

Deducción de la formula del enésimo termino

ì  Volviendo al ejemplo del pequeño Gauss, tenemos que S = 5050, el primer termino “a” es 1 y que la razón “r” es 1 ya que se suma de uno en uno y que la can8dad (n) de términos es 100.

ì  Con esto respaldamos el teorema de Gauss.

ì  Inténtelo Ud: ì  Hallar el 15vo termino de la sucesión 4, 7, 10….. ì  El 15vo termino es no representa la suma de los

términos, solamente representa su valor.

u = a+ (n−1)r u =1+ (100−1)1=100

Algoritmo 1 : Insertion Sort 2.1 Insertion sort 17

2!!

! 2!

4!! !

!! 4!

5!! !

!! 5!

!

7!!! !

! !!!7!

10! !! !! !

!!!!!

10!

Figure 2.1 Sorting a hand of cards using insertion sort.

reading our algorithms. What separates pseudocode from “real” code is that inpseudocode, we employ whatever expressive method is most clear and concise tospecify a given algorithm. Sometimes, the clearest method is English, so do notbe surprised if you come across an English phrase or sentence embedded withina section of “real” code. Another difference between pseudocode and real codeis that pseudocode is not typically concerned with issues of software engineering.Issues of data abstraction, modularity, and error handling are often ignored in orderto convey the essence of the algorithm more concisely.

We start with insertion sort, which is an efficient algorithm for sorting a smallnumber of elements. Insertion sort works the way many people sort a hand ofplaying cards. We start with an empty left hand and the cards face down on thetable. We then remove one card at a time from the table and insert it into thecorrect position in the left hand. To find the correct position for a card, we compareit with each of the cards already in the hand, from right to left, as illustrated inFigure 2.1. At all times, the cards held in the left hand are sorted, and these cardswere originally the top cards of the pile on the table.

We present our pseudocode for insertion sort as a procedure called INSERTION-SORT, which takes as a parameter an array AŒ1 : : n! containing a sequence oflength n that is to be sorted. (In the code, the number n of elements in A is denotedby A: length.) The algorithm sorts the input numbers in place: it rearranges thenumbers within the array A, with at most a constant number of them stored outsidethe array at any time. The input array A contains the sorted output sequence whenthe INSERTION-SORT procedure is finished.

Pseudo-‐Codigo :: InsertionSort

18 Chapter 2 Getting Started

1 2 3 4 5 65 2 4 6 1 3(a)

1 2 3 4 5 62 5 4 6 1 3(b)

1 2 3 4 5 62 4 5 6 1 3(c)

1 2 3 4 5 62 4 5 6 1 3(d)

1 2 3 4 5 62 4 5 61 3(e)

1 2 3 4 5 62 4 5 61 3(f)

Figure 2.2 The operation of INSERTION-SORT on the array A D h5; 2; 4; 6; 1; 3i. Array indicesappear above the rectangles, and values stored in the array positions appear within the rectangles.(a)–(e) The iterations of the for loop of lines 1–8. In each iteration, the black rectangle holds thekey taken from AŒj !, which is compared with the values in shaded rectangles to its left in the test ofline 5. Shaded arrows show array values moved one position to the right in line 6, and black arrowsindicate where the key moves to in line 8. (f) The final sorted array.

INSERTION-SORT.A/

1 for j D 2 to A: length2 key D AŒj !3 // Insert AŒj ! into the sorted sequence AŒ1 : : j ! 1!.4 i D j ! 15 while i > 0 and AŒi ! > key6 AŒi C 1! D AŒi !7 i D i ! 18 AŒi C 1! D key

Loop invariants and the correctness of insertion sortFigure 2.2 shows how this algorithm works for A D h5; 2; 4; 6; 1; 3i. The in-dex j indicates the “current card” being inserted into the hand. At the beginningof each iteration of the for loop, which is indexed by j , the subarray consistingof elements AŒ1 : : j ! 1! constitutes the currently sorted hand, and the remainingsubarray AŒj C 1 : : n! corresponds to the pile of cards still on the table. In fact,elements AŒ1 : : j ! 1! are the elements originally in positions 1 through j ! 1, butnow in sorted order. We state these properties of AŒ1 : : j ! 1! formally as a loopinvariant:

At the start of each iteration of the for loop of lines 1–8, the subarrayAŒ1 : : j !1! consists of the elements originally in AŒ1 : : j !1!, but in sortedorder.

We use loop invariants to help us understand why an algorithm is correct. Wemust show three things about a loop invariant:


1 2 3 4 5 65 2 4 6 1 3(a)

1 2 3 4 5 62 5 4 6 1 3(b)

1 2 3 4 5 62 4 5 6 1 3(c)

1 2 3 4 5 62 4 5 6 1 3(d)

1 2 3 4 5 62 4 5 61 3(e)

1 2 3 4 5 62 4 5 61 3(f)

Figure 2.2 The operation of INSERTION-SORT on the array A D h5; 2; 4; 6; 1; 3i. Array indicesappear above the rectangles, and values stored in the array positions appear within the rectangles.(a)–(e) The iterations of the for loop of lines 1–8. In each iteration, the black rectangle holds thekey taken from AŒj !, which is compared with the values in shaded rectangles to its left in the test ofline 5. Shaded arrows show array values moved one position to the right in line 6, and black arrowsindicate where the key moves to in line 8. (f) The final sorted array.

INSERTION-SORT.A/

1 for j D 2 to A: length2 key D AŒj !3 // Insert AŒj ! into the sorted sequence AŒ1 : : j ! 1!.4 i D j ! 15 while i > 0 and AŒi ! > key6 AŒi C 1! D AŒi !7 i D i ! 18 AŒi C 1! D key

Loop invariants and the correctness of insertion sortFigure 2.2 shows how this algorithm works for A D h5; 2; 4; 6; 1; 3i. The in-dex j indicates the “current card” being inserted into the hand. At the beginningof each iteration of the for loop, which is indexed by j , the subarray consistingof elements AŒ1 : : j ! 1! constitutes the currently sorted hand, and the remainingsubarray AŒj C 1 : : n! corresponds to the pile of cards still on the table. In fact,elements AŒ1 : : j ! 1! are the elements originally in positions 1 through j ! 1, butnow in sorted order. We state these properties of AŒ1 : : j ! 1! formally as a loopinvariant:

At the start of each iteration of the for loop of lines 1–8, the subarrayAŒ1 : : j !1! consists of the elements originally in AŒ1 : : j !1!, but in sortedorder.

We use loop invariants to help us understand why an algorithm is correct. Wemust show three things about a loop invariant:

MIT Chapter 2 – pag 18

Tiempo de Ejecución de InsertionSort 26 Chapter 2 Getting Started

INSERTION-SORT.A/ cost times1 for j D 2 to A: length c1 n2 key D AŒj ! c2 n ! 13 // Insert AŒj ! into the sorted

sequence AŒ1 : : j ! 1!. 0 n ! 14 i D j ! 1 c4 n ! 15 while i > 0 and AŒi ! > key c5

Pnj D2 tj

6 AŒi C 1! D AŒi ! c6

Pnj D2.tj ! 1/

7 i D i ! 1 c7

Pnj D2.tj ! 1/

8 AŒi C 1! D key c8 n ! 1

The running time of the algorithm is the sum of running times for each state-ment executed; a statement that takes ci steps to execute and executes n times willcontribute cin to the total running time.6 To compute T .n/, the running time ofINSERTION-SORT on an input of n values, we sum the products of the cost andtimes columns, obtaining

T .n/ D c1nC c2.n ! 1/C c4.n ! 1/C c5

nX

j D2

tj C c6

nX

j D2

.tj ! 1/

C c7

nX

j D2

.tj ! 1/C c8.n ! 1/ :

Even for inputs of a given size, an algorithm’s running time may depend onwhich input of that size is given. For example, in INSERTION-SORT, the bestcase occurs if the array is already sorted. For each j D 2; 3; : : : ; n, we then findthat AŒi ! " key in line 5 when i has its initial value of j ! 1. Thus tj D 1 forj D 2; 3; : : : ; n, and the best-case running time isT .n/ D c1nC c2.n ! 1/C c4.n ! 1/C c5.n ! 1/C c8.n ! 1/

D .c1 C c2 C c4 C c5 C c8/n ! .c2 C c4 C c5 C c8/ :

We can express this running time as anC b for constants a and b that depend onthe statement costs ci ; it is thus a linear function of n.

If the array is in reverse sorted order—that is, in decreasing order—the worstcase results. We must compare each element AŒj ! with each element in the entiresorted subarray AŒ1 : : j ! 1!, and so tj D j for j D 2; 3; : : : ; n. Noting that

6This characteristic does not necessarily hold for a resource such as memory. A statement thatreferences m words of memory and is executed n times does not necessarily reference mn distinctwords of memory.




Pnj D2 tj


Pnj D2.tj ! 1/

7 i D i ! 1 c7

Pnj D2.tj ! 1/



T .n/ D c1nC c2.n ! 1/C c4.n ! 1/C c5

nX

j D2

tj C c6

nX

j D2

.tj ! 1/

C c7

nX

j D2

.tj ! 1/C c8.n ! 1/ :






Donde el Tiempo T para una progresión n esta dado por:

Tiempo de Ejecución de InsertionSort

ì  El Mejor Tiempo de Ejecución

ì  Por tanto y en resumen esta ecuación se comporta como una función lineal de n.




Pnj D2 tj


Pnj D2.tj ! 1/

7 i D i ! 1 c7

Pnj D2.tj ! 1/



T .n/ D c1nC c2.n ! 1/C c4.n ! 1/C c5

nX

j D2

tj C c6

nX

j D2

.tj ! 1/

C c7

nX

j D2

.tj ! 1/C c8.n ! 1/ :






T (n) = an+ b

Ω(n) = an+ b

Tiempo de Ejecución de InsertionSort

ì  El Peor Tiempo de Ejecución

ì  Esta función se comporta de forma cuadrá8ca de n

2.2 Analyzing algorithms 27

nX

j D2

j Dn.nC 1/

2! 1

andnX

j D2

.j ! 1/ Dn.n ! 1/

2

(see Appendix A for a review of how to solve these summations), we find that inthe worst case, the running time of INSERTION-SORT is

T .n/ D c1nC c2.n ! 1/C c4.n ! 1/C c5

!n.nC 1/

2! 1

"

C c6

!n.n ! 1/

2

"C c7

!n.n ! 1/

2

"C c8.n ! 1/

D#c5

2C

c6

2C

c7

2

$n2 C

#c1 C c2 C c4 C

c5

2!

c6

2!

c7

2C c8

$n

! .c2 C c4 C c5 C c8/ :

We can express this worst-case running time as an2 C bnC c for constants a, b,and c that again depend on the statement costs ci ; it is thus a quadratic functionof n.

Typically, as in insertion sort, the running time of an algorithm is fixed for agiven input, although in later chapters we shall see some interesting “randomized”algorithms whose behavior can vary even for a fixed input.

Worst-case and average-case analysisIn our analysis of insertion sort, we looked at both the best case, in which the inputarray was already sorted, and the worst case, in which the input array was reversesorted. For the remainder of this book, though, we shall usually concentrate onfinding only the worst-case running time, that is, the longest running time for anyinput of size n. We give three reasons for this orientation.! The worst-case running time of an algorithm gives us an upper bound on the

running time for any input. Knowing it provides a guarantee that the algorithmwill never take any longer. We need not make some educated guess about therunning time and hope that it never gets much worse.

! For some algorithms, the worst case occurs fairly often. For example, in search-ing a database for a particular piece of information, the searching algorithm’sworst case will often occur when the information is not present in the database.In some applications, searches for absent information may be frequent.

2.2 Analyzing algorithms 27

nX

j D2

j Dn.nC 1/

2! 1

andnX

j D2

.j ! 1/ Dn.n ! 1/

2

(see Appendix A for a review of how to solve these summations), we find that inthe worst case, the running time of INSERTION-SORT is

T .n/ D c1nC c2.n ! 1/C c4.n ! 1/C c5

!n.nC 1/

2! 1

"

C c6

!n.n ! 1/

2

"C c7

!n.n ! 1/

2

"C c8.n ! 1/

D#c5

2C

c6

2C

c7

2

$n2 C

#c1 C c2 C c4 C

c5

2!

c6

2!

c7

2C c8

$n

! .c2 C c4 C c5 C c8/ :

We can express this worst-case running time as an2 C bnC c for constants a, b,and c that again depend on the statement costs ci ; it is thus a quadratic functionof n.

Typically, as in insertion sort, the running time of an algorithm is fixed for agiven input, although in later chapters we shall see some interesting “randomized”algorithms whose behavior can vary even for a fixed input.

Worst-case and average-case analysisIn our analysis of insertion sort, we looked at both the best case, in which the inputarray was already sorted, and the worst case, in which the input array was reversesorted. For the remainder of this book, though, we shall usually concentrate onfinding only the worst-case running time, that is, the longest running time for anyinput of size n. We give three reasons for this orientation.! The worst-case running time of an algorithm gives us an upper bound on the

running time for any input. Knowing it provides a guarantee that the algorithmwill never take any longer. We need not make some educated guess about therunning time and hope that it never gets much worse.

! For some algorithms, the worst case occurs fairly often. For example, in search-ing a database for a particular piece of information, the searching algorithm’sworst case will often occur when the information is not present in the database.In some applications, searches for absent information may be frequent.

Ο(n) = an2 + bn+ cT (n) = an2 + bn+ c

Mejor vs Peor

Education

C1 - Insertion Sort