info@prakashgautam.com.np … · Shell Sort: Analysis Time Complexity Average Case: O(n5/4) O(n3/2)...

Preview:

Citation preview

Lecture 7: Sorting Techniques

Prakash Gautamhttps://prakashgautam.com.np/dipit02/info@prakashgautam.com.np

26 April, 2018

Agenda

➔ Introduction to Sorting➔ Different Sorting Techniques

◆ Bubble Sort◆ Selection Sort◆ Insertion Sort◆ Merge Sort◆ Quick Sort◆ Shell Sort

2

3

4

5

Sorting➔ An operation that segregates items into groups

according to specified criterion➔ Input: A = { 3 1 6 2 1 3 4 5 9 0 }➔ Output: A = { 0 1 1 2 3 3 4 5 6 9 }➔ Sorting: Ordering➔ Sorted: Ordered based on a particular way➔ Examples

◆ Words in the dictionary

6

➔ It is arranging the elements in a list or collection in increasing or decreasing order of some property

➔ We may have list of any data types◆ Strings or Words: Lexicographical◆ List of Integers: Increasing order of value

➔ 2, 3, 9, 4, 6◆ 2, 3, 4, 6, 9 [Increasing order of value]◆ 9, 6, 4, 3, 2 [Decreasing order of value]◆ 2, 3, 9, 4, 6 [Increasing order of # factors]

7

➔ Sorted data are useful not only for representation & retrieval of data◆ It also significantly helps to improve the computational

power◆ Unsorted: Linear Search◆ Sorted: Binary Search

➔ Goal: Study, Analyze & Compare the various sorting algorithms

8

Sorting Algorithms➔ Bubble Sort➔ Selection Sort➔ Insertion Sort➔ Merge Sort➔ Quick Sort➔ Shell Sort➔ Radix Sort➔ Swap Sort➔ Heap Sort 9

Classification of Sorting Algorithms➔ Time Complexity

◆ Rate of growth of time taken by an algorithm with respect to input size, n

➔ Space Complexity◆ In-Place Algorithm or not…?

➔ Stability◆ Does it preserve the relative order of key values?

➔ Internal or External Sort (RAM or Disks)➔ Recursive or Non-Recursive

10

Bubble Sort➔ Bubble sort: Sinking sort➔ It is a simple sorting algorithm that works by

repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order

➔ The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted

11

12

Pass 1

13

Pass 2

14

Pass 3

15

Pass 4

16

➔ The algorithm gets its name from the way smaller elements "bubble" to the top of the list

➔ As it only uses comparisons to operate on elements, it is a comparison sort

➔ Notice that at least one element will be in the correct position each iteration

➔ Although the algorithm is simple, it is too slow for practical use

17

Bubble Sort: Algorithm

18

Bubble_Sort(A,n)for k=1 to n-1

for i=0 to n-2 if(A[i]>A[i+1])

Swap(A[i], A[i+1])

Swap(A[i],A[i+1])temp=A[i]A[i]=A[i+1]A[i+1]=temp

Bubble Sort: Improvement-I

19

Bubble_Sort(A,n)for k=1 to n-1

for i=0 to n-k-1 if(A[i]>A[i+1])

Swap(A[i], A[i+1])

Bubble Sort: Improvement-II

20

Bubble_Sort(A,n)for k=1 to n-1

for i=0 to n-k-1 flag=0

if(A[i]>A[i+1])Swap(A[i], A[i+1])flag=1

if(flag==0) break;

Bubble Sort: Complexity Analysis➔ Best Case: [ O(n)]➔ Worst Case: [ O(n2)]➔ Average Case: [ O(n2)]

21

Selection Sort➔ Array is imaginary divided into two parts - sorted

one & unsorted one➔ At the beginning, sorted part is empty, while

unsorted one contains whole array➔ At every step, algorithm finds minimal element in

the unsorted part and adds it to the end of the sorted one

➔ When unsorted part becomes empty, algorithm stops 22

23

24

25

Selection Sort: Algorithm

26

Selection_Sort(A,n)for i=0 to n-1 iMin=i

for j=i+1 to n-1if(A[j]<A[iMin])

iMin=jSwap(A[i], A[iMin])

Selection Sort: Complexity Analysis➔ O(n2)➔ It minimizes # of swaps

27

Insertion Sort➔ Array is imaginary divided into two parts - sorted

one & unsorted one➔ At the beginning, sorted part is empty, while

unsorted one contains whole array➔ It keeps a prefix of the array sorted➔ This prefix is grown by inserting the next value

into it at the correct place➔ Eventually, the prefix is the entire array, which is

therefore sorted 28

3 7 4 9 5 2 6 1

3 7 4 9 5 2 6 1

3 7 4 9 5 2 6 1

3 4 7 9 5 2 6 1

3 4 7 9 5 2 6 1

3 4 5 7 9 2 6 1 29

2 3 4 5 7 9 6 1

2 3 4 5 6 7 9 1

1 2 3 4 5 6 7 9

1 2 3 4 5 6 7 9

30

Insertion Sort: Algorithm

31

Insertion_Sort(A,n)for i=1 to n-1 Value=A[i]; hole=i;

while(hole>0 && A[hole-1]>Value)A[hole]=A[hole-1]hole=hole-1

A[hole]=Value

Insertion Sort: Complexity Analysis➔ Best Case: [ O(n) ]➔ Worst Case: [O(n2)]➔ Average Case: [O(n2)]➔ It minimizes # of swaps➔ Practical comparisions & swaps are much less than

Bubble & Selection sort.

32

Merge Sort➔ Divide & Conquer

◆ DIVIDE: Partition the n-element sequence to be sorted into two subsequences of n/2 elements each

◆ CONQUER: Sort the two subsequences recursively using the merge sort

◆ COMBINE: Merge the two sorted subsequences of size n/2 each to produce the sorted sequence

➔ Note that, Recursion "bottoms out" when the sequence to be sorted is of unit length

33

➔ Since every sequence of length 1 is in sorted order, no further recursive call is necessary

➔ The key operation of the merge sort algorithm is the merging of the two sorted sub sequences in the "combine step"

34

35

Merge Sort: Algorithm

36

Merge_Sort(A)n=length(A); if(n<2) {Its Sorted}mid=n/2; left=Array of size(mid);right=Array of size(n-mid)for i=0 to mid-1

left[i]=A[i]for i=mid to n-1

right[i-mid]=A[i]Merge_Sort(left); Merge_Sort(right);Merge(left, right, A)

37

Merge(L, R, A)nL=length(L); nR=length(R); i=j=k=0;while(i<nL && i<nR)

if(L[i]<=R[j])A[k]=L[i]; i++;

elseA[k]=R[j]; j++;

k++;while(i<nL)

A[k]=L[i]; i++; k++;while(j<nR)

A[k]=R[j]; j++; k++;

Merge Sort: Analysis➔ Time Complexity: O(n logn)➔ Space Complexity

◆ Non In-Place Algorithm ….WHY?◆ If we don’t clear extra memory for left & right: O(n logn)◆ If we clear extra memory in each call: O(n)

38

Quick Sort➔ It is the currently fastest known sorting algorithm

and is often the best practical choice for sorting➔ Pick an element, called a pivot, from the array➔ Reorder the array so that all elements with values less than the

pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation

39

➔ Recursively apply the above steps to the sub-array of elements with smaller values and separately to the sub-array of elements with greater values

➔ Divide & Conquer◆ Divide: Partition T[i..j] = T[i .. k-1] & T[k+1... j] such that

each element of T[i..k-1]<=Pvt & T[k+1….j]>Pvt◆ Conquer: Sort the two sub arrays T[i .. k-1] & T[k+1... j] by

recursive calls to quicksort◆ Combine: Since the sub arrays are sorted in place, no work

is needed to combining them: the entire array T[i..j] is now sorted

40

41

42

43

44

Quick Sort: Algorithm

45

Quick_Sort(A, Start, End)if (start<end)

Pindex=Partition(A, Start, End)Quick_Sort(A, Start, Pindex-1);Quick_Sort(A, Pindex+1, End)

46

Partition(A, Start, End)pivot=A[End] ; Pindex=Start;

for i=Start to End-1if (A[i]<=pivot)

swap(A[i], A[Pindex]);Pindex=Pindex+1;

Swap(A[Pindex], A[End]);return Pindex;

Quick Sort: Analysis➔ Time Complexity

◆ Best Case(Balanced): O(n logn)◆ Worst Case(If already sorted/Unbalanced): O(n2)

● Solution: Randomized Partition◆ Average Case: O(n logn)

➔ Space Complexity◆ An In-Place Algorithm◆ Worst Case: O(n)

47

48

Randomized_Partition(A, Start, End)pivotIndex=Random(Start, End)Swap(A[pivotIndex], A[End])Partition(A, Start, End)

Shell Sort➔ Donald L. Shell (1959)➔ Generalization of the Insertion Sort➔ We compare elements that are distant apart rather

than adjacent➔ Comparison of elements: If there are N elements

then we start with a value gap<N➔ In each pass, we keep reducing the value of gap till

we reach the last pass when gap is 1

49

➔ In last pass: Shell Sort = Insertion Sort

[14 18 19 37 23 40 29 30 11] - A[ ]

0 1 2 3 4 5 6 7 8 Index

➔ Total Elements (N) = 9➔ “gap” must be less than N➔ gap = Floor[N/2]

50

➔ Here gap = 4 { Floor[9/2] }➔ So, Pass=1 & gap=4

◆ First element at Index 0◆ Second at Index, 0+4=4◆ Third at Index, 4+4=8

[14 18 19 37 23 40 29 30 11] - A[ ]

0 1 2 3 4 5 6 7 8 Index

51

[14 18 19 37 23 40 29 30 11] - A[ ]

0 1 2 3 4 5 6 7 8 Index

➔ Is 14 > 23...?, Is 18 > 40…?, Is 19 > 29…?, Is 37 > 30 (Now Swap) Is 23 > 11…?(Now Swap), Is 14 > 11…?

[11 18 19 30 14 40 29 37 23] - A[ ]

52

➔ Pass=2 & gap=2◆ gap = Floor [gap/2] = 2

[14 18 19 37 23 40 29 30 11] - A[ ]

0 1 2 3 4 5 6 7 8 Index

[11 18 14 30 19 37 23 40 29] - A[ ]

0 1 2 3 4 5 6 7 8 Index53

➔ Pass=3 & gap=1◆ gap = Floor [2/2] = 1◆ Equivalent with Insertion Sort when gap = 1

[11 18 14 30 19 37 23 40 29] - A[ ]

0 1 2 3 4 5 6 7 8 Index

FINALLY SORTED:

[11 14 18 19 23 29 30 37 40] - A[ ]

0 1 2 3 4 5 6 7 8 Index54

Shell Sort: Algorithm

55

Shell_Sort(A, Size)gap = Size/2;While(gap > 0)

i = gapwhile(i < Size)

temp = A[i]for(j=i; j>=gap && A[j-gap]>temp; j=j-gap)

A[j]=A[j-gap]A[j]=temp

i=i+1gap=gap/2

Shell Sort: Algorithm

56

1. Calculate gap2. While gap>0

FOR each element in the list, gap apartExtract the current itemLocate the position to insertInsert the item to the position

END FOR3. Calculate gap4. END While

Shell Sort: Analysis➔ Time Complexity

◆ Average Case: O(n5/4) O(n3/2)◆ Worst case: Insertion Sort O(n2)◆ Exact Complexity of this algorithm is still being debated

➔ Space Complexity◆ In Place

➔ Stable Sorting…?◆ No, It doesn’t preserve the relative order of duplicates

➔ Experience: Not better than O(nlogn)57

Heap Sort

58

59

60

61

Heap Data Structure➔ A special tree-based data structure➔ It must be a complete binary tree

62

Heapify

63

Heapify (A)Root = A[0]Largest=largest(A[0],A[2i+1], A[2i+2])If (Root != Largest)

Swap(Root, Largest)

64

heapify(int arr[], int n, int i){ int largest = i;

int l = 2*i + 1; int r =2*i+ 2; if (l < n && arr[l] > arr[largest]) largest = l; if (r < n && arr[r] > arr[largest]) largest = r; if (largest != i) {swap(arr[i], arr[largest]); heapify(arr, n, largest)}

}

65

66

67

Heap Sort➔ Max-Heap: largest item is stored at the root node➔ Remove the root node & put at end of the array➔ Reduce the size of the heap by 1 and heapify the

root element again so that we have highest element at root

➔ The process is repeated until all the items of the list is sorted

68

69

70

71

72

73

74

75

Heap Sort: Algorithm

76

for (int i=n-1; i>=0; i--) { swap(arr[0], arr[i]);//call max heapify on the reduced heap heapify(arr, i, 0); }

Heap Sort: Analysis➔ Time Complexity O(nlogn)

◆ The height of a complete binary tree containing n elements is log(n)

➔ Space Complexity◆ In Place

➔ Stable Sorting…?◆ No, It doesn’t preserve the relative order of duplicates

➔ Recursive

77

78

➔ Radix Sort➔ Counting Sort➔ Topological Sort➔ Bucket Sort➔ Comb Sort➔ Cycle Sort➔ Cocktail Sort➔ Bitonic Sort➔ Gnome Sort➔ Sleep Sort 79

Thank You…!

80

…?

Recommended