34
Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH [email protected] CSED233: Data Structures (2014F)

Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH [email protected] CSED233: Data Structures (2014F)

Embed Size (px)

Citation preview

Page 1: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

Lecture3: Algorithm Analysis

Bohyung HanCSE, POSTECH

[email protected]

CSED233: Data Structures (2014F)

Page 2: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

2 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Algorithm

• A step-by-step procedure to solve a problem Start from an initial state and input Proceed through a finite number of successive states Stop when reaching a final state and producing output

Lamp doesn’t work.

Lamp plugged in?

Bulb burned

out?

Repair lamp.

Replace bulb.

Plug in lamp.

Page 3: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

3 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Algorithm Performance

• Algorithm performance is measured by the amount of com-puter memory and running time required to run a algorithm.

• Performance measurement Empirical analysis

• Compute memory space in use and running time• Results are not general—valid only for tested inputs.• Same machine environment must be used to compare two algorithms.

Theoretical analysis• Compute asymptotic bound of space and running time• Sometimes, it is difficult to measure average cases; worst case analysis are

often used.• Machine independent

Page 4: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

4 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Empirical Analysis

• Measurement criteria Actual memory space in use Running time of algorithms

• Example

Algorithm1 and 2 took 0.3 and 0.5 milliseconds for Input1, respec-tively.

Algorithm1 may be faster than Algorithm2.

Algorithm1 and 2 took 1.0 and 0.8 milliseconds for Input2, respec-tively.

How about now?

Algorithm1

Input Output

Algorithm2

Page 5: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

5 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Theoretical Analysis

• Measurement criteria Space: amount of memory in bytes that algorithm occupies Time

• Running time• Typically measured by the number of primitive operations• However, different primitive operations may take different running time!

• Algorithm analysis Use a pseudocode of the algorithm Need to take all possible inputs into account Evaluate the algorithm speed independent of the machine environ-

ment

Page 6: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

6 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Pseudocode

• High-level description of an algorithm Independent of any programming language More structured than English proses Less detailed than program codes Hiding program design issues Easy to understand

Algorithm arrayMax(A, n) Input array A of n integers Output maximum element of A

currentMax <- A[0] for i <- 1 to n - 1 do if A[i] >= currentMax then currentMax <- A[i] return currentMax

Page 7: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

7 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Space Complexity

• Amount of necessary memory for an algorithm The space complexity may define an upper bound on the data that the

algorithm uses.

• Why do we care about space complexity? We may not have sufficient memory space in our computer. When we solve a large-scale problem, memory space is often a critical

bottleneck.

Page 8: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

8 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Space Complexity

• Data space depends on Computer architecture and compiler Implementation

Algorithm: e.g., dense matrix vs. sparse matrix

double array1[100];int array2[100];

// The size of array1 is twice bigger than that of array2.

4 8 1

(index, value) {(2,4), (5,8), (9,1)}

Dense matrix:

Sparse matrix:

10 integers

6 integers

Page 9: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

9 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Time Complexity

• Amount of time required to run an algorithm

• Why do we care about time complexity? Some applications require real-time responses. If there are many solutions for a problem, we typically prefer the

fastest one.

• How do we measure? Count a particular operation (operation counts)

e.g., comparisons in sorting Count the number of steps (step counts) Asymptotic complexity

Page 10: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

10 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Example: Insertion Sort

• How many comparisons are made?

3 1 22

n=3

i=1

1 3 22

i=2

1 2 23

j=0

Algorithm insertionSort(A, n) Input array A of n integers Output the sorted array A

for i <- 1 to n - 1 do t <- A[i] for j <- i-1 to 0 do if t < A[j] then A[j+1] <- A[j] else break A[j] <- t return A

Comparison happens here.

Page 11: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

11 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Worst Case Analysis

• Why using the worst case? Average case is sometimes difficult to analyze. The time complexity of the worst case is often very important.

• An example: insertion sort. All elements are reversely sorted. Total number of comparisons:

Algorithm insertionSort(A, n) Input array A of n integers Output the sorted array A

for i <- 1 to n - 1 do t <- A[i] for j <- i-1 to 0 do if t < A[j] then A[j+1] <- A[j] else break A[j] <- t return A

1+2+…+(𝑛−1 )=𝑛 (𝑛−1 )2

Page 12: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

12 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Primitive Operations

• Basic computations performed by an algorithm Constant time assumed for execution Identifiable in pseudocode

• Largely independent from the programming language• Examples

Assign a value, a <- b Comparison, a < b Returning from a method

Algorithm insertionSort(A, n) Input array A of n integers Output the sorted array A

for i <- 1 to n - 1 do t <- A[i] for j <- i-1 to 0 do if t < A[j] then A[j+1] <- A[j] else break A[j] <- t return A

Page 13: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

13 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Primitive Operation Counts

• By inspecting the pseudocode, we can determine the maxi-mum number of primitive operations executed by an algo-rithm, as a function of the input size.

Algorithm insertionSort(A, n) Input array A of n integers Output the sorted array A

for i <- 1 to n - 1 do t <- A[i] for j <- i-1 to 0 do if t < A[j] then A[j+1] <- A[j] else break A[j] <- t return A

# operations

0102011

total # operations

0n-10

(n-1)n0

n-11

Total

Page 14: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

14 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Running Time Analysis

• Running time, Computational complexity of an algorithm in a theoretical model Example

• The total number of operations of insertion sort is .• The theoretical running time may be bounded by other functions:

where and are times taken by the fastest and slowest primitive opera-tions.

• We are interested in the growth rate of the running time.

Page 15: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

15 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Growth Functions

• Constant Growth is independent of the input size,

e.g., accessing array element at known location, assignment

• Linear Growth is directly proportional to

e.g., finding a particular array element (linear search)

• Logarithmic Growth increases slowly compared to

e.g., finding particular array element (binary search)

Page 16: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

16 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Growth Functions

• Quadratic

e.g., typical in nested loops (bubble sort)

• Cubic

e.g., matrix multiplication

• Exponential Growth is extremely rapid and possibly impractical

e.g., Towers of Hanoi (), Fibonacci number ()

Page 17: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

17 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Growth Functions

• Other polynomial

e.g., typical in more nested loops

• Others

e.g., sorting using divide and conquer approach

Page 18: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

18 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

A Table of Growth Functions

• Overly complex algorithm may be impractical in some system

012345

1 2 4 8

16 32

0 2 8

24 64

160

1 4

16 64

256 1024

1 8

64 512

4096 32768

24

16256

655364294967296

Page 19: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

19 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

A Graph of Growth Functions

Page 20: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

20 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Asymptotic Complexity

• Asymptotic Definition: approaching a given value as an expression containing a

variable tends to infinity

• Asymptotic complexity Describe the growth rate of the time or space complexity with respect

to input size Is not directly related to the exact running time

• Three notations Big O (O) notation provides an upper bound for the growth rate of the

given function. Big Omega (Ω) notation provides a lower bound. Big Theta (Q) notation is used when an algorithm is bounded both

above and below by the same kind of function.

Page 21: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

21 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

0 1 2 3 4 5 60

20

40

60

80

100

120

3*n^2-5

n^2

Big O

Note: and may not unique.

• Example .

𝑐=4∧𝑛0=33𝑛2−5≤ 4𝑛2

Page 22: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

22 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

More Big O Examples

for

for

for

Page 23: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

23 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

0 1 2 3 4 5 60

10

20

30

40

50

60

0.5n^2-1

n^2

Big Omega

Note: and may not unique.

• Example

𝑐=0.4∧𝑛0=00 .5𝑛2−1≥0.4𝑛2

Page 24: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

24 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Big Theta

for Note: and may not unique.

• The total number of primitive operations of insertion sort is . The upper bound is The lower bound is Therefore,

Page 25: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

25 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Caveat

• The relative behavior of two function is compared only asymptotically, for large .

• The asymptotic complexity may make no sense for small .

• Example: two algorithms with the same complexity, Algorithm A executes primitive operations in a loop. Algorithm B executes primitive operations in a loop. Then the algorithm A is faster than the algorithm B although their as-

ymptotic complexities are same.

• Example: two algorithms with and With a small number of inputs, algorithm with quadratic complexity

may be faster than algorithm with linear complexity. Think about the definition of Big O notation.

Page 26: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

26 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Example

int Example1(int[][] a){ int sum = 0;

for (int i=0; i<a.length; i++) { for (int j=0; j<i; j++) { sum = sum + a[i][j]; } }

return sum;}

Number of operations

0+1+⋯+(𝑛−1 )=𝑛(𝑛−1)2

: number of rows

Page 27: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

27 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Example

int Example2(int[] a){ int sum = 0;

for (int i=a.length; i>0; i/=2) { sum = sum + a[i]; }

return sum;}

1+1+⋯+1⟹𝑂 (log𝑛)

⌊ log2𝑛 ⌋

: number of elements

Number of operations

Page 28: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

28 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Example

Number of operations

log𝑛+ log𝑛+⋯+ log𝑛

𝑛

: number of rows

int Example3(int[] a){ int sum = 0;

for (int i=0; i<a.length; i++) { for (int j=a.length; j>0; j/=2) { sum = sum + a[i] + a[j]; } }

return sum;}

Page 29: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

29 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Example

int Example4(int[][] a){ int sum = 0;

for (int i=0; i<a.length; i++) { for (int j=0; j<100; j++) { sum = sum + a[i][j]; } }

return sum;}

Number of operations

100+100+⋯+100=100𝑛

𝑛

: number of rows

Page 30: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

30 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Example

int Example5(int n){ int sum1 = 0; int sum2 = 0;

for (int i=0; i<n; i++) { sum1 += i; for (int j=10*n; j<100*n; j++) { sum2 += sum1; } }

return sum;}

Number of operations

90𝑛+90𝑛+⋯+90𝑛=90𝑛2

𝑛

Page 31: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

31 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Example

int[][] multiplySquareMatrices(int[][] a, int[][] b){ int[][] c = new int[a.length][a.length]; int n = a.length;

for (int i=0; i<n; i++) { for (int j=0; j<n; j++) { for (int k=0; k<n; k++) { c[i][j] = c[i][j] + a[i][k] * b[k][j]; } } }

return c;}

Number of operations

𝑛2+𝑛2+⋯+𝑛2=𝑛3⟹𝑂 (𝑛3)

𝑛

Page 32: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

32 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Trade-off

• Space complexity and time complexity may not be indepen-dent. There is a trade-off between the two complexities. Algorithm A may be faster than Algorithm B, but consume memory

space more than Algorithm B.

Space complexity Time complexity of A[i]

Dense matrix 4*10 bytes O(1)

Sparse matrix 4*6 bytes O(n)

Dense matrix A:

Sparse matrix A: (index, value) {(2,4), (5,8), (9,1)}

4 8 1

Page 33: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

33 CSED233: Data Structuresby Prof. Bohyung Han, Fall 2014

Summary

• Complexity analysis mostly involves examining loops.• Attempt to characterize growth of functions.

This will be important when we examine searching and sorting

• Big O notation helps us simplify complexity definitions – worst case It ignores constant and lower order terms. Big Omega (Ω) examines lower bounds. Big Theta (Θ) examines specific case.

• Computer scientists often use proofs to defend complexityclaims.

Page 34: Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH bhhan@postech.ac.kr CSED233: Data Structures (2014F)

34