Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code •...

Preview:

Citation preview

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-1

Analysis Tools

• Experimental Studies

• Pseudo-Code

• Mathematical Review

• Analysis of Algorithms

• Asymptotic Analysis

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-2

Analysis Goals

• The goals of analyzing data structures and algorithms areto study:– The running time

– The resource requirement

• We want our data structures and algorithms to run as fastas possible and to use as less resources as possible.

• But we need analysis to confirm that our data structuresand algorithms are “good”.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-3

Experimental Studies

• Method: implement the algorithms and observe speed.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-4

Experimental Studies

• In general, running time increases with growing input size.

• Running time depends on hardware.

• Running time also depends on operating system (includingdifferent versions of the Operating System), compiler, etc.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-5

Experimental Studies

• Experiments can only be done on a limited number of testcases.

• It is often difficult to compare two algorithms due to:– Different hardware

– Different operating systems

– Different compilers

• It also requires an implementation before experiments canbe done.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-6

Looking for Better Method

• We want a methodology for analyzing the running time ofan algorithm that– Takes into account all possible inputs.

– Allows us to evaluate relative efficiency of any two algorithms in away that is independent of the hardware and software environment.

– Can be performed by studying a high-level description of thealgorithm without actually implementing it.

• This introduces the concept of Pseudo Code

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-7

Pseudo-Code

• Pseudo-code is a mixture of natural language and high-level programming constructs.

• There is no precise definition of pseudo-code.

• The following slides show example of pseudo-code and thecorresponding Java code.

• This example computes the maximum value in an array Aof n integers.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-8

Pseudo-Code Example

Algorithm arrayMax (A, n):

Input: An array A storing n integers.

Output: The maximum element in A.

let currentMax Å A[0].

for i Å 1 to n-1 do

if currentMax < A[i] then

let currentMax Å A[i].

return currentMax.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-9

Java Code Example

public class ArrayMaxProgram

// test program for an algorithm that finds the maximum element in an array

static int arrayMax (int[] A, int n)

// find the maximum element in array A of n integers by scanning

// the cells of A while keeping track of the maximum element

// encountered.

int currentMax = A[0]; // executed once

for (int i=1; i<n; i++) // executed once, n times, n-1 times, resp.

if (currentMax < A[i]) // executed n-1 times

currentMax = A[i]); // executed at most n-1 times

return (currentMax);

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-10

Java Code Example

public static void main (String args [])

// testing method called when the program is executed

int [] num = 10, 15, 3, 5, 56, 107, 22, 16, 85 ;

int n = num.length;

System.out.print (“Array:”);

for (int i=0; i<n; i++)

System.out.print (“ ” + num[i]); // prints one element of the array

System.out.println (“.”);

System.out.println (“The maximum element is ” + arrayMax(num,n) + “.”);

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-11

Rules for Pseudo-Code

• Expression: We use standard mathematical symbols toexpress expressions. We use the left arrow sign () as theassignment operator in assignments (equivalent to the Java= operator) and we use the equal sign (=) as the equalityrelation in boolean expression (which is equivalent to the== relation in Java).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-12

Rules for Pseudo-Code

• Method declarations: Algorithm name (param1, param2,...) declares a new method “name” and its parameters.

• Decision structures: if condition then true-actions [elsefalse-actions]. We use indentation to indicate what actionsshould be included in the true-actions and false-actions.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-13

Rules for Pseudo-Code

• While-loops: while condition do actions.

• Repeat-loops: repeat condition do actions.

• For-loops: for variable-increment-definition do actions.

• We use indentation to indicate what actions should beincluded in all the loop actions.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-14

Rules for Pseudo-Code

• Array indexing: A[i] represents the ith cell in the array A.The cells of an n-cell array A are indexed from A[0] to A[n-1]. This is consistent with Java.

• Method calls: object.method (args). “object.” is optional ifit is understood.

• Method returns: return value. This operation returns thevalue specified to the method that called this one, value isoptional.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-15

Mathematical Review

• Before we continue to discuss how we can analyze analgorithm, we need to take a quick review of somemathematical rules.

• These rules will be used in our analysis.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-16

Logarithms and Exponents

• logba = c ⇔ a = bc

• logba/c = logba - logbc

• logbac = clogba

• logba = (logca)/(logcb)

• bloga = alogb

• (ba)c = bac

• babc = ba+c

• ba/bc = ba-c

• When the base is omitted, it is assumed to be 2: logn ⇔ log2n.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-17

Examples

• log(2nlogn) = 1 + logn +loglogn

• log(n/2) = logn - log2 = logn - 1

• log√n = log(n1/2) = (logn)/2

• loglog√n = log((logn)/2) = loglogn - 1

• log4n = (logn)/log4 = (logn)/2

• log2n = n

• 2logn = n

• 22logn = (2logn)2 = n2

• 4n = (22)n = 22n

• n223logn = n2n3 = n5

• 4n/2n = 22n/2n = 22n -n = 2n

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-18

)(...)2()1()()( tfsfsfsfift

si

++++++=∑=

2)1(

1

)1(...321 −

==+−++++=∑ nn

n

i

nni

aan

n

i

i n

aaaa −−

+=++++=∑ 112

0

1

...1

122...84212 121

21

0

1 −==+++++= +−

=

+∑ nnn

i

i n

Summations

• Definition

• Arithmetic Series

• Example: if a=2

• Geometric Series, giving a>0

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-19

Floor and Ceiling

• Floor: x = largest integer ≤ x

• Ceiling: x = smallest integer ≥ x

• Example:– 3.6 = 3

– 3.6 = 4

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-20

Analysis of Algorithms

• Principle: count primitive operations in the pseudo-code.

• Assumption: all primitive operations take approximatelythe same times to execute.

• Primitive operations include:– Assigning a value to a variable

– Calling a method

– Arithmetic operations (e.g. “+”, “-”, “*”, “/”, etc.)

– Comparing two numbers

– Indexing into an array

– Following an object reference

– Returning from a method

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-21

Counting Primitive Operations

• Primitive operations are similar to basic machine levelinstructions.

• Running times of primitive operations are fairly similar.

• Therefore, counting the number of primitive operationsgives an estimate on the running time that is independentof the machine architecture.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-22

Algorithm Complexity

• The “time complexity” of an algorithm refers to thenumber of primitive operations which are proportional tothe running time.

• Similarly, the “space complexity” of an algorithm isproportional to the maximum memory used (in bytes,kilobytes, or megabytes).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-23

Example

• Using the arrayMax algorithm (slide 8 or page 101 of thetext book) as example.

• Initializing variable currentMax to A[0] corresponds to twoprimitive operations (indexing into an array and assigninga value to a variable) and is executed only once at thebeginning of the algorithm. Thus, it contributes two unitsto the count.

• Total primitive operations so far: 2 + ...

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-24

Example

• At the beginning of the for loop, counter i is initialized to1. This action corresponds to executing one primitiveoperation (assigning a value to a variable).

• Total primitive operations so far: 2 + 1 + ...

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-25

Example

• Before entering the body of the for loop, condition i < n isverified. This action corresponds to executing oneprimitive instruction (comparing two numbers). Sincecounter i starts at 1 and is incremented by 1 at the end ofeach iteration of the loop, the comparison i < n isperformed n-1 times. Thus, it contributes (n-1) units to thecount.

• Total primitive operations so far: 2 + 1 + (n-1) + ...

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-26

Example

• The body of the for loop is executed n-1 times (for values 1, 2, ..., n-1of the counter). In each iterations, A[i] is compared with currentMax(two primitive operations, indexing and comparing), A[i] is possiblyassigned to currentMax (two primitive operations, indexing andassigning), and the counter i is incremented (two primitive operations,summing and assigning). Hence, at each iteration of the loop, eitherfour or six primitive operations are performed, depending on whetherA[i] ≤ currentMax or A[i] > currentMax. Therefore, the body of theloop contributes between 4(n-1) and 6(n-1) units to the count.

• Total primitive operations so far:– At least 2 + 1 + (n-1) + 4(n-1) + ...

– At most 2 + 1 + (n-1) + 6(n-1) + ...

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-27

Example

• While i = n, the comparison fails, and the loop finishes.This contributes to 1 unit to the count (comparing twonumbers), and it executes only once.

• Returning the value of variable currentMax corresponds toone primitive operation, and it executes only once.

• Total primitive operations so far:– At least 2 + 1 + (n-1) + 4(n-1) + 1 + 1

– At most 2 + 1 + (n-1) + 6(n-1) + 1 + 1

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-28

Conclusion of Example

• Therefore, the number of primitive operations t(n)executed by algorithm arrayMax is– At least 2 + 1 + (n-1) + 4(n-1) + 1 + 1 = 5n

– At most 2 + 1 + (n-1) + 6(n-1) + 1 + 1 = 7n - 2

• Does this mean the average number of primitive operationsis 6n - 1?

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-29

Average-Case and Worst Case Analysis

Input

1 ms

2 ms

3 ms

4 ms

5 ms

A B C D E F G

worst-case

best-caseaverage-case?

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-30

Average-Case and Worst Case Analysis

• Algorithm may run faster on some inputs of the same size.

• For all possible inputs of the same size– Average case time is the expected T(n) based on a given input

distribution.

– Worst case time is the worst possible T(n).

• Unless otherwise stated, when we say running timeanalysis in this course, we always refer to worst caseanalysis.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-31

Asymptotic Analysis

• By counting the number of primitive operations, we canknow how fast an algorithm can run.

• But some question:– Is this level of details really needed?

– How important is it to figure out the exact number of primitiveoperations?

– How careful must we define the primitive operations?• For example: How many operations are there in the statement

“A[k] Å A[k] + (a*x)”? 3 or 5? Why?

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-32

Simplifying The Analysis

• We will introduce a “big-picture” approach.

• We will only focus on the growth-rate of the running timeas a function of n (the size of input).

• That is, we are interested only on how the running timegrows when the input size grows.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-33

The “Big-Oh” Notation

• Definition: f(n) = O(g(n)) if ∃(c > 0 & n0 > 0) such that∀(n ≥ n0) f(n) ≤ c*g(n). Note that c is a real number whilen and n0 are integers.

• You can think that f(n) = O(g(n)) means f(n) is less than orequal to g(n) up to some fixed constant c (and for n ≥ n0).

• If f(n) = O(g(n)), we say f(n) is at most the order of g(n).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-34

“Big-Oh” Example

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-35

Some Principles

• Fixed constant factors don’t really matter (as long as theyare not too HUGE) because of different possible hardwareplatforms, operating systems and compilers.

• Small values of n are not that important. We are onlyinterested in the case n ≥ n0 (as long as n0 is notunreasonably large).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-36

Exercises

• Find the “Big-Oh” notations for the following functions:– 5n - 1

– 7n - 3

– 20n3 + 10nlogn + 5

– aknk + ak-1n

k-1 + ak-2nk-2 + ... + a1n + a0

– 3logn + loglogn

– 5/n

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-37

Answers

• 5n - 1 = O(n) and 7n - 3 = O(n)

• 20n3 + 10nlogn + 5 = O(n3)– Because 20n3 + 10nlogn + 5 ≤ 35n3 for n ≥ 1

• aknk + ak-1nk-1 + ak-2nk-2 + ... + a1n + a0 = O(nk)– Because akn

k + ak-1nk-1 + ak-2n

k-2 + ... + a1n + a0 ≤ (ak + ak-1 + ak-2 +... + a1 + a0)nk for n ≥ 1

• 3logn + loglogn = O(logn)– Because 3logn + loglogn ≤ 4logn for n ≥ 2

• 5/n = O(1/n)– Because 5/n ≤ 5(1/n) for n ≥ 1

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-38

Some Rules ...

• f(n) is O(af(n)) for any constant a > 0.

• If f(n) ≤ g(n) and g(n) is O(h(n)), then f(n) is O(h(n)).

• If f(n) is O(g(n)) and g(n) is O(h(n)), then f(n) is O(h(n)).

• f(n) + g(n) is O(max(f(n),g(n)).

• If g(n) is O(h(n)), then f(n)+g(n) is O(f(n) + h(n)).

• If g(n) is O(h(n)), then f(n)g(n) is O(f(n)h(n)).

• If f(n) is a polynomial of degree d (i.e. f(n) = a0 + a1n + ... + adnd), then

f(n) is O(nd).

• nx is O(an) for any fixed x > 0 and a > 1.

• lognx is O(logn) for any fixed x > 0.

• logxn is O(ny) for any fixed constants x > 0 and y > 0.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-39

Best Possible Upper Bound

• It is important that we always find the best possible upperbound.

• For example: f(n) = 3n3 + 3n3/4 + 7– We could say f(n) = O(n5) or f(n) = O(n4logn)

– But it is more accurate to say f(n) = O(n3).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-40

Related Notations

• f(n) = Ω(g(n)) if ∃(c’ > 0 & n0’ > 0) such that ∀(n ≥ n0’)f(n) ≥ c’*g(n). Note that c’ is a real number, while n andn0’ are integers.

• If f(n) = Ω(g(n)) then g(n) = O(f(n))

• f(n) = Θ(g(n)) if– f(n) = O(g(n)); and

– f(n) = Ω(g(n)).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-41

Related Notations

• f(n) = o(g(n)) if ∀c > 0, ∃n0 > 0 such that ∀(n ≥ n0) f(n) ≤c*g(n). Note that c is a real number, while n and n0 areintegers.

• If f(n) = o(g(n)) then g(n) = ω(f(n))

• f(n) = θ(g(n)) if– f(n) = o(g(n)); and

– f(n) = ω(g(n)).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-42

Some Typical Running Time

• From better to worse:– O(logn) Logarithmic Good

– O(n) Linear Fair

– O(nlogn) OK

– O(n2) Quadratic Not too bad

– O(nk), k>2 Polynomial Bad

– O(an), a>1 Exponential Terrible

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-43

Running Time Examples

n248163264

128256512

1024

logn12345678910

√n1.42

2.84

5.7811162332

n248163264

128256512

1,024

nlogn282464

160384896

2,0484,60810,240

n2

41664

2561,0244,09616,38465,536

262,1441,048,576

n3

864

5124,09632,768

262,1442,097,152

16,777,216

134,217,728

1,073,741,824

2n

416

25665,536

4,294,967,296

1.84×1019

3.40×1038

1.15×1077

1.34×10154

1.79×10308

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-44

More Examples

• Given 5 algorithms:– Algorithm A: 400n=O(n)

– Algorithm B: 20nlogn = O(nlogn)

– Algorithm C: 2n2 = O(n2)

– Algorithm D: n4 = O(n4)

– Algorithm E: 2n = O(2n)

• Assuming a machine that can execute 1,000,000instructions per second.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-45

Maximum Problem Sizes

• Maximum problem size (m) for a given running time isshown in the following table:

Running TimeMaximum Problem Size (m)

1 second 1 minute 1 hour

400n 2,500 150,000 9,000,000

20nlogn 4,096 166,666 7,826,087

2n2 707 5,477 42,426

n4 31 88 244

2n 19 25 31

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-46

A Faster Machine?

• How if we upgrade the machine to another one which is256 time faster?

• Caution: Beware of “Astronomical” constants.

Running Time New Maximum Problem Size

400n 256m

20nlogn Approximately 256m((logm)/(7+logm))

2n2 16m

n4 4m

2n m+8

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-47

Example: Prefix Average

• Given: Array x[0…n-1] with n integer.

• Compute: Array A[0…n-1] where

[ ] 1

][0

+

∑= =

i

ixi

jiA

• That is:– A[0] = x[0]

– A[1] = (x[0] + x[1]) / 2

– A[2] = (x[0] + x[1] + x[2]) / 3

– …

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-48

Prefix Average

• The Prefix Average problem can be used in manyapplications.

• One example is the mutual fund evaluation, in which x[i] isthe return in year i, and A[i] is the average annual return inthe first i years.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-49

Quadratic-Time Implementation

Algorithm prefixAverage1 (X, n):

Input: An n-element array X of numbers.

Output: An n-element array A of numbers such that

A[i] is the average of elements X[0], …,

X[i].

let A be an array of n numbers

for i Å 0 to n-1 do

a Å 0

for j Å 0 to i do

a Å a + X[j]

A[i] Å a / (i+1)

return array A.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-50

Analysis

• Initializing array A at the beginning and returning array Aat the end can be done with a constant number of primitiveoperations per element and takes O(n) time.

• There are two nested for loop, controlled by counter i andj, respectively. The body of the outer loop, controlled bycounter i, is executed n times, for i = 0, …, n-1. Thus,statement a 0 and A[j] a / (i+1) are executed n timeseach. This implies that these two statements, plus theincrementing and testing of counter i, contribute a numberof primitive operations proportional to n, that is, O(n) time.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-51

Analysis

• The body of the inner loop, controlled by counter j, isexecuted i+1 times, depending on the current values of theouter loop counter i. Thus, statement a a + X[j] in theinner loop is executed 1+2+3+…+n times. Since1+2+3+…+n = n(n+1)/2, this implies that the statement inthe inner loop contributes O(n2) time. A similar argumentcan be done for the primitive operations associated withincrementing and testing counter j, which also take O(n2)time.

• Therefore the total running time is T(n) = O(n) + O(n2) =O(n2).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-52

Linear-Time Implementation

Algorithm prefixAverage2 (X, n):

Input: An n-element array X of numbers.

Output: An n-element array A of numbers such

that A[i] is the average of elements

X[0], …, X[i].

let A be an array of n numbers

let s Å 0

for i Å 0 to n-1 do

s Å s + X[i]

A[i] Å s / (i+1)

return array A.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-53

Analysis

• Initializing array A at the beginning and returning array Aat the end can be done with a constant number of primitiveoperations per element and takes O(n) time.

• Initializing variable s at the beginning takes O(1) time.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-54

Analysis

• There is a single for loop, controlled by counter i. Thebody of the loop is executed n times, for i = 0, …, n-1.Therefore, the statements s s + X[i] and A[i] s / (i+1)are executed n times each. This implies that these twostatements, plus the incrementing and testing of counter i,contribute a number of primitive operations proportional ton, that is, O(n) time.

• Therefore the total running time is T(n) = O(1) + O(n) =O(n).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-55

Justification Techniques

• We also need to justify that our claims on the correctnessand the running time for our algorithms.

• Common Techniques:– By Example

– The Contra Attack• Contrapositive

• Contradiction

– Mathematical Induction

– Loop Invariants

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-56

By Example

• Give a counter example to prove a claim is incorrect.

• Example: if someone claims that all integers in the form of2i-1 are prime numbers, we can show that this statement isincorrect by giving a counter example of i=4 as 24-1=15 isnot a prime number.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-57

Contrapositive

• To justify the statement “if p is true, then q is true”, weinstead establish that “if q is not true, then p is not true.”These two statements are logically equivalent.

• The second statement (“if q is not true, then p is not true”)is called the contrapositive of the first statement (“if p istrue, then q is true.”)

• That is: (p →q)⇔(¬q→¬p)

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-58

Contrapositive Example

• Example: if ab is odd, then either a is odd or b is even.

• Justification: To justify the claim, consider thecontrapositive, “ if a is even and b is odd, then ab is even.”So suppose a=2k for some integer k. Then ab=(2k)b=2(kb);hence ab is even.

• Therefore, we proved our original claim.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-59

Contradiction

• Assume the statement we want to justify is false. Then weshow that this assumption will lead to a contradiction.

• So the original statement must be true.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-60

Contradiction Example

• Example: if ab is odd, then a is odd or b is even.

• Justification: The oppose of the statement is: if ab is odd,then a is even and b is odd. Since a is even, we have a=2kfor some integer k. Hence, ab=(2k)b=2(kb), that is, ab iseven. But this is a contradiction: ab cannot simultaneouslybe odd and even. Therefore either a is odd or b is even.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-61

Mathematical Induction

• A technique to prove that a statement P(n) is true for allpositive integers n ≥ 1.

• Can be generalized to prove that a statement P(n) is truefor all integers n ≥ n0 (we’ll use n0 in the following slides).

• Mathematical induction includes three steps …

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-62

Mathematical Induction

• Base Case: to show that P(n) is true for n =n0.

• Induction Step: show that if P(n) is true for n= n0, …, k,then P(n) is also true for n=k+1.

• Conclusion Step: combining the above two steps, we canconclude that P(n) is true for all integers n ≥ n0.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-63

Mathematical Induction Example

• Definition: Fibonacci number:– F(0) = 0

– F(1) = 1

– F(n) = F(n-1) + F(n-2) for n ≥ 2

• Theorem: F(n) < 2n for all non-negative integers n ≥ 0.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-64

Proof

• Our statement P(n): F(n) < 2n.

• Base Cases:– n=0, F(n)=F(0)=0<1=20=2n. P(0) is true.

– n=1, F(n)=F(1)=1<2=21=2n. P(1) is true.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-65

Proof

• Induction:– Assume P(0), P(1), …, P(k) are all true, we want to show that

P(k+1) is also true.

– F(k+1) = F(k)+F(k-1) < 2k+2k-1 < 2k+2k = 2*2k = 2k+1.

– Therefore P(k+1) is true.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-66

Proof

• Combining the base cases and the induction step, we canconclude that P(n):F(n) < 2n is true for all non-negativeintegers n ≥ 0.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-67

Loop Invariants

• This technique is usually used to prove the correctness ofan algorithm, especially for those that use loops (for-loop,while-loops).

• We have to establish a statement related to the loop (theloop invariant) and prove that the statement is true at thebeginning and/or the end of each loop.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-68

Loop Invariants

• The technique we use in loop invariants is very similar tomathematical induction.

• After we establish the statement, we show that it is truebefore we enter the loop.

• Then we assume the statement is true at the beginningand/or end of the kth loop, we show that either it is also truefor the beginning and/or end of the (k+1)th loop, or thestatement for the next loop does not exist (because the loopends).

• As a result, we conclude that the loop invariant is true andthe algorithm is correct.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-69

Loop Invariants Example

Algorithm arrayFind (x, A):

Input: An element x and an n-element array A of

numbers.

Output: The index i such that x=A[i] or -1 if no

element of A is equal to x.

let i Å 0

while i < n do

if x = A[i] then

return i

else

i Å i + 1

return -1.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-70

Loop Invariants Example

• Our Statement:– Si: x is not equal to any of the first i elements of A.

• Base case: statement is true at the beginning of the loops:– S0: x is not equal to any of the first 0 element of A.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-71

Loop Invariants Example

• Assume the statement is true up to Sk at the beginning ofthe (k+1)th loop:– Sk: x is not equal to any of the first k elements of A.

• During the (k+1)th loop, two things can happen:– if x = A[k], then we return k, thus there will be no Sk+1.

– if x ≠ A[k], then we continue to the next loop. In this case, weknow from Sk that x is not equal to any of the first k elements of A,but we also have that x is not equal to the (k+1)th element of A.Therefore we know Sk+1 is also true:

• Sk+1: x is not equal to any of the first k+1 elements of A.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-72

Loop Invariants Example

• Conclusion - Si is always true at the beginning of the ith

loop:– Si: x is not equal to any of the first i elements of A.

• As a result of the proof, we can conclude that the algorithmis correct.

Recommended