View
20
Download
0
Category
Preview:
Citation preview
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-1
Analysis Tools
• Experimental Studies
• Pseudo-Code
• Mathematical Review
• Analysis of Algorithms
• Asymptotic Analysis
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-2
Analysis Goals
• The goals of analyzing data structures and algorithms areto study:– The running time
– The resource requirement
• We want our data structures and algorithms to run as fastas possible and to use as less resources as possible.
• But we need analysis to confirm that our data structuresand algorithms are “good”.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-3
Experimental Studies
• Method: implement the algorithms and observe speed.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-4
Experimental Studies
• In general, running time increases with growing input size.
• Running time depends on hardware.
• Running time also depends on operating system (includingdifferent versions of the Operating System), compiler, etc.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-5
Experimental Studies
• Experiments can only be done on a limited number of testcases.
• It is often difficult to compare two algorithms due to:– Different hardware
– Different operating systems
– Different compilers
• It also requires an implementation before experiments canbe done.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-6
Looking for Better Method
• We want a methodology for analyzing the running time ofan algorithm that– Takes into account all possible inputs.
– Allows us to evaluate relative efficiency of any two algorithms in away that is independent of the hardware and software environment.
– Can be performed by studying a high-level description of thealgorithm without actually implementing it.
• This introduces the concept of Pseudo Code
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-7
Pseudo-Code
• Pseudo-code is a mixture of natural language and high-level programming constructs.
• There is no precise definition of pseudo-code.
• The following slides show example of pseudo-code and thecorresponding Java code.
• This example computes the maximum value in an array Aof n integers.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-8
Pseudo-Code Example
Algorithm arrayMax (A, n):
Input: An array A storing n integers.
Output: The maximum element in A.
let currentMax Å A[0].
for i Å 1 to n-1 do
if currentMax < A[i] then
let currentMax Å A[i].
return currentMax.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-9
Java Code Example
public class ArrayMaxProgram
// test program for an algorithm that finds the maximum element in an array
static int arrayMax (int[] A, int n)
// find the maximum element in array A of n integers by scanning
// the cells of A while keeping track of the maximum element
// encountered.
int currentMax = A[0]; // executed once
for (int i=1; i<n; i++) // executed once, n times, n-1 times, resp.
if (currentMax < A[i]) // executed n-1 times
currentMax = A[i]); // executed at most n-1 times
return (currentMax);
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-10
Java Code Example
public static void main (String args [])
// testing method called when the program is executed
int [] num = 10, 15, 3, 5, 56, 107, 22, 16, 85 ;
int n = num.length;
System.out.print (“Array:”);
for (int i=0; i<n; i++)
System.out.print (“ ” + num[i]); // prints one element of the array
System.out.println (“.”);
System.out.println (“The maximum element is ” + arrayMax(num,n) + “.”);
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-11
Rules for Pseudo-Code
• Expression: We use standard mathematical symbols toexpress expressions. We use the left arrow sign () as theassignment operator in assignments (equivalent to the Java= operator) and we use the equal sign (=) as the equalityrelation in boolean expression (which is equivalent to the== relation in Java).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-12
Rules for Pseudo-Code
• Method declarations: Algorithm name (param1, param2,...) declares a new method “name” and its parameters.
• Decision structures: if condition then true-actions [elsefalse-actions]. We use indentation to indicate what actionsshould be included in the true-actions and false-actions.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-13
Rules for Pseudo-Code
• While-loops: while condition do actions.
• Repeat-loops: repeat condition do actions.
• For-loops: for variable-increment-definition do actions.
• We use indentation to indicate what actions should beincluded in all the loop actions.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-14
Rules for Pseudo-Code
• Array indexing: A[i] represents the ith cell in the array A.The cells of an n-cell array A are indexed from A[0] to A[n-1]. This is consistent with Java.
• Method calls: object.method (args). “object.” is optional ifit is understood.
• Method returns: return value. This operation returns thevalue specified to the method that called this one, value isoptional.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-15
Mathematical Review
• Before we continue to discuss how we can analyze analgorithm, we need to take a quick review of somemathematical rules.
• These rules will be used in our analysis.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-16
Logarithms and Exponents
• logba = c ⇔ a = bc
• logba/c = logba - logbc
• logbac = clogba
• logba = (logca)/(logcb)
• bloga = alogb
• (ba)c = bac
• babc = ba+c
• ba/bc = ba-c
• When the base is omitted, it is assumed to be 2: logn ⇔ log2n.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-17
Examples
• log(2nlogn) = 1 + logn +loglogn
• log(n/2) = logn - log2 = logn - 1
• log√n = log(n1/2) = (logn)/2
• loglog√n = log((logn)/2) = loglogn - 1
• log4n = (logn)/log4 = (logn)/2
• log2n = n
• 2logn = n
• 22logn = (2logn)2 = n2
• 4n = (22)n = 22n
• n223logn = n2n3 = n5
• 4n/2n = 22n/2n = 22n -n = 2n
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-18
)(...)2()1()()( tfsfsfsfift
si
++++++=∑=
2)1(
1
)1(...321 −
==+−++++=∑ nn
n
i
nni
aan
n
i
i n
aaaa −−
−
+=++++=∑ 112
0
1
...1
122...84212 121
21
0
1 −==+++++= +−
−
=
+∑ nnn
i
i n
Summations
• Definition
• Arithmetic Series
• Example: if a=2
• Geometric Series, giving a>0
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-19
Floor and Ceiling
• Floor: x = largest integer ≤ x
• Ceiling: x = smallest integer ≥ x
• Example:– 3.6 = 3
– 3.6 = 4
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-20
Analysis of Algorithms
• Principle: count primitive operations in the pseudo-code.
• Assumption: all primitive operations take approximatelythe same times to execute.
• Primitive operations include:– Assigning a value to a variable
– Calling a method
– Arithmetic operations (e.g. “+”, “-”, “*”, “/”, etc.)
– Comparing two numbers
– Indexing into an array
– Following an object reference
– Returning from a method
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-21
Counting Primitive Operations
• Primitive operations are similar to basic machine levelinstructions.
• Running times of primitive operations are fairly similar.
• Therefore, counting the number of primitive operationsgives an estimate on the running time that is independentof the machine architecture.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-22
Algorithm Complexity
• The “time complexity” of an algorithm refers to thenumber of primitive operations which are proportional tothe running time.
• Similarly, the “space complexity” of an algorithm isproportional to the maximum memory used (in bytes,kilobytes, or megabytes).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-23
Example
• Using the arrayMax algorithm (slide 8 or page 101 of thetext book) as example.
• Initializing variable currentMax to A[0] corresponds to twoprimitive operations (indexing into an array and assigninga value to a variable) and is executed only once at thebeginning of the algorithm. Thus, it contributes two unitsto the count.
• Total primitive operations so far: 2 + ...
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-24
Example
• At the beginning of the for loop, counter i is initialized to1. This action corresponds to executing one primitiveoperation (assigning a value to a variable).
• Total primitive operations so far: 2 + 1 + ...
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-25
Example
• Before entering the body of the for loop, condition i < n isverified. This action corresponds to executing oneprimitive instruction (comparing two numbers). Sincecounter i starts at 1 and is incremented by 1 at the end ofeach iteration of the loop, the comparison i < n isperformed n-1 times. Thus, it contributes (n-1) units to thecount.
• Total primitive operations so far: 2 + 1 + (n-1) + ...
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-26
Example
• The body of the for loop is executed n-1 times (for values 1, 2, ..., n-1of the counter). In each iterations, A[i] is compared with currentMax(two primitive operations, indexing and comparing), A[i] is possiblyassigned to currentMax (two primitive operations, indexing andassigning), and the counter i is incremented (two primitive operations,summing and assigning). Hence, at each iteration of the loop, eitherfour or six primitive operations are performed, depending on whetherA[i] ≤ currentMax or A[i] > currentMax. Therefore, the body of theloop contributes between 4(n-1) and 6(n-1) units to the count.
• Total primitive operations so far:– At least 2 + 1 + (n-1) + 4(n-1) + ...
– At most 2 + 1 + (n-1) + 6(n-1) + ...
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-27
Example
• While i = n, the comparison fails, and the loop finishes.This contributes to 1 unit to the count (comparing twonumbers), and it executes only once.
• Returning the value of variable currentMax corresponds toone primitive operation, and it executes only once.
• Total primitive operations so far:– At least 2 + 1 + (n-1) + 4(n-1) + 1 + 1
– At most 2 + 1 + (n-1) + 6(n-1) + 1 + 1
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-28
Conclusion of Example
• Therefore, the number of primitive operations t(n)executed by algorithm arrayMax is– At least 2 + 1 + (n-1) + 4(n-1) + 1 + 1 = 5n
– At most 2 + 1 + (n-1) + 6(n-1) + 1 + 1 = 7n - 2
• Does this mean the average number of primitive operationsis 6n - 1?
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-29
Average-Case and Worst Case Analysis
Input
1 ms
2 ms
3 ms
4 ms
5 ms
A B C D E F G
worst-case
best-caseaverage-case?
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-30
Average-Case and Worst Case Analysis
• Algorithm may run faster on some inputs of the same size.
• For all possible inputs of the same size– Average case time is the expected T(n) based on a given input
distribution.
– Worst case time is the worst possible T(n).
• Unless otherwise stated, when we say running timeanalysis in this course, we always refer to worst caseanalysis.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-31
Asymptotic Analysis
• By counting the number of primitive operations, we canknow how fast an algorithm can run.
• But some question:– Is this level of details really needed?
– How important is it to figure out the exact number of primitiveoperations?
– How careful must we define the primitive operations?• For example: How many operations are there in the statement
“A[k] Å A[k] + (a*x)”? 3 or 5? Why?
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-32
Simplifying The Analysis
• We will introduce a “big-picture” approach.
• We will only focus on the growth-rate of the running timeas a function of n (the size of input).
• That is, we are interested only on how the running timegrows when the input size grows.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-33
The “Big-Oh” Notation
• Definition: f(n) = O(g(n)) if ∃(c > 0 & n0 > 0) such that∀(n ≥ n0) f(n) ≤ c*g(n). Note that c is a real number whilen and n0 are integers.
• You can think that f(n) = O(g(n)) means f(n) is less than orequal to g(n) up to some fixed constant c (and for n ≥ n0).
• If f(n) = O(g(n)), we say f(n) is at most the order of g(n).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-34
“Big-Oh” Example
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-35
Some Principles
• Fixed constant factors don’t really matter (as long as theyare not too HUGE) because of different possible hardwareplatforms, operating systems and compilers.
• Small values of n are not that important. We are onlyinterested in the case n ≥ n0 (as long as n0 is notunreasonably large).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-36
Exercises
• Find the “Big-Oh” notations for the following functions:– 5n - 1
– 7n - 3
– 20n3 + 10nlogn + 5
– aknk + ak-1n
k-1 + ak-2nk-2 + ... + a1n + a0
– 3logn + loglogn
– 5/n
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-37
Answers
• 5n - 1 = O(n) and 7n - 3 = O(n)
• 20n3 + 10nlogn + 5 = O(n3)– Because 20n3 + 10nlogn + 5 ≤ 35n3 for n ≥ 1
• aknk + ak-1nk-1 + ak-2nk-2 + ... + a1n + a0 = O(nk)– Because akn
k + ak-1nk-1 + ak-2n
k-2 + ... + a1n + a0 ≤ (ak + ak-1 + ak-2 +... + a1 + a0)nk for n ≥ 1
• 3logn + loglogn = O(logn)– Because 3logn + loglogn ≤ 4logn for n ≥ 2
• 5/n = O(1/n)– Because 5/n ≤ 5(1/n) for n ≥ 1
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-38
Some Rules ...
• f(n) is O(af(n)) for any constant a > 0.
• If f(n) ≤ g(n) and g(n) is O(h(n)), then f(n) is O(h(n)).
• If f(n) is O(g(n)) and g(n) is O(h(n)), then f(n) is O(h(n)).
• f(n) + g(n) is O(max(f(n),g(n)).
• If g(n) is O(h(n)), then f(n)+g(n) is O(f(n) + h(n)).
• If g(n) is O(h(n)), then f(n)g(n) is O(f(n)h(n)).
• If f(n) is a polynomial of degree d (i.e. f(n) = a0 + a1n + ... + adnd), then
f(n) is O(nd).
• nx is O(an) for any fixed x > 0 and a > 1.
• lognx is O(logn) for any fixed x > 0.
• logxn is O(ny) for any fixed constants x > 0 and y > 0.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-39
Best Possible Upper Bound
• It is important that we always find the best possible upperbound.
• For example: f(n) = 3n3 + 3n3/4 + 7– We could say f(n) = O(n5) or f(n) = O(n4logn)
– But it is more accurate to say f(n) = O(n3).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-40
Related Notations
• f(n) = Ω(g(n)) if ∃(c’ > 0 & n0’ > 0) such that ∀(n ≥ n0’)f(n) ≥ c’*g(n). Note that c’ is a real number, while n andn0’ are integers.
• If f(n) = Ω(g(n)) then g(n) = O(f(n))
• f(n) = Θ(g(n)) if– f(n) = O(g(n)); and
– f(n) = Ω(g(n)).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-41
Related Notations
• f(n) = o(g(n)) if ∀c > 0, ∃n0 > 0 such that ∀(n ≥ n0) f(n) ≤c*g(n). Note that c is a real number, while n and n0 areintegers.
• If f(n) = o(g(n)) then g(n) = ω(f(n))
• f(n) = θ(g(n)) if– f(n) = o(g(n)); and
– f(n) = ω(g(n)).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-42
Some Typical Running Time
• From better to worse:– O(logn) Logarithmic Good
– O(n) Linear Fair
– O(nlogn) OK
– O(n2) Quadratic Not too bad
– O(nk), k>2 Polynomial Bad
– O(an), a>1 Exponential Terrible
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-43
Running Time Examples
n248163264
128256512
1024
logn12345678910
√n1.42
2.84
5.7811162332
n248163264
128256512
1,024
nlogn282464
160384896
2,0484,60810,240
n2
41664
2561,0244,09616,38465,536
262,1441,048,576
n3
864
5124,09632,768
262,1442,097,152
16,777,216
134,217,728
1,073,741,824
2n
416
25665,536
4,294,967,296
1.84×1019
3.40×1038
1.15×1077
1.34×10154
1.79×10308
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-44
More Examples
• Given 5 algorithms:– Algorithm A: 400n=O(n)
– Algorithm B: 20nlogn = O(nlogn)
– Algorithm C: 2n2 = O(n2)
– Algorithm D: n4 = O(n4)
– Algorithm E: 2n = O(2n)
• Assuming a machine that can execute 1,000,000instructions per second.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-45
Maximum Problem Sizes
• Maximum problem size (m) for a given running time isshown in the following table:
Running TimeMaximum Problem Size (m)
1 second 1 minute 1 hour
400n 2,500 150,000 9,000,000
20nlogn 4,096 166,666 7,826,087
2n2 707 5,477 42,426
n4 31 88 244
2n 19 25 31
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-46
A Faster Machine?
• How if we upgrade the machine to another one which is256 time faster?
• Caution: Beware of “Astronomical” constants.
Running Time New Maximum Problem Size
400n 256m
20nlogn Approximately 256m((logm)/(7+logm))
2n2 16m
n4 4m
2n m+8
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-47
Example: Prefix Average
• Given: Array x[0…n-1] with n integer.
• Compute: Array A[0…n-1] where
[ ] 1
][0
+
∑= =
i
ixi
jiA
• That is:– A[0] = x[0]
– A[1] = (x[0] + x[1]) / 2
– A[2] = (x[0] + x[1] + x[2]) / 3
– …
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-48
Prefix Average
• The Prefix Average problem can be used in manyapplications.
• One example is the mutual fund evaluation, in which x[i] isthe return in year i, and A[i] is the average annual return inthe first i years.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-49
Quadratic-Time Implementation
Algorithm prefixAverage1 (X, n):
Input: An n-element array X of numbers.
Output: An n-element array A of numbers such that
A[i] is the average of elements X[0], …,
X[i].
let A be an array of n numbers
for i Å 0 to n-1 do
a Å 0
for j Å 0 to i do
a Å a + X[j]
A[i] Å a / (i+1)
return array A.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-50
Analysis
• Initializing array A at the beginning and returning array Aat the end can be done with a constant number of primitiveoperations per element and takes O(n) time.
• There are two nested for loop, controlled by counter i andj, respectively. The body of the outer loop, controlled bycounter i, is executed n times, for i = 0, …, n-1. Thus,statement a 0 and A[j] a / (i+1) are executed n timeseach. This implies that these two statements, plus theincrementing and testing of counter i, contribute a numberof primitive operations proportional to n, that is, O(n) time.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-51
Analysis
• The body of the inner loop, controlled by counter j, isexecuted i+1 times, depending on the current values of theouter loop counter i. Thus, statement a a + X[j] in theinner loop is executed 1+2+3+…+n times. Since1+2+3+…+n = n(n+1)/2, this implies that the statement inthe inner loop contributes O(n2) time. A similar argumentcan be done for the primitive operations associated withincrementing and testing counter j, which also take O(n2)time.
• Therefore the total running time is T(n) = O(n) + O(n2) =O(n2).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-52
Linear-Time Implementation
Algorithm prefixAverage2 (X, n):
Input: An n-element array X of numbers.
Output: An n-element array A of numbers such
that A[i] is the average of elements
X[0], …, X[i].
let A be an array of n numbers
let s Å 0
for i Å 0 to n-1 do
s Å s + X[i]
A[i] Å s / (i+1)
return array A.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-53
Analysis
• Initializing array A at the beginning and returning array Aat the end can be done with a constant number of primitiveoperations per element and takes O(n) time.
• Initializing variable s at the beginning takes O(1) time.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-54
Analysis
• There is a single for loop, controlled by counter i. Thebody of the loop is executed n times, for i = 0, …, n-1.Therefore, the statements s s + X[i] and A[i] s / (i+1)are executed n times each. This implies that these twostatements, plus the incrementing and testing of counter i,contribute a number of primitive operations proportional ton, that is, O(n) time.
• Therefore the total running time is T(n) = O(1) + O(n) =O(n).
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-55
Justification Techniques
• We also need to justify that our claims on the correctnessand the running time for our algorithms.
• Common Techniques:– By Example
– The Contra Attack• Contrapositive
• Contradiction
– Mathematical Induction
– Loop Invariants
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-56
By Example
• Give a counter example to prove a claim is incorrect.
• Example: if someone claims that all integers in the form of2i-1 are prime numbers, we can show that this statement isincorrect by giving a counter example of i=4 as 24-1=15 isnot a prime number.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-57
Contrapositive
• To justify the statement “if p is true, then q is true”, weinstead establish that “if q is not true, then p is not true.”These two statements are logically equivalent.
• The second statement (“if q is not true, then p is not true”)is called the contrapositive of the first statement (“if p istrue, then q is true.”)
• That is: (p →q)⇔(¬q→¬p)
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-58
Contrapositive Example
• Example: if ab is odd, then either a is odd or b is even.
• Justification: To justify the claim, consider thecontrapositive, “ if a is even and b is odd, then ab is even.”So suppose a=2k for some integer k. Then ab=(2k)b=2(kb);hence ab is even.
• Therefore, we proved our original claim.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-59
Contradiction
• Assume the statement we want to justify is false. Then weshow that this assumption will lead to a contradiction.
• So the original statement must be true.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-60
Contradiction Example
• Example: if ab is odd, then a is odd or b is even.
• Justification: The oppose of the statement is: if ab is odd,then a is even and b is odd. Since a is even, we have a=2kfor some integer k. Hence, ab=(2k)b=2(kb), that is, ab iseven. But this is a contradiction: ab cannot simultaneouslybe odd and even. Therefore either a is odd or b is even.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-61
Mathematical Induction
• A technique to prove that a statement P(n) is true for allpositive integers n ≥ 1.
• Can be generalized to prove that a statement P(n) is truefor all integers n ≥ n0 (we’ll use n0 in the following slides).
• Mathematical induction includes three steps …
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-62
Mathematical Induction
• Base Case: to show that P(n) is true for n =n0.
• Induction Step: show that if P(n) is true for n= n0, …, k,then P(n) is also true for n=k+1.
• Conclusion Step: combining the above two steps, we canconclude that P(n) is true for all integers n ≥ n0.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-63
Mathematical Induction Example
• Definition: Fibonacci number:– F(0) = 0
– F(1) = 1
– F(n) = F(n-1) + F(n-2) for n ≥ 2
• Theorem: F(n) < 2n for all non-negative integers n ≥ 0.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-64
Proof
• Our statement P(n): F(n) < 2n.
• Base Cases:– n=0, F(n)=F(0)=0<1=20=2n. P(0) is true.
– n=1, F(n)=F(1)=1<2=21=2n. P(1) is true.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-65
Proof
• Induction:– Assume P(0), P(1), …, P(k) are all true, we want to show that
P(k+1) is also true.
– F(k+1) = F(k)+F(k-1) < 2k+2k-1 < 2k+2k = 2*2k = 2k+1.
– Therefore P(k+1) is true.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-66
Proof
• Combining the base cases and the induction step, we canconclude that P(n):F(n) < 2n is true for all non-negativeintegers n ≥ 0.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-67
Loop Invariants
• This technique is usually used to prove the correctness ofan algorithm, especially for those that use loops (for-loop,while-loops).
• We have to establish a statement related to the loop (theloop invariant) and prove that the statement is true at thebeginning and/or the end of each loop.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-68
Loop Invariants
• The technique we use in loop invariants is very similar tomathematical induction.
• After we establish the statement, we show that it is truebefore we enter the loop.
• Then we assume the statement is true at the beginningand/or end of the kth loop, we show that either it is also truefor the beginning and/or end of the (k+1)th loop, or thestatement for the next loop does not exist (because the loopends).
• As a result, we conclude that the loop invariant is true andthe algorithm is correct.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-69
Loop Invariants Example
Algorithm arrayFind (x, A):
Input: An element x and an n-element array A of
numbers.
Output: The index i such that x=A[i] or -1 if no
element of A is equal to x.
let i Å 0
while i < n do
if x = A[i] then
return i
else
i Å i + 1
return -1.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-70
Loop Invariants Example
• Our Statement:– Si: x is not equal to any of the first i elements of A.
• Base case: statement is true at the beginning of the loops:– S0: x is not equal to any of the first 0 element of A.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-71
Loop Invariants Example
• Assume the statement is true up to Sk at the beginning ofthe (k+1)th loop:– Sk: x is not equal to any of the first k elements of A.
• During the (k+1)th loop, two things can happen:– if x = A[k], then we return k, thus there will be no Sk+1.
– if x ≠ A[k], then we continue to the next loop. In this case, weknow from Sk that x is not equal to any of the first k elements of A,but we also have that x is not equal to the (k+1)th element of A.Therefore we know Sk+1 is also true:
• Sk+1: x is not equal to any of the first k+1 elements of A.
Albert Chanhttp://www.scs.carleton.ca/~achan
School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types
Version 03.s3-72
Loop Invariants Example
• Conclusion - Si is always true at the beginning of the ith
loop:– Si: x is not equal to any of the first i elements of A.
• As a result of the proof, we can conclude that the algorithmis correct.
Recommended