Algorithm & data structure lec2

Algorithm & Data Structure CSC-112Fall 2011

Syed Muhammad Raza

Algorithm Requirements Requirements for an algorithm:

Input Output Unambiguous Generality Correctness Finite Efficiency

Algorithm Representation

Pseudo-code

Flow chart

Pseudo-Code

Pseudo-code is a semi-formal, English-like language with a

limited vocabulary that can be used to design and describe

algorithms.

The main purpose of a pseudo-code is to define the

procedural logic of an algorithm in a simple, easy-to-

understand manner for its readers, who may or may not be

proficient in computer programming.

Pseudo-Code

Used in designing algorithms.

Used in communicating to users.

Used in implementing algorithms as programs.

Used in debugging logic errors in programs.

Used in documenting programs for future maintenance and

expansion purposes.

Pseudo-Code

Must have a limited vocabulary.

Must be easy to learn.

Must produce simple, English-like narrative notation.

Must be capable of describing all algorithms, regardless of

their complexity.

Control Structures

Sequence

Selection

Repetition

Sequence

Series of steps or statements that are executed in the order

they are written.

Example:

Read taxable income

Read filing status

Compute income tax

Print income tax

Selection

Defines one or two courses of action depending on the

evaluation of a condition.

A condition is an expression that is either true or false.

Example

if condition (is true)

then-part

else

else-part

end_if

Nested Selection

if status is equal to 1

print “Single”

else


print “Married filing jointly”

else


print “Married filing separately

end_if

end_if

end_if

Repetition

Specifies a block of one or more statements that are

repeatedly executed until a condition is satisfied.

Example:

while condition

loop-body

end_while

Conventions

Each pseudo-code statement consists of keywords that

describe operations and some appropriate, English-like

description of operands.

Each statement should be written on a separate line.

Continuation lines should be indented

Conventions II

Sequence statements should begin with unambiguous words

(compute, set, initialize).

Selection statements then-part and else-part should be

indented.

Selection statements end with the keyword end_if.

Convention III

Repetition statements end with end_while.

Loop-bodies are indented.

All words in a pseudo-code statement must be chosen to be

unambiguous, and as easy as possible to understand by

non-programmers.

Enclose comments between /* and */

Example 1

If student's grade is greater than or equal to 60

Print "passed"

else

Print "failed“

End_if

Example 2

Set total to zero

Set grade counter to one

While grade counter is less than or equal to ten

Input the next grade

Add the grade into the total

End_while

Set the class average to the total divided by ten

Print the class average.

Example 3initialize passes to zeroinitialize failures to zeroinitialize student to onewhile student counter is less than or equal to ten

input the next exam resultif the student passedadd one to passeselseadd one to failuresEnd_ifadd one to student counter

End_while

print the number of passesprint the number of failuresif eight or more students passed

print “good result“End_if

Set asterik counter to one Set outer while counter to one While out counter is less than or equal to 4

set spaces counter to 1

while spaces counter is less than 4

print “ ”

increment space by one

end_while

set asterik counter to one

while asterik counter is less than or equal to odd multiple of outer counter

print “*”

increment asterik counter by one

end_while

print new line

End_while

Basic SymbolsRounded box - use it to represent an event which occurs automatically. Such an event will trigger a subsequent action, for example `receive telephone call’, or describe a new state of affairs.

Rectangle or box - use it to represent an event which is controlled within the process. Typically this will be a step or action which is taken. In most flowcharts this will be the most frequently used symbol.

Diamond - use it to represent a decision point in the process. Typically, the statement in the symbol will require a `yes' or `no' response and branch to different parts of the flowchart accordingly.

Circle - use it to represent a point at which the flowchart connects with another process. The name or reference for the other process should appear within the symbol.

Flowchart

A flowchart is a diagrammatic representation that illustrates

the sequence of operations to be performed to get the

solution of a problem

Guide Lines Flowcharts are usually drawn using some standard symbols;

however, some special symbols can also be developed when

required Start or end of the program. Computational steps or processing function of a program

Input or output operation, Decision making and branchingConnector or joining of two parts of program

Magnetic TapeOff-page connector

Flow line

Annotation

Display

Example

Draw a flowchart to find the largest of three numbers A,B,

and C.

Example

Draw a flowchart to find the sum of first 50 natural

numbers.

Imagination is more important than knowledge

Knowledge is Limited.

Imagination encircles the worldEinstein

Efficiency of an Algorithm The Amount of Computer Time and Space that an Algorithm Requires

We are more interested in time complexity. Why?

Efficiency of algorithms helps comparing different methods of solution

for the same problem

Algorithm Analysis Should be Independent of

Specific Implementation languages,

Computers and

Data

Efficiency is a concern for large problems (large data size)

Algorithm Growth Rate

Algorithm growth rate is a measure of it’s efficiency

Growth Rate: Measure an Algorithm’s Time Requirement as a

Function of the Problem Size (N)

How quickly the Algorithm’s Time Requirement Grows as a Function of

the Problem Size

The Way to Measure a Problem’s Size Depends on the Application

Number of Nodes in a Linked List

The Size of an Array

Number of Items in the Stack

Etc.

Example

Suppose we have two algorithms A and B for solving the same

problem. Which one do we use?

Suppose

Algorithm A Requires Time Proportional to N2 (expressed as

O(N2 ). This is Big O Notation.)

Algorithm B Requires Time Proportional to N , i.e. B is O(N)

Obviously B is a better choice

Big O Notation

If Algorithm A Requires Time Proportional to f(N), Algorithm A is Said

to be Order f(N), Which is Denoted as O(f(N));

f(N) is Called Algorithm’s Growth-Rate Function

Growth Rates Comparison

Growth Rates Comparison (Tabular)

Common Growth Rates Constant : O(1)

Time Requirement is Constant

Independent of Problem Size

Example: Accessing an array element

Logarithmic : O(log N)

Time Requirement Increases Slowly for a Logarithmic Function

Typical algorithm: Solves a Problem by Solving a Smaller Constant Fraction of

the Problem

Example: Binary search

Linear : O(N)

Time Requirement Increases Directly with the size of the Problem

Example: Retrieving an element from a linked list

Common Growth Rates O(N log N)

Time Requirement Increases More Rapidly Than a Linear Algorithm

Typical algorithms : Divide a Problem into Smaller Problems That are Each

Solved Separately

Quadratic : O(N2)

Increases Rapidly with the Size of the Problem

Typical Algorithms : Algorithms that Use Two Nested Loops

Practical for Only Small Problems

Cubic : O(N3)

Increases Rapidly with the Size of the Problem

Typical Algorithms : Algorithms that Use Three Nested Loops

Common Growth Rates

Exponential time: O( 2N)

Very, very, very bad, even for small N

Usually Increases too Rapidly to be Practical

Sorting Algorithms

Sorting is the process of rearranging your data elements/Item in

ascending or descending order

Unsorted Data

Sorted Data (Ascending)

Sorted Data (Descending)

4 3 2 7 1 6 5 8 9

1 2 3 4 5 6 7 8 9

9 8 7 6 5 4 3 2 1

Sorting Algorithms They are many

Bubble Sort Selection Sort Insertion Sort Shell sort Comb Sort Merge Sort Heap Sort Quick Sort Counting Sort Bucket Sort Radix Sort Distribution Sort Time Sort

Source: Wikipedia

Bubble Sort

Compares Adjacent Items and Exchanges Them if They are Out of

Order

When You Order Successive Pairs of Elements, the Largest Element

Bubbles to the Top(end) of the Array

Bubble Sort (Usually) Requires Several Passes Through the Array

Bubble Sort

29 10 14 37 13

10 29 14 37 13

10 14 29 37 13

10 14 29 37 13

10 14 29 13 37

Pass 1

10 14 29 13 37

10 14 29 13 37

10 14 13 29 37

10 14 29 13 37

Pass 2

Selection Sort To Sort an Array into Ascending Order, First Search for the Largest Element

Because You Want the Largest Element to be in the Last Position of the Array, You

Swap the Last Item With the Largest Item to be in the Last Position of the Array, You

Swap the Last Item with the Largest Item, Even if These Items Appear to be

Identical

Now, Ignoring the Last (Largest) Item of the Array, search Rest of the Array For Its

Largest Item and Swap it With Its Last Item, Which is the Next-to-Last Item in the

original Array

You Continue Until You Have Selected and Swapped N-1 of the N Items in the Array

The Remaining Item, Which is Now in the First Position of the Array, is in its Proper

Order

Selection Sort

29 10 14 37 13

29 10 14 13 37

13 10 14 29 37

13 10 14 29 37

10 13 14 29 37

Initial Array

After4th Swap

After1st Swap

After2nd Swap

After3rd Swap

Insertion Sort

Divide Array Into Sorted Section At Front (Initially Empty), Unsorted

Section At End

Step Iteratively Through Array, Moving Each Element To Proper

Place In Sorted Section

Sorted Unsorted

N-10After i Iterations

….. …..

i

Insertion Sort

29 29 14 37 13

10 29 29 37 13

10 14 29 37 13

10 14 29 37 13

10 14 14 29 37

Initial Array

Sorted Array

Copy 10

10 29 14 37 13

29 10 14 37 13

10 13 14 29 37

Shift 29

Insert 10, Copy 14

Shift 29

Insert 14; Copy 37

Insert 37 on Itself

Copy 13

Shift 14, 29, 37

Insert 13

Shell Sort

Improved and efficient version of insertion sort

It iterates the data elements/items like insertion sort, but instead of

shifting it makes the swaps.

Shell Sort

3

5

1

2

4

1

5

3

2

4

1

2

3

5

4

1

2

3

4

5

Merge Sort Divide and Conquer Algorithm Recursively split the input in half Then recursively merge pairs of pieces

Recursive Steps: Divide the Array in Half

Sort the Halves

Merge The Halves inside a Temporary Array

Copy Temporary Array to the appropriate locations in original array

Merge Sort

The Recursive calls Continue to divide the Array into Pieces Until

Each Piece Contains Only One Item

An Array of One Item is Obviously Sorted

The Algorithm Then Merges These small Pieces Until One Sorted

Array Results

Merge Sort

38

38

38

38

38

38

38

16

16

16

16

16

16

16

27

27

27

27

27

27

27

27

27

271239

12

12

12

12

12

12

39

39

39

39

39

39Merge

Steps

Recursive

Calls

to

Mergesort

Quick Sort

Divide and Conquer algorithm

Quicksort Works by Partitioning the Array into Two Pieces Separated

by a Single Element That is Greater Than all the Elements in the Left

part and Smaller Than all the Elements in the right part

This Guarantees That, the Single Element , Called the Pivot

Element, is in its Correct position

Then the Algorithm Proceeds, Applying the Same Method to the Two

parts Separately

Quick Sort

Partition (Divide)

Choose a pivot

Find the position for the pivot so that

all elements to the left are less

all elements to the right are greater

>= pivotpivot< pivot

Quick Sort

Conquer

Apply the same algorithm to each half

< pivot >= pivot

pivot< p’ p’ > p’ < p” p” >= p”

Partitioning Method

Must Arrange Items Into Two regions S1, the Set of Items Less Than the Pivot, and S2, the Set of Items Greater Than or Equal to Pivot

Different algorithms for Choice of a Pivot Retain Pivot in A[F] position The Items That await Placement are in Another Region , Called the

Unknown Region

At Each Step of the partition Algorithm you Examine One Item from Unknown Region and Place it in S1 or S2

P <P ?

F LastS1 FirstUnknown L

S1 S2 Unknown

>=P

To Move A[FirstUnknown] into S1

Swap A[FirstUnknown] With the First Item of S2, Which is A[LastS1+1], and Then

Increment S1 by 1

Thus the Item That Was in A[FirstUnknown] will be at the Rightmost Position of S1

Item of S2 That was Moved to A[FirstUnknown]: If you Increment FirstUnknown by 1, That

Item Becomes the Rightmost Member of S2

Thus, Statements for the Following Steps are Required

Swap A [FirstUnknown] with A[lastS1+1]

Increment LastS1

Increment FirstUnknown

P >=P<P >=P <P ?

F LastS1 LastS1+1 FirstUnknown L

SwapS1 Unknown

S2

To Move A[FirstUnknown] into S2

Region S2 and Unknown are Adjacent

Simply Increment FirstUnknown by 1, S2 Expands to the Right

To Move Pivot Between S1 and S2

Interchange A[LastS1], the Rightmost Item in S1 with Pivot

Thus, Pivot Would be in its Actual Location

P <P >=P ?

F LastS1 FirstUnknown L

S1 S2 Unknown

Quick Sort Example (only one step)

38 12 39 27 16

12 38 39 27 16

12 38 39 27 16

12 38 39 27 16

12 16 39 27 38

38 12 39 27 16

38 12 39 27 16

12 27 39 27 38

27

27

27

27

27

27

16

27

S1

S1

S1

S1

S2

S2

S2

S2

S2

S2S1

Unknown

Pivot

Unknown

Unknown

Unknown

Unknown

Pivot

Pivot

Pivot

Pivot

Pivot

Pivot

Pivot

FirstUnknown = 1(Points to 3838 Belongs in S2

Place Pivot between S1 and S2

S1 is Empty12 Belongs in S1, swap38 & 12

39 Belongs in S2

27 Belongs in S2

16 Belongs in S1, Swap 38 & 16

No more Unknown

Choose Pivot, keep it in A[F]

Radix Sort

3 2 94 5 76 5 78 3 94 3 67 2 03 5 5

7 2 03 5 54 3 64 5 76 5 73 2 98 3 9

7 2 03 2 94 3 68 3 93 5 54 5 76 5 7

3 2 93 5 54 3 64 5 76 5 77 2 08 3 9

Comparison of Sorting Algorithms

Worst Case

Selection Sort N2

Bubble Sort N2

Insertion Sort N2

Mergesort N * log N

Quicksort N2

Radix Sort N

Searching Algorithms

Searching is the process of determining whether or not a given value

exists in a data structure or a storage media.

We will study two searching algorithms

Linear Search

Binary Search

Linear Search: O(n)

The linear (or sequential) search algorithm on an array is:

Start from beginning of an array/list and continues until the item is

found or the entire array/list has been searched.

Sequentially scan the array, comparing each array item with the

searched value.

If a match is found; return the index of the matched element;

otherwise return –1.

Note: linear search can be applied to both sorted and unsorted

arrays.

Linear Search bool LinSearch(double x[ ], int n, double item)

{

for(int i=0;i<n;i++)

{

if(x[i]==item)

{

return true;

}

else

{

return false;

}

}

return false;

}

Linear Search Tradeoffs Benefits

Easy to understand Array can be in any order

Disadvantages Inefficient for array of N elements

Examines N/2 elements on average for value in array, N elements for value not in array

Binary Search: O(log2 n) Binary search looks for an item in an array/list

using divide and conquer strategy

Binary Search

Binary search algorithm assumes that the items in the array being

searched is sorted

The algorithm begins at the middle of the array in a binary

search

If the item for which we are searching is less than the item in the

middle, we know that the item won’t be in the second half of the

array

Once again we examine the “middle” element

The process continues with each comparison cutting in half the

portion of the array where the item might be

Binary Search

Binary search uses a recursive method to search an array to find a

specified value

The array must be a sorted array:

a[0]≤a[1]≤a[2]≤. . . ≤ a[finalIndex]

If the value is found, its index is returned

If the value is not found, -1 is returned

Note: Each execution of the recursive method reduces the search

space by about a half

Pseudocode for Binary Search

Execution of Binary Search

Execution of Binary Search

Key Points in Binary Search

1. There is no infinite recursion

• On each recursive call, the value of first is increased, or the value

of last is decreased

• If the chain of recursive calls does not end in some other way, then

eventually the method will be called with first larger than last

2. Each stopping case performs the correct action for that case

• If first > last, there are no array elements between a[first] and

a[last], so key is not in this segment of the array, and result is

correctly set to -1

• If key == a[mid], result is correctly set to mid


3. For each of the cases that involve recursion, if all recursive calls

perform their actions correctly, then the entire case performs

correctly

• If key < a[mid], then key must be one of the elements a[first]

through a[mid-1], or it is not in the array

• The method should then search only those elements, which it

does

• The recursive call is correct, therefore the entire action is

correct


• If key > a[mid], then key must be one of the elements

a[mid+1] through a[last], or it is not in the array

• The method should then search only those elements, which it

does

• The recursive call is correct, therefore the entire action is

correct

The method search passes all three tests:

Therefore, it is a good recursive method definition

Efficiency of Binary Search

The binary search algorithm is extremely fast compared to an

algorithm that tries all array elements in order

About half the array is eliminated from consideration right at the

start

Then a quarter of the array, then an eighth of the array, and so

forth

Efficiency of Binary Search

Given an array with 1,000 elements, the binary search will only need

to compare about 10 array elements to the key value, as compared

to an average of 500 for a serial search algorithm

The binary search algorithm has a worst-case running time that is

logarithmic: O(log n)

A serial search algorithm is linear: O(n)

If desired, the recursive version of the method search can be

converted to an iterative version that will run more efficiently

Binary Search

IT IS ALL ABOUT DOING THE “RIGHT” THING

AT THE

“RIGHT” TIME

Documents

Algorithm & data structure lec2