Upload
doankhanh
View
218
Download
0
Embed Size (px)
Citation preview
UNIT 5
Prof. Sushant S Sundikar 1
UNIT 5
� The related activities of sorting, searching
and merging are central to many computer
applications.
� Sorting and merging provide us with a means
of organizing information take advantage of
the organization of information and thereby
reduce the amount of effort to either locate
a particular item or to establish that it is not
present in a data set.
� Sorting algorithms arrange items in a set
according to a predefined ordering relation.
� The two most common types of data are
string information and numerical
information.
� The ordering relation for numeric data
simply involves arranging items in sequence
from smallest to largest (or vice versa) such
that each item is less than or equal to its
immediate successor.
� This ordering is referred to as non-
descending order.
� Sorted string information is generally
arranged in standard lexicographical or
dictionary order.
� Sorting algorithms usually fall into one of two
classes:
� The simpler and less sophisticated algorithms are
characterized by the fact that they require of
the order of n2 comparisons (i.e. 0(n2)) to sort
items.
� The advanced sorting algorithm take of the order
of n log2n (i.e., O(nlog2n)) comparisons to sort n
items of data. Algorithms within this set come
close to the optimum possible performance for
sorting random data
� Problem:
� Merge two arrays of integers, both with their
elements in ascending order into a single ordered
array.
UNIT 5
Prof. Sushant S Sundikar 2
� Algorithm Development
� Merging two or more sets of data is a task that is
frequently performed in computing.
� It is simpler than sorting because it is possible to
take advantage of the partial order in the data.
� Examination of two ordered arrays should help to
discover the essential of a suitable merging
procedure.
� Consider the two arrays:
� A little though reveals that the merged result
should be as indicated below:
� The origins are written above each element in
the c array.
� What we see here is c is longer than a and b.
� In fact c must contain a number of elements
corresponding to the sum of the elements in a
and b(i.e a+b).
� To see how this might be done let us consider the
smallest merging problem.
� To merge the two one dimensional array all we
need to do is select the smaller of the a and b
elements and place it in c.
� The larger element is then placed in c.
� 8 is less than 15 so 8 will take c[1] place and 15
c[2] place.
� In the same way we start merging arrays of
lengths m and n.
� The comparison between a[1] and b[1] allows
us to set c[1].
� After placing 8 in c[1] we need a way of deciding which element must be placed next in the c array.
� In the general case the next element to be placed into c is always going to be the smaller of the first elements in the unmanaged parts of arrays a and b.
� To keep track of the “yet to be merged” parts of both the a and b arrays two index pointers i and j will be needed.
� As an element is selected from either a or b the appropriate pointer must be incremented /decremented by 1.
�Overall the entire process would be:
UNIT 5
Prof. Sushant S Sundikar 4
� Algorithm
� Applications
� Sorting
� Tape sorting
� Data processing
� Problem
� Given a randomly ordered set of n numbers sort
them into non-descending order using exchange
method.
� Almost all sorting methods rely on exchanging
data to achieve the desired ordering.
� This method we will now consider relies heavily
on an exchange mechanism.
� Suppose we start out with the following random
data set:
� We notice that the first two elements are “out of
order”.
� If 30 and 12 are exchanged we will have the
following configuration:
� After seeing the above result we see that the
order in the data can be increased further by
now comparing and swapping the second and
third elements.
� With this new change we get the configuration
� The investigation we have made suggests that the
order in the array can be increased using the
following steps:
� For all adjacent pairs in the array do
� If the current pair of elements is not in non-descending
order then exchange the two elements.
� After applying this idea to all adjacent pairs in our
current data set we get the configuration below;
UNIT 5
Prof. Sushant S Sundikar 5
� Since there are n elements in the data this
implies that (n-1) passes (of decreasing length)
must be made through the array to complete the
sort.
� Algorithm Description
� Algorithm
� Applications
� Only for sorting data in which a small percentage
of elements are out of order.
� Problem
� Given a randomly ordered set on n numbers sort
them into non-descending order using an
insertion method.
UNIT 5
Prof. Sushant S Sundikar 6
� This is a simple sorting algorithm that builds the
final sorted array (or list) one item at a time.
� Insertion sort iterates, consuming one input
element each repetition, and growing a sorted
output list.
� Each iteration, insertion sort removes one
element from the input data, finds the location
it belongs within the sorted list, and inserts it
there.
� It repeats until no input elements remain.
� Sorting is typically done in-place, by iterating up
the array, growing the sorted list behind it.
� At each array-position, it checks the value there
against the largest value in the sorted list (which
happens to be next to it, in the previous array-
position checked).
� If larger, it leaves the element in place and
moves to the next.
� If smaller, it finds the correct position within the
sorted list, shifts all the larger values up to make
a space, and inserts into that correct position.
� The resulting array after k iterations has the property where the first k + 1 entries are sorted ("+1" because the first entry is skipped).
� In each iteration the first remaining entry of the input is removed, and inserted into the result at the correct position, thus extending the result:
� becomes
� with each element greater than x copied to the right as it is compared against x.
� To understand this sorting algorithm lets take up
an example
UNIT 5
Prof. Sushant S Sundikar 7
� Our complete algorithm can now be described as:
� To perform an insertion sort, begin at the left-
most element of the array and invoke Insert to
insert each element encountered into its correct
position.
� The ordered sequence into which the element is
inserted is stored at the beginning of the array in
the set of indices already examined.
� Each insertion overwrites a single value: the
value being inserted.
� Algorithm description
� Algorithm
� Applications
� Where there are relatively small data sets.
� It is sometimes used for more advanced quick
sort algorithm.
� Problem
� Given a randomly ordered set on n numbers sort
them into non-descending order using Shell’s
diminishing increment insertion method.
� Algorithm development
� A comparison of random and sorted data sets
indicates that for an array of size n elements
need to travel on average a distance of about
n/3 places.
� This observation suggests that progress towards
the final sorted order will be quicker if elements
are compared and moved initially over longer
rather than shorter distances.
� This strategy has the effect (on average) of
placing each element closer to its final position
earlier in sort.
� A strategy that moves elements over long
distances is to take an array of size n and start
comparing elements over a distance of n/2 and
then successively over the distances n/4, n/8,
n/16 and …. 1.
� Consider what happens when the n/2 idea is
applied to the dataset below
UNIT 5
Prof. Sushant S Sundikar 8
� After comparisons and exchanges over the
distance n/2 we have n/2 chains of length two
that are sorted.
� The next step is to compare elements over a
distance n/4 and thus produce two sorted chains
of length 4.
� Notice that after the n/4 sort the “amount of
disorder” in the array is relatively small.
� The final step is to form a single sorted chain of
length 8 by comparing and sorting elements
distance 1 apart.
� Since the relative disorder in the array is small
towards the end of the sort (i.e. when we are
n/8- sorting in this case) we should choose our
method for sorting the chains ( an algorithm that
is efficient for sorting partially order data).
� The insertion short should be better because it
does not rely so heavily on exchanges.
� The next and most important consideration is to
apply insertion sorts over the following distances
: n/2, n/4, n/8 , … , 1.
� We can implement this as follows
� The next steps in the development are to
establish how many chains are to sorted and for
each increment gap and then to work out how to
access the individual chains for insertion sorting.
� We can therefore expand our algorithm to
�Now comes the most crucial stage of the
insertion sort.
� In standard implementation the first element
that we try is to insert is the second element in
the array .
� Here for each chain to be sorted it will need to
be second element of each chain
� The position of k can be given by:
� Successive members of each chain beginning with
j can be found using
� Algorithm description
UNIT 5
Prof. Sushant S Sundikar 9
� Algorithm
� Shellsort.txt
� Applications
� Works well on sorting large datasets by there are
more advanced methods with better
performance.
� Problem
� Given a randomly ordered set of n numbers, sort
them into non-descending order using Hoare’s
partitioning method.
� Algorithm development
� Take guess and select an element that might
allow us to distinguish between the big and the
small elements.
� After first pass we have all big elements in the
right half of the array and all small elements in
the left half of the array.
� To achieve this do the following
� Extend the two partitions inwards until a wrongly
partitioned pair is encountered.
� While the two partitions have not crossed
� Exchange the wrongly partitioned pair;
� Extend the two partitions inwards again until another
wrongly partitioned pair is encountered.
� Applying this ideas to the sample data set
� Element 18 is selected as pivot element
� This step has given us two independent sets of
elements which can be sorted independently.
� The basic mechanism to do sort partitions is :
� While all partitions are not reduced to size one
do:
� Choose next partition to be processed;
� Select a new partitioning value from the current
partition;
� Partition the current partition into two smaller
partially ordered sets.
UNIT 5
Prof. Sushant S Sundikar 10
� After creating partitions of size one do the
following:
� Choose the smaller partition to be processed next;
� Select the element in the middle of the partition as
the partitioning value;
� Partition the current partition into two partially
ordered sets;
� Save the larger of the partitions from step c for later
processing.
� Algorithm description
� Algorithm
� Applications
� Internal sorting of large datasets.
� Problem
� Given an element x and a set of data that is in
strictly ascending numerical order find whether
or not x is present in the set.
� Algorithm Development:
� Let us now consider an example in order to
try to find the details of the algorithm
needed to implement.
� Suppose we are required to search an array of 15
ordered elements to find x= 44 is present . If
present then return the position of the array that
contains 44.
UNIT 5
Prof. Sushant S Sundikar 11
� We start by examining the middle value in the
array.
� To get the middle value of size n we can try
middle <- n / 2;
� For the above problem middle value is 8
� This gives a[middle] = a[8] =39
� Since the value we are seeking is greater than 39
it must be somewhere in the range a[9] … a[15].
� That is 9 becomes the lower limit and 15 upper
limit.
� lower = middle +1
� We then have
� To calculate the middle index 9 +15 / 2 =12
� a[12]=49 > 44 so search in a[9] .. a[11].
� Using the same above procedures calculate the
middle position.
� Our middle position is 10 and a[10] contains44
which is matching with our key to be found.
� Hence return the position 10 .
� Algorithm Description
� Problem
� Design and implement a hash searching
algorithm.
� Algorithm Description
UNIT 5
Prof. Sushant S Sundikar 12
� Algorithm
� Hashsearch.txt
� Applications
� Fast retrieval from both small and large tables
Unit 5 Algorithms
1. Two Way Merge
ALGORITHM merge(a,b,c,m,n)
//PROBLEM STATEMENT: Merge two arrays of integers, both with their
elements in ascending order into a single ordered array.
//INPUT: a : integer array with n elements and size n as integer
b : integer array with m elements and size m as integer
//OUTPUT: sorted array c.
{
if(a[m] <= b[n]) then
mergecopy(a,b,c,m,n);
else
mergecopy(b,a,c,n,m);
}
ALGORITHM mergecopy(a,b,c,m,n)
{
// i : first position in a array
// j: current position in b array
// k: current position in merged array –initally 1
i<--1;
j<--1;
k<--1;
if( a[m] <= b[i]) then
{
copy(a,c,i,m,k);
copy(b,c,i,n,k);
}
else
{
shortmerge(a,b,c,m,j,k);
copy(b,c,j,n,k);
}
}
ALGORITHM copy(b,c,j,n,k)
{
for i <-- j to n do
{
c[k] <-- b[i];
k <-- k+1;
}
}
ALGORITHM shortmerge(a,b,c,m,j,k)
{
while i <= m do
{
if a[i] <= b[j] then
{
c[k] <-- a[i];
i <- - i + 1
}
else
{
c[k] <-- b[i];
j <- - j + 1
}
k <-- k+1;
}
}
2. Sort by Exchange
ALGORITHM bubblesort(a,n)
//PROBLEM STATEMENT: Given a randomly ordered set of n numbers sort
them into non-descending order using exchange method.
//INPUT: a : integer array with n elements and size n as integer
//OUTPUT: sorted array a.
{
i <- 0;
sorted <- false;
while(i<n) AND(NOT sorted) do
{
sorted <- true;
i <- i + 1;
for j <- 1 to n-i
{
if a[j] >a[j+1] then
{
t <- a[j];
a[j] <- a[j+1];
a[j+1] <-t;
sorted =false;
}
}
}
return a;
}
3. Sorting By Insertion
ALGORITHM insertionsort (a,n)
PROBLEM STATEMENT: Given a randomly ordered set on n numbers sort
them into non-descending order using an insertion method.
INPUT: a -array of unsorted elements
n - size of array
i - increasing index of number of elements ordered in
each stage
j- decreasing index used for searching insertion position
first - smallest element in array
p - original position of smallest element
x -current element to be inserted
OUTPUT: Sorted array a.
{
// FIND MINIMUM TO ACT AS SENTINAL
first <- a[1];
p <- 1;
for i <- 2 to n do
{
if a[i] < first then
{
first <- a[i];
p <- l;
}
a[p] <- a[1];
a[1] <- first;
}
//inserting ith element - note a[1] is a sentinal
for i <- 3 to n do
{
x <- a[i];
j <-i;
while x < a[j-1] do
{
a[j] <- a[j-1];
j <- j - 1;
}
a[j] <- x;
}
return a;
}
4. Sorting by diminishing increment
ALGORITHM shellsort(a, n)
//PROBLEM STATEMENT:
//INPUT: a- integer array of size n
//OUPUT: Sorted array a.
{
//variable description
// inc - step size at which elements are to be sorted.
// current- position in chain where x is finally
inserted.
// previous - indes of element currently being compared
with x
// j - index for lowest element in current chain being
sorted.
// k - index of current element being inserted
// x - current value to be inserted
// inserted - is true when insertion can be made
inc =n;
while inc > 1 do
{
inc <- inc / 2;
for j <- 1 to inc do
{
k <- j + inc;
while k <=n do
{
inserted <- false;
x <- a[k];
current <- ;
previous <- current - inc;
while( previous >= j) and (not inserted) do
{
//locate the position and perform
insertion of x
if x < a[previous] then
{
a[current] <- a[previous];
current <- previous;
previous <- previous -inc;
}
else
{
inserted <- true;
}
}
a[current] <- x;
k <- k + inc;
}
}
}
return a;
}
5. Sorting By Partitioning
ALGORITHM quicksort(a, n, stacksize)
//INPUT: a - an integer array of size n
//OUTPUT: an sorted array
{
//variables used in the algorithm
//left - upper limit of left partition
//right- lower limit of right partition
//newleft - upper limit of extended left partition
//right- lower limit of extended right partition
//middle - middle index of current partition
//mguess - current guess at median
//temp - temporary variable used for exchange
//stacktop - current top of stack
//stack - array[1,100] of integers
stacktop <- 2;
stack[1] <- 1;
stack[2] <- n;
while stacktop > 0 do
{
right <- stack[stacktop];
left <- stack[stacktop - 1];
stacktop <- stacktop - 2;
}
while(left < right) do
{
newleft <- left;
newright <- right;
middle <- (left + right) / 2;
mguess <- a[middle];
while a[newleft] < mguess do newleft <- newleft + 1;
while a[newright] < mguess do newright <- newright -
1;
while newleft < newright-1 do
{
temp <- a[newleft];
a[newleft] <- a[newright];
a[newright] <- temp;
newleft <- newleft +1;
newright <- newright -1;
while a[newleft] < mguess do newleft <-
newleft + 1;
while a[newright] < mguess do newright <-
newright-1;;
}
if newleft <= newright then
{
if newleft < newright then
{
temp <- a[newleft];
a[newleft] <-a[newright];
a[newright] <- temp;
}
newleft <- newleft + 1;
newright <- newright 1;
}
if newright <middle then
{
stack[stacktop+1] <- newleft;
stacktop <- stacktop +2;
stack[stacktop] <- right;
right <- newright;
}
else
{
stack[stacktop+1] <- left;
stacktop <- stacktop +2;
stack[stacktop] <- newright;
left <- newleft;
}
}
return a;
}
6. Binary Search
7. Hash Searching
ALGORITHM hashsearch(table,position,found,tablesize,empty,
key)
PROBLEM STATEMENT: Design and implement a hash searching
algorithm.
INPUT: table - hash table to be searched
tablesize - an integer to set the size of the table
found - boolean value to set if element is found or
not.
key - key to be searched
empty- an integer for empty value
temp - temporary storage for value at position start
start - hash value index to table
active - if true continue search of table
OUTPUT: Position position of the key element.
{
active <-- true;
found <-- false;
start <-- key mod tablesize;
position <-- start;
if table[start] = key then
{
active <-- false;
found <-- true;
temp <-- table[start];
}
else
{
temp <-- table[start];
table[start] <-- key;
}
while active do
{
position <-- position + 1 mod tablesize;
if table[position] = key then
{
active <-- false;
if position<> start then
found <-- true;
}
else
{
if table[position] = empty then
{
active <-- false;
}
}
}
table[start] <-- temp;
}