Upload
gervais-evans
View
227
Download
0
Embed Size (px)
DESCRIPTION
3 Purpose: However, if that list grew from 10 to 10 million, the WAY we store, order & retrieve this data would become critically important.
Citation preview
1
Algorithms
Starring: Binary SearchCo Starring: Big-O
2
Purpose:
The ability to effectively process a large volume of data is a critical element in systems design.
If we had to maintain information on 10 licensed drivers, we could code it almost any way we wished.
3
Purpose:
However, if that list grew from 10 to 10 million, the WAY we store, order & retrieve this data would become critically important.
4
Resources:
Java Essentials Chapter 18 p.703
Java Essentials Study Guide Chapter 15 p.235
5
Intro:
Knowing all of the rules of English, grammar and spelling, will not help you give directions from place a to place b if you do not know how to get there.
In systems, an analyst can describe a method in more abstract terms to a programmer without knowing the exact syntax of the programming language.
Programs are typically based on one or more algorithms.
6
An algorithm is a abstract and formal step by step recipe that tells how to perform a certain task or solve a certain problem on a computer.
Pseudocode is a solution in a loosely formatted style of the actual software, Java, code but without the syntax. This is the shorthand that developers use to flesh out a solution.
7
When dealing with handling large volumes of data, it makes sense to form an acceptable algorithm that will effectively work with the data. Before you actually implement this algorithm, you need to scope it out and analyze its potential efficiency.
8
Therefore, an algorithm that efficiently orders (sorts) a large volume of data and another algorithm that efficiently searches for a specific element in that data, a specific driver’s information obtained by using their SSN, is imperative.
9
We will cover the following topics:
A Gentle Introduction to Big-O Sequential Search Algorithm A Need for Order Bubble Sort (a Review) Selection Sort Binary Search
10
A Gentle Introduction to Big-O:
When we begin to discuss algorithms we MUST be able to evaluate their effectiveness in some way
One way would be to evaluate their execution or pure clock time
This method leaves a tremendous dependency on the power of a specific CPU
11
Also, if the algorithm is inefficient, a powerful CPU can mask the problem only up to a point
We need a more abstract, standard, mechanism for evaluating efficiency
12
We use a more logical Order of Growth methodology, Big-O, to evaluate theoretical efficiency
This method obviates the relative strengths of the system(s) on which a given algorithm executes
The Big-O Growth Rate can be summed up with the following chart:
13
O(1) < O(log n) < O(n) < O(n log n) < O(n^2) < O(n^3) < O(a^n)
Linear
t
n
Exponential
Quadratic N log n
Constant O(1)
14
As you can see, a constant growth rate is optimum whereas an exponential rate is a problem
What do you think the “N” stands for ?
15
Here is a little comparison chart that illustrates the concept:
N N^2 N Log(N)
100 10,000 664
300 90,000 2,468
1,000 1,000,000 9,965100,000 10,000,000,000 1,660,964
16
We will examine these in depth in our lecture on Big-O
For now, understand that an algorithm that has a Logarithmic efficiency is preferable to a Quadratic algorithm
17
Sequential Search Algorithm:
Given an example where we have a “database” consisting of only 10 Licensed drivers
Well, we can create “driver” class and then an array of instances of that class
Order really does not matter since we have only 10 drivers to search
18
Adding drivers to the array would be efficient as it only takes one “step”:
myDriverArray[2] = new myDriver(constructor info);
What do you think the Order of Efficiency would be for the “add” ?
19
Adding drivers to the array would be efficient as it only takes one “step”:
myDriverArray[2] = new myDriver(constructor info);
What do you think the Order of Efficiency would be for the “add” ?
ANS: Constant O(1)
20
If we needed to look for a specific driver using their SSN, at most how many “steps” would we need to execute ?
at Least ?
21
If we needed to look for a specific driver using their SSN, at most how many “steps” would we need to execute ?
at Least ?
ANS: 10 if driver was last item or not in array
Best case is 1 step
22
This is the essence of a Sequential Search, it iterates over each element in a list and stops either when the item is located or the end of the list is reached
What do you think the Order of Efficiency is in the best and worst cases ?
23
This is the essence of a Sequential Search, it iterates over each element in a list and stops either when the item is located or the end of the list is reached
What do you think the Order of Efficiency is in the best and worst cases ?
ANS: if the driver being searched is the first in the list, then it is Constant O(1) otherwise it is Linear O(N)
24
This search is also known as a Linear Search
How is this coded ?
25
This search is also known as a Linear Search
How is this coded ?Driver myDriver[] = new Driver[100];String SSN = new String(“123456789”);for (int i = 0; i < myDriver.length; i++){
if myDriver[i].getSSN.equals(SSN)return i;
}return -1;
26
A Need for Order:Well, our little search works for 10 Drivers,
but if our list had 1 million drivers, then we can expect our linear search algorithm to execute 1 million times EACH time we look for a specific Driver
We need a better way to search our list, but before we can think of a more efficient search we need to order the data in a way that can be used in a more advanced search
27
We need to make sure that our list is indexed in a manner such that the sequence of SSN’s is ordered from “smallest” to “greatest”
Now, it is important to note that just as there is a “cost” to performing a search against a list there is a “cost” for sorting a list
28
Therefore, we need to evaluate the relative value of sorting a list so that we may execute an efficient search AGAINST simply leaving the list unordered and performing a linear search
29
In order to make a decision we need to know what the Dominant factor or process is in our application
If the list is fairly static and there will be extensive searches for specific drivers then the search is the dominant factor and our solution needs to make sure that the search algorithm is efficient even at the expense of a costly SORT algorithm
30
If the list is dynamic and the searching is infrequent then the inserting or adding algorithm efficiency overrides the search efficiency
We will learn when we discuss Data Structures that this solution requires the evaluation of the efficiency (Big-O) of competing ways to store and maintain data
31
At this point we know of only the array or ArrayList as a potential Data Structure but we will soon cover Linked Lists, Binary Trees and Hash Tables
Lets assume that in our project, the list of licensed drivers will be about 1 million and the list once loaded will remain static
32
Lets also assume that there will be frequent requests for information on specific drivers
This information provides us with our solution, we will order the data so that we can provide an efficient method for searching the list
33
There are MANY ways to sort a list(MergeSort, QuickSort, InsertionSort )
We will cover all of them in a later lecture, but for now we will focus on using the Selection Sort, and we will look back at the Bubble Sort
34
Bubble Sort (a Review):
Sort an array in ascending or descending order by evaluating the nth element against the nth+1 element
If they are not in the prescribed order, swap them
When we reach the end of the array, all of the items will be sorted
How will we sort our Driver class list ?
35
int c1, c2, leng, temp;Driver temp;
leng = myDriver.length;for (c1 = 0; c1 < (leng - 1); c1++){
for (c2 = (c1 +1); c2 < leng; c2++){if (myDriver[c1].compareTo(myDriver [c2]) > 1)
{temp = myDriver [c1];myDriver [c1] = myDriver [c2];myDriver [c2] = temp;
}}
}
36
What are we actually Swapping here ?
This sort has a nested for loop
This means that for each element of the list, the inner loop is executed
In effect we perform the number of steps equal to the number of elements squared
That’s why we call this an O(n^2) sort
37
Selection SortAn algorithm for arranging the
elements of an array in (ascending) order
Find the largest element on the array and swap it with the last element, then continue the process for the first n-1 elements
38
1st iteration takes the LARGEST ARRAY element and swaps it with the LAST array element
The largest element is now in its correct place and will not be moved again
39
We logically reduce the size of the array and ignore the “last” element(s)
40
Steps in selection sort:
Initialize a variable, n , to the size of the array
Find the largest among the first n elements
Make it swap places with the nth element Decrement n by 1
Repeat steps 2 to 4 while n >= 2
41
SELECTION SORT OUTPUT:initial array:2 97 17 37 12 46 10 55 80 42 39selection sort in progress...2 39 17 37 12 46 10 55 80 42 972 39 17 37 12 46 10 55 42 80 972 39 17 37 12 46 10 42 55 80 972 39 17 37 12 42 10 46 55 80 97
42
SELECTION SORT OUTPUT:initial array:2 97 17 37 12 46 10 55 80 42 39selection sort in progress...
2 10 12 17 37 39 42 46 55 80 97
43
The same procedure can be used to sort the array in descending order by finding the SMALLEST element in the array
44
For the same reasons as the bubble sort, this is also an O(n^2) sort
45
Once we have sorted the list, there is no need to apply a linear (Sequential) search unless you need to accumulate data about each driver in the list
We are now free to apply an efficient search algorithm
46
Binary Search:
The concept of a Binary search is that it continually & logically divides the list in half until the element is found or the logical size of the list is eliminated
It is an algorithm for quickly finding the element with a given value in a sorted array
47
Used to find the location of a given “target” value in an array by searching the array
Works on sorted arrays. Unsorted arrays need to be searched element by element
48
Take a sorted (acsending) array of n elements and search for a given value, x
49
Locate the middle element
Compare that element with x
A match ends the search
50
If x is smaller, the target element is in the LEFT half of the array
51
If x is larger, the target element is in the RIGHT half of the array
52
With each iteration we narrow the search by 50%
The search eventualy ends when a match is found or “right - left” becomes negative (target not found)
53
IN GENERAL, An array of (2^n) –1 elements requires, at MOST, n comparisons
54
For example: a 7 element array willrequire 3 iterations: (2^3)-17 = 2 to the 3rd power minus 1 therefore n = 3
55
Left set to ZERORight set to array length - 1middle = left + right / 2
if target val > middle valleft = middle + 1
if target val < middle valright = middle + 1
56
TARGET: 121 4 7 8 12 16 21
---- middle is 6/2 or 3 ---- element 3 is valued at 7---- 12 is greater than 7 so change
the middle to 10/2 or 5 //10 is 4 + 6--- element 5 is valued at 12--- 12 (target) = 12 (5th array
element)
57
An array of 15 elements requires a max of 4 iterations (2^4) - 1
An array of 1,000,000 elements requires a max of 20 iterations (2^20) - 1
58
Lets revisit and update our performance chart to see the efficiency of a Logarithmic algorithm
N N^2 N Log(N) Log (N)
100 10,000 664 6.64300 90,000 2,468 8.221,000 1,000,000 9,965 9.97100,000 10,000,000,000 1,660,964 16.61
59
TPS:
Write you own Binary Search that will work against your Driver class
60
Tips for the AP Exam:
Given an algorithm, count the number of times a specific statement executes
In writing a Sort algorithm be aware of the “off by 1” problem. length – 1
An array must be sorted before a Binary Search will work
N Log(N) algorithms are more efficient than O(N^2)
61
Project:
POE