Upload
edward-blurock
View
377
Download
0
Embed Size (px)
Citation preview
Abstract Data Structures
andAlgorithmsOverview of standard data structures
and useful algorithms
Why different data types?
Complexity of ManipulationOne criteria:
The data structure can have an effect on how difficult the task is
Vector of size n
Efficient for the nth elementA single arithmetic calculationComplexity does not increase as the vector gets bigger
O(1)
76 8 9 10
11
12
0 1 2 3 4 5
Pos(v[0]) Pos(v[0]) + 5
Find ith element of vector
This is exactly an example for what a vector is designed for
Vector of size nInsert before ith element in vector
76 8 9 10
11
12
0 1 2 3 4 5
1. Allocate vector of size n+1
2. Copy element 0 to i-1 to places 0 to i-1 and copy elements i to n-1 to places i+1 to n 76 8 9 1
011
12
0 1 2 3 4 5
1 operation
n operations
3. Set in element76 8 9 1
011
12
0 1 2 3 4 5 131 operation
Vectors are not designed to be used with insertion operationsAs the vector gets larger, the insertion takes more time/operations
O(n)
Complexity
O(1)
Time/operation complexity does not increase with the size of the problem
O(n)
Time/operation complexity does increases linearly with the size of the problem
Find ith element of vector
Insert before ith element in vector
Linked List
Element
Structure pair
Pointer to next element pair
Linked list with 6 elements
43210 5
Linked List43210 5
Find ith element of linked list
43210 5
Have to traverse structure to ith element
Linked lists are not designed to find the ith elementAs the list increases in size, the number of steps
can increaseO(n)
Linked ListInsert an element
2
43210 5
2
43210 5
Change pointers… one operation
Linked lists are exactly designed to insert an element
Regardless of the size of the list, the insertion is still one operation
Linear SearchI am thinking of a number between 1 and 10
If you just guess number (for example sequentially)
Best case: correct on the 1 guessWorse case: correct after 10 guessesOn the average it will take you 5 guesses
In general: for a number between 1 and n it will take you n/2 guessesComplexity: n/2 guesses
O(n)Don’t worry about the constant ½…
The complexity increases linearly with the size of the problem
Binary SearchFor every guess I will say whether it is correct, higher or
lower4
62
7531
Best Case: 1 guess
Worse Case:
3 guessesAt most log2 8 = 3 are needed
In general: log2 n guesses
O(log n)
Extra information:
Complexity of operations
The proper data structure can increase the efficiency of an algorithm
For structures of size n
Increases linearly with size of structure
Does not depend on size of structure
Complexity of an Algorithm
O(c) Complexity does not increase with the size of the problem
Example: Find ith element in a vector
O(n) Complexity increases linearly with the size of the problem
Example: Find ith element in a linked list
O(log n) Complexity increases with the log of the problem
Example: Binary search
As the problem grows in size, how more difficult (in terms of computation time/operations)
does the problem become
CouplingRelationship between
data structures and algorithms
Choose the wrong data structure the algorithm becomes more complex
Why different data types?
A specific object implies a data structure
Another criteria:
Graph Data StructureA set of nodes
A set of connections between the nodes
Both nodes and connections can have properties associated to them
Graph Data Structure
A graph can be a natural representation
for many data objects and processes
Social NetworkNode: The person(facebook page)
Node: connects two people who know each other(the friends of facebook page)
Each node has a list of connections(the friends of facebook page)
InheritanceDirected graph
Nodes: Data classes
Directed connection: One way connectionOne class inherits properties from the other
Connections: One class inherits the properties of the other
Object oriented classes
Ontology(labeled connectors)
Nodes: Objects
Connections: Relationship between
objects
Arithmetic Expression(functional
programming)
x + y + z * ( a + b * c)
cb*a
+z
*yx
+ function
arguments
Molecular Graph
Nodes: The atoms
Connections: The bonds between the atoms
Graph as a Linked list
4321 5
876 9
10
11
12
13
Foundation of LISP: List programmingFunctional programming(graph as an functional expression)
StacksQueues
Priority Queues
Stacks
Characteristics:Top: was the last thing
added
To get to something in the middle
You have to remove what is on top first
LIFO:Last In, First Out
Last in first out (LIFO)
DCBA
BA
topCBA
topDCBA
top EDCBA
top
topA
Push C Push D Push EPush B Pop E
Two main operations:
Push and Pop
The Towers of HanoiA Stack-based
Applicationo GIVEN: three poleso a set of discs on the first pole, discs of different sizes,
the smallest discs at the topo GOAL: move all the discs from the left pole to the
right one. o CONDITIONS: only one disc may be moved at a time. o A disc can be placed either on an empty pole or on
top of a larger disc.
Towers of Hanoi
Complexity:Towers of Hanoi
Complexity:2n
Why?To get to the bottom, you have to move all of the top object: 2(n-1)
Then you move the bottom object: 1
Then you have to move all the other objects back on top again: 2(n-1)
2(n-1) + 2(n-1) = 2 * 2(n-1) = 2n
A LegendThe Towers of Hanoi
In the great temple of Brahma in Benares, on a brass plate under the dome that marks the center of the world there are 64 disks of pure gold that the priests carry one at a time between these diamond needles
According to Brahma's immutable law: No disk may be placed on a smaller disk.
In the beginning of the world all 64 disks formed the Tower of Brahma on one needle.
Now, however, the process of transfer of the tower from one needle to another is in mid course.
When the last disk is finally in place, once again forming the Tower of Brahma but on a different needle, then will come the end of the world and all will turn to dust.
Is the End of the World Approaching?• Problem complexity 2n • 64 gold discs• Given 1 move a second
600,000,000,000 years until the end of the world
Queues
FILO:
First In and Last Out
Objects are inserted in the backAnd
Removed from the front
Queues
Computer systems must often provide a “holding area” for messages
between two processes, two programs, or even two systems.
Real time systems
Queue: Buffering
Computer sends data faster than the printer can print
Printer Buffer
Priority QueueLike a regular queue or stack datastructure, but
where additionally each element has a "priority"
associated with it.
An element with high priority is served before
an element with low priority.
If two elements have the same priority, they are served according to their order in the queue.
There is an ordering associated with the queue
Programming Paradigms
• Goto (like assembler and primitive/older languages)
• Iteration and Loops (while and for-next)• Functional languages and Recursion• Declarative• Non-deterministic programming
Example: Factorial
Implies a loop Recursive mathematical
definition
Goto statementLoops a GOTO (or similar) statement
The GOTO jumps to a specified location (label or address)
an index involved
The index is incremented until the end is reached i=1 factorial = 1;loop: factorial = factorial * I if( i=n) goto exit goto loopexit
IterationRepetition of a block of code
an index involved
The index is incremented until the end is reached
i=1 factorial = 1;
while( i <= n) { factorial = factorial * i
i = i + 1 }
Once again involves a iteration counter
factorial = 1;for i=1 to n { factorial = factorial * I }
Recursion
Numerische Mathematik 2, 312--318 (1960)
Content of Recursion• Base case(s).
o Values of the input variables for which we perform no recursive calls are called base cases (there should be at least one base case).
o Every possible chain of recursive calls must eventually reach a base case.
• Recursive calls. o Calls to the current method. o Each recursive call should be defined so that it makes
progress towards a base case.
factorial(n) {if(n=1) return 1
return factorial(n-1)*n}
How do I write a recursive function?
• Determine the size factoro The number: smaller number, smaller size
• Determine the base case(s) o The case for n=1, the answer is 1
• Determine the general case(s) o The recursive call: factorial(n)=factorial(n-1)*n
• Verify the algorithm (use the "Three-Question-Method")
factorial(n) {if(n=1) return 1
return factorial(n-1)*n}
Three-Question Verification Method
1. The Base-Case Question:Is there a nonrecursive way out of the function, and does the routine work correctly for this "base" case?
2. The Smaller-Caller Question:Does each recursive call to the function involve a smaller case of the original problem, leading inescapably to the base case?
3. The General-Case Question:Assuming that the recursive call(s) work correctly, does the whole function work correctly?
Stacksin recursion
factorial(n)If (n=1)
return 1 else
return factorial(n-1)
n! = n*(n-1)*(n-2)*(n-3)*……* 1
5! = 5*4*3*2*1
Factorial(5)Factorial(4)Factorial(3)Factorial(2)Factorial(1)return 1
Return 2Return 6Return 24Return 120
5!=120
Deep recursion can result in running out of
memory
tail recursionTail recursion is iteration
factorial(n) { factorial-help(n,1);}factorial-help(n, acc) {
if(n=1) return accreturn factorial-help(n-
1,acc*n)}
Tail recursion is a pattern of use that can be compiled or interpreted as iteration, avoiding the inefficiencies
A tail recursive function is one where every recursive call is the last thing done by the function before returning and thus produces the function’s value
Declarative programming
Expresses the logic of a computation without describing its control flow.
factorial(1,1)
factorial(N,F) :- N1 is N-1,
factorial(N1,F1),F is N*F1.
Constraint Logic Programming
factorial(1,1)
factorial(N,F) :- N1 is N-1,
factorial(N1,F1),F is N*F1.Factorial(5,F) Returns F=120
Factorial(N.120) Creates an instantiation error
PROLOG has no knowledge of Real or Integer numbers
Mathematical manipulations cannot be made
Constraint Logic Programming
factorial(1,1)
factorial(N,F) :- N1 is N-1,
factorial(N1,F1),F is N*F1.
Logic Programming
Constraint Logic Programming
CLP
Formulas passed to
CLP
Reduced or solved formulas returned
Mathmatical knowledge about the numbers used
Probabilistic Algorithms
Non-deterministicNo exact control program flow
Leaves Some of Its Decisions To ChanceOutcome of the program in different runs is not necessarily the same
Monte Carlo Methods
Always Gives an answerBut not necessarily CorrectThe probability of correctness go es up with time
Las Vegas Methods
Never returns an incorrect answerBut sometimes it doesn’t give an answer
Probabilistic Algorithms
Probabilistic Algorithms in optimization:
Closer to human reasoning and problem solving(for hard problems we don’t follow strict deterministic algorithms)
Finding local and
globalminimum
Probabilistic Algorithms
Classic gradient optimization find local minimumThe search path is always downhill toward
minimumProbabilistic algorithms allow search to go uphill sometimes
Randomness in the search for next step
Genetic AlgorithmsSimulated Annealing to find global minimum
Probabilistic Algorithms
Calculate pi with a dart board
Area of square
d2Area of Circle:
Probability dart will be in circle
d
number darts in circle
divided bynumber of darts in
totaltimesIs π
Monte Carlo MethodAlways Gives an answerBut not necessarily CorrectThe probability of correctness goes up with time