67
Prelim 2 Review CS 2110 Fall 2009

Prelim 2 Review

  • Upload
    annice

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

Prelim 2 Review. CS 2110 Fall 2009. Overview. Complexity and Big-O notation ADTs and data structures Linked lists, arrays, queues, stacks, priority queues, hash maps Searching and sorting Graphs and algorithms Searching Topological sort Minimum spanning trees Shortest paths - PowerPoint PPT Presentation

Citation preview

Page 1: Prelim 2 Review

Prelim 2 Review

CS 2110 Fall 2009

Page 2: Prelim 2 Review

Overview

• Complexity and Big-O notation• ADTs and data structures

– Linked lists, arrays, queues, stacks, priority queues, hash maps• Searching and sorting• Graphs and algorithms

– Searching– Topological sort– Minimum spanning trees– Shortest paths

• Miscellaneous Java topics– Dynamic vs static types, casting, garbage collectors, Java docs

Page 3: Prelim 2 Review

Big-O notation

• Big-O is an asymptotic upper bound on a function– “f(x) is O(g(x))”

Meaning: There exists some constant k such that f(x) ≤ k g(x)

…as x goes to infinity

• Often used to describe upper bounds for both worst-case and average-case algorithm runtimes– Runtime is a function: The number of operations

performed, usually as a function of input size

Page 4: Prelim 2 Review

Big-O notation

• For the prelim, you should know…– Worst case Big-O complexity for the algorithms we’ve

covered and for common implementations of ADT operations

• Examples– Mergesort is worst-case O(n log n)– PriorityQueue insert using a heap is O(log n)

– Average case time complexity for some algorithms and ADT operations, if it has been noted in class

• Examples– Quicksort is average case O(n log n)– HashMap insert is average case O(1)

Page 5: Prelim 2 Review

Big-O notation

• For the prelim, you should know…– How to estimate the Big-O worst case runtimes of

basic algorithms (written in Java or pseudocode)• Count the operations• Loops tend to multiply the loop body operations by the

loop counter• Trees and divide-and-conquer algorithms tend to

introduce log(n) as a factor in the complexity• Basic recursive algorithms, i.e., binary search or

mergesort

Page 6: Prelim 2 Review

Abstract Data Types

• What do we mean by “abstract”?– Defined in terms of operations that can be

performed, not as a concrete structure• Example: Priority Queue is an ADT, Heap is a concrete

data structure

• For ADTs, we should know:– Operations offered, and when to use them– Big-O complexity of these operations for standard

implementations

Page 7: Prelim 2 Review

ADTs:The Bag Interface

interface Bag<E> {

void insert(E obj);

E extract(); //extract some element

boolean isEmpty();

E peek(); // optional: return next

element without removing

}

Examples: Queue, Stack, PriorityQueue

7

Page 8: Prelim 2 Review

Queues

• First-In-First-Out (FIFO)– Objects come out of a queue in the same order

they were inserted• Linked List implementation

– insert(obj): O(1)• Add object to tail of list• Also called enqueue, add (Java)

– extract(): O(1) • Remove object from head of list• Also called dequeue, poll (Java)

8

Page 9: Prelim 2 Review

Stacks

• Last-In-First-Out (LIFO)– Objects come out of a queue in the opposite order

they were inserted• Linked List implementation

– insert(obj): O(1)• Add object to tail of list• Also called push (Java)

– extract(): O(1) • Remove object from head of list• Also called pop (Java)

9

Page 10: Prelim 2 Review

Priority Queues

• Objects come out of a Priority Queue according to their priority

• Generalized– By using different priorities, can implement Stacks or Queues

• Heap implementation (as seen in lecture)– insert(obj, priority): O(log n)

• insert object into heap with given priority• Also called add (Java)

– extract(): O(log n) • Remove and return top of heap (minimum priority element)• Also called poll (Java)

10

Page 11: Prelim 2 Review

Heaps

• Concrete Data Structure• Balanced binary tree• Obeys heap order

invariant:Priority(child) ≥ Priority(parent)

• Operations– insert(value, priority)– extract()

Page 12: Prelim 2 Review

• Put the new element at the end of the array

• If this violates heap order because it is smaller than its parent, swap it with its parent

• Continue swapping it up until it finds its rightful place

• The heap invariant is maintained!

Heap insert()

12

Page 13: Prelim 2 Review

4

146

21 198 35

22 5538 10 20

13

Heap insert()

Page 14: Prelim 2 Review

4

146

21 198 35

22 5538 10 20 5

14

Heap insert()

Page 15: Prelim 2 Review

4

146

21

19

8 35

22 5538 10 20

5

15

Heap insert()

Page 16: Prelim 2 Review

4

14

6

21

19

8 35

22 5538 10 20

5

16

Heap insert()

Page 17: Prelim 2 Review

4

14

6

21

19

8 35

22 5538 10 20

5

17

Heap insert()

Page 18: Prelim 2 Review

• Time is O(log n), since the tree is balanced

– size of tree is exponential as a function of depth

– depth of tree is logarithmic as a function of size

18

insert()

Page 19: Prelim 2 Review

• Remove the least element – it is at the root• This leaves a hole at the root – fill it in with the last

element of the array• If this violates heap order because the root

element is too big, swap it down with the smaller of its children

• Continue swapping it down until it finds its rightful place

• The heap invariant is maintained!19

extract()

Page 20: Prelim 2 Review

4

56

21 148 35

22 5538 10 20 19

20

extract()

Page 21: Prelim 2 Review

56

21 148 35

22 5538 10 20 19

4

21

extract()

Page 22: Prelim 2 Review

56

21 148 35

22 5538 10 20 19

4

22

extract()

Page 23: Prelim 2 Review

56

21 148 35

22 5538 10 20

194

23

extract()

Page 24: Prelim 2 Review

5

6

21 148 35

22 5538 10 20

19

4

24

extract()

Page 25: Prelim 2 Review

5

6

21

14

8 35

22 5538 10 20

19

4

25

extract()

Page 26: Prelim 2 Review

5

6

21

14

8 35

22 5538 10 20

4

19

26

extract()

Page 27: Prelim 2 Review

6

21

14

8 35

22 5538 10 20

4 5

19

27

extract()

Page 28: Prelim 2 Review

6

21

14

8 35

22 5538 10 20

19

4 5

28

extract()

Page 29: Prelim 2 Review

6

21

14

8 35

22 5538 10

20

19

4 5

29

extract()

Page 30: Prelim 2 Review

6

21

14

8 35

22 5538 10

20

19

4 5

30

extract()

Page 31: Prelim 2 Review

6

21

148

35

22 5538 10

20 19

4 5

31

extract()

Page 32: Prelim 2 Review

6

21

148

35

22 5538

10

20

19

4 5

32

extract()

Page 33: Prelim 2 Review

6

21

148

35

22 5538

10 19

20

4 5

33

extract()

Page 34: Prelim 2 Review

• Time is O(log n), since the tree is balanced

34

extract()

Page 35: Prelim 2 Review

• Elements of the heap are stored in the array in order, going across each level from left to right, top to bottom

• The children of the node at array index n are found at 2n + 1 and 2n + 2

• The parent of node n is found at (n – 1)/2

Store in an ArrayList or Vector

35

Page 36: Prelim 2 Review

Sets

36

• ADT Set– Operations:

•void insert(Object element);•boolean contains(Object element);•void remove(Object element);•int size();•iteration

• No duplicates allowed• Hash table implementation: O(1) insert and contains• SortedSet tree implementation: O(log n) insert and

containsA set makes no promises about ordering, but you can still iterate over it.

Page 37: Prelim 2 Review

Dictionaries

37

• ADT Dictionary (aka Map)– Operations:

• void insert(Object key, Object value);• void update(Object key, Object value);• Object find(Object key);• void remove(Object key);• boolean isEmpty();• void clear();

• Think of: key = word; value = definition• Where used:

– Symbol tables– Wide use within other algorithms

A HashMap is a particular implementation of the Map interface

Page 38: Prelim 2 Review

Dictionaries

38

• Hash table implementation:– Use a hash function to compute hashes of keys– Store values in an array, indexed by key hash– A collision occurs when two keys have the same hash– How to handle collisions?

• Store another data structure, such as a linked list, in the array location for each key

• Put (key, value) pairs into that data structure

– insert and find are O(1) when there are no collisions• Expected complexity

– Worst case, every hash is a collision• Complexity for insert and find comes from the tertiary data structure’s

complexity, e.g., O(n) for a linked listA HashMap is a particular implementation of the Map interface

Page 39: Prelim 2 Review

For the prelim…

• Don’t spend your time memorizing Java APIs!• If you want to use an ADT, it’s acceptable to write code

that looks reasonable, even if it’s not the exact Java API. For example,

Queue<Integer> myQueue = new Queue<Integer>();myQueue.enqueue(5);…int x = myQueue.dequeue();

• This is not correct Java (Queue is an interface! And Java calls enqueue and dequeue “add” and “poll”)

• But it’s fine for the exam.

Page 40: Prelim 2 Review

Searching

• Find a specific element in a data structure• Linear search, O(n)

– Linked lists– Unsorted arrays

• Binary search, O(log n)– Sorted arrays

• Binary Search Tree search– O(log n) if the tree is balanced– O(n) worst case

Page 41: Prelim 2 Review

Sorting

• Comparison sorting algorithms– Sort based on a ≤ relation on objects– Examples

• QuickSort• HeapSort• MergeSort• InsertionSort• BubbleSort

Page 42: Prelim 2 Review

QuickSort

42

• Given an array A to sort, choose a pivot value p• Partition A into two sub-arrays, AX and AY

– AX contains only elements ≤ p– AY contains only elements ≥ p

• Recursively sort sub-arrays separately• Return AX + p + AY• O(n log n) average case runtime• O(n2) worst case runtime

– But in the average case, very fast! So people often prefer QuickSort over other sorts.

Page 43: Prelim 2 Review

43

20 31 24 19 45 56 4 65 5 72 14 99

pivot partition

5 1914

4

3172

56

65 45

24

99

204 5 14 19 24 31 45 56 65 72 99

QuickSort QuickSort

4 5 14 19 20 24 31 45 56 65 72 99

concatenate

Page 44: Prelim 2 Review

QuickSort Questions

44

•Key problems–How should we choose a

pivot?–How do we partition an array

in place?

•Partitioning in place–Can be done in O(n)

Choosing a pivot Ideal pivot is the median, since this

splits array in half Computing the median of an

unsorted array is O(n), but algorithm is quite complicated

Popular heuristics: Use first value in array (usually not a

good choice) Use middle value in array Use median of first, last, and middle

values in array Choose a random element

Page 45: Prelim 2 Review

Heap Sort

• Insert all elements into a heap• Extract all elements from the heap

– Each extraction returns the next largest element; they come out in sorted order!

• Heap insert and extract are O(log n)– Repeated for n elements, we have O(n log n)

worst-case runtime

Page 46: Prelim 2 Review

Graphs• Set of vertices (or nodes) V, set of edges E• Number of vertices n = |V|• Number of edges m = |E|

– Upper bound O(n2) on number of edges• A complete graph has m = n(n-1)/2

• Directed or undirected– Directed edges have distinct head and tail

• Weighted edges• Cycles and paths• Connected components• DAGs • Degree of a node (in- and out- degree for directed

graphs)

Page 47: Prelim 2 Review

Graph Representation

• You should be able to write a Vertex class in Java and implement standard graph algorithms using this class

• However, understanding the algorithms is much more important than memorizing their code

Page 48: Prelim 2 Review

Topological Sort

• A topological sort is a partial order on the vertices of a DAG– Definition: If vertex A comes before B in topological sort order,

then there is no path from B to A– A DAG can have many topological sort orderings; they aren’t

necessarily unique (except, say, for a singly linked list)• No topological sort possible for a graph that contains cycles• One algorithm for topological sorting:

– Iteratively remove nodes with no incoming edges until all nodes removed (see next slide)

– Can be used to find cycles in a directed graph• If no remaining node has no incoming edges, there must be a cycle

Page 49: Prelim 2 Review

Topological Sorting

A

B

D

E C

A B E DC

Page 50: Prelim 2 Review

Graph Searching

• Works on directed and undirected graphs• You have a start vertex which you visit first• You want to visit all vertices reachable from

the start vertex– For directed graphs, depending on your start

vertex, some vertices may not be reachable• You can traverse an edge from an already

visited vertex to another vertex

Page 51: Prelim 2 Review

Graph Searching

• Why is choosing any path on a graph risky?– Cycles!– Could traverse a cycle forever

• Need to keep track of vertices already visited– No cycles if you do not visit a vertex twice

• Might also help to keep track of all unvisited vertices you can visit from a visited vertex

Page 52: Prelim 2 Review

Graph Searching

• Add the start vertex to the collection of vertices to visit

• Pick a vertex from the collection to visit– If you have already visited it, do nothing– If you have not visited it:

• Visit that vertex• Follow its edges to neighboring vertices• Add unvisited neighboring vertices to the set to visit• (You may add the same unvisited vertex twice)

• Repeat until there are no more vertices to visit

Page 53: Prelim 2 Review

Graph Searching

• Runtime analysis– Visit each vertex only once– When you visit a vertex, you traverse its edges

• You traverse all edges once on a directed graph• Twice on an undirected graph

– At worst, you add a new vertex to the collection to visit for each edge (collection has size of O(m))

– Lower bound is Ω(n + m)• Actual results depends on cost to add/delete vertices

to/from the collection of vertices to visit

Page 54: Prelim 2 Review

Graph Searching

• Depth-first search and breadth-first search are two graph searching algorithms

• DFS pushes vertices to visit onto a stack– Examines a vertex by popping it off the stack

• BFS uses a queue instead• Both have O(n + m) running time

– Push/enqueue and pop/dequeue have O(1) time

Page 55: Prelim 2 Review

Graph Searching: PseudocodeNode search(Node startNode, List<Node> nodes) {

BagType<Node> bag = new BagType<Node>();Set<Node> visited = new Set<Node>();bag.insert(startNode);while(!bag.isEmpty()) {

node = bag.extract();if(visited.contains(node))

continue; // Already visitedif(found(node))

return node; // Search has found its goalvisited.add(node); // Mark node as visitedfor(Node neighbor : node.getNeighbors())

bag.insert(neighbor);}

}If generic type BagType is a Queue, this is BFSIf it’s a Stack, this is DFS

Page 56: Prelim 2 Review

Graph Searching: DFS

A

B

D

E C

∅-AA-BB-CB-EE-D

A B E D C

Stack

Page 57: Prelim 2 Review

Graph Searching: BFS

A

B

D

E C

∅-A

E-D

A-B

C-D

B-CB-E

A B E DC

Queue

Page 58: Prelim 2 Review

Spanning Trees• A spanning tree is a

subgraph of an undirected graph that:– Is a tree– Contains every vertex in the

graph• Number of edges in a tree

m = n-1

Page 59: Prelim 2 Review

Minimum Spanning Trees (MST)

• Spanning tree with minimum sum edge weights– Prim’s algorithm– Kruskal’s algorithm– Not necessarily unique

Page 60: Prelim 2 Review

Prim’s algorithm

• Graph search algorithm, builds up a spanning tree from one root vertex

• Like BFS, but it uses a priority queue• Priority is the weight of the edge to the vertex• Also need to keep track of which edge we used

• Always picks smallest edge to an unvisited vertex

• Runtime is O(m log m)– O(m) Priority Queue operations at log(m) each

Page 61: Prelim 2 Review

Prim’s Algorithm

A

B

C

D

E

G

F

5

7

2

8

3

10

124

1

9

∅-A0

A-C2

A-B5

C-B4

C-D3

D-E8

D-F10

B-E7

D-G12

E-G9

G-F1

Priority Queue

Page 62: Prelim 2 Review

Kruskal’s Algorithm

• Idea: Find MST by connecting forest components using shortest edges – Process edges from least to greatest– Initially, every node is its own component– Either an edge connects two different components or it

connects a component to itself• Add an edge only in the former case

– Picks smallest edge between two components– O(m log m) time to sort the edges

• Also need the union-find structure to keep track of components, but it does not change the running time

Page 63: Prelim 2 Review

Kruskal’s Algorithm

A

B

C

D

E

G

F

5

7

2

8

3

10

124

1

9

Edges are darkened when added to the tree

Page 64: Prelim 2 Review

Dijkstra’s Algorithm

• Compute length of shortest path from source vertex to every other vertex

• Works on directed and undirected graphs• Works only on graphs with non-negative edge

weights• O(m log m) runtime when implemented with

Priority Queue, same as Prim’s

Page 65: Prelim 2 Review

Dijkstra’s Algorithm

• Similar to Prim’s algorithm• Difference lies in the priority

– Priority is the length of shortest path to a visited vertex + cost of edge to unvisited vertex

– We know the shortest path to every visited vertex • On unweighted graphs, BFS gives us the same

result as Dijkstra’s algorithm

Page 66: Prelim 2 Review

Dijkstra’s Algorithm

A

B

C

D

E

G

F

5

7

2

8

3

10

154

1

9

0

5

2

5

12

15

16

∅-A0

A-B5

A-C2

C-B6

C-D5

B-E12

D-E13

D-F15

D-G20

F-G16

E-G21

Priority Queue

Page 67: Prelim 2 Review

Dijkstra’s Algorithm

• Computes shortest path lengths• Optionally, can compute shortest path as well

– Normally, we store the shortest distance to the source for each node

– But if we also store a back-pointer to the neighbor nearest the source, we can reconstruct the shortest path from any node to the source by following pointers

– Example (from previous slide):• When edge [B-E 12] is removed from the priority queue, we mark

vertex E with distance 12. We can also store a pointer to B, since it is the neighbor of E closest to the source A.