31
1 Symbol Tables Symbol Tables Chapter 12.1-12.3 Sedgewick

1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

Embed Size (px)

Citation preview

Page 1: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

1Symbol Tables

Symbol Tables

Chapter 12.1-12.3 Sedgewick

Page 2: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

2Symbol Tables

Searching Searching is a fundamental element of many

computational tasks looking up a name in a phone book selecting records in databases searching for pages on the web

Characteristics of searching: typically, very large amount of data (very many

items) "information need" specified by keys (search

terms) effective keys identify a small proportion of data

Page 3: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

3Symbol Tables

…Searching In our context (DS&A), we abstract the problem

to: we have a large collection of items each item contains key values and other data

The search problem: input: a key value output: item(s) containing that key

Page 4: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

4Symbol Tables

…Searching We assume:

keys are unique each item has one key

How do we represent a key in an item? Extend Item.h to handle complex items. Items now have a key and data

Page 5: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

5Symbol Tables

Complex Items – Item.htypedef int Key; //Can change key type

struct record{

Key keyVal;

char value[10]; //Can change value type

};

typedef struct record * Item;

#define NULLitem NULL //This indicates no item

#define key(A) ((A)->keyVal)

#define compare(A,B) ((A) – (B))

void itemShow(Item);

Page 6: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

6Symbol Tables

Symbol Tables A symbol table (dictionary) is a collection of

items with unique keys that has operations to Insert a new item Return an item with a particular key

Applications of symbol tables: programming language processors (e.g.

compilers, interpreters) text processing systems (spell-checkers,

document retrieval)

Page 7: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

7Symbol Tables

Symbol Table InterfaceSymbolTable.h

typedef struct sTabRep * STab;

// Create a new Symbol Table

STab stInit();

// insert an item in symbol table

void stInsert(STab s,Item i);

// return item with given key in table

// return NULLItem if key is not in the table

Item stSearch(STab s,Key k);

Page 8: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

8Symbol Tables

..First Class Symbol Table InterfaceSymbolTable.h

// return the number of items in the table

int stCount(STab s);

// Delete the given item from table

void stDelete(STab s,Item i);

// Find the nth item in table

Item stSelect(STab s,int n);

// Traverse items in key order

void stSort(STab s);

Page 9: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

9Symbol Tables

Insertion into the SymbolTable What does insert do if key already exists in

table? approach A: do nothing (insertion fails silently) approach B: return an error indication approach C: replace existing item associated

with key

We use approach A and provide a replace function if necessary

Page 10: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

10

Symbol Tables

Example: Symbol Table Client Using a symbol table

Generate an ordered list of random numbers with no duplicates Use stSearch to check if it is already in the

table If not use stInsert to insert it!

Use stSort to print out the numbers in order Only really care about the key of the item

See stClient1.c for full implementation

Page 11: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

11

Symbol Tables

Exercise: Random Number Checker

Write a program that uses a symbol table to generate many random #'s in the range 1..N count frequency of occurrence of each Expectation: all frequencies roughly equal

What are the items? What does the key represent What does the value represent

Page 12: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

12

Symbol Tables

Symbol Table Implementation 1: Key Indexed Array

Use key to determine the position of the item in the array Requires dense keys (ie. Few gaps) Keys must be integral (or easy to map to

integral values)

NULLitem NULLitem NULLitem

[0] [1] [2] [7][6][5][3] [4]

1,data 3,data 4,data 5,data 7,dataitems

Page 13: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

13

Symbol Tables

ST_keyIndexed.cstruct sTabRep{ Item *items; int count; int size;};//Assume keys are from 0 – (max-1) and are uniqueSTab stInit(){ int i; Stab st = malloc(struct sTabRep); STab st->items = malloc((max)* sizeof(Item)); for(i=0;i< max;i++) st->items[i] = NULLitem; } st->count = 0; st->size = max; return st;}

Page 14: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

14

Symbol Tables

…ST_keyIndexed.cint stCount(STab st){ assert(st != NULL); return st->count;}void stInsert(STab st, Item i){ assert(st != NULL); if(compare(key(i), st->size) < 0 && compare(st->items[key(i)],NULLitem) == 0){ st->items[key(i)] = item; st->count++; }}//Exercise:Item stSearch(STab st, Key k);

Page 15: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

15

Symbol Tables

…ST_keyIndexed.c//Exercise

void stDelete(STab st, Item i){

}

//Exercise

Item stSelect(STab st, int n){

}

Page 16: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

16

Symbol Tables

…ST_keyIndexed.c//Traverse all items in sorted order

void stSort(STab st){

int i;

assert(st != NULL);

for(i = 0; i < st->size; i++){

if(compare(st->items[i],NULLitem) != 0){

showItem(st->items[i]);

}

}

}

Page 17: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

17

Symbol Tables

Properties:Key Indexed Array Implementation

Insert, search and delete and count are O(1) Init, select and sort are O(n)

Problem: May have large gaps in array due to sparse keys. Not suitable for all types of data

Large range of keys Key cannot easily be mapped to unique index

Page 18: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

18

Symbol Tables

Symbol Table Implementation2: Ordered Array

Enter items into array without leaving gaps Put items in key order

Can use linear or binary search to find items

items

Page 19: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

19

Symbol Tables

ST_orderedArray.c//Data structure representation is the samestruct sTabRep{ Item *items; int count; int size;};STab stInit(){ int i; Stab st = malloc(struct sTabRep); assert(st != NULL); STab st->items = malloc((max)* sizeof(Item)); st->count = 0; st->size = max; return st;}

Page 20: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

20

Symbol Tables

…ST_orderedArray.cItem search(STab st, Key k) {

int i;

Item returnVal;

assert(st != NULL && st->items != NULL);

i = findInArray(k, st->items, 0, st->count-1);

if( i < st->count &&

compare(key(st->items[i]),k) == 0){

returnVal = st->items[i];

}else{

returnVal = NULLitem;

}

return returnVal;

}

Page 21: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

21

Symbol Tables

…ST_orderedArray.cvoid stInsert(STab st, Item it) {

assert(st != NULL && st->items != NULL);

assert(st->count < st->size);

int i = findInArray(key(it),st->items,0,st->count -1);

if (i < st->count &&

compare(key(st->items[i]),key(it)) != 0){

int j;

for (j = st->count; j > i; j--)

st->items[j] = st->items[j-1];

}

st->items[i] = it;

st->count++;

}

}

Page 22: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

22

Symbol Tables

Linear Search for findInArray()//does not indicate if k is there or not

//return value 0 indicates k ≤ all keys in array

//return value N for array of size N indicates that

//k is larger than all keys in array

int findInArray(Key k, Item a[], int lo, int hi) {

int i, diff;

for (i = lo; i <= hi && diff > 0; i++) {

diff = compare(k, key(a[i]));

}

return i;

}

Page 23: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

23

Symbol Tables

Binary Search for findInArray()int findInArray(Key k, Item a[], int lo, int hi) { int returnVal; if (hi <= lo) { returnVal = lo; }else{ int mid = (hi+lo)/2; int diff = compare(k, key(a[mid])); if (diff < 0){ returnVal = findInArray(k, a, lo, mid); } else if (diff > 0){ returnVal = findInArray(k, a, mid+1, hi); }else{ returnVal = mid; } } return returnVal;}

Page 24: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

24

Symbol Tables

Cost Analysis of Searching Linear Search:

best case: key is min key (1 comparison) worst case: key is not in array (N comparisons) average case: key is in middle (N/2

comparisons) Binary Search:

best case: key is mid key (1 comparison) worst case: key is not in array (log2N

comparisons) average case: find key part-way through

partitioning

Page 25: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

25

Symbol Tables

Properties:Ordered Array Implementation

Init, Count O(1) Search O(logn)

Assuming binary search Insertion, Deletion O(n)

Need to shuffle items along to fill gaps

Page 26: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

26

Symbol Tables

Symbol Table Implementation 3:Linked Lists

Linked list of items, maintained in key order No real need for max size Must use linear search

items

Page 27: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

27

Symbol Tables

ST_LinkedList.ctypedef struct sTabRep *STab;

typedef typedef struct node Node;

struct node {

Item data;

Node *next;

};

struct STabRep {

Node *items;

int count;

int max;

};

Page 28: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

28

Symbol Tables

Properties:Ordered Linked List Implementation

Init, count O(1) Search, Insert, Delete O(n)

best case: key is min key (1 comparison) worst case: key is max key (n comparisons) average case: key is in middle (n/2

comparisons)

Page 29: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

29

Symbol Tables

Symbol Table Implementation 4:Binary Search Tree

items

Page 30: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

30

Symbol Tables

ST_bst.ctypedef struct sTabRep *STab;

typedef typedef struct node Node;

typedef struct node *Tree;

struct node {

Item data;

Node *left;

Node *right;

};

struct sTabRep {

Node *items;

int count;

int size;

};

Page 31: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name

31

Symbol Tables

PropertiesBinary Search Tree Implementation

Init, Count O(1) We use a counter to keep track of how many

items in the tree

Insert, Search, Delete Average height worst case O(n) Max height worst case O(nlogn)