47
Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice Hall

Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

Object-Oriented Programming in PythonGoldwasser and Letscher

Chapter 12More Python Containers

Terry ScottUniversity of Northern Colorado

2007 Prentice Hall

Page 2: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

2

Introduction: What is Covered in Chapter 12

• Lists and Tuples.

• Dictionaries.

• Containers of Containers.

• Sets.

• Arrays.

• Python’s internal use of dictionaries.

• Case Study: a simple search engine.

Page 3: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

3

Aspects of Containers• order: ordered sequence (list, tuple, array).• mutability: list is mutable. tuple is immutable.• associativity: dict (dictionary) • heterogeneity: most Python containers allow

different types. Some containers in other languages and one that will be introduced in this chapter only allow one type for the individual components. This is called homogeneous

• Storage – Python uses referential containers meaning that rather than the elements in the actual container, there are references (addresses) to the actual items.

Page 4: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

4

Summary of Aspects of Python Containers

list tuple dict set frozenset Array

Ordered X X X

Mutable X X X X

Associative X

Heterogeneous X X X X X

Compact storage X

Page 5: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

5

lists and tuples

• Use indexes that are sequential.• Can add or subtract values to make index values

start at 0 as these containers require (example months 1 – 12 : subtract 1).

• Limitations:– Lack of permanency: employee id’s – what happens

when an employee leaves.– Using Social Security Numbers for employees. SSN's

are not sequential.– Maybe the index values are not numeric.

Page 6: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

6

Dictionaries

• Can use non-numeric index values.– director['Star Wars'] 'George Lucas'– director['The Godfather'] 'Francis Ford Coppola'– director['American Graffiti'] 'George Lucas'

• Index values are called keys.• Keys must be immutable (int, str, tuple)• Dictionaries point to items such as 'George

Lucas' and are called values.• No limits on values.

Page 7: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

7

Python’s Dictionary Classdirector = { } # can also use director = dict()director['Star Wars'] = 'George Lucas' director['The Godfather']='Francis Ford Coppola'director[‘American Graffiti'] = 'George Lucas'director['Princess Bride'] = 'Rob Reiner'

#can also do the followingdirector ={'Star Wars': 'George Lucas',

'The Godfather': 'Francis Ford Coppola‘,'American Graffiti' : 'George Lucas' ,'Princess Bride' : 'Rob Reiner' }

Page 8: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

8

Dictionary Behaviors

Syntax Semantics

d[k] Returns value at key k, error if k not found.

d[k] = value Assigns value to d at key value k.

k in d True if k is a member of d else False.

len(d) Returns the number of items in d.

d.clear() Make d empty.

d.pop(k) Remove key k and return value at k to caller.

Page 9: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

9

Dictionary Behaviors

Syntax Semantics

d.popitem() Removes and returns arbitary key value pair.

d.keys( ) Returns a list of all keys in d.

d.values( ) Returns a list of all values in d.

d.Items() Returns a list of tuples.

for k in d Iterate over all keys short for: for k in d.keys()

Page 10: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

10

Iterating Through Entries of a Dictionary

#display a dictionary in sorted order on keystitles = director.keys()titles.sort()for movie in titles:

print movie, 'was direct by', director[movie]

#can streamline this syntaxfor movie in sorted(director):

print movie, 'was directed by', director[movie]

#can iterate over both keys and valuesfor movie,person in director.items():

print movie, 'was directed by', person

Page 11: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

11

Containers of Containers• list, tuple and dict can have values of any type

including containers.

• Modeling a two dimensional table (a list of lists):

game = [['X','X','O'], ['O','O','X'], ['X','O','X ']]

• bottomLeft = game[2][0]

X X O

O O X

X O X

Page 12: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

12

Tic-tac-toe Representation

Page 13: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

13

Modeling Many-to-Many Relationships

• Dictionary is a many-to-one relationship. Many keys may map to the same value.

• Modeling many-to-many relationship. For example a single movie may have many actors.

cast['The Princess Bride '] = ('Cary Elwes', 'Robin Wright Penn', 'Chris Sarandon', 'Mandy Patinkin', 'Andre the Giant',. . .)

Page 14: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

14

Modeling Many-to-Many Relationships (continued)

>>> #Tuple used since immutable and cast for

>>> #movies is also.

>>> 'Andre the Giant' in cast

False

#must refer to specific key value in cast

>>> 'Andre the Giant' in cast.values()

False

#must refer to results not keys

>>> 'Andre the Giant' in cast['The Princess Bride']

True

Page 15: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

15

Reverse Dictionary

• What if we want to know the keys that are associated with an item?

• Can do this one at a time or build a complete reverse dictionary.

original = {'A':1, 'B':3, 'C':3, 'D':4, 'E': 1, 'F': 3}

reverse = {1: ['A', 'E'], 3: ['C', 'B', 'F'], 4: ['D'] }

Page 16: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

16

Reverse Dictionary Code

#can build a reverse dictionarydef buildReverse(dictionary):

reverse = { }for key,value in dictionary.items():

if value in reverse: reverse[value].append(key)else: reverse[value] = [key]

return reverse

Page 17: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

17

Sets and Frozensets

• Originally programmers created a mathematical set class using a list or dictionary.

• Python now contains a set class that is mutable and a frozenset class that is immutable.

• Elements within the set are in arbitrary order.• Elements added to set must be immutable.• Elements of the set only occur once.• set constructor can be:

– myset = set( )– mysetb = set(container) #container can be any

#immutable container

Page 18: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

18

Set Accessor Methods: Standard Container Methods

Syntax Semantics

len(s) Returns the cardinality of set s.

v in s Returns True if v is in set s, else False.

v not in s Obviously does opposite of command above.

for v in s Iterates over all values in set s in arbitrary order.

Page 19: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

19

Set Accessor Methods: Comparing Two Sets

Syntax Semanticss == t Returns True if s and t have identical

elements, else False.

s < t Returns True if s is a proper subset of t, else False.

s <= t

s.issubset(t)

Returns True if set s is a subset of t, else False.

s >= t Returns True if t is a proper subset of s, else False.

s >= t

s.issuperset(t)

Returns True if set t is a subset of s, else False.

Page 20: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

20

Set Accessor Methods: Creating a Third Set on Existing Sets

Syntax Semanticss | t

s.union(t)

Returns a new set of elements in both sets but with no repeats.

s & t

s.Intersection(t)

Returns a new set of elements that are in both sets s and t.

s – t

s.difference(t)

Returns a new set of elements that are in set s but not in set t.

s ^ t

s.symmetric_difference(t)

Returns a new set of elements that are in either set s or t but not both. s XOR t.

Page 21: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

21

Set Mutator Methods

Syntax Semanticss.add(v) Adds value v to set s. No effect if v is already in s.

s.discard(v) Removes v from set s. No effect if v is not in s.

s.remove(v) Removes v from set s if present. Else it raises KeyError.

s.pop() Removes and returns arbitrary value from set s.

s.clear() Removes all entries from the set s.

Page 22: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

22

Set Mutator Methods (continued)

Syntax Semantics

s |= t, s.update(t) Alters set s making it the union of sets s and t.

s &= t, s.intersection_update(t) Alters set s by making it the intersection of sets s and t.

s -= t

s.difference_update(t)

Alters set s making it s – t.

s ^= t s.symmetric_difference_update(t)

Alters set s making it the XOR of s and t.

Page 23: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

23

Set Operation Examples

>>> set ([3, 2]) < set([1,2,3])True>>> set ([3, 2]) < set([2,3])False>>> set ([7]) < set([1,2,3])False>>> set ([3, 2]) <= set([2,3])True

Page 24: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

24

Frozensets

• A frozenset is immutable.• A frozenset can perform all accessor methods

previously listed for a set but none of the mutator methods.

• All elements of a set or frozenset must be immutable. Therefore a set can not consist of a set of sets but it could be a set of frozensets.

• Dictionary keys recall can only be immutable therefore a frozenset can be a key of a dictionary.

Page 25: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

25

Sets and Frozensets

• Two sets operated on results in a set.

• Two frozensets operated on results in a frozenset.

• A set and a frozenset results in the type that occurs first in the operation.

Page 26: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

26

Illustrating Set and Frozenset Operations

>>> colors = set(['red', 'green', 'blue'])

>>> stoplight = frozenset(['green', 'yellow', 'red'])

>>> print colors & stoplight

set(['green', 'red'])

>>> print stoplight & colors

frozenset(['green', 'red'])

Page 27: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

27

Lists Versus Arrays

• A list consists of a list of references to items.

• Advantage is that a list can consist of different types. This is called a heterogeneous type.

• Disadvantage is slightly slower access since it requires two accesses to retrieve an item (once to get the reference and once to retrieve the value from the reference).

Page 28: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

28

Lists Versus Arrays (continued)

• In contrast the array type consists of the actual items.

• This is necessarily a homogenous data structure.

• It of course has slightly faster access than a list since only one access is required to retrieve a value.

Page 29: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

29

Illustrating Differences In How Information is Stored in Arrays

and Lists

Page 30: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

30

Using Arrays

• array('i') - stores integers.• array('f') – stores floats.• Other codes are possible.• Array is not part of built-in types in Python so

must import it using: from array import array.• Can also initialize array at same time as

creating.

>>> from array import array

>>> yearArray = array('i', [1776, 1789, 1917,1979])

Page 31: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

31

Python's Internal Use Of Dictionaries

• Python uses it's dictionary to keep track of namespaces.

• Namespaces consist of identifiers in a particular scope. Scope is where a variable is known.

• Python has a top-level scope called global.

• The namespace inside of a function is called a local scope.

Page 32: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

32

Python's Namespaces

• globals() – function that displays identifiers in global namespace.

• locals() – function that displays identifiers in local namespace.

• When in a function the local namespace can be accessed. If an identifier is given that does not exist in the local namespace then the global scope is accessed.

Page 33: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

33

Code Showing Global and Local Namespaces

def demo(param): x = len(param) print 'local dictionary is', locals() print 'global dictionary is', globals()

y = 'hello'demo(y)#outputlocal dictionary is ('x': 10, 'param': 'hello')global dictionary is {(will list built-ins and) 'y': 'hello',

'demo': <function demo at 0x37d270>}

Page 34: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

34

Global and Local Scopes

• Referring to the previous slide notice that y does not exist in the local scope but we could print it out within the function.

• Using variables in the global scope (called a global variable) is considered to be almost always bad programming style.

• Assigning to a global variable that is immutable inside a function will not change its value in the global scope.

• Assigning to a global variable that is mutable inside a function will change its value in the global scope since it uses a reference value.

Page 35: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

35

Built-in Module

• Python has a special module __builtins__• Some words are reserved such as for. These have a

predefined meaning and can not be used as an identifier for other situations.

• Other words have a predefined meaning but can be used as an identifier in other situations.

• These other words such as abs or True are contained in the __builtins__ module.

>>> True = 5>>> #No longer can True be used in the usual way >>> #for the current Python section.• The above is correct because the definition 'True = 5' will

be found before checking for the definition 'True' in the __builtins__ module.

Page 36: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

36

Importing Modules

• One form for importing is:

>>> from math import *

• This brings in all identifiers from math. This is said to import the namespace.

• This may be a problem when some identifier in the imported namespace covers over a wanted identifier.

Page 37: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

37

Importing Modules (continue)

• It is better to explicitly import the wanted methods shown below

>>> from math import sqrt, pi

• Another way to do this is

>>> import math

• To use the sqrt in this case is to do the following

>>> math.sqrt(5)

Page 38: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

38

Object-Oriented Name Resolution

• Except for primitive types such as int, str the state of individual object is stored in an instance-level dictionary.

• vars() retrieves identifiers and their values.

>>> myTV = Television()

>>> print vars(myTV)

{'_prevChan': 2, '_volume': 5, '_channel': 2, '_powerOn': False, '_muted': False}

Page 39: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

39

Object-Oriented Name Resolution (continued)

• Notice it does not contain class level identifiers such as volumeUp( )

• This can be accessed using:>>> vars(Television)• When searching for an identifier Python first looks in the

instance namespace, then to the class namespace then if inheritance is involved it moves up the inheritance chain.

• This explains why a child class overriding a method in a base class works, since the child class namespace is looked at first before the parent class namespace.

Page 40: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

40

A Simple Search Engine

• Searching for words in a corpus (multiple documents).

• Similar to generating an index of a book.• Can use split( ) to delineate words by white-space. • Still may have extra characters. Want (Text) to

reduce to text and "document's" to reduce to document's. – Strip off non-alphabetic characters at start.– Strip off non-alphabetic characters at end.– Make lower case.

Page 41: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

41

Code to Reduce to Actual Word

def ourStrip(w):

first = 0 #find first desirable letter

while first < len(w) and not w[first].isalpha():

first += 1

last = len(w) – 1 #find last desirable letter

while last > first and not w[last].isalpha():

last -= 1

return w[first:last+1].lower() #lower case letters

Page 42: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

42

Code to Create an Index for a Single Document

class TextIndex: #manage index of wordsdef __init__(self, contents, sourceLabel): #make new index

self._lines = contents.split('\n')self._label= sourceLabelself._wordAt = { } #build a new dictionary: has line numbers for linenum in range(len(self._lines)): #where word occurs words = self._lines[linenum].split() for w in words: # step through words w = ourStrip(w) #use our strip method if w: #w is not null

if w not in self._wordAt: #word not in dictionary self._wordAt[w] = [ linenum ] #add to dictionary

elif self._wordAt[w][-1] != linenum: #is linenum there self._wordAt[w].append(linenum) #if not add line

Page 43: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

43

Code to Create anIndex for a Single Document (continued)

def getLabel(self): #return label for documentreturn self._label

def getWords(self): #return unordered words listreturn self._wordAt.keys()

def getContext(self, word, maxOccur = 10):#returns wordsword = ourStrip(word) #occurrences in contextoutput = [ ]if word in self._wordAt: startContext = max(lineNum – 1, 0) stopContext = min(lineNum + 2, len(self._lines)) output.append('-' * 40)

output.extend(self._lines[startContext:stopContext])return'\n'.join(output)

Page 44: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

44

Code for Search Engine Corpus

class Engine: #supports word occurrences in a collectiondef __init__(self): #of text documents.

self.corpus = { } #maps label to index self._hasWord = { } #maps words to a label

def addDocument(self, contents, sourceLabel):if sourceLabel not in self._corpus: newIndex = TextIndex(contents, sourceLabel) self._corpus[sourceLabel] = newIndex for word in newIndex.getWords(): if word in self._hasWord:

self._hasWord[word].add(sourceLabel) else:

self._hasWord[word] = set([sourceLabel])

Page 45: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

45

Code for Search Engine Corpus (continued)

def lookup(self, term):"""Return set of labels for documents containing search term"""term = ourStrip(term)if term in self._hasWord: return set (self._hasWord[term])else: return set()

def getContext(term, docLabel, maxOccur = 10):#search for word in a document returning its contextreturn

self._corpus[docLabel].getContext(term,maxOccur )

Page 46: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

46

Code for Search Engine Corpus (continued)

def makeReport(self, term,maxDocuments=10,maxContext=3):output = [ ] #produce formatted report of occurrences sources = self.lookup(term) #of term in documents.num = min(len(sources), maxDocuments)label = list(sources)[:num]for docLabel in labels: output.append('Document: ' + docLabel) context=self._corpus[docLabel].getContext(term,max

Context) output.append(context) output.append(' =' * 40)return '\n'.join(output)

Page 47: Object-Oriented Programming in Python Goldwasser and Letscher Chapter 12 More Python Containers Terry Scott University of Northern Colorado 2007 Prentice

47

Unit Test for Simple Search Engine

• See book page 429-430 for unit test for the simple search engine.