28
Welcome to the Brixton Library Technology Initiative (Coding for Adults) [email protected] [email protected] January 30 th 2016 Week 4 – Collections 1

Brixon Library Technology Initiative

Embed Size (px)

Citation preview

Welcome to the Brixton Library Technology Initiative

(Coding for Adults)

[email protected]

[email protected]

January 30th 2016

Week 4 – Collections 1

Collections

• In computer science a collection is a grouping of a variable number of data items (possibly zero) that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion – Wikipedia

• They formally belong to a group of data types called Abstract Data Types.

Collections - Data Structures

• A data structure is a specialized format for organizing and storing data. General data structure types include the array, the file, the record, the table, the tree, and so on. Any data structure is designed to organize data to suit a specific purpose so that it can be accessed and worked with in appropriate ways. Wikipedia

• I will, in fact, claim that the difference between a bad programmer and a good one is whether they consider their code or their data structures more important. – Linus Torvals (my minor change ‘he’ -> ‘they’ – sorry Linus )

Kinds of collections

• Different kinds of collections are Arrays, Lists, Sets, Trees, Graphs and Maps or Dictionaries.

• Fixed-size arrays are usually not considered a collection because they hold a fixed number of data items, although they commonly play a role in the implementation of collections.

• Variable-size arrays are generally considered collections.

Python Collection Types• In build types : list, set, dict, tuple

• collections module adds more :

• deque

• Counter

• OrderedDict : dict subclass that remembers the order entries were added

• defaultdict : dict with missing values

Arrays, Python arrays and Lists

• In most computer languages an array is the simplest form of collection.

• It is a sequence of memory positions that can be used to store elements.

• Python considers the Array a special kind of List.

• The Python List has many very cool features that other languages do not. These might be the reason to write part or the whole of a system in Python.

Lists – indexed accessA sequence of memory positions that can be used to store elements.

# declare a variable myList of type List populated with elements 10 to 50myList = [10, 20, 30, 40, 50]

# Can access the elements using an index.print myList[3]40

# Index position starts at zero.print myList[0]10

myList 10 20 30 40 50

index 0 1 2 3 4

Lists – indexed access# Can use a variable to index elements in the array or listindex = 4print myList [ index ]50

# A access from the end using negative index print myList [-1]50

print myList[-3]30

myList 10 20 30 40 50

index 0 1 2 3 4

-5 -4 -3 -2 -1

Lists – index out of range

# We get an IndexError when we try to index out of bounds

myList = [10, 20, 30, 40, 50]

print myList[ 20 ]

IndexError: list index out of range

Lists - slices# Lists can be accessed using a ‘slice’myList = [10, 20, 30, 40, 50]

# myList[ start : until ] UNTIL is not included

print myList [ 1 : 4 ][20,30,40]

# You can omit implied start and until

print myList [ : 4 ][10,20,30,40]

print myList [ 2 : ][30, 40, 50]

Lists - slices# A slice can be defined in steps myList = [10, 20, 30, 40, 50]

# myList[ start : until : step ]

print myList [ 1 : 4 : 2 ][20,40]

# With implied values for start and end

print myList [ : : 2][10, 30, 50]

print myList [ : : 3][10, 40]

Lists – assignment to slices

myList = [10,20,30,40,50]

# replace some values myList[2:4] = ['C', 'D', 'E']print myList[10, 20, 'C', 'D', 'E', 50]

# now remove them by assigning an empty list to the same positionsmyList[2:5] = [] print myList[10, 20, 50]

# clear the list by replacing all the elements with an empty list myList[:] = [] print myList []

Lists – Strings

# Strings are treated as a listname = "Felix The House Cat“print name[2:5]'lix'

# Strings are immutable – you cannot change them :name[2:5] = "CAN NOT DO THIS"

TypeError: 'str' object does not support item assignment

Lists – basic operation summary

Expression Result

myList[ 3 ] 40

myList[ 0 ] 10

index=4myList[ index ]

50

myList[ -1 ] 50

myList[ -3 ] 30

myList[ 20 ] IndexError: list index out of range

myList[ 1 : 4 ] [20, 30, 40]

myList[ : 4] [10, 20, 30, 40]

myList[1:4:2] [20, 40]

myList[::2] [10, 30, 50]

myList[2:4] = ['C', 'D', 'E‘] [10, 20, 'C', 'D', 'E', 50]

myList[2:5] = [] [10, 20, 50]

myList[:] = [] []

myList[ : : -1 ] [50,40,30,20,10]

Given myList = [10,20,30,40,50]

Lists – more operations

Expression Resultremove(30) [10,20,40,50]index(40) 3index(99) ValueError: 99 is not in listcount(30) 1appendreverse() [50, 40, 30, 20, 10]

Given myList = [10,20,30,40,50]

Lists – more expressionsPython Expression Results

len([1, 2, 3]) 3

[1, 2, 3] + [4, 5, 6] [1, 2, 3, 4, 5, 6]

['Hi!'] * 4 ['Hi!', 'Hi!', 'Hi!', 'Hi!']

3 in [1, 2, 3] True

ListsLists and arrays can be multidimensional.Lists of lists.

myMulti = [ [1,2,3], ['a','b','c'], [100,200,300] ]

myMulti[ 0 ][ 2 ]3myMulti[ 1 ][ 1 ]'b'myMulti[ 1 ][ 1: ]['b', 'c']

Arrays

• Array is different to List because all elements in an array must be the same type

• myList = [10, 20, 'C', 'D', 'E', 30, 40, 50]

Python docs:https://docs.python.org/2/library/array.html

The module defines the following type:class array.array(typecode[, initializer])

A new array whose items are restricted by typecode, and initialized from the optional initializer value, which must be a list, string, or iterable over elements of the appropriate type.

Arrays - typecodeclass array.array(typecode[, initializer])

Type code C Type Python Type Minimum size in bytes

'c' char character 1'b' signed char int 1'B' unsigned char int 1'u' Py_UNICODE Unicode character 2 (see note)'h' signed short int 2'H' unsigned short int 2'i' signed int int 2'I' unsigned int long 2'l' signed long int 4'L' unsigned long long 4'f' float float 4'd' double float 8

myFloats = array.array( 'f' , [ 3.1415, 0.6931, 2.7182 ] )

Arrays – same type

import arraymyIntArray = array.array('L', [10, 20, 30, 40, 50])print myIntArray[1]

array.array('L', [10, 20, 'C', 'D', 'E', 30, 40, 50])TypeError: an integer is required

Why Is Data Structure Choice Important?

Remember what Linus said about the importance of data structures?

... whether they consider their code or their data structures more important ...

Let’s see what he means.

Consider the differences between a List and an Array.

Why Chose List Or Array• In most languages including Python List is

implemented as a chain of element positions called a linked list.

• Adding to the front of a list is cheap.• Adding to the end of a list is expensive

because we have to run along the whole list to find the end and then add the new element.

10 20 30 40

Why Chose List Or Array• Inserting an element in a list relatively cheap.

• Lists have the memory overhead of all the pointers.

10 20 30 40

A

Why Chose List Or ArrayWith arrays we always know the length so adding an element to the end is very cheap.

Depending on how arrays are implemented in your language :

Inserting is very expensive because we have to take copies of the parts and then glue back together.

Adding to the front of an array is very expensive for the same reason.

Choosing the right data structure is important .

Special list and arrays - Stack, Queue, Deque• A stack is a last in, first out (LIFO) data

structure– Items are removed from a stack in the reverse

order from the way they were inserted• A queue is a first in, first out (FIFO) data

structure– Items are removed from a queue in the same

order as they were inserted• A deque is a double-ended queue—items can

be inserted and removed at either end

Stack

Last In First out

Queue

First In First out

Deque – “deck” – Double ended Queue