Upload
alvin567
View
180
Download
1
Tags:
Embed Size (px)
Citation preview
Introduction to computational thinking
Module 8 : Strings and Characters
Module 8 : Strings and Characters 1
Asst Prof Michael LeesOffice: N4‐02c‐76
email: mhlees[at]ntu.edu.sg
Contents
1. What is a string (defining)2. Accessing strings3. String operations4. Functions and Methods for
Strings5. Formatting Strings
Module 8 : Strings and Characters 2
Chapter 4
What is a string?
• A string is a sequence of characters• Notionally, a character is a letter or a symbol• But, we’ll find out about characters later.
Module 8 : Strings and Characters 3
Defining a string
• A string is indicated using:– ‘ ‘ or “ “
• The sequence of characters is important, and is maintained.
CharacterSequence
Module 8 : Strings and Characters 4
Defining a string (more)
• Then there is “““ ””” (triple quotes)• Preserves vertical and horizontal formatting• E.g.,
“““ This is
a vertically
and horizontally formatted string. ”””
• Compare with:“This is not verticallyordered’”
Module 8 : Strings and Characters 5
More definitions
• Using single or double is fine:– S = “a string”– S = ‘a string’
• Don’t use both! – S = “a string’ WRONG!
• How to put in a apostrophe?– S = “Mike’s string” #Use different quotes– S = ‘Mike\’s string’ #Use an escape character
Module 8 : Strings and Characters 6
Index
• Elements (or characters) in a string are in a sequence, we can identify each element with an index ( a position in the sequence )
• Can start index from either end of sequence.– Positive values indicate counting from left, starting at 0
– Negative values indicate counting from right, starting with ‐1
Module 8 : Strings and Characters 7 of 53
Hello World
• String = “Hello World”
• Index 8 and ‐3 point to same location: r
Module 8 : Strings and Characters 8
Accessing one element
• We can use [ ] to access particular characters in a string:myStr = “Hello World”
x = myStr[1]
print x #will print e
print mystr =>?
print myStr[-2] => ?print myStr[11] => ?
Module 8 : Strings and Characters 9
Slicing: parts of a String
• You can also select a part of the string, slice of characters
• Similar syntax, [ start : finish ]– start : index of start of subsequence– finish: index of position after we want the subsequence to end (i.e., the sequence will not contain character myStr[finish])
• If start or finish are not specified they default to the beginning and end of the string respectively.
Module 8 : Strings and Characters 10
Slicing: parts of a String
• Slicing uses half‐open ranges, common in python. (Different from fully closed ranges)
• The first index is included, the last index is the one after what is included.
myStr = “Hello World”
myStr[1:4] => ?
Module 8 : Strings and Characters 11
Slicing: Examples
myStr = “Hello World”
print(myStr[1:6])
print(myStr[1:2])
print(myStr[-7:-1])
print(myStr[-3:-5])
print(myStr[:6])
print(myStr[5:])
ello
e
o Worl
‘ ‘ (empty)
Hello
World
Output
Module 8 : Strings and Characters 12 of 53
One more example
myStr[3:-2]
Module 8 : Strings and Characters 13 of 53
Slicing and skipping
• We can specify a third argument:– [ start : finish : countBy ]
• countBy specifies a ‘skip’• Defaults:
– start = beginning– finish = end– countBy = 1
myStr = “Hello World”
myStr[1:11:2] =>’el ol’
Module 8 : Strings and Characters 14 of 53
Slicing and skipping
myStr[::2] =>’HloWrd’
Module 8 : Strings and Characters 15 of 53
Python Idioms
• Idiom : “A form of expression natural to a language, person, or group of people”
• These are things particularly unusual/unique to python• Copying a string:aString = “String to copy”
newStr = aString[:]
newStr = ‘’.join(aString)
• To reverse a string:aString = “Madam I’m Adam”
revString = aString[::-1] => ?
Module 8 : Strings and Characters 16
STRING OPERATIONSModule 8: Strings and Characters
Module 8 : Strings and Characters 17 of 53
Basic operations
opStr = “Basic”
• length of a string : len()len(opStr) => 4
• Concatenate strings : +opStr + “ operations”
=> “Basic operations”
• Repeat String : *opStr * 3 => “BasicBasicBasic”
Module 8 : Strings and Characters 18 of 53
More operations
• Operations + and * will make new strings, input arguments not changed
• Order? important for +, irrelevant for *• Arguments to operators must be of certain type:– + : two strings– * : string and integer
• Notice the meaning of + depends on argument type (works with integers also)
Module 8 : Strings and Characters 19 of 53
The type function
• Easy to check type of variable:myStr = ‘hello world’
type(myStr) => <type ‘str’>
myStr2 = 245
type(myStr2) => <type ‘int’>
myStr + myStr2 => ?
myStr * myStr2 => ?
Module 8 : Strings and Characters 20
Back to Characters
• There are two common systems for representing characters in machines now (to store a character in memory it must be binary, so we give it a code)
• ASCII (older) and Unicode (more modern, more characters)
• Tables indicate mapping of characters to ASCII or Unicode
Module 8 : Strings and Characters 21
ASCII vs Unicode
• ASCII stores everything in 8 bits, so can only manage 256 different characters.
• Unicode, was an extension of ASCII to include all characters of all languages.
• The Unicode code space is divided into 17 planes. Each plane contains 65,536 code points (16‐bit) and consists of several charts.
• Total of: 1,114,112 possible, uses 96,000.
Module 8 : Strings and Characters 22
Module 8 : Strings and Characters 23
Getting the code
• ord() takes a string of length 1 as input and will return the unicode of the character (for standard symbols this is same as ASCII)
• chr() takes an ASCII code and returns a string on length 1 containing the corresponding character
ord(‘a’) => 97
chr(97) => ‘a’
Encrpyt (see later):code = ord(‘a’)
chr(code+1) => ‘b’Module 8 : Strings and Characters 24 of 53
Comparing characters
• When comparing a single character the code is used to compare.
‘a’ == ‘a’ => true‘a’ < ‘b’ => true‘A’ < ‘B’ => true‘1’ < ‘9’ => true‘a’ < ‘B’ => false!
Module 8 : Strings and Characters 25
Comparing strings
• Checks first element in strings, if equal move to each in turn (left to right)
• If they are not equal, the relationship of the string is the same as those two characters
• If one string is shorter (but equal to a point), the shorter string is smaller.
Module 8 : Strings and Characters 26
Examples
• ‘aa’ < ‘ab’– First element is same, move to second. a < b so ‘aa’ < ‘ab’ => true
• ‘aaab’ > ‘aaac’– First three elements same, b<c so ‘aaab’ > ‘aaac’ => false.
• ‘aa’ < ‘aaz’– The first string is the same but shorter. Thus it is “smaller” => true.
Module 8 : Strings and Characters 27
Membership operations
• Does one string contain another.– a in b: returns true is string b contains string a
myStr = “abcdefg”
‘c’ in myStr => true
‘cde’ in myStr => true
‘cef’ in myStr => false
myStr in myStr => true
Module 8 : Strings and Characters 28
Strings are Immutable
– aStr = ‘spam’
– aStr[1] = ‘l’ ERROR
• However, you can use it to make another string (copy it, slice it, etc).– newStr = aStr[:1] + ‘l’ + aStr[2:]– aStr ‘spam’– newStr => ‘slam’
Module 8 : Strings and Characters
• Strings are immutable, that is you cannot change one once you make it:
29
FUNCTIONS & METHODS FOR STRINGS
Module 8: Strings and Characters
Module 8 : Strings and Characters 30
Functions reminder
• A function is a piece of code that performs some operation. Details are hidden (encapsulated), only it’s interface exposed.
• Way to arrange a program to make it easier to understand.
• Functions have arguments as input and may return some output.
Module 8 : Strings and Characters 31
A string fuction: len()
• We have already seen one string function:–len(‘test string’) => 11(not 10)
• Input: a string• Output : an integer indicating length of string.
Module 8 : Strings and Characters 32
String method
• A method is a variation on a function– like a function, it represents a program– like a function, it has input arguments and an output
• Unlike a function, it is applied in the context of a particular object.
• This is indicated by the ‘dot notation’ invocation
• Each string is itself and object.
Module 8 : Strings and Characters 33
Example
• upper() is a string method.• It will output a new string which is the same as the string on which it was called, except all letters will now be upper case.
myStr = “shouting!”
mystr.upper() => “SHOUTING!”• object : myStr• method : upper()• method call: myStr.upper()
Module 8 : Strings and Characters 34
Methods in general• We’ll touch on this more later.• But in general the form is:
– object.method() • We say object is calling the method method.• Different objects have different methods available, defined by the type (class).
• How to find out all methods available on strings? Use a reference! : Python online…– http://docs.python.org/lib/string‐methods.html
• Iintegrated Development Environment (IDE) such as IDLE will help.
Module 8 : Strings and Characters 35
Find
myStr = “Find in a string”
myStr.find(‘d’) => 3
• find is another string method, called using same object.method() notation.
• It takes a single character string as input and returns the index of the position where the character is first seen (from left to right).
• ‘d’ is called an argument of the method.
Module 8 : Strings and Characters 36
Chaining methods together
Methods can be chained together. • Perform first operation, yielding an object• Use the resulting object for the next methodmyStr = ‘Python Rules!’
myStr.upper() ‘PYTHON RULES!’
myStr.upper().find(‘H’)
=> 3
Module 8 : Strings and Characters 37
CHALLENGE 8.1 – SECRET MESSAGECreate a simple program to do simple encrypting and decrypting of a string. Nothing too complicated: just so a human cannot read the encrypted message.
Module 8 : Strings and Characters 38
Thought process
• Character replacement.• Replace each character in the string with a different one
• Need to be able to translate back• A lookup table?• Or simple way to shift the ASCII code?
Module 8 : Strings and Characters 39
CHALLENGE 8.2 – REMOVE WHITE SPACERemove all white space characters from a string
Module 8 : Strings and Characters 40
Though process
• Similar solution? iterate and re‐build string without space?
• Think about the task, is it common?• If it seems common, there is probably a built‐in method!
• Look in the python reference (online or book)
Module 8 : Strings and Characters 41
CHALLENGE 8.3 – MESSING WITH VOWELSmake all vowels in the second half of a string capitalized
Module 8 : Strings and Characters 42
Thought process
• Common? No• What do we have available to us?• A method…• Iterate through string, check if its in second half of string (using len()) and re‐build, changing case where necessary
• Alternatively, split the string in half first.
Module 8 : Strings and Characters 43
FORMATTING STRINGSModule 8: Strings and Characters
Module 8 : Strings and Characters 44
Pretty printing
• The default print function is fine.• More is possible• format allows us to perform low‐level type‐setting (prettier strings)
Module 8 : Strings and Characters 45
Basic form
• Look at an example:n = 500
description = “a lot!”
print(“In this class we have {}students, which is {}”.format(n, description))
• This will show:In this class we have 500 students, which is a lot!
Module 8 : Strings and Characters 46
Basic form
• format creates a new string!“In this class we have {} students, which is {}”.format(n, description)
• This outputs a new string where {} will be replaced by the data inside the format method (this case n and description)
Module 8 : Strings and Characters 47
Example
print(“In this class we have {} students, which is {}”.format(n, description))
String in quotesformat
• Objects are matched in order with format descriptors. The substitution is made and resulting string created (then printed)
In this class we have 500 students, which is a lot!Module 8 : Strings and Characters 48
Format String
• We can put ‘stuff’ inside the {} which specifies how the data should be printed (i.e., make it pretty)
• Overall:{:[align] [min_width] [.precision][descriptor]}
[ ] means optional
Module 8 : Strings and Characters 49
Descriptors (codes)
• s string• d decimal• e floating point exponent• f floating point decimal• u unsigned integer
Module 8 : Strings and Characters 50
width and alignment
• < left• > right• ^ center• Integers indicate widthprint(‘{:>10s} is {:<10d} years old.’.format(‘Bill’, 25))
• ‘Bill’ is printed right aligned as a string, with at least 10 spaces.
Module 8 : Strings and Characters 51
Example of space
print(“MIKE, OFFICE: {:3d} TEL: {:<10d} ”.format(76, 67906277))
MIKE, OFFICE: 76 TEL:67906277
3 spaces wide, right justified (by default)
10 spaces wide, left justified
Module 8 : Strings and Characters 52
Precision
• print(math.pi)
3.14159265359 #default is 11• print('{:.4f}'.format(math.pi))
3.1416 (4 decimal places of precision, with rounding)
• print('{:10.2f}'.format(math.pi))
• ‘ 3.14’ (10 spaces for characteristic and the decimal point – not including mantissa)
Module 8 : Strings and Characters 53
CHALLENGE 8.4 – ASCII tableCreate a formatted table of all the ASCII characters and codes.Code should be first item in table (right aligned) and character second item in table (left aligned). Only do printable characters 33 to 127
Module 8 : Strings and Characters 54
Thought process
• Iterate through the characters• Iterate how? Use code from 30 to 127.• Use chr() and ord() functions to display• For each character we should print a formatted string…
Module 8 : Strings and Characters 55
A Solution
for x in range(33, 127):
print('|{:3d}|{:<3s}|'.format(x, chr(x)))
Module 8 : Strings and Characters 56
Take home lessons
• Understand various operations on strings, remember the basic ones.
• Understand slicing and indexing• ASCII and Unicode: what they are and how to use them
• Method vs. function• The use of triple quotes (formatting)• Strings are useful… especially for files.
Module 8 : Strings and Characters 57
Further reading/exercises• http://code.google.com/edu/languages/google‐python‐class/strings.html
• http://diveintopython3.org/strings.html
• http://docs.python.org/tutorial/introduction.html#strings
• (I always assume you will read the book)
Module 8 : Strings and Characters 58