19
Chapter 5 Strings Chapter 5 Strings CSC1310 Fall 2009 CSC1310 Fall 2009

Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Embed Size (px)

Citation preview

Page 1: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Chapter 5 StringsChapter 5 Strings

CSC1310 Fall 2009CSC1310 Fall 2009

Page 2: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

StringsStrings

StringString is an orderedordered collection of characters that storesstores and representsrepresents text-based information.

Strings in Python are immutable immutable (e.g., cannot be changed in place) sequencessequences (e.g., they have a left-left-to-right orderto-right order).

Page 3: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Single and Double QuotesSingle and Double Quotes

Single and double quotes are interchangeable.interchangeable.

>>> ‘Python’, “Python” Empty literalEmpty literal: ‘’‘’ or “”“”. It allows you to embed a quote character of the

other type inside a string:

>>> ““knight’’s””,‘‘knight””s’’ Python automatically concatenates adjacent

strings

>>>””Title”” ‘ ‘ of’’ “ “ thethe book””

Page 4: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Escape SequencesEscape Sequences Escape sequenceEscape sequence is a special byte codebyte code embedded

into string, that can not be easily typed on a keyboard.

\ with one (or more) character(s)\ with one (or more) character(s) is replaced by a single charactersingle character in the resulting string.

\n\n - Newline \t\t - Horizontal Tab>>> s=‘a\nb\tc’ # 5 characters!>>> s>>> s‘a\nb\tc’>>> print s>>> print sab c

Page 5: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Escape SequencesEscape Sequences \\\\ - Backslash \’ \’ - Single quote \”\” - Double quote \a\a - Bell \b\b - Backspace \r\r - Carriage return \xhh\xhh - Hex digits value hh \0\0 - Null (binary zero bytes)>>> print ‘a\0m\0c’ # 5 characters!a m c

Page 6: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Raw StringsRaw Strings

>>>print ‘C:\temp\new.txt’>>>print ‘C:\temp\new.txt’

>>>print ‘C:\\\\temp\\\\new.txt’ Raw string suppress escape FormatFormat: r“text” : r“text” or r‘text’(R“text” r‘text’(R“text” or R‘text’) R‘text’)

>>>print rr‘C:\temp\new.txt’’ Raw strings may be used for directory paths,

text pattern matching.

Page 7: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Triple Quoted Strings or Triple Quoted Strings or Block StringsBlock Strings Block stringBlock string is a convenient literal format for coding

multiline text data (error msgs, HTML or XML code). FormatFormat: “”” : “”” text””” ””” or ‘’’ ‘’’text’’’’’’

Page 8: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Unicode StringsUnicode Strings

Unicode (“wide” character) strings Unicode (“wide” character) strings are used to support non-latin characters that require more than one byte in memory.

FormatFormat: u“text” : u“text” or u‘text’(U“text” u‘text’(U“text” or U‘text’) U‘text’) Expression with Unicode and normal strings has

Unicode string as a result.

>>>’fall’+u’08’

u’fall08’

>>>strstr(u’fall08’),unicodeunicode(‘fall08’)

‘fall08’,u’fall08’

Page 9: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Basic operations: len(), +, Basic operations: len(), +, *,in*,in len(str)len(str) function returns the length of a string str.str.>>>len(‘abc’) str1+str2 (concatenation)str1+str2 (concatenation) creates a new string by

joining operands str1str1 and str2str2.>>>‘abc’ + ‘def’,len(‘abc’ + ‘def’) str*i (repeat)str*i (repeat) adds a string str str to itself ii times.>>> print ‘-’ * 80 str 1 in str2 (membership)str 1 in str2 (membership) returns true if str1str1 is a

substring of str2str2; otherwise, returns false.>>>day=‘Monday 8th Sept 2008’>>>’sep’ in day>>>’th Sep’ in day

Page 10: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

IndexingIndexing Each character in the string can be accessed by its

position (offset) – indexindex.>>>S = ‘STRINGINPYTHON’

Negative offset can be viewed as counting backward from the end(offset –x–x is xxthth character from the end).

>>>S[0],S[10],S[13],S[-5],S[-14] (‘S’,’T’,’N’,’Y’,’S’)

>>>S[14],S[-15]>>>S[14],S[-15]

Page 11: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

SlicingSlicing

Slicing allows us to extract an entire section (substringsubstring) in a single step.

Str1[offset1:offset2]Str1[offset1:offset2] returns a substringsubstring of str1str1 starting from from offset1 offset1 (including)(including) and ending atat offset2 offset2 (excluding)(excluding)..

>>>S[1:3] #extract item at offsets1 and 2>>>S[1:] #all items past the first >>>S[:3] # extract items at offsets 0,1,2>>>S[:-1] #fetch all but the last item>>>S[-1:] # extract last item>>>S[:] # a copycopy of the string

Page 12: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

In Python 2.3In Python 2.3

Third index – stride(step)

>>>S=‘0123456789’

>>>S[1:10:2]

‘13579’ To reverse string use step =-1

>>>”hello”[::-1]

Page 13: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

String ConversionString Conversion

You cannot add a number and a string together. int(str1)int(str1) converts string str1str1 into integer. float(str1)float(str1) converts string str1str1 into floating-point

number. Older techniques: functions string.atoi(str1)string.atoi(str1) and

string.atof(str1).string.atof(str1).

>>>int(“42”)+1,float(“42”)+1 str(i)str(i) converts numeric ii to string(`i``i`)

>>>”fall0”+str(8)

Page 14: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Changing StringsChanging Strings

You cannot change a string in-place by assigning value to an index(S[0] = ‘X’)

To modify: create new string with concatenation and slicing.

>>>s=‘spam’

>>>s=s+” again” # s+=+=” again!”

>>>s

>>>s=s[:3]+” is here”+s[-1:]

>>>s Alternatively, format a string.

Page 15: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

Formatting StringsFormatting Strings ““format string” % “object to insert”format string” % “object to insert”

>>>s=“Sales tax”

>>>s=”%s is %d percent!” % (s,8) %s%s string %d%d decimal integer %i%i integer (%u%u - unsigned integer) %o%o octal integer %x%x hex integer (%X %X – uppercase hex integer) %e%e floating-point exponent (%E%E - uppercase) %f%f floating-point decimal (%F%F – uppercase)

Page 16: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

String Methods (p.91 table 5-String Methods (p.91 table 5-4)4) Str1.replace(str2,str3)Str1.replace(str2,str3) replaces each substring str2str2

in Str1Str1 to str3str3.

>>>‘string in python’.replace( ‘in’, ‘XXXX’) Str1.find(str2)Str1.find(str2) returns the offset where substring

str2str2 appears in Str1, Str1, or -1.

>>>where =‘string in python’.find( ‘in’)

>>>’string in python’[:where]

>>>‘in1in2in3in4in5’.replace( ‘in’, ‘XX’,3)

Page 17: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

String Methods String Methods

Str.upper(), str.lower(), str.swapcase() Str1.count(substr,start,end) Str1.endswith(suffix,start,end)

Str1.startswith(prefix,start,end) Str1.index(substr,start,end) Str1.isalnum(),str1.isalpha(), str1.isdigit(),

str1.islower(),str1.isspace(),str1.issupper()

Page 18: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

String Module String Module

Maketrans()/translate()Maketrans()/translate()

>>>import string

>>>convert=string.maketrans(“ _-”,”_-+”)

>>>input=“It is a two_part – one_part”

>>>input.translate(convert)

‘It_is_a_two-part_+_one-part’

Page 19: Chapter 5 Strings CSC1310 Fall 2009. Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based

String Module String Module

ConstantsConstants digits ‘0123456789’ octdigits ‘01234567’ hexdigits ‘0123456789abcdefABCDEF’ lowercase ‘abcdefghijklmnopqrstuvwxyz’ uppercase ‘ABCDEFGHIJKLMNOPRQSTUVWXYZ’ letters lowercase+uppercase whitespace ‘\t\n\r\v’

>>>import string>>>x=raw_input()>>>x in string.digits