Upload
bernard-phelps
View
216
Download
0
Embed Size (px)
Citation preview
Chapter 5 StringsChapter 5 Strings
CSC1310 Fall 2009CSC1310 Fall 2009
StringsStrings
StringString is an orderedordered collection of characters that storesstores and representsrepresents text-based information.
Strings in Python are immutable immutable (e.g., cannot be changed in place) sequencessequences (e.g., they have a left-left-to-right orderto-right order).
Single and Double QuotesSingle and Double Quotes
Single and double quotes are interchangeable.interchangeable.
>>> ‘Python’, “Python” Empty literalEmpty literal: ‘’‘’ or “”“”. It allows you to embed a quote character of the
other type inside a string:
>>> ““knight’’s””,‘‘knight””s’’ Python automatically concatenates adjacent
strings
>>>””Title”” ‘ ‘ of’’ “ “ thethe book””
Escape SequencesEscape Sequences Escape sequenceEscape sequence is a special byte codebyte code embedded
into string, that can not be easily typed on a keyboard.
\ with one (or more) character(s)\ with one (or more) character(s) is replaced by a single charactersingle character in the resulting string.
\n\n - Newline \t\t - Horizontal Tab>>> s=‘a\nb\tc’ # 5 characters!>>> s>>> s‘a\nb\tc’>>> print s>>> print sab c
Escape SequencesEscape Sequences \\\\ - Backslash \’ \’ - Single quote \”\” - Double quote \a\a - Bell \b\b - Backspace \r\r - Carriage return \xhh\xhh - Hex digits value hh \0\0 - Null (binary zero bytes)>>> print ‘a\0m\0c’ # 5 characters!a m c
Raw StringsRaw Strings
>>>print ‘C:\temp\new.txt’>>>print ‘C:\temp\new.txt’
>>>print ‘C:\\\\temp\\\\new.txt’ Raw string suppress escape FormatFormat: r“text” : r“text” or r‘text’(R“text” r‘text’(R“text” or R‘text’) R‘text’)
>>>print rr‘C:\temp\new.txt’’ Raw strings may be used for directory paths,
text pattern matching.
Triple Quoted Strings or Triple Quoted Strings or Block StringsBlock Strings Block stringBlock string is a convenient literal format for coding
multiline text data (error msgs, HTML or XML code). FormatFormat: “”” : “”” text””” ””” or ‘’’ ‘’’text’’’’’’
Unicode StringsUnicode Strings
Unicode (“wide” character) strings Unicode (“wide” character) strings are used to support non-latin characters that require more than one byte in memory.
FormatFormat: u“text” : u“text” or u‘text’(U“text” u‘text’(U“text” or U‘text’) U‘text’) Expression with Unicode and normal strings has
Unicode string as a result.
>>>’fall’+u’08’
u’fall08’
>>>strstr(u’fall08’),unicodeunicode(‘fall08’)
‘fall08’,u’fall08’
Basic operations: len(), +, Basic operations: len(), +, *,in*,in len(str)len(str) function returns the length of a string str.str.>>>len(‘abc’) str1+str2 (concatenation)str1+str2 (concatenation) creates a new string by
joining operands str1str1 and str2str2.>>>‘abc’ + ‘def’,len(‘abc’ + ‘def’) str*i (repeat)str*i (repeat) adds a string str str to itself ii times.>>> print ‘-’ * 80 str 1 in str2 (membership)str 1 in str2 (membership) returns true if str1str1 is a
substring of str2str2; otherwise, returns false.>>>day=‘Monday 8th Sept 2008’>>>’sep’ in day>>>’th Sep’ in day
IndexingIndexing Each character in the string can be accessed by its
position (offset) – indexindex.>>>S = ‘STRINGINPYTHON’
Negative offset can be viewed as counting backward from the end(offset –x–x is xxthth character from the end).
>>>S[0],S[10],S[13],S[-5],S[-14] (‘S’,’T’,’N’,’Y’,’S’)
>>>S[14],S[-15]>>>S[14],S[-15]
SlicingSlicing
Slicing allows us to extract an entire section (substringsubstring) in a single step.
Str1[offset1:offset2]Str1[offset1:offset2] returns a substringsubstring of str1str1 starting from from offset1 offset1 (including)(including) and ending atat offset2 offset2 (excluding)(excluding)..
>>>S[1:3] #extract item at offsets1 and 2>>>S[1:] #all items past the first >>>S[:3] # extract items at offsets 0,1,2>>>S[:-1] #fetch all but the last item>>>S[-1:] # extract last item>>>S[:] # a copycopy of the string
In Python 2.3In Python 2.3
Third index – stride(step)
>>>S=‘0123456789’
>>>S[1:10:2]
‘13579’ To reverse string use step =-1
>>>”hello”[::-1]
String ConversionString Conversion
You cannot add a number and a string together. int(str1)int(str1) converts string str1str1 into integer. float(str1)float(str1) converts string str1str1 into floating-point
number. Older techniques: functions string.atoi(str1)string.atoi(str1) and
string.atof(str1).string.atof(str1).
>>>int(“42”)+1,float(“42”)+1 str(i)str(i) converts numeric ii to string(`i``i`)
>>>”fall0”+str(8)
Changing StringsChanging Strings
You cannot change a string in-place by assigning value to an index(S[0] = ‘X’)
To modify: create new string with concatenation and slicing.
>>>s=‘spam’
>>>s=s+” again” # s+=+=” again!”
>>>s
>>>s=s[:3]+” is here”+s[-1:]
>>>s Alternatively, format a string.
Formatting StringsFormatting Strings ““format string” % “object to insert”format string” % “object to insert”
>>>s=“Sales tax”
>>>s=”%s is %d percent!” % (s,8) %s%s string %d%d decimal integer %i%i integer (%u%u - unsigned integer) %o%o octal integer %x%x hex integer (%X %X – uppercase hex integer) %e%e floating-point exponent (%E%E - uppercase) %f%f floating-point decimal (%F%F – uppercase)
String Methods (p.91 table 5-String Methods (p.91 table 5-4)4) Str1.replace(str2,str3)Str1.replace(str2,str3) replaces each substring str2str2
in Str1Str1 to str3str3.
>>>‘string in python’.replace( ‘in’, ‘XXXX’) Str1.find(str2)Str1.find(str2) returns the offset where substring
str2str2 appears in Str1, Str1, or -1.
>>>where =‘string in python’.find( ‘in’)
>>>’string in python’[:where]
>>>‘in1in2in3in4in5’.replace( ‘in’, ‘XX’,3)
String Methods String Methods
Str.upper(), str.lower(), str.swapcase() Str1.count(substr,start,end) Str1.endswith(suffix,start,end)
Str1.startswith(prefix,start,end) Str1.index(substr,start,end) Str1.isalnum(),str1.isalpha(), str1.isdigit(),
str1.islower(),str1.isspace(),str1.issupper()
String Module String Module
Maketrans()/translate()Maketrans()/translate()
>>>import string
>>>convert=string.maketrans(“ _-”,”_-+”)
>>>input=“It is a two_part – one_part”
>>>input.translate(convert)
‘It_is_a_two-part_+_one-part’
String Module String Module
ConstantsConstants digits ‘0123456789’ octdigits ‘01234567’ hexdigits ‘0123456789abcdefABCDEF’ lowercase ‘abcdefghijklmnopqrstuvwxyz’ uppercase ‘ABCDEFGHIJKLMNOPRQSTUVWXYZ’ letters lowercase+uppercase whitespace ‘\t\n\r\v’
>>>import string>>>x=raw_input()>>>x in string.digits