27
Strings The Basics

Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Embed Size (px)

Citation preview

Page 1: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

StringsThe Basics

Page 2: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Strings

• can refer to a string variable as one variable or as many different components (characters)• string values are delimited by either single quotes or double quotes• operators + and * (overloaded operators)• + “concatenation” sticks two strings together to get a new string• * “replication” repeats a string a given number of times to give a new string• precedence is * above +• order of concatenation matters! “a” + “b” is not the same as “b” + “a”

Page 3: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Indexing

• Also called subscripts• A way to tell individual characters of a string apart• notation like that used for lists• [ ] with an integer (constant or variable) inside• Strings are numbered from 0 on left end and increasing• Strings are also numbered from -1 on right end and decreasing• You can use an expression as an index also, like str[k + 1], as long as

the expression has an integer value

Page 4: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

The len function

• Note this is a function, not a method (do not call it with the dot notation)• len operates on one argument, either a string or a list• returns an integer result, tells how many characters are in the string or

how many elements are in the list• lowest return value is zero for the empty string• Python claims “no upper limit on string length”, literally it depends on

your environment – how much RAM you have, which OS you are running, on a typical PC today with Windows it is around 2 billion characters• s[len(s)] would give an error! Remember that len tells you how many

characters, not what the last subscript in the string would be

Page 5: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

chr and ord functions

• Sometimes you need to work with the ASCII codes of individual characters• chr(integer) will return the character corresponding to the ASCII code

argument, example chr(65) will return “A”• ord(char) will return the ASCII code (as an integer) of the single

character that you send, example ord(“A”) will return 65• In general, use the characters instead of their ASCII values in coding, it

will be much more obvious what you are doingDo NOT say “if ch >= 65 and ch <= 90” when you are checking for upper caseit is MUCH better to say if ch >= “A” and ch <=“Z” or even if ch in ascii_uppercase or if ch.isupper()

Page 6: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Strings and the slice operator

Page 7: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

The slice operator

• This is the “substring” method which most languages have• Given a string, you can slice pieces of the string• the syntax of the operator uses the square brackets• s[5:7] means the part of the string s which starts at position 5 and

includes everything up to but not including position 7, in other words, the characters at position 5 and at position 6• Omitting an argument before the colon means to start at the beginning

of the string s[:3] means s[0], s[1] and s[2]• Omitting an argument after the colon means to include the rest of the

string s[4:] means s[4], s[5]… up to and including the end of the string

Page 8: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

slice returns a new string

• Suppose s were “abcdef” and you wrote an expression like s[1:4]• Its value is “bcd”, a new string (the original string is not changed at all)• If the expression were on a line by itself, it does nothing!• You need to use the expression IN some statement, like an assignment

statement, an if statement, a while statement, a print statement• print(s[1:4]) makes sense• t = s[1:4] makes a copy of the 3 characters as a string and puts it into t• s = s[1:4] this does the same as the statement above, but the previous value

of s is also discarded• if s[1:4] == “bcd”: would result in True

Page 9: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Be careful of your range

• str[i:i+1] is the same as saying str[i]• It is possible to go out of range with slice – don’t go past the end of

the string!• You can leave off both the starting and ending points, as in s[:]. This

produces a copy of the whole string.

Page 10: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Strings and Lists – the split method

Page 11: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

The split method for strings

• This method is very powerful and is unique to Python• It only applies to strings – remember this!• The syntax is source.split(delimiter) or source.split()• What does it do? it breaks the string into smaller pieces based on the

delimiter string (if given) or using any whitespace character (if no delimiter given)• It returns a list of strings

Page 12: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

With no delimiters

• Suppose that str is “ abc \t def\t xyz “• str has lots of whitespace in it• If you give the expression str.split()

you get the result [“abc”, “def”, “xyz”]• Note that the elements of the list do NOT have any whitespace in

them at all• Note that something like “ \t “ is used as one delimiter, not used

individually (which would give you lots of empty strings in the list)

Page 13: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

With delimiters

• Suppose str contained “abbabbadefbb” and you wrote the expression str.split(“bb”)• You get the result [“a”,”a”,”adef”,””]• Note that there is an empty string at the end of the list because the

delimiter was found at the very end of the string (same thing happens if the delimiter is at the start of the string also)• You are only allowed one string as a delimiter, it can be as long as you

wish

Page 14: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Strings in PythonString Traversal

Page 15: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Traversing a string

• Traversing just means to process every character in a string, usually from left end to right end• Python allows for 2 ways to do this – both useful but not identical• if all you need is the value of each character in the string

for ch in stringvar: will work fine. ch is given the value of each character from left to right, and can be used in the for loop as you need• The other way uses the location of the characters in the string

for ct in range(len(stringvar)): makes the variable ct an integer which has all the values of the subscripts of the characters in the string. If it is important that you know where a character is, this is the form you use.

Page 16: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Strings in PythonString Methods

Page 17: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

String methods

• You do not have to include the string library to use these!• Since strings are objects, you use the dot notation (like in graphics)• Note that in the methods mentioned which return string results, they

create a new string; they do NOT change the old one!• upper(), lower() return a new string with the alphabetic characters in

a string changed to the other case, they don’t change any other characters at all• replace(old, new) will return a brand new string with the first pattern,

“old”, replaced by the second pattern, “new”, wherever it is in the original string

Page 18: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Strings are immutable

• Strings are immutable – meaning they cannot be changed once they are created• They can be discarded; the variable name can be reassigned to another different

string• But individual characters cannot be changed in a string• If name = “Mary” you cannot say name[0]= “S”• To change the first letter of the string you would have to do something like name =

‘S’ + name[1:]• To make a string all upper case, you would say something like name =

name.upper() - that is, upper returns a string and you assign that new string to the same variable, the old string is discarded

Page 19: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Strings and whitespace

• whitespace is a specific name for a set of characters, namely the blank or space character ‘ ‘, the tab character ‘\t’, and the newline character ‘\n’• They are very often used as delimiters (markers between) in various

string methods (strip, split)• str.strip() removes all leading and trailing whitespace characters from

a string. It does NOT alter any whitespace characters in the middle of the string. Example: “ ab\tcd\n”.strip() will “ab\tcd”• str.split() will use whitespace characters as delimiters to break the

string into a list of strings

Page 20: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

The find method, the index method, the count method and the in operator• All of these methods and operators involve finding something in a

string• They could all be written with loops and if statements but it’s much

easier to use prewritten functions• in is the simplest – syntax a in b It returns a Bool if it finds the first

string a, somewhere in the string b. It does not matter if there are multiple occurrences of the string a, does not matter where the string a is positioned in the string b. It does have to be an exact match, as far as case, spacing, etc.

Page 21: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

The find method and index method

• find(target) – this is called with the dot notation on the source string, so a call like s.find(“me”) is looking in the string s for the string “me”• It works from the left end of the string every time• It returns the location of the first occurrence of the target string • It returns -1 if the target string does not occur at all in the sourceThe index method works the same as the find method, except that if the search fails, the index method causes an exception and the program crashes! Safety rule: use the “in” operator first, before using the index method

Page 22: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

The count method

• The count method is similar to the find and index methods• count(“m”) will return an integer from 0 up to length of the string• It is a method so it’s called with the dot notation s.count(“m”) would

return an integer that shows how many m’s appear in the string

Page 23: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Summary of location/membership

• the in operator gives you just True or False, tells you the target string is in the source string or not, NO location• find and index return the location of the left-most occurrence of the

target string in the source string (find returns -1 for failure, index crashes the program for failure)• count returns the number of occurrences of the target string in the

source string

Page 24: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Strings in PythonCreating a string

Page 25: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Creating a string

• There are many different ways to create a string• The simplest way is to hard code it into your program

mystring = “Hello”• Remember individual characters of mystring cannot be changed

afterward (strings are immutable)• Many of the string methods return new strings

newstring = mystring.upper()• The typecast str will create a string from another data type

mynumstring = str(num)

Page 26: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Getting a string from input

• When you read from the keyboard, the input function will always return a string myname = input(“What’s your name? “) • When you learn about external files, one thing you do is read from

them (see Chapter 10)• The input from a file always comes in as a string or list of strings

Page 27: Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by

Other ways to create a string

• You can build up a string one character at a time• The pattern is very similar to an accumulator• You initialize a variable to an empty string new_one = “”• Then in a loop you concatenate the new character onto the variable

new_one = new_one + ch