38
June 27, 2022 1 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

Embed Size (px)

Citation preview

Page 1: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 1

Laura Broussard, Ph.D.

Professor

COS 131: Computing for Engineers

Chapter 6: Character Strings

Page 2: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 2

Introduction• Chapter discusses the nature, implementation, and

behavior of character strings in MATLAB• We will consider:

– Internal workings of character strings as vectors– Operations on character strings– Converting between numeric and character string

representations– Input and output functions– Construction and uses for arrays of strings

• M-files we suer to store scripts and functions consist of lines of legible characters separated by an invisible “new-line” character

Page 3: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 3

Introduction• We will discuss the underlying concept of

character storage and the tools MATLAB provides for operating on character strings

• Distinguish two different relationships between characters and numbers:– Individual characters have an internal

numerical representation.– Strings of characters represent numerical values

to the users

Page 4: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 4

Introduction• Individual characters have an internal

numerical representation.– Characters created as a pattern of white and

black dots using special software called a character generator

– Allow us to take the underlying concept of a character (w) and draw that character on screen or aper in accorandce with the rules defined by the current font

Page 5: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 5

Introduction• Individual characters have an internal

numerical representation.– We represent each individual character by a

numerical equivalent– Dominant representation is the one define by

the American Standard Code for Information Interchange (ASCII)

– The most common uppercase and lowercase characters, numbers, and many punctuation marks are represented by numbers between 1 and 127

Page 6: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 6

Introduction

We briefly examinedthis character codingscheme in Chapter 1.It is repeated here foryour convenience.You can observe thebinary mapping so agiven ASCII binarycode triggers thecharacter generator todisplay thecorrespondingcharacter.

Page 7: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 7

Introduction• Strings of characters represent numerical

values to the user:– Whenever we need to see the value of a given

number in the command window, that internal representation is automatically converted by MATLAB into a character string representing its value in a form we can read.

– When we use the input(…) function, the set of characters that we enter is automatically translated to the internal number representation

Page 8: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 8

Character String Concepts: Mapping and Casting

• MATLAB tools that deal with the first relationship between characters and numbers– The numerical representation of individual

characters– Mapping – defines a relationship between two

entities • Character mapping allows each individual graphic

character to be uniquely represented by a numerical value (What about a computer makes us do this?)

Page 9: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 9

MATLAB Implementation• MATLAB’s external specification of

character strings uses the single quote mark (‘) to delimit character strings

• The editor colors the resulting string in purple

• When you have a (‘) as actual text you use double single quote marks as in ‘don’’t’

• Exercise 6.1: Character Casting

• Smith text, page 137, bottom

Page 10: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 10

MATLAB Implementation• In Exercise 6.1 the casting function

uint8(…) takes a character or character string and converts its representation to a vector of the same length as the orginal string

• Casting function char(…) takes a vector and converts it to a string representation

• Casting function double(…) acts the same as uint8(…) but stores values using 64 bits

Page 11: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 11

MATLAB Implementation• Use single quotes as delimiters within a

MATLAB script• Quotes are removed when displayed as

output• Arithmetic operations on character strings

are illegal• MATLAB uses the numerical equivalent

and then you can get the character equivalents back by using the char(…) casting function

Page 12: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 12

MATLAB Implementation• Slicing and Concatenating Strings

– Strings are internally represented as vectors– Can perform all usual vector operations on

strings– Exercise 6.2: Character strings– Smith text, page 138, bottom

Page 13: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 13

MATLAB Implementation• Arithmetic and Logical Operations

– Mathematical operations can be performed on the numerical mapping of a character string

– MATLAB will do the cast for you

– Creates a result of type double but not usually suitable for character values

– Logical operations on character strings are also exactly equivalent to logical operations on vectors

– Exercise 6.3: Character string logic

– Smith text, page 139, top

Page 14: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 14

MATLAB Implementation• Useful functions

– Functions useful in analyzing character strings:• ischar(a) returns TRUE if a is a character string

• isspace(ch) returns TRUE if the character ch is the space character

Page 15: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 15

Format Conversion Functions• Second relationship between characters and

functions:– Using character strings to represent individual

number values– Two separate capabilities:

• Converting numbers from efficient, internal form to legible strings

• Converting strings provided by users of MATLAB into the internal number representation

Page 16: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 16

Format Conversion Functions• Conversion from Numbers to Strings

– MATLAB functions for a simple conversion of a single number, x, to its string equivalent representation:

• int2str(x) if you want it displayed as an integer

• num2str(x, n) to see the decimal parts; the parameter n represents the number of decimal palces required; default is 3

Page 17: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 17

Format Conversion Functions• Conversion from Numbers to Strings

– Often need better control over the data conversion

• Function sprintf(…) provides fine-grained control

• First parameter is a format control string that defines exactly how the resulting string should be formatted

• Variable number of value parameters follow the format string; provide data items as necessary to satisfy the formatting

Page 18: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 18

Format Conversion Functions• Conversion from Numbers to Strings

– Contains two types of special entry introduced by the following two special characters:• ‘%’ character introduces a conversion specification,

indicating how one of the value parameters should be represented

• Most common conversions:– %d integer

– %f real

– %g general

– %c character

– %s string

Page 19: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 19

Format Conversion Functions• Conversion from Numbers to Strings

– A number may be placed immediately after the % character to specify the minimum number of characters in the conversion

– ‘.n’ to indicate the number of decimal places required

– If you require the % as a character in a string then you must use %% to allow its use as a character

– If there are more parameters than conversion specifications in the format control string, the format control string is repeated

– The ‘\’ character introduces format control information, the most common of which are \n (new line) and \t (tab)

Page 20: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 20

Format Conversion Functions• Conversion from Numbers to Strings

– Example:You need a comma after the print stringformat and before the variable list.

Page 21: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 21

Format Conversion Functions• Conversion from strings to numbers

– This is a much messier and complex process

– Should avoid if possible

– Use MATLAB’s function input(…) to these conversions for you

– Use the function sscanf(…) if you must do these types of conversions yourself

Page 22: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 22

Format Conversion Functions• Conversion from strings to numbers

– Function input(str) displays the string parameter to the user in the Command window and waits for the user to type some characters and the ‘enter’ key

– It then converts the input string according to the following rules:

• If the string begins with a numerical character, MATLAB converts the string to a number

• If it begins with a non-numeric character, MATLAB constructs a variable name and looks for its definition

• If it begins with an open bracket, ‘[‘, a vector is constructed

• If it begins with the single quote character, MATLAB creates a string

Page 23: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 23

Format Conversion Functions• Conversion from strings to numbers

– Function input(str)• If a format error occurs, MATLAB repeats the prompt in the

Command window

• This can be modified if ‘s’ is provided as the second parameter as in

• Input(str, ‘s’); the complete input character sequence is saved as a string

– Exercise 6.4: The input(…) function

– Smith text, page 141, bottom

– Observations:• MATLAB attempts to distinguish between a variable and a number by

the first digit

• When input(…) detects and error parsing the text entered, it automatically resets and requests a new entry

Page 24: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 24

Format Conversion Functions• Conversion from strings to numbers

– Function sscanf(…) is very different from the C/C++ implementation

– Simplest form: cv = sscanf(str, fmt) scans the string str and converts each data item according to the conversion specifications in the format string fmt.

– Each item discovered in str produces a new row on the result array, cv, a column vector

– If you convert strings this way, each character in the string becomes a separate numerical result in the output vector

– MATLAB allows you to substitute the character ‘*’ for the conversion size parameter to suppress any strings in the input string. See example on the next slide.

Page 25: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 25

Format Conversion Functions• Conversion from strings to numbers

– Function sscanf(…) example:

Page 26: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 26

Character String Operations• Can use simple functions with limited flexibility or you

can use complex functions that offer far more control• Simple data output: the disp(…) function

– disp(…) presents values of any variable, regardless of type, or of strings constructed by concatenation

– Note that an explicit number conversion is required to concatenate variables with strings

– Exercise 6.5: The disp(…) function

– Smith text, page 143, top

– Note that conversion from the ASCII code is not automatic

– Must use the simple string conversion functions to enforce consistent information for concatenation

Page 27: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 27

Character String Operations• Complex Output

– Function fprintf(…) is similar to sprintf(…), except that it prints its results to the Command window instead of returning a string.

– fprintf(…) returns the number of characters actually printed

– Exercise 6.6: fprintf(…) and sprintf(…)

– Smith text, page143-144, bottom-top

Page 28: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 28

Character String Operations• Comparing Strings

– In MATLAB strings are readily translated into vectors of numbers

– They may be compared with the logical operators we used on numbers

– There is the restriction that either the strings must be of the same length or one of them must be of length 1 before it is legal to compare them with these operators

– MATLAB provides the C-style function strcmp(<s1>,<s2>) that returns true if the strings are identical and false if they are not

– Exercise 6.7:Character string comparison

– Smith text, page 144, bottom

Page 29: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 29

Character String Operations• Comparing Strings

– Exercise 6.7 observations• Strings of the same length compare exactly to vectors returning a

logical vector result

• Cannot use the equality test on strings of unequal length

• strcmp(…) deals gracefully with strings of unequal length

• For case-idependent testing, use strcmpi(…)

Page 30: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 30

Arrays of Strings• A single character string is stored in a vector• Natural to consider storing a collection of strings as an

array• Character arrays are constructed by either of the

following:– As a vertical vector of strings, all of which must be the same

length– By using a special version of the char( &) cast function that

accepts a variable number of strings with different lengths, pads them with blanks to make all rows the same length, and stores them in an array of characters

– Exercise 6.8: Character string arrays– Smith text, page 145, bottom

Page 31: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 31

Engineering Example: Encryption• Problem:

– Increasing interest in the use of encryption to protect intellectual property and private communication from unauthorized access

– Problem illustrates a very simple approach to developing an algorithm that is immune to all but the most obvious, brute-force code-breaking techniques

Page 32: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 32

Engineering Example: Encryption• Background:

– Simple encryption has been accomplished by substituting one character for another in the message

– More advanced techniques use a random letter selection process to substitute new letters

Page 33: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 33

Engineering Example: Encryption• Solution:

– Propose a simple algorithm where a predetermined random series is used to select the replacement letters

– MATLAB rand(…) function is an excellent source for an appropriate random sequence (pseudo-random number generation)

– An abundant set of differetn techniques for generating pseudo-random sequences, the specific generation technique must be known in addition to the seed value for successful decryption

– See MATLAB code on following slide

Page 34: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 34

Engineering Example: Encryption• MATLAB code:

Page 35: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 35

Engineering Example: Encryption• MATLAB code:

Page 36: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 36

Engineering Example: Encryption• Observations on Listing 6.1:

– Original text taken from earlier in this chapter

– Multiple lines of characters can be concatenated

– Number 13 same as new line escape sequence’\n’– Line 15 seeds the pseudo-random number generator with a

known value: rand(‘state’, 123456)

– No two characters of the original text are replaced by the same character

– Program begins the decryption by seeding the generator with the same value

– Must subtract the random sequence from the encrypted string and correct for underflow

Page 37: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 37

Engineering Example: Encryption• Observations on Listing 6.1:

– Best to add the RANGE value to all the letters while subtracting the random offsets, and then bring back those values that remain above the highest printable character

– Program then attempts to decrypt with the same code but a bad seed

– Program then attempts to decrypt with the correct seed but a different random generator, here MATLAB’s normal random generator limited to positive values within the letter range of interest

Page 38: Covenant College September 3, 20151 Laura Broussard, Ph.D. Professor COS 131: Computing for Engineers Chapter 6: Character Strings

April 19, 2023 38

Engineering Example: Encryption• Partial

output for Listing 6.1: