Upload
tabitha-whitehead
View
244
Download
1
Embed Size (px)
Citation preview
C# StringsC# Strings 11
C# StringsC# Strings
CNS 3260CNS 3260
C# .NET Software DevelopmentC# .NET Software Development
C# StringsC# Strings 22
System.StringSystem.String stringstring maps to maps to System.StringSystem.String
Strings are Strings are • reference-type reference-type objectsobjects• Strings are immutableStrings are immutable• Strings are sealedStrings are sealed
Strings have properties and methodsStrings have properties and methods
Strings contain 16-bit Unicode charactersStrings contain 16-bit Unicode characters
C# StringsC# Strings 33
Immutability of a stringImmutability of a string
String methods do not modify the string, they String methods do not modify the string, they return a new stringreturn a new string
string str = “one two three”;
str.Replace(“one”, “1”);
str = str.Replace(“one”, “1”);
Wrong way
Right way
C# StringsC# Strings 44
Strings are Reference-TypesStrings are Reference-Types
string s1 = null;
string reference
string s2 = “”;
string reference
string objectcontains
“”
note that:s1 != s2
String variables can be nullString variables can be null nullnull is is notnot the same as “” (the empty string) the same as “” (the empty string)
C# StringsC# Strings 55
Underlying CharactersUnderlying Characters Escape sequencesEscape sequences
• \a\a - bell- bell• \b\b - backspace- backspace• \t\t - tab- tab• \r\r - carriage return- carriage return• \n\n - line feed (new-line)- line feed (new-line)• \\\\ - backslash- backslash• \”\” - double quote- double quote
Literal stringLiteral string• preceded by ‘@’preceded by ‘@’• needs no escape charactersneeds no escape charactersstring s1 = @”C:\Demo”;string s1 = @”C:\Demo”;string s2 = “C:\\Demo”;string s2 = “C:\\Demo”;
C# StringsC# Strings 66
InterfacesInterfaces
System.StringSystem.String• IComparableIComparable• ICloneableICloneable• IConvertibleIConvertible• IEnumerableIEnumerable
C# StringsC# Strings 77
Enumerating through a stringEnumerating through a string
Contained type is ‘Contained type is ‘charchar’’
string str = “something, something, something”;
foreach(char ch in str){ // do work here }
C# StringsC# Strings 88
String String staticstatic Members Members
EmptyEmpty Compare()Compare() CompareOrdinal()CompareOrdinal() Concat()Concat() Copy()Copy() Format()Format() Intern()Intern() IsInterned()IsInterned() Join()Join()
C# StringsC# Strings 99
string.Emptystring.Empty
static property equivalent to “”static property equivalent to “”
string str = string.Empty;
// means
string str = “”;
C# StringsC# Strings 1010
Compare and ConcatenationCompare and Concatenation
string.Compare(string s1, string s2)string.Compare(string s1, string s2)• Returns less-than zero if s1 is lexicographically less-than Returns less-than zero if s1 is lexicographically less-than
s2s2• Returns greater-than zero if it’s greaterReturns greater-than zero if it’s greater• Returns zero if they are the sameReturns zero if they are the same
string.CompareOrdinalstring.CompareOrdinal• Compares the numeric value of each underlying Compares the numeric value of each underlying charchar
string.Concat(params object[] strings)string.Concat(params object[] strings)• Calls ToString on each object and concatenates themCalls ToString on each object and concatenates them
C# StringsC# Strings 1111
string + and ==string + and ==
operators + and == are also definedoperators + and == are also defined
string s1 = “1”, s2 = “2”;
string s3 = s1 + s2;
s3 += “hahahaha”;
if(s1 == s2){ // do something}
if(s1 != s2){ // do something}
C# StringsC# Strings 1212
string.Format()string.Format()
string.Format(string string.Format(string formatformat, params object[] , params object[] argsargs))• formatformat is the string to perform the formatting on is the string to perform the formatting on• formatformat may contain zero or more embedded may contain zero or more embedded format format
itemsitems• format items have the following structure:format items have the following structure:
{index[,alignment][:formatString]}{index[,alignment][:formatString]} to create a format item, you must have at least “{index}”to create a format item, you must have at least “{index}”
• args args is a list of parameters that will be inserted into the is a list of parameters that will be inserted into the format stringformat string
decimal price = 5.99;string output = string.Format( “Price: {0,-5:C}” , price );
C# StringsC# Strings 1313
Format ItemsFormat Items {index[,alignment][:formatString]}{index[,alignment][:formatString]} indexindex refers to the parameter in the params array refers to the parameter in the params array
• it is 0-basedit is 0-based alignmentalignment
• requires the comma if usedrequires the comma if used• is an integer specifying the minimum widthis an integer specifying the minimum width• negative values mean left-justify the valuenegative values mean left-justify the value• positive values mean right-justify the valuepositive values mean right-justify the value
formatStringformatString• requires the colon if usedrequires the colon if used• is sent to the parameter if it implements IFormattableis sent to the parameter if it implements IFormattable
(See StringFormat Demo)
C# StringsC# Strings 1414
IFormattableIFormattable
Overloads the ToString() methodOverloads the ToString() methodToString(string format, IFormatProvider formatProvider)ToString(string format, IFormatProvider formatProvider)
Available IFormatProvider classes are:Available IFormatProvider classes are:• NumberFormatInfoNumberFormatInfo• DateFormatInfoDateFormatInfo
C# StringsC# Strings 1515
Back to string.Format()Back to string.Format()
If the parameter does not accept the stringFormat If the parameter does not accept the stringFormat passed, a FormatException is thrownpassed, a FormatException is thrown• Items that are not IFormattable will always throw if Items that are not IFormattable will always throw if
passed a stringFormatpassed a stringFormat If the parameter is not IFormattable, it ignores If the parameter is not IFormattable, it ignores
the stringFormat fieldthe stringFormat field
If you need to use ‘{‘ or ‘}’ in your string, double If you need to use ‘{‘ or ‘}’ in your string, double them up to escape them:them up to escape them:
string str = string.Format(“{{ {0} }}”, 555);
C# StringsC# Strings 1616
string.Join()string.Join()
string.Join(string string.Join(string separatorseparator, string[] , string[] valuesvalues))• joins each of the strings in values separated by joins each of the strings in values separated by
separatorseparator
string str = string.Join(“:”, new string[] {“1”,”2”,”3”,”4”});
(See StringManipulation Demo)
C# StringsC# Strings 1717
Last 2 Static MembersLast 2 Static Members Literal strings are stored in a table called the Literal strings are stored in a table called the
Intern PoolIntern Pool• saves storagesaves storage• speeds up some operationsspeeds up some operations
Intern(string str)Intern(string str)• if str is already interned, the system reference is if str is already interned, the system reference is
returnedreturned• if str is not interned, it is added to the pool and the new if str is not interned, it is added to the pool and the new
reference is returnedreference is returned IsInterned(string str)IsInterned(string str)
• if str is already interned, the system reference is if str is already interned, the system reference is returnedreturned
• if str is not interned, a null reference is returnedif str is not interned, a null reference is returned
C# StringsC# Strings 1818
Switching on stringsSwitching on strings The Intern Pool should *theoretically* make switching on The Intern Pool should *theoretically* make switching on
strings faster than doing multiple if-else statementsstrings faster than doing multiple if-else statements• but...but...
Results from the StringSwitchBench Demo show otherwise:Results from the StringSwitchBench Demo show otherwise:
(See StringSwitchingBench Demo)
C# StringsC# Strings 1919
Instance MembersInstance Members IndexerIndexer LengthLength EndsWith()EndsWith() Equals()Equals() IndexOf()IndexOf() IndexOfAny()IndexOfAny() Insert()Insert() LastIndexOf()LastIndexOf() LastIndexOfAny()LastIndexOfAny() PadLeft()PadLeft() PadRight()PadRight() Remove()Remove() Replace()Replace() Split()Split() StartsWith()StartsWith() Substring()Substring() ToLower()ToLower() ToUpper()ToUpper() Trim()Trim() TrimStart()TrimStart() TrimEnd()TrimEnd()
C# StringsC# Strings 2020
String IndexerString Indexer
gets the char at the specified locationgets the char at the specified location readonly (no set)readonly (no set)
string str = “hello world”char ch = str[4];
C# StringsC# Strings 2121
String EqualsString Equals
Equals(string str)Equals(string str)• tests the equality of the tests the equality of the valuevalue of the of the
calling string and strcalling string and str
(See InternPool Demo)
C# StringsC# Strings 2222
EndsWith() and StartsWith()EndsWith() and StartsWith()
EndsWith(string str)EndsWith(string str)• returns true if the calling string ends with strreturns true if the calling string ends with str
StartsWith(string str)StartsWith(string str)• returns true if the calling string starts with strreturns true if the calling string starts with str
C# StringsC# Strings 2323
Searching stringsSearching strings IndexOf (string substr)IndexOf (string substr)
• Returns the integer index of the first occurrence of substr in the calling Returns the integer index of the first occurrence of substr in the calling stringstring
LastIndexOf (string substr)LastIndexOf (string substr)• Returns the integer index of the last occurrence of substr in the calling Returns the integer index of the last occurrence of substr in the calling
stringstring
IndexOfAny (char[] search)IndexOfAny (char[] search)• Returns the integer index of the first occurrence of any of the Returns the integer index of the first occurrence of any of the
characters in searchcharacters in search
LastIndexOfAny (char[] search)LastIndexOfAny (char[] search)• Returns the integer index of the last occurrence of any of the Returns the integer index of the last occurrence of any of the
characters in searchcharacters in search
**Each of these returns -1 if the substring or character cannot be **Each of these returns -1 if the substring or character cannot be foundfound
C# StringsC# Strings 2424
Insert, Remove, ReplaceInsert, Remove, Replace Insert(int Insert(int positionposition, string , string strstr))
• Returns a new copy of the calling string with Returns a new copy of the calling string with strstr inserted inserted at at positionposition
Remove(int start, int count)Remove(int start, int count)• Returns a new copy of the calling string with the Returns a new copy of the calling string with the
specified section removedspecified section removed
Replace(string Replace(string oldValueoldValue, string , string newValuenewValue)) Replace(char Replace(char oldValueoldValue, char , char newValuenewValue))
• Returns a new copy of the calling string with all Returns a new copy of the calling string with all instances of instances of oldValue oldValue replaced with replaced with newValuenewValue
C# StringsC# Strings 2525
Substring()Substring() Substring(int Substring(int startstart, int , int countcount))
• Returns a new string with the substring of the calling Returns a new string with the substring of the calling string from string from startstart for for count count characterscharacters
Substring(int Substring(int startstart))• Returns a new string with the substring of the calling Returns a new string with the substring of the calling
string from string from start start to the end of the stringto the end of the string
C# StringsC# Strings 2626
Padding stringsPadding strings PadLeft(int width)PadLeft(int width) PadLeft(int width, char padChar)PadLeft(int width, char padChar)
• If the string is shorter than width:If the string is shorter than width: Returns a new copy of the calling string, padding the Returns a new copy of the calling string, padding the
left side with padCharleft side with padChar If no padChar is specified, space is usedIf no padChar is specified, space is used
PadRight(int width)PadRight(int width) PadRight(int width, char padChar)PadRight(int width, char padChar)
• If the string is shorter than width:If the string is shorter than width: Returns a new copy of the calling string, padding the Returns a new copy of the calling string, padding the
right side with padCharright side with padChar If no padChar is specified, space is usedIf no padChar is specified, space is used
C# StringsC# Strings 2727
Manipulating CaseManipulating Case
ToUpper()ToUpper()• Returns a new copy of the calling string with all Returns a new copy of the calling string with all
letter characters set to upper-caseletter characters set to upper-case ToLower()ToLower()
• Returns a new copy of the calling string with all Returns a new copy of the calling string with all letter characters set to lower-caseletter characters set to lower-case
C# StringsC# Strings 2828
TrimmingTrimming Trim()Trim()
• Returns a new copy of the calling string with all white space Returns a new copy of the calling string with all white space removed from the beginning and endremoved from the beginning and end
Trim(params char[] trimChars)Trim(params char[] trimChars)• Returns a new copy of the calling string with all characters in Returns a new copy of the calling string with all characters in
the trimChars set removed from the beginning and endthe trimChars set removed from the beginning and end
TrimEnd(params char[] trimChars)TrimEnd(params char[] trimChars)• Returns a new copy of the calling string with all characters in Returns a new copy of the calling string with all characters in
the trimChars set removed from the endthe trimChars set removed from the end
TrimStart(params char[] trimChars)TrimStart(params char[] trimChars)• Returns a new copy of the calling string with all characters in Returns a new copy of the calling string with all characters in
the trimChars set removed from the beginningthe trimChars set removed from the beginning
C# StringsC# Strings 2929
Split()Split() The opposite of string.Join()The opposite of string.Join() Split(params char[] delimiters)Split(params char[] delimiters)
• Returns an array of stringsReturns an array of strings• Splits the calling string each time it finds any one of the Splits the calling string each time it finds any one of the
delimitersdelimiters• adjacent delimiters split into an empty stringadjacent delimiters split into an empty string
The following Example Reads a file in and separates the The following Example Reads a file in and separates the lines into a string[]:lines into a string[]:
StreamReader sr = null;try{ sr = new StreamReader(“MyFile.txt”); string[] lines = sr.ReadToEnd().Replace(“\r”, “”).Split(‘\n’);}finally{ sr.Close();}
C# StringsC# Strings 3030
Tokenizing a stringTokenizing a string
Tokenizing means splitting into partsTokenizing means splitting into parts• Why not use Split() ?Why not use Split() ?
Leaves empty strings sometimesLeaves empty strings sometimes We must know all possible tokens We must know all possible tokens
(delimiters)(delimiters)
(See SearchingStrings Demo)
C# StringsC# Strings 3131
Tokenizing using IndexOf()Tokenizing using IndexOf()
Exactly what we wantExactly what we want• No Extra Empty stringsNo Extra Empty strings• We only need to know what makes up a We only need to know what makes up a
word (not what doesn’t)word (not what doesn’t) Not the most elegant code though...Not the most elegant code though...
(See SearchingStrings Demo)
C# StringsC# Strings 3232
Introducing Regular ExpressionsIntroducing Regular Expressions
String pattern matching toolString pattern matching tool
Regular expressions constitute a languageRegular expressions constitute a language• C# regular expressions are a language inside a languageC# regular expressions are a language inside a language
Used in many languages (Perl most notably)Used in many languages (Perl most notably)
There’s a whole class on the theoryThere’s a whole class on the theory• CNS 3240: Computational TheoryCNS 3240: Computational Theory• By Chuck AllisonBy Chuck Allison
C# StringsC# Strings 3333
Pattern MatchingPattern Matching Match any of the characters in brackets [] onceMatch any of the characters in brackets [] once
• [a-zA-Z][a-zA-Z]
Anything not in brackets is matched exactlyAnything not in brackets is matched exactly• Except for special charactersExcept for special characters• abc[a-zA-Z]abc[a-zA-Z]
Match preceding pattern zero or more timesMatch preceding pattern zero or more times• [a-zA-Z]*[a-zA-Z]*
Match preceding pattern one or more timesMatch preceding pattern one or more times• [a-zA-Z]+[a-zA-Z]+
C# StringsC# Strings 3434
Language ElementsLanguage Elements
()() groups patternsgroups patterns || “or”, choose between patterns“or”, choose between patterns [][] defines a range of charactersdefines a range of characters {}{} used as a quantifierused as a quantifier \\ escape characterescape character . . matches any charactermatches any character ^̂ beginning of linebeginning of line $$ end of lineend of line [^][^] not character specifiednot character specified
C# StringsC# Strings 3535
QuantifiersQuantifiers ** Matches zero or moreMatches zero or more ++ Matches one or moreMatches one or more ?? Matches zero or oneMatches zero or one {n}{n} Matches exactly nMatches exactly n {n,}{n,} Matches at least nMatches at least n {n,m}{n,m} Matches at least n, up to mMatches at least n, up to m
These quantifiers always take the largest pattern they can These quantifiers always take the largest pattern they can matchmatch
Lazy quantifiers always take the smallest pattern they can Lazy quantifiers always take the smallest pattern they can matchmatch• The lazy quantifiers are the same as those listed above, except The lazy quantifiers are the same as those listed above, except
followed by a ?followed by a ?
C# StringsC# Strings 3636
Character ClassesCharacter Classes \w\w Matches any word characterMatches any word character
• Same as: [a-zA-Z_0-9]Same as: [a-zA-Z_0-9] \W\W Matches any non-word characterMatches any non-word character
• Same as: [^a-zA-Z_0-9]Same as: [^a-zA-Z_0-9]
\s\s Matches any white-space characterMatches any white-space character• Same as: [\f\n\r\t\v]Same as: [\f\n\r\t\v]
\S\S Matches any non-white-space characterMatches any non-white-space character• Same as: [^\f\n\r\t\v]Same as: [^\f\n\r\t\v]
\d\d Matches any digit characterMatches any digit character• Same as: [0-9]Same as: [0-9]
\D\D Matches any non-digit characterMatches any non-digit character• Same as: [^0-9]Same as: [^0-9]
C# StringsC# Strings 3737
Putting It TogetherPutting It Together Regular Expression for C# identifiers:Regular Expression for C# identifiers:
• [a-zA-Z$_][a-zA-Z0-9$_]*[a-zA-Z$_][a-zA-Z0-9$_]*
Floating Point Numbers:Floating Point Numbers:• (0|([1-9][0-9]*))?\.[0-9]+(0|([1-9][0-9]*))?\.[0-9]+
C# Hexidecimal numbersC# Hexidecimal numbers• [0][xX][0-9a-fA-F]+[0][xX][0-9a-fA-F]+
‘‘words’ in Project 4words’ in Project 4• [a-zA-Z\-’]+[a-zA-Z\-’]+
C# StringsC# Strings 3838
C# Regular ExpressionsC# Regular Expressions
System.Text.RegularExpressionsSystem.Text.RegularExpressions• RegexRegex• MatchMatch• MatchCollectionMatchCollection• CaptureCapture• CaptureCollectionCaptureCollection• GroupGroup
C# StringsC# Strings 3939
Regex ClassRegex Class
Exposes static methods for doing Exposes static methods for doing Regular Expression matchingRegular Expression matching
Or, holds a Regular Expression as an Or, holds a Regular Expression as an objectobject• Compiles the expression to make it Compiles the expression to make it
fasterfaster
C# StringsC# Strings 4040
Regex MembersRegex Members The non-static methods echo the static methodsThe non-static methods echo the static methods
OptionsOptions Escape()Escape() GetGroupNames()GetGroupNames() GetGroupNumbers()GetGroupNumbers() GetNameFromNumber()GetNameFromNumber() GetNumberFromName()GetNumberFromName() IsMatch()IsMatch() Match()Match() Matches()Matches() Replace()Replace() Split()Split() Unescape()Unescape()
C# StringsC# Strings 4141
Regex.OptionsRegex.Options
OptionsOptions• RegexOptions EnumRegexOptions Enum
Compiled – speeds up the searchesCompiled – speeds up the searches IgnoreCaseIgnoreCase MultiLineMultiLine NoneNone RightToLeftRightToLeft SingleLineSingleLine
C# StringsC# Strings 4242
Regex.Escape()Regex.Escape() If you’re not sure what needs to be escaped?If you’re not sure what needs to be escaped? Regex.Regex.EscapeEscape(string pattern)(string pattern)
• Returns a new string with the necessary characters Returns a new string with the necessary characters escapedescaped
Need to undo it?Need to undo it? Regex.Regex.UnescapeUnescape(string pattern)(string pattern)
• Returns a new string with all escape characters removedReturns a new string with all escape characters removed
C# StringsC# Strings 4343
MatchesMatchesprivate Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4})");private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4})");
private string input1 = “801-224-6707";private string input1 = “801-224-6707";
Match:ValueIndex
LengthSuccess
NextMatch()CapturesGroups
C# StringsC# Strings 4444
Linked MatchesLinked Matches
Follow links using NextMatch()Follow links using NextMatch() Last link Success == falseLast link Success == false
Match1NextMatch()
Success==true
Match2NextMatch()
Success==true
Match3NextMatch()
Success==true
Match4NextMatch()
Success=false
C# StringsC# Strings 4545
GroupsGroups
private Regex re1 = new Regex("(([2-9]\d{2})-)?(\d{3})-(\d{4})");private Regex re1 = new Regex("(([2-9]\d{2})-)?(\d{3})-(\d{4})");
Group:ValueIndex
LengthSuccessCaptures
1
2
0
Captures a matching substring for Captures a matching substring for future usefuture use• Captures in a Captures in a Capture Capture objectobject
Group 0 represents the entire matchGroup 0 represents the entire match
3 4
C# StringsC# Strings 4646
Named GroupsNamed Groups
(?<name>expression)(?<name>expression) Non-capturing groupNon-capturing group
• (?:expression)(?:expression)
(@"(?:(?<areaCode>[2-9]\d{2})-)?(?<prefix>\d{3})-(?<lastFour>\d{4})(@"(?:(?<areaCode>[2-9]\d{2})-)?(?<prefix>\d{3})-(?<lastFour>\d{4})
C# StringsC# Strings 4747
CapturesCaptures
Capture:ValueIndex
Length
private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4}) ");private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4}) "); private string input1 = “801-224-6707";private string input1 = “801-224-6707";
C# StringsC# Strings 4848
Regex.Match()Regex.Match() Regex.Regex.IsMatchIsMatch(string input, string pattern)(string input, string pattern)
• returns true if input matches pattern at least oncereturns true if input matches pattern at least once
Regex.Regex.MatchMatch(string input, string pattern)(string input, string pattern)• Returns a Match objectReturns a Match object• Use Match.Value to get the string value of the matchUse Match.Value to get the string value of the match
Regex.Regex.MatchesMatches(string input, string pattern)(string input, string pattern)• Returns a MatchCollection of all the occurrences of Returns a MatchCollection of all the occurrences of
pattern in inputpattern in input
C# StringsC# Strings 4949
Regex GroupsRegex Groups
GetGroupNames()GetGroupNames()• Returns all the group names in a string[]Returns all the group names in a string[]
GetGroupNumbers()GetGroupNumbers()• Returns the group numbers in an int[]Returns the group numbers in an int[]
GetNameFromNumber()GetNameFromNumber() GetNumberFromName()GetNumberFromName()
C# StringsC# Strings 5050
Regex.Split()Regex.Split()
Splits the string on a Regular Expression PatternSplits the string on a Regular Expression Pattern
string input = "one%%two%%%%three%%%four";Console.WriteLine();Console.WriteLine("Split...");Console.WriteLine(string.Join(",", Regex.Split(input, @"[%]+")));Console.ReadLine();
C# StringsC# Strings 5151
Regex.ReplaceRegex.Replace
Refer to a group capture in the regex using a $Refer to a group capture in the regex using a $ Replace(string input, string replacement, int count)Replace(string input, string replacement, int count)
string input2 = "aaabbbccc:aaabbbccc:aaabbbccc";string input2 = "aaabbbccc:aaabbbccc:aaabbbccc";
Regex re2 = new Regex("(aaa)(bbb)(ccc)");Regex re2 = new Regex("(aaa)(bbb)(ccc)");
Console.WriteLine();Console.WriteLine();
Console.WriteLine("Replace...");Console.WriteLine("Replace...");
Console.WriteLine(re2.Replace(input2, Console.WriteLine(re2.Replace(input2, "$3$2$1""$3$2$1", 1));, 1));
Console.ReadLine();Console.ReadLine();
C# StringsC# Strings 5252
Constructing Strings Constructing Strings Becuase strings are immutable, building them is Becuase strings are immutable, building them is
slowslow• Each change creates a new stringEach change creates a new string
Use StringBuilder to speed things upUse StringBuilder to speed things up
C# StringsC# Strings 5353
StringBuilderStringBuilder
In System.TextIn System.Text Contains a mutable stringContains a mutable string Allocates space as neededAllocates space as needed Build the string then call:Build the string then call:
• myStringBuilder.ToString()myStringBuilder.ToString()
C# StringsC# Strings 5454
StringBuilder MembersStringBuilder Members CapacityCapacity IndexerIndexer LengthLength Append()Append() AppendFormat()AppendFormat() Insert()Insert() Remove()Remove() Replace()Replace() ToString()ToString()