34
http://www.comp.nus.edu.sg/~cs 1010/ UNIT 16 Characters and Strings

Http://cs1010/ UNIT 16 Characters and Strings

Embed Size (px)

Citation preview

Page 1: Http://cs1010/ UNIT 16 Characters and Strings

http://www.comp.nus.edu.sg/~cs1010/

UNIT 16

Characters and Strings

Page 2: Http://cs1010/ UNIT 16 Characters and Strings

Unit 16: Characters and Strings

CS1010 (AY2014/5 Semester 1) Unit16 - 2© NUS

Objectives: Declare and manipulate data of char data type Learn fundamental operations on strings Write string processing programs

References: Lesson 1.4.1 Characters and Symbols Chapter 7: Strings and Pointers

Page 3: Http://cs1010/ UNIT 16 Characters and Strings

Unit 16: Characters and Strings (1/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 3© NUS

1. Motivation

2. Characters2.1 ASCII Table

2.2 Demo #1: Using Characters

2.3 Demo #2: Character I/O

2.4 Demo #3: Character Functions

2.5 Common Error

3. Strings3.1 Basics

3.2 String I/O

3.3 Demo #4: String I/O

3.4 Demo #5: Remove Vowels

3.5 Demo #6: Character Array without terminating ‘\0’

Page 4: Http://cs1010/ UNIT 16 Characters and Strings

Unit 16: Characters and Strings (2/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 4© NUS

4. String Functions

5. Pointer to String

6. Array of Strings

7. Demo #7: Using String Functions

8. Strings and Pointers

Page 5: Http://cs1010/ UNIT 16 Characters and Strings

1. Motivation

CS1010 (AY2014/5 Semester 1) Unit16 - 5© NUS

Why study characters and strings? Hangman game – Player tries to guess a word by filling

in the blanks. Each incorrect guess brings the player closer to being “hanged”

Let’s play! http://www.hangman.no/

Page 6: Http://cs1010/ UNIT 16 Characters and Strings

2. Characters

CS1010 (AY2014/5 Semester 1) Unit16 - 6© NUS

In C, single characters are represented using the data type char

Character constants are written as symbols enclosed in single quotes Examples: 'g', '8', '*', ' ', '\n', '\0' Recall: Week 3 Exercise #7 NRIC Check Code

Characters are stored in one byte, and are encoded as numbers using the ASCII scheme.

ASCII (American Standard Code for Information Interchange), is one of the document coding schemes widely used today.

Unicode is another commonly used standard for multi-language texts.

Page 7: Http://cs1010/ UNIT 16 Characters and Strings

2.1 Characters: ASCII Table

CS1010 (AY2014/5 Semester 1) Unit16 - 7© NUS

©The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

For example, character 'O' is 79 (row value 70 + col value 9 = 79).

For example, character 'O' is 79 (row value 70 + col value 9 = 79).

O

9

70

Page 8: Http://cs1010/ UNIT 16 Characters and Strings

2.2 Demo #1: Using Characters (1/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 8© NUS

// Unit16_CharacterDemo1.c#include <stdio.h>

int main(void) {

char grade = 'A', newgrade, ch;int value;

printf("grade = %c\n", grade);newgrade = grade + 2; printf("newgrade = %c\n", newgrade);printf("newgrade = %d\n", newgrade);

value = 65;printf("value = %d\n", value);printf("value = %c\n", value);

Unit16_CharacterDemo1.c

Declaring and initialising char variables.

Relationship between character and integer.

Using %cgrade = Anewgrade = C newgrade = 67

value = 65value = A

Page 9: Http://cs1010/ UNIT 16 Characters and Strings

2.2 Demo #1: Using Characters (2/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 9© NUS

if ('A' < 'c')printf("'A' is less than 'c'\n");

elseprintf("'A' is not less than 'c'\n");

for (ch = 'p'; ch <= 't'; ch++)printf("ch = %c\n", ch);

return 0;}

Comparing characters.

'A' is less than 'c'

ch = pch = qch = rch = sch = t

Using character variable as a loop variable.

ASCII value of 'A' is 65. ASCII value of 'c' is 99.

Page 10: Http://cs1010/ UNIT 16 Characters and Strings

2.3 Demo #2: Character I/O

CS1010 (AY2014/5 Semester 1) Unit16 - 10© NUS

Besides scanf() and printf(), we can also use getchar() and putchar(). Note how they are used below:

// Unit16_CharacterDemo2.c#include <stdio.h>

int main(void) {char ch;

printf("Enter a character: ");ch = getchar();

printf("The character entered is ");putchar(ch);putchar('\n');

return 0;}

// Unit16_CharacterDemo2.c#include <stdio.h>

int main(void) {char ch;

printf("Enter a character: ");ch = getchar();

printf("The character entered is ");putchar(ch);putchar('\n');

return 0;}

Unit16_CharacterDemo2.c

Read a character from stdin.

Enter a character: WCharacter entered is W

Print a character to stdout.

Page 11: Http://cs1010/ UNIT 16 Characters and Strings

2.4 Demo #3: Character Functions

CS1010 (AY2014/5 Semester 1) Unit16 - 11© NUS

Must include <ctype.h> to use these functions.

// Unit16_CharacterDemo3.c#include <stdio.h>#include <ctype.h>int main(void) {

char ch;

printf("Enter a character: ");ch = getchar();if (isalpha(ch)) {

if (isupper(ch)) {printf("'%c' is a uppercase-letter.\n", ch);printf("Converted to lowercase: %c\n", tolower(ch));

}if (islower(ch)) {

printf("'%c' is a lowercase-letter.\n", ch);printf("Converted to uppercase: %c\n", toupper(ch));

}}if (isdigit(ch)) printf("'%c' is a digit character.\n", ch);if (isalnum(ch)) printf("'%c' is an alphanumeric character.\n", ch);if (isspace(ch)) printf("'%c' is a whitespace character.\n", ch);if (ispunct(ch)) printf("'%c' is a punctuation character.\n", ch);return 0;

}

Unit16_CharacterDemo3.c

Download this program and test it out.For a complete list of character functions, refer to the Internet (eg: http://www.csd.uwo.ca/staff/magi/175/refs/char-funcs.html)

Note that tolower(ch) and toupper(ch) do NOT change ch!

Page 12: Http://cs1010/ UNIT 16 Characters and Strings

2.5 Characters: Common Error

CS1010 (AY2014/5 Semester 1) Unit16 - 12© NUS

A character variable named z does not means it is equivalent to 'z' or it contains 'z'!

if (marks >= 80)return 'A';

else if (marks >= 70)return 'B';

else if (marks >= 60)return 'C';

. . .

if (marks >= 80)return 'A';

else if (marks >= 70)return 'B';

else if (marks >= 60)return 'C';

. . .

char A, B, C, D, F;

if (marks >= 80)return A;

else if (marks >= 70)return B;

else if (marks >= 60)return C;

. . .

char A, B, C, D, F;

if (marks >= 80)return A;

else if (marks >= 70)return B;

else if (marks >= 60)return C;

. . .char grade;if (marks >= 80)

grade = 'A';else if (marks >= 70)

grade = 'B';else if (marks >= 60)

grade = 'C';. . .return grade;

char grade;if (marks >= 80)

grade = 'A';else if (marks >= 70)

grade = 'B';else if (marks >= 60)

grade = 'C';. . .return grade;

Page 13: Http://cs1010/ UNIT 16 Characters and Strings

3. Strings

CS1010 (AY2014/5 Semester 1) Unit16 - 13© NUS

We have seen arrays of numeric values (types int, float, double)

We have seen string constants printf("Average = %.2f", avg); #define ERROR "*****Error –"

A string is an array of characters, terminated by a null character '\0' (which has ASCII value of zero)

c s 1 0 1 0 \0

Page 14: Http://cs1010/ UNIT 16 Characters and Strings

3.1 Strings: Basics

CS1010 (AY2014/5 Semester 1) Unit16 - 14© NUS

Declaration an array of characters• char str[6];

Assigning character to an element of an array of characters

• str[0] = 'e';• str[1] = 'g';• str[2] = 'g';• str[3] = '\0';

Initializer for string Two ways:

• char fruit_name[] = "apple";• char fruit_name[] = {'a','p','p','l','e','\0'};

e g g \0 ? ?

Without ‘\0’, it is just an array of character, not a string.

Do not need ‘\0’ as it is automatically added.

Page 15: Http://cs1010/ UNIT 16 Characters and Strings

3.2 Strings: I/O (1/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 15© NUS

Read string from stdin• fgets(str, size, stdin) // reads size – 1 char, • // or until newline• scanf("%s", str); // reads until white space

Print string to stdoutputs(str); // terminates with newlineprintf("%s\n", str);

Note: There is another function gets(str) to read a string interactively. However, due to security reason, we avoid it and use fgets() function instead.

Page 16: Http://cs1010/ UNIT 16 Characters and Strings

3.2 Strings: I/O (2/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 16© NUS

fgets() On interactive input, fgets() also reads in the newline

character

Hence, we may need to replace it with '\0' if necessary

fgets(str, size, stdin);len = strlen(str);if (str[len – 1] == '\n')

str[len – 1] = '\0';

e a t \n \0 ? ?User input: eat

Page 17: Http://cs1010/ UNIT 16 Characters and Strings

3.3 Demo #4: String I/O

CS1010 (AY2014/5 Semester 1) Unit16 - 17© NUS

#include <stdio.h>#define LENGTH 10

int main(void) {char str[LENGTH];

printf("Enter string (at most %d characters): ", LENGTH-1);scanf("%s", str); printf("str = %s\n", str); return 0;

}

#include <stdio.h>#define LENGTH 10

int main(void) {char str[LENGTH];

printf("Enter string (at most %d characters): ", LENGTH-1);scanf("%s", str); printf("str = %s\n", str); return 0;

}

Unit16_StringIO1.c

#include <stdio.h>#define LENGTH 10

int main(void) {char str[LENGTH];

printf("Enter string (at most %d characters): ", LENGTH-1);fgets(str, LENGTH, stdin);printf("str = ");puts(str);return 0;

}

#include <stdio.h>#define LENGTH 10

int main(void) {char str[LENGTH];

printf("Enter string (at most %d characters): ", LENGTH-1);fgets(str, LENGTH, stdin);printf("str = ");puts(str);return 0;

}

Unit16_StringIO2.c

Test out the programs with this input: My book

Output:str = My

Output:str = My book

Note that puts(str) adds a newline automatically.

Page 18: Http://cs1010/ UNIT 16 Characters and Strings

3.4 Demo #5: Remove Vowels (1/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 18© NUS

Write a program Unit16_RemoveVowels.c to remove all vowels in a given input string.

Assume the input string has at most 100 characters.

Sample run:

Enter a string: How HAVE you been, James?Changed string: Hw HV y bn, Jms?

Page 19: Http://cs1010/ UNIT 16 Characters and Strings

3.4 Demo #5: Remove Vowels (2/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 19© NUS

#include <stdio.h>#include <string.h>#include <ctype.h>int main(void) {

int i, len, count = 0;char str[101], newstr[101];

printf("Enter a string (at most 100 characters): ");fgets(str, 101, stdin); //what happens if you use scanf() here?len = strlen(str); // strlen() returns number of char in stringif (str[len – 1] == '\n')

str[len – 1] = '\0';len = strlen(str); // check length again

for (i=0; i<len; i++) {switch (toupper(str[i])) {

case 'A': case 'E':case 'I': case 'O': case 'U': break; default: newstr[count++] = str[i];

}}

newstr[count] = '\0';printf("New string: %s\n", newstr);return 0;

}

#include <stdio.h>#include <string.h>#include <ctype.h>int main(void) {

int i, len, count = 0;char str[101], newstr[101];

printf("Enter a string (at most 100 characters): ");fgets(str, 101, stdin); //what happens if you use scanf() here?len = strlen(str); // strlen() returns number of char in stringif (str[len – 1] == '\n')

str[len – 1] = '\0';len = strlen(str); // check length again

for (i=0; i<len; i++) {switch (toupper(str[i])) {

case 'A': case 'E':case 'I': case 'O': case 'U': break; default: newstr[count++] = str[i];

}}

newstr[count] = '\0';printf("New string: %s\n", newstr);return 0;

}

Unit16_RemoveVowels.c

Need to include <string.h> to use string functions such as strlen().

Page 20: Http://cs1010/ UNIT 16 Characters and Strings

3.5 Demo #6: Character Array without terminating ‘\0’

CS1010 (AY2014/5 Semester 1) Unit16 - 20© NUS

What is the output of this code?

#include <stdio.h>#include <string.h>

int main(void) {char str[10];

str[0] = 'a';str[1] = 'p';str[2] = 'p';str[3] = 'l';str[4] = 'e';

printf("Length = %d\n", strlen(str));printf("str = %s\n", str);

return 0;}

Unit16_without_null_char.c

One possible output:Length = 8str = apple¿ø<

Compare the output if you add:str[5] = '\0';

or, you have:char str[10] = "apple";

%s and string functions work only on “true” strings. Without the terminating null character ‘\0’, string functions will not work properly.

printf() will print %s from the starting address of str until it encounters the ‘\0’ character.

Page 21: Http://cs1010/ UNIT 16 Characters and Strings

4. String Functions (1/3)

CS1010 (AY2014/5 Semester 1) Unit16 - 21© NUS

C provides a library of string functions Must include <string.h> Table 7.3 (pg 509 – 514) http://www.edcc.edu/faculty/paul.bladek/c_string_functions.htm http://www.cs.cf.ac.uk/Dave/C/node19.html and other links you can find on the Internet

strcmp(s1, s2) Compare the ASCII values of the corresponding characters in

strings s1 and s2. Return

a negative integer if s1 is lexicographically less than s2, or a positive integer if s1 is lexicographically greater than s2, or 0 if s1 and s2 are equal.

strncmp(s1, s2, n) Compare first n characters of s1 and s2.

Page 22: Http://cs1010/ UNIT 16 Characters and Strings

4. String Functions (2/3)

CS1010 (AY2014/5 Semester 1) Unit16 - 22© NUS

strcpy(s1, s2) Copy the string pointed to by s2 into array pointed to by s1. Function returns s1. Example:

char name[10];strcpy(name, "Matthew");

The following assignment statement does not work:name = "Matthew";

What happens when string to be copied is too long? strcpy(name, "A very long name");

strncpy(s1, s2, n) Copy first n characters of string pointed to by s2 to s1.

M a t t h e w \0 ? ?

A v e r y l o n g n a m e \0

Page 23: Http://cs1010/ UNIT 16 Characters and Strings

4. String Functions (3/3)

CS1010 (AY2014/5 Semester 1) Unit16 - 23© NUS

strstr(s1, s2) Returns a pointer to the first instance of string s2 in s1. Returns a NULL pointer if s2 is not found in s1,

We will use the functions above in Demo #7.

Read up on the above functions (Table 7.3 [pg 405 – 411] and Table 7.4 [pg 412 – 413]) We will do some more exercises on them next week

Other functions (atoi, strcat, strchr, strtok, etc.) We will explore these in your discussion session

Page 24: Http://cs1010/ UNIT 16 Characters and Strings

5. Pointer to String (1/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 24© NUS

#include <stdio.h>#include <string.h>int main(void) {

char name[12] = "Chan Tan"; char *namePtr = "Chan Tan";

printf("name = %s\n", name);printf("namePtr = %s\n", namePtr);printf("Address of 1st array element for name = %p\n", name);printf("Address of 1st array element for namePtr = %p\n",namePtr);

strcpy(name, "Lee Hsu");namePtr = "Lee Hsu";

printf("name = %s\n", name);printf("namePtr = %s\n", namePtr);printf("Address of 1st array element for name = %p\n", name);printf("Address of 1st array element for namePtr = %p\n",namePtr);

}

Unit16_StringPointer.c

name is a character array of 12 elements. namePtr is a pointer to a character. Both have strings assigned. Difference is name sets aside space for 12 characters, but namePtr is a char pointer variable that is initialized to point to a string constant of 9 characters.

name updated using strcpy(). namePtr assigned to another string using =.

Address of first array element for name remains constant, string assigned to namePtr changes on new assignment.

Page 25: Http://cs1010/ UNIT 16 Characters and Strings

5. Pointer to String (2/2)

CS1010 (AY2014/5 Semester 1) Unit16 - 25© NUS

Comparison

char name[12] = "Chan Tan";

char *namePtr = "Chan Tan";namePtr

C h a n T a n \0

C h a n T a n \0 \0 \0\0name[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

Page 26: Http://cs1010/ UNIT 16 Characters and Strings

6. Array of Strings

CS1010 (AY2014/5 Semester 1) Unit16 - 26© NUS

Declarationchar fruits[MAXNUM][STRSIZE]; // where MAXNUM is the maximum number of

names// and STRSIZE is the size of each name

Initializationchar fruits[][6] = {"apple", "mango",

"pear"};or

char fruits[3][6] = {"apple", "mango", "pear"};

Outputprintf("fruits: %s %s\n", fruits[0], fruits[1]);printf("character: %c\n", fruits[2][1]);

fruits: apple mangocharacter: e

Page 27: Http://cs1010/ UNIT 16 Characters and Strings

7. Demo #7: Using String Functions

CS1010 (AY2014/5 Semester 1) Unit16 - 27© NUS

#include <stdio.h>#include <string.h>#define MAX_LEN 10

int main(void) {char s1[MAX_LEN + 1], s2[MAX_LEN + 1], *p;int len;

printf("Enter string (at most %d characters) for s1: ", MAX_LEN);fgets(s1, MAX_LEN+1, stdin);len = strlen(s1);if (s1[len – 1] == '\n') s1[len – 1] = '\0';

printf("Enter string (at most %d characters) for s2: ", MAX_LEN);fgets(s2, MAX_LEN+1, stdin);len = strlen(s2);if (s2[len – 1] == '\n') s2[len – 1] = '\0';

printf("strcmp(s1,s2) = %d\n", strcmp(s1,s2));

p = strstr(s1,s2);if (p != NULL) printf("strstr(s1,s2) returns %s\n", p);else printf("strstr(s1,s2) returns NULL\n");

strcpy(s1,s2);printf("After strcpy(s1,s2), s1 = %s\n", s1);return 0;

}

#include <stdio.h>#include <string.h>#define MAX_LEN 10

int main(void) {char s1[MAX_LEN + 1], s2[MAX_LEN + 1], *p;int len;

printf("Enter string (at most %d characters) for s1: ", MAX_LEN);fgets(s1, MAX_LEN+1, stdin);len = strlen(s1);if (s1[len – 1] == '\n') s1[len – 1] = '\0';

printf("Enter string (at most %d characters) for s2: ", MAX_LEN);fgets(s2, MAX_LEN+1, stdin);len = strlen(s2);if (s2[len – 1] == '\n') s2[len – 1] = '\0';

printf("strcmp(s1,s2) = %d\n", strcmp(s1,s2));

p = strstr(s1,s2);if (p != NULL) printf("strstr(s1,s2) returns %s\n", p);else printf("strstr(s1,s2) returns NULL\n");

strcpy(s1,s2);printf("After strcpy(s1,s2), s1 = %s\n", s1);return 0;

}

Unit16_StringFunctions.c

Page 28: Http://cs1010/ UNIT 16 Characters and Strings

8. Strings and Pointers (1/4)

CS1010 (AY2014/5 Semester 1) Unit16 - 28© NUS

We discussed in Unit #9 Section 4 that an array name is a pointer (that points to the first array element)

Likewise, since a string is physically an array of characters, the name of a string is also a pointer (that points to the first character of the string)

char str[] = "apple";

printf("1st character: %c\n", str[0]);printf("1st character: %c\n", *str);

printf("5th character: %c\n", str[4]);printf("5th character: %c\n", *(str+4));

Unit16_String_vs_Pointer.c

1st character: a1st character: a5th character: e5th character: e

Page 29: Http://cs1010/ UNIT 16 Characters and Strings

8. Strings and Pointers (2/4)

CS1010 (AY2014/5 Semester 1) Unit16 - 29© NUS

Unit16_strlen.c shows how we could compute the length of a string if we are not using strlen()

See full program on CS1010 website

int mystrlen(char *p) {int count = 0;

while (*p != '\0') {count++;p++;

}

return count;}

Unit16_strlen.c

Page 30: Http://cs1010/ UNIT 16 Characters and Strings

8. Strings and Pointers (3/4)

CS1010 (AY2014/5 Semester 1) Unit16 - 30© NUS

Since ASCII value of null character '\0' is zero, the condition in the while loop is equivalent to (*p != 0) and that can be further simplified to just (*p) (see left box)

We can combine *p with p++ (see right box) (why?)

int mystrlen(char *p) {int count = 0;

while (*p) {count++;p++;

}

return count;}

int mystrlen(char *p) {int count = 0;

while (*p++) {count++;

}

return count;}

Unit16_strlen_v2.c

Page 31: Http://cs1010/ UNIT 16 Characters and Strings

8. Strings and Pointers (4/4)

CS1010 (AY2014/5 Semester 1) Unit16 - 31© NUS

How to interpret the following?

while (*p++)

Check whether *p is 0 (that is, whether *p is the null character ‘\0’)…

Then, increment p by 1 (so that p points to the next character).Not increment *p by 1!

(*p++) is not the same as (*p)++(*p)++ is to increment *p (the character that p points to) by 1. (Hence, if p is pointing to character ‘a’, that character becomes ‘b’.)

Page 32: Http://cs1010/ UNIT 16 Characters and Strings

Extra topics

CS1010 (AY2014/5 Semester 1) Unit16 - 32© NUS

2 additional topics that are not in the syllabus are in the file Unit16_Extra.pptx Array of Pointers to Strings

Command-line arguments

Page 33: Http://cs1010/ UNIT 16 Characters and Strings

Summary

CS1010 (AY2014/5 Semester 1) Unit16 - 33© NUS

In this unit, you have learned about Characters

Declaring and using characters Characters I/O Character functions

Strings Declaring and initialising strings String I/O String functions Array of strings

Page 34: Http://cs1010/ UNIT 16 Characters and Strings

End of File

CS1010 (AY2014/5 Semester 1) Unit16 - 34© NUS