Upload
others
View
47
Download
0
Embed Size (px)
Citation preview
This is part of the book
COMPUTER PROGRAMMING
THE C LANGUAGE
by Adriana ALBU
Conspress Publisher, Bucureşti, România, 2013 ISBN: 978-973-100-270-5
Chapter 8
Strings
A string variable stores a sequence of characters. C programming language
doesn’t have a data type dedicated to strings (as Java and Pascal have the
string type). In C the strings are represented using single-dimensional arrays
that contain in each location a character and that have in their last location the
character '\0', which marks the end of the string. Because the basic element
of a string is the character, this chapter starts by presenting several aspects
connected to characters; then the features of the strings are described in detail.
8.1 Characters
The data type char is used in order to declare a character. This type has 8 bits;
therefore 28, meaning 256 characters can be represented. Each character is
stored as an integer number in the range [0, 255], number that is the ASCII code
related to that character [CH96]. Appendix 1 contains the most used characters
together with their ASCII codes.
When a character variable must be declared, the keyword char is used, fol-
lowed by the name of that new variable. If the variable requires an initial value,
then that value must be enclosed by apostrophes (not quotation marks). The
following instruction declares a variable with the name ch and gives it the
ASCII value of the letter A.
char ch='A';
For input/output actions made on character variables the functions getch-
ar() and putchar() can be used. If a formatted manner is required for these
variables, it can be performed through the functions printf() and
scanf(), using the format descriptor %c. The following instruction displays
on the screen the variable ch, previously declared; therefore the letter A will be
printed.
printf("%c",ch);
The characters can be analyzed and processed through several functions that are
part of the header file <ctype.h>. Some of these functions are hereby pre-
sented (each function receives a character as argument).
isalpha(c) – returns a non-zero value if the argument is a letter (A to Z or a
to z);
Computer Programming – The C Language
2
isdigit(c) – returns a non-zero value if the argument is a digit (0 to 9);
isalnum(c) – returns a non-zero value if the argument is a letter or a digit;
isupper(c) – returns a non-zero value if the argument is an uppercase letter
(A to Z);
islower(c) – returns a non-zero value if the argument is a lowercase letter
(a to z);
isspace(c) – returns a non-zero value if the argument is space, tab, new
line;
toupper(c) – returns the argument converted to its uppercase value;
tolower(c) – returns the argument converted to its lowercase value.
Program 8.1.1 is an example of using a character variable. It depicts the option
of the user who decides whether he/she wants to increment and to display the
value of a counter whose initial value is zero. The counter is displayed, then it is
incremented and the user is asked whether he/she wants to continue these ac-
tions. His/her option is read through the getch() function and stored in a
char variable. This sequence of code is repeated within a do while instruc-
tion until the user enters the letter 'N' or 'n'. In order to avoid the require-
ment to enter only 'N' (uppercase) or only 'n' (lowercase) and also to avoid a
compound condition (the comparison of the user’s option to both 'N' and
'n'), the letter that is read from the keyboard is transformed to lowercase,
using the function tolower() and it is only compared to 'n' letter.
Program 8.1.1
#include <stdio.h>
#include <conio.h>
#include <ctype.h>
void main(void){
int counter=0;
char option;
do{
printf("Counter: %d", counter++);
printf("\nDo you want to continue? (Y/N)");
option=getch();
}while(tolower(option)!='n');
8 – Strings
3
}
8.2 Strings
A string is a sequence of characters that ends with '\0' (character that has the
ASCII code 0) and enclosed in double quotes ("This is a string"). The
double quotes specify that it is a string and the compiler adds the terminator for
strings [CH96]. The null character from the end of a string exists only in the
computer’s memory; the programmer is not required to worry about it too
much. The string from above can have the following memory representation:
T h i s i s a s t r i n g \0
In order to create a string element, a single-dimensional array of character
variables will be used. The declaration is made as in the case of other variables,
specifying the type (char) and the number of elements and giving a valid
name for the new variable:
char str_name[str_length];
The number of elements specified when the variable is declared must take into
account that the string is ended with the null character ('\0'), therefore the
declaration must allocate memory space for this character too. For instance, if
the string "Programming", which has 11 characters, must be stored, then the
variable must be declared with at least 12 characters length.
A string can also be declared as a pointer (more about pointers in Chapter 9).
The variable stores the address of the first character of the string:
char *name;
A string can be initialized somewhere during the program or even when it is
declared. In this case, it is not mandatory to specify the length of the string (it
will have one character more than the number of characters used as initial val-
ue). If the string receives value character by character, then the programmer
must “manually” add the string terminator '\0'. The following instructions
emphasize these three types of initialization:
char my_name[8]="Adriana";
char my_name[]="Adriana";
char my_name[8]={'A','d','r','i','a','n','a','\0'};
Computer Programming – The C Language
4
Though a string is an array, it is not necessary to access it character by charac-
ter. It can be handled as a whole, as can be observed in the first two previous
initializations. The functions that process a string start from its first character
(whose address is specified by the name of the string) and traverse it until the
null character (which specifies the end of the string) is encountered. It is there-
fore very important not to omit this final character when the string is made of
individual characters, as the case of the third initialization from above.
The input/output actions performed on strings can be made through the func-
tions gets() and scanf() for reading and puts() and printf() for
writing. The format descriptor for scanf() and printf() functions is
"%s". Because the name of an array contains the memory address of its first
element, the string variables do not need the addressing operator (&) when they
are read using the scanf() function.
In program 8.2.1 a string is declared, its value is read from the keyboard and
then it is displayed on the screen. Pay attention, the scanf() function reads
until a space is encountered and gets() until a new line (Enter) is found.
Program 8.2.1
#include <stdio.h>
#include <conio.h>
#include <ctype.h>
void main(void){
char name[30];
printf("Enter the name ");
scanf("%s", name);
//or:
//gets(name);
//if the name contains spaces
printf("The name is: %s", name);
getch();
}
Because C programming language doesn’t have a special type dedicated to
strings, it doesn’t have operators for strings either. Therefore, instructions as
if(str1==str2) or str="Abc"+"def"; are impossible in C language.
The string processing (copying, concatenating, comparing) is made through
8 – Strings
5
several functions that are part of the header file <string.h>. Some of these
functions are hereby presented.
int strlen(const char *s);
It returns the length of the string s (the number of characters), not counting the
terminating null character. Program 8.2.2 shows how this function can be used.
Once the program is run, values 14 and 17 (the lengths of the two strings) will
be displayed on the screen.
Program 8.2.2
#include <stdio.h>
#include <conio.h>
#include <string.h>
void main(void){
char name[]="Dennis Ritchie";
int l;
l=strlen("Bell Laboratories");
printf("%d, %d", strlen(name), l);
getch();
}
char *strcpy(char *dest, const char *src);
The function is used to make assignments between two strings, copying the
source string src into the destination string dest. It returns the destination
string and assumes that it is long enough to store the source string (if not, the
program can have an improper behavior, even if it doesn’t have compilation
errors). This function is necessary because in C programming language the
assignments of strings cannot be made by simple instructions as str1=str2.
Program 8.2.3 is a short example that uses this function. The text abcdefg
will be displayed on the screen. It can be noted that strcpy() function modi-
fies the content of the destination string (the initial content of this string is lost),
but keeps the source string unchanged (actually, the source string is declared as
a constant in the function’s prototype).
Program 8.2.3
#include <stdio.h>
#include <conio.h>
#include <string.h>
Computer Programming – The C Language
6
void main(void){
char dest[20];
char src[] = "abcdefg";
strcpy(dest, src);
printf("%s", dest);
getch();
}
char *strcat(char *dest, const char *src);
Through this function two strings dest and src are concatenated. The address
of the destination string is returned; it contains the result of this action and must
be long enough to store the new string. Program 8.2.4 concatenates a space at
the end of the variable str1. Then the variable str2 is concatenated to the
result, obtaining the text C Programming.
Program 8.2.4
#include <stdio.h>
#include <conio.h>
#include <string.h>
void main(void){
char str1[20]="C";
char str2[]="Programming";
strcat(str1, " ");
strcat(str1, str2);
printf("%s", str1);
getch();
}
int strcmp(const char *s1, const char *s2);
This function is used to compare two strings (the lexicographical order is con-
sidered). The function is case sensitive, therefore it will distinguish between
uppercases and lowercases (lowercases have the ASCII codes greater than
uppercases and the function will act accordingly). The returned value is an
integer with the following meanings:
o a negative value if string s1 is less than string s2 (e.g. „calcula-
tor” is less than „key”);
8 – Strings
7
o 0 – if strings s1 and s2 are identically;
o a positive value if string s1 is greater than string s2.
Program 8.2.5 compares the password entered by the user to the password that
was set through the program; a message is displayed accordingly. The pass-
words are stored into strings.
Program 8.2.5
#include <stdio.h>
#include <conio.h>
#include <string.h>
void main(void){
char pass[20]="Secret!";
char user_pass[20];
gets(user_pass);
if(strcmp(pass, user_pass)==0)
printf("Welcome!");
else
printf("Incorrect password");
getch();
}
char *strchr(const char *s, int c);
This function searches the char c into the string s. If it finds the character, it
returns a pointer to the character’s first occurrence into the string; else, it re-
turns the null value.
char *strrchr(const char *s, int c);
The character c is also searched into the string s, but the traversing is made
from right to left; therefore the function returns the address of the character’s
last occurrence (or the null value if string s doesn’t contain the character c).
char *strstr(const char *s1, const char *s2);
The function looks for string s2 into string s1. It returns a pointer to the ad-
dress of s2 into s1 or null if s2 in not part of s1.
Program 8.2.6 is an example of using the previously described three functions.
"C Programming" string is considered and three substrings are extracted
from it; the first one begins where the char 'r' occurs for the first time, the
Computer Programming – The C Language
8
second one begins where the char 'r' occurs for the last time and the third
substring begins where the string "gram" occurs. Therefore, on the screen will
be displayed:
Substring 1: "rogramming"
Substring 2: "ramming" Substring 3: "gramming"
Program 8.2.6
#include <stdio.h>
#include <conio.h>
#include <string.h>
void main(void){
char str1[] = "C Programming", str2[] = "gram";
char substr1[20], substr2[20], substr3[20];
char ch='r';
strcpy(substr1,strchr(str1,ch));
strcpy(substr2,strrchr(str1,ch));
strcpy(substr3,strstr(str1,str2));
printf("\nSubstring 1: \"%s\"", substr1);
printf("\nSubstring 2: \"%s\"", substr2);
printf("\nSubstring 3: \"%s\"", substr3);
getch();
}
char *strncpy(char *dest, const char *src, int n);
It is similar to the function strcpy(), but it copies only maximum n charac-
ters from the source string to the destination string.
char *strncat(char *dest,const char *sursa, int n);
It is similar to the function strcat(), but it concatenates only maximum n
characters from the source string to the destination string.
int strncmp(const char *s1, const char *s2, int n);
It is similar to the function strcmp(), but it compares only the first maximum
n characters of the two strings. The function is case sensitive.
int stricmp(const char *s1, const char *s2);
8 – Strings
9
It is similar to the function strcmp(), but it doesn’t distinguish between
lowercases and uppercases; therefore it is not case sensitive (the strings "ABC"
and "abc" will be considered identical by this function).
int strnicmp(const char *s1,const char *s2, int n);
It is similar to the function strncmp(), but it doesn’t distinguish between
lowercases and uppercases; therefore it is not case sensitive.
Conversion functions – these are used to transform a string into another data
type (usually a number) and vice versa. These functions are part of the
header file <stdlib.h>.
o int atoi(const char * str );
This function transforms the string received as argument into an integer num-
ber. First, the function ignores the spaces from the beginning of the string (if
there are such spaces); then it interprets a possible plus or minus sign; finally
the function transforms the next characters into an integer number (if these
characters have such a meaning). When a character that cannot be interpreted as
a number is encountered, the function stops, returning the already created num-
ber and ignoring the rest of the string. If the string doesn’t begin with a se-
quence of characters that can be transformed into an integer, then the function
returns zero.
o long int atol(const char * str );
It is similar to the function atoi(), but it returns a value of type long int.
o double atof(const char * str );
It is similar to the function atoi(), but transforms the string (or a part of it)
into a real number (of type double).
o char *itoa(int val, char * str, int base);
This is not a standard function, but it is accepted by some compilers. Using this
function, an integer number is transformed into a string. The number is the first
argument of the function. The string obtained through the conversion is re-
turned by the function and it is also stored into the second argument of the
function. The third argument specifies the base to be used in conversion (for
instance, 10).
Program 8.2.7 uses the functions atoi() and itoa() to convert a string into
an integer and two integers into strings (the third string is the base 2 representa-
Computer Programming – The C Language
10
tion of the third number). Once the program is run, on the screen will be dis-
played:
number1=-987, str1=-987
number2=12345, str2=12345
number3=7, str3=111
Program 8.2.7
#include <stdlib.h>
#include <stdio.h>
#include <conio.h>
void main(void){
int number1, number2=12345, number3=7;
char str1[25]="-987", str2[25], str3[25];
number1=atoi(str1);
itoa(number2, str2, 10);
itoa(number3, str3, 2);
printf("number1=%d, str1=%s\n",number1,str1);
printf("number2=%d, str2=%s\n",number2,str2);
printf("number3=%d, str3=%s",number3,str3);
getch();
}
Though a string is usually dealt with as a whole, it can also be analyzed or
processed character by character because it is in fact an array. Program 8.2.8
verifies if a string read from the keyboard is a palindrome (a word or a phrase
that may be read the same way in either direction: "eye", "madam",
"level", "1234321", "A car, a man, a maraca"). In order to
do this, the symmetry of characters with respect to the middle of the string must
be verified. An integer variable, called flag is declared; it receives the initial
value 1, considering that the string is a palindrome. When it is found that the
string is not palindrome, the variable is changed to 0. At the beginning, the
algorithm determines the length of the string (using the function strlen()).
Then the first half of the string is traversed character by character and the char-
acters symmetrical with respect to the middle of the string are compared (the
first with the last, the second with the last but one and so on). When a pair of
characters that are not identical is found, the variable flag changes its value to
0 and the checking ends; the for is left through the instruction break. If after
8 – Strings
11
the for the variable flag is unchanged (its value is still 1), then all the char-
acters of the string are symmetrical with respect to the middle of the string,
therefore the string is a palindrome; otherwise, not.
Program 8.2.8
#include <stdio.h>
#include <conio.h>
#include <string.h>
void main(void){
char str[25];
int flag=1;
int i, n;
clrscr();
printf("Enter the string: ");
gets(str);
n=strlen(str);
for(i=0;i<n/2;i++)
if(str[i]!=str[n-i-1]){
flag=0;
break;
}
if(flag)
printf("The string is palindrome.");
else
printf("The string is not palindrome.");
getch();
}
8.3 Questions and exercises
A. Find the error.
1. A character must be declared and initialized. char ch="A";
2. The value 'a' is assigned to a char variable and then that variable is
displayed. char ch='a';
printf("%d",ch);
Computer Programming – The C Language
12
3. A string variable must be declared and initialized. char str[5]="C Programming";
4. Two strings s1 and s2 must be compared.
if(s1==s2)
//something
else
//something else
B. There are considered the following sequences of code. Specify what hap-
pens during their execution.
5. char key; do{
key=getch();
}while(toupper(key)!='X');
6. char s1[10], s2[10]; int flag;
strcpy(s1,"abc"); strcpy(s2,"abc");
if(strcmp(s1,s2))
flag=1;
else
flag=2;
printf("%d",flag);
7. char str[25]={'a','b','c','\0','1','2','3'};
printf("%s, %d",str,strlen(str));
8. char s1[10], s2[10]; strcpy(s1,"abc"); strcpy(s2,"123");
if(strcmp(s1,s2)==0)
strcpy(s1,s2);
else
strcat(s1,s2);
printf("%s, %s",s1,s2);
C. Choose the correct answer (one only).
9. Which of the following functions can be applied to a char variable?
□ a) isalnum(); □ b) isspace(); □ c) toupper(); □ d)
tolower(); □ e) all answers are correct.
8 – Strings
13
10. There are considered two strings a="abra" and b="cadabra". Af-
ter the execution of the instruction strcat(a,b);, the variables a
and b will have the following values:
□ a) a="abra", b="cadabra"; □ b) a="abra si cadabra",
b="cadabra"; □ c) a="abracadabra", b="cadabra"; □ d)
a="abra cadabra", b="cadabra"; □ e) a="abra",
b="abracadabra".
11. There are considered two strings s1="abc" and s2="abc" and two
integer variables a=0 and b=0. The following sequence of code is exe-
cuted: if(strcmp(s1,s2)==0) a++; else b+=1;. What will
be the values of the variables a and b after this execution?
□ a) a=0, b=0; □ b) a=1, b=0; □ c) a=1, b=1; □ d) a=0, b=1; □ e)
all answers are wrong.
12. Which of the following functions should be used to determine the length
of a string?
□ a) strlength(); □ b) length(); □ c) String.Len(); □ d)
strlen(); □ e) len().
13. Which of the following actions is possible with two strings s1 and s2?
□ a) s=s1+s2; □ b) s=s1-s2; □ c) s=s1*s2; □ d) s=s1/s2; □ e)
all answers are wrong.
14. A string is:
□ a) a data type; □ b) an array of characters; □ c) an operator; □ d) a
function; □ e) a program.
15. The function that concatenates two strings is:
□ a) strcat(); □ b) strconcatenate(); □ c) concate-
nate(); □ d) strconc(); □ e) conc().
16. Which of the following functions is case sensitive?
□ a) strcpy(); □ b) strcat(); □ c) strlen(); □ d) strcmp();
□ e) stricmp().
17. Which of the following instructions copies a string s1 to a string s2?
Computer Programming – The C Language
14
□ a) strcpy(s1,s2); □ b) strcpy(s2,s1); □ c) s2=s1; □ d)
s2==s1; □ e) none of them.
D. Write programs to solve the following problems.
18. Read in a string the first name and the last name of a person (both en-
tered on the same line).
19. Read, in two different variables, the first name and the last name of a
person. Concatenate these values without losing any of them and display
the complete name.
20. A string is read from the keyboard. Find out how many digits, how
many uppercases and how many lowercases that string contains.
21. A string is read from the keyboard. Display the string from right to left.
22. Determine how many times a character is present in a text. Both the
character and the text are read from the keyboard.
23. Ten words are read from the keyboard. Display a warning for those
words that are greater (from lexicographical point of view) than the first
word.
24. The first name and the last name of a person are stored in two strings.
Modify these strings so that the first name has the first letter uppercase
and the rest of them lowercases and the last name has all letters upper-
cases.
8.4 Answers to questions and exercises
A. 1. The variable is correctly declared, but it is not correctly initialized because
the double quotes are used (these indicate to the compiler that a string is initial-
ized) instead of apostrophes. The compiler tries to add the string terminator
'\0', but it won’t be able to store that value because the variable is defined as
a character, not as a string. 2. The character is displayed in a wrong way. The
format descriptor "%d" is used to display integer numbers, therefore the ASCII
code of that character is printed, not the desired character. In order to print the
character, the format descriptor "%c" should be used. 3. Variable str doesn’t
have enough space allocated to store the given value. There are two solutions
for this problem: either the number of characters allocated to the string is in-
creased to fit the desired value, or this number is deleted and the compiler will
8 – Strings
15
calculate the desired space. 4. The comparison of two strings cannot be made by
the operator „==”; function strcmp() should be used.
B. 5. A key is repetitively read within a do while loop. The section of code is
ended when the user presses 'X' or 'x'. 6. The value "abc" is assigned to
string s1; string s2 receives the same value "abc". Because the two strings
are equal, function strcmp() returns the value 0, which means false from the
logical point of view; therefore, the else branch of if instruction is executed
and variable flag receives the value 2. 7. It is displayed "abc, 3", meaning
that the string is abc and it has the length 3 because the created string ends
when the character '\0' is encountered. 8. The value "abc" is assigned to
string s1; string s2 receives the value "123". Function strcmp() is not
zero, because the strings are not identical. Therefore, the condition of if in-
struction is false and function strcat() on the else branch is executed.
Thus, string s1 becomes "abc123" and string s2 remains "123".
C. 9. e). 10. c) – strings a and b are concatenated; the result is returned in string
a and string b retains its initial value. 11. b) – the two strings are identical,
function strcmp() returns the value 0, therefore the condition of if is true;
variable a is incremented and b remains at its value 0. 12. d). 13. e). 14. b). 15.
a). 16. d). 17. b).