Upload
gary-hampton
View
246
Download
0
Embed Size (px)
Citation preview
CSCI 3431: OPERATING SYSTEMS
Chapter 0 – C Programming
The C Language
We are programming in C, a subset of C++
C++ was originally compiled into C No classes, templates, namespaces,
user-defined overloading, streams Here is a good reference to the
language and the basic C libraries:http://www.gnu.org/software/libc/manual/
C's Foundations
C is really just a structured version of assembly language Very minimal library support (see handout) Meant for systems programming – originally created as a
language to program Unix C was programmed in a week and the manual was written
after they determined what the compiler did/accepted Fundamentally linked to the Unix OS and to system
programming Created by Brian Kernighan and Dennis Ritchie at Bell
Labs Developed before mice, GUI's Is a foundational language that spawned C++, Java, C#
and many other more recent languages A reference manual (online or paper) is essential
Philosophies
C is made to let a programmer do pretty much anything which means that it's easy to really botch something up
It was designed to be as close to assembly language as was comfortable for a programmer
The language is simple which means that complexity is handled by the programmer and programs
Puts the emphasis on the programmer to develop good style, structure, and practices
C versus Java
Much of the two languages is similar (e.g., operators, control structures)
Differences mainly occur in the type system C is much simpler, but in turn, leads to very
complex programs e.g., complexity is in the programs, not the language as
is the case with Java C's libraries are similarly simple and easy to learn,
but most find them hard to use because of their flexibility and the very low level of detail
Java protects the programmer while C exposes the programmer, but this is partly due to historical context
The File Model All C I/O is done using files 3 special files are provided: stdin, stdout, and stderr
These files are automatically opened and closed for you by the compiler
stdin = keyboard, stdout = monitor, but both can be redirected as needed
stderr also goes to the monitor, but is often redirected to a file to be used as an error log
I/O can be done directly using fread() and fwrite()
More useful to do formatted I/O using fprintf() and fscanf()
File I/O
Files are represented by a file pointer of type FILE * Files must be opened (fopen) Files are opened to support a specific mode of use Files should be closed (fclose) Files are often buffered so flushing might be necessary
to force output (fflush) Buffering is controlled by the programmer (setvbuf)
for one of: _IONBF No buffering (unbuffered) _IOLBF Line buffering _IOFBF Full buffering
The buffer size is also controllable to achieve optimum performance (usually matched to a disk block)
File Mode Values
"r" read "w" write "a" append "r+" reading and writing "w+" create then reading and writing
(discard previous contents)
"a+" open or create and do writing at end
FILE *fp = fopen ("errors.txt","a+");
File Output
Formatted: fprintf, printf, sprintf, vprintf ...
fputc: write a char to a FILE * fputs: write a char* to a FILE *
char* = null terminated string putc: macro version of fputc that may evaluate
its argument more than once (avoid using it) puts: write a char* to stdout putchar: write a char to stdout Each function has its own subtleties You MUST learn how to use these functions!
Example: Writing a string(not necessarily efficiently or clearly)#include <stdio.h>
#include <string.h>
#define LENGTH 80
int
main(void) {
FILE *stream = stdout;
int i, ch;
char buffer[LENGTH + 1] = "Hello world";
for (i = 0;
(i < strlen(buffer)) && ((ch = putc(buffer[i], stream)) != EOF);
++i);
} /* end main() */
/******************** Expected output: *************************
Hello world
*/
Formatted Output
fprintf(file, “format”, values);
Various versions of fprintf() exist:printf(...) is the same as fprintf(stdout,...)
sprintf(): print to a string (or character buffer)v_printf(): variable number or args (in the
format)
Other variations exist (see reference material)
Formatted Outputprintf(“format”, args ...);
1. Format contains format specifiers, such as %d (integer), %s (string), %c (character)
2. There must be an argument for every specifier
3. Format can contain output modifiers such as \n to insert a newline
printf(“%d is %s than %d\n”, i[0], “smaller”, max);
Format Specifiers
1. %2. flags (0ptional): sign, zero padding, etc.3. minimum width (optional)4. .precision (optional): maximum chars5. length modifier (optional): short, long6. argument type conversion: string,
pointer, character, float, integer, etc.
%-2.8s, %hd, %*s
Conversions (used by specifiers)
s string d signed integer f float (double) c single character p pointer n displays the number of
characters printed so far All these are preceded by a % as part
of the format specifier
ALL the Conversions ...%c The character format specifier.%d The integer format specifier.%i The integer format specifier (same as %d).%f The floating-point format specifier.%e The scientific notation format specifier.%E The scientific notation format specifier.%g Uses %f or %e, whichever result is shorter.%G Uses %f or %E, whichever result is shorter.%o The unsigned octal format specifier.%s The string format specifier.%u The unsigned integer format specifier.%x The unsigned hexadecimal format specifier.%X The unsigned hexadecimal format specifier.%p Displays the corresponding argument that is a pointer.%n Records the number of characters written so far.%% Outputs a percent sign.
Examples of Output
printf ("Hello World\n");
char buf[256];
sprintf(buf,"%8d\t%8d", a, b);
fprintf(stderr,
"%s: fatal: %s not found",
argv[0], argv[1]);
FILE *log = fopen("log.txt","w+");
fprintf(log,"%s: %s\n", time, msg);
Command Line Arguments
#include <stdio.h>
int
main (int argc, char**argv) {
int i;
for (i = 0; i < argc; i++) {
printf("Argument %d: %s\n", i, argv[i]);
}
return (i);
} /* end main () */
%>./a.out –h –o test
Argument 0: a.exe
Argument 1: -h
Argument 2: -o
Argument 3: test
C Input
Formatted input is done using various versions of fscanf fscanf(stdin,...) same as scanf(...)
scanf() is hard to use at first! Also have:
fgetc(), getc(), getchar(), fgets(), gets(), and ungetc()
Each function has its own little quirks You must learn to use the printf() and scanf()
families of functions! Learning to read the documentation while you
practice using them is crucial
C Input
To match fprintf() for output there is fscanf(file,fmt,args) for input. E.g.,fscanf(stdin, “%s %d”, name, &grade);
Format specifiers are pretty similar, but do have a few differences
Args MUST be addresses (pointers)name is a char*grade is an int so we use its
address
scanf() Conversions
/* Example: Tami Meredith, 2011 */
#include <stdio.h>
#define debug 0
#define COUNT 10
#define SUCCESS 0
int main (int argc, char** argv) {
int i, data[COUNT], max = 0;
for (i = 0; i < COUNT; i++) {
scanf("%d", &data[i]);
#if debug
printf("%d: Read %d\n", i, data[i]);
#endif
if (data[i] > data[max]) { max = i; }
}
printf("Max = %d\n", data[max]);
return(SUCCESS);
} /* end main () */
C Operators1.Parentheses ( ) [ ] L to R1.Structure Access . -> R to L2.Unary ! ~ ++ -- + - * & (type) sizeof L to R3.Mult., Div., Modulus * / % L to R4.Add, Subtract + - L to R5.Shift << >> L to R6.Comparison < <= > >= L to R7.Equality == != L to R8.Bitwise And & L to R9.Bitwise Exor ^ L to R10.Bitwise Or | L to R11.Logical And && L to R12.Logical Or || L to R13: Conditional ?= L to R14.Assignment = += -= *= /= %= &= ~= |= <<= >>= R
to L15. Comma , L to R
Control Structures
Pretty much the same as Java f(), return if () ... , if () ... else ... switch () { case _: ... default: ... } for (;;) ..., while () ..., do { ... } while
(); break, continue f(), return label:, goto
Example: Switch
switch (c) {
default:
printf("This is just a mess …\n");
case '\n':
case '\t':
printf("c is whitespace\n"); break;
case '_':
printf("c is an underscore"); break;
default :
printf("c is unspecified\n");
}
You can have more than one default. Cases can come in any order. If no break exists, execution "falls through" into the next case.
Quirks and Tricks
for(i = 1, j=2; x < 10; a++, b++) { … } for (;;) { /* infinite loop */ } #define loop for(;;) continue – jumps to test of a loop break – exits innermost loop or switch a = (x < y) ? x : y; printf("sex: %smale\n", (sex='f') ? "fe" : ""); while ((c = getchar()) != EOF) { … } x++; ++x; x += 1; x = x + 1; But x[i++] is not the same as x[++i] x = x * 2; x *= 2; x <<= 1; 0 = false, anything else = true in booleans
Example: goto#include <stdio.h>
int
main () {
int a = 10;
start:
do {
/* start: could also go here */
if ((a % 3) == 0) {
/* skip iteration if divisible by 3 */
a = a + 1; /* or: a += 1; */
goto start;
}
printf("value of a: %d\n", a++);
} while (a < 20);
return 0;
}
Type Definitions
Type synonyms or aliases Just an alternative name for a type Often used to improve code portability and
readability, has no semantic value Type system uses structural equivalence, not
name equivalence
typedef int number;number y;number zero() { return (0);}
Structural Equivalence
Structural equivalence means that two data values have the same type if they have the same structure
What something is called/named is unimportant, it's what it looks like structurally
Type names are not important at all and only improve readability (e.g., syntactic sugar)
Example: Structural Equivalencetypedef int integer;
int x = 5;
integer y;
y = x; /* Allowed, both are ints */
integer sum(int a, int b) {
return (a + b);
}
Def before Use
All C variables must be defined (or declared) before they are used. Definition = memory allocated Declaration = no memory allocated
Was NOT originally block scoped and had only file and function scoping
Also used to require all variables to be declared/defined (at start of function) before any executable code
Many C programmers do not use block scoping as a result (I don't)
Many also consider it good style is to put all variable definitions at the beginning of a function (as was historically necessary) to create a data dictionary for the function
Scoping
Block scoping means that a variable can be defined within a block and that it can be accessed only in that block (or nested blocks) – "new" ANSI C
Function scoping is when a variable is accessible anywhere within the body of a function and no finer-grained control is possible – old "K&R" C
Example: Scoping
#include <stdio.h>
int main (int argc, char **argv) {
int x = 3; /* Function Scope */
for (x = 0; x < 10; x++) {
int x = 7;
printf("%d\n", x);
}
return (x);
}
Not allowed in Old C1. Block scope2. Defined after executable codeShadows and hides other x
PreDefined Namespaces
Predefined namespaces exist and can't be user defined
"objects" – variables within a scope (object is used in the general sense not the OOP sense)
functions typedef names enum constants labels SUE tags Fields per SUIn general, if you avoid re-using names too much,
you won't have any problems with naming conflicts
Structures
A "class" with only data fields Structure (and union, enum) "tags" have their
own namespace tags are optional – usually leave them out in a
typedef
typedef struct {
int x, y;
} t_point;
t_point origin = { 0, 0 };
printf("x: %d, y: %d\n", origin.x, origin.y);
Unions
A technique for treating a block of memory in various ways
Like a structure, but only one of the fields can be used at any time
May be used as a complex casting mechanism
Gives a programmer considerable control over memory management
Will provide more on structures and unions when we discuss pointers and memory
Enumerations
A way of giving names to a set of integers Allows names to be used in switches Used to define finite sets of pre-defined values Often used in a typedef
enum mode { mUnknown= 0, mRead, mWrite };
switch (m) {
case mRead: … break;
case mWrite: … break;
default:
/* mUnknown */ … break;
}
Program Execution
1. Preprocessing: Remove #directives2. Compilation
a) Tokenisation: Stream of Tokensb) Parsing: Abstract Syntax Treec) Semantic Analysis: Type checking etc.d) Optimisation: Register Transfer Languagee) Code Generation and Optimisation
3. Linking: Produce a relocatable executable4. Loading: Resolve virtual addresses5. Execution
C Preprocessor Constructs Trigraph replacement:
Obscure compliance with ISO 646-1983 Invariant Code Set
Line Splicing: Lines that end with \ are joined
File Inclusion: #include Macro Definition and Expansion: #define Conditional Compilation: #if Line Identification:
Insertion of #line constructs Error Generation:
#error causes CPP to write an error message Pragmas:
Implementation dependent PP commands
Predefined CPP Names
__LINE__ current source line __FILE__ current source file __DATE__ date of compilation __TIME__ time of compilation __STDC__1 if CC is standard-
conforming
Other names may be defined by the implementation
File Inclusion
You are encouraged to create your own include files containing: Macros, Declarations, Typedefs, Types, etc.
Included files should not contain executable code!
Two Variants1. #include <file>
Searches in the compilers include path for file
2. #include "file"Searches in a path from the CWD for file
Libraries Part 1
1. assert.h: the assert() diagnostic macro2. ctype.h: character class tests3. errno.h: declaration of int errno;4. float.h: implementation limits for floats5. limits.h: implementation limits for
integers6. locale.h: localization information7. math.h: mathematical functions8. setjmp.h: non-local jumps to avoid
normal function calls and returns
Libraries Part 2
9. signal.h: signal handling and raising10.stdarg.h: variable arg lists for
functions11.stddef.h: std. type defs (NULL,
size_t, ...)12.stdio.h: I/O, 1/3 of the C library13.stdlib.h: utility (conversion, storage
alloc.)14.string.h: string manipulation15.time.h: time and date functions
Macros Not recursive, but args
expanded/called multiple times Two variants:1. #define identfier token-sequence
e.g.: #define loop while (1)#define COUNT 20
2. #define identifier(args) token-sequence
e.g.: #define max((a),(b)) (((a)>(b))?(a):(b))
Macro Quirks
#undef will undefine a macro e.g.: #undef COUNT
Macros can be redefined if desired Token concatenation is allowed in macro
bodiese.g., #define cat(x,y) x ## y
cat(var,123) produces var123 #arg as a use causes stringification
e.g., #define string(x) #xstring(hi\n) produces "hi\n"
Preprocessor#define COUNT 10
#define MAX(x,y) (((x)>(y))?(x):(y))
#undef COUNT
#include <stdio.h>
#include "myjunk.h"
#ifdef CONST ccode #endif -- also ifndef
#if defined(CONST) ccode #endif
#if (1) ccode #endif
#elif, #else
Makefiles
makefiles automatically build your program in Unix environments
lab1: lab1.c
gcc –o lab1 lab1.c
Format of entries in makefilefile: dependencies<tab>command
To use type: makeWill use the first entry as the target by default
Good Code Simple, clear Readable, understandable Maintainable Efficient, but only sacrifice
readability and maintainability if the efficiency is critical
Uniform with regard to coding standard
Comments are "value added" Neither under nor over commented
Coding Thoughts 1 Write it once only; if you duplicate code
then refactor! (#bugs = LOC) No magic numbers; use symbolic constants Check for errors! Clarity before efficiency unless needed Trust the compiler to optimise for you Make things explicit; casts not conversions Aim for type consistency as much as
possible Use coding standards consistently
Coding Thoughts 2 Know your hardware (e.g., sizeof(int)) Use sizeof instead of explicit numeric values You may need to fflush(stdout) to ensure you
see all your debugging output If you modify any typedef, you must recompile
the entire program Don't delete code! Hide it with:
#if 0 ... code ... #endif
Keep backups! Make checkpoints; consider using a repository (e.g., subversion)
Comments cannot be nested
Abstract Data Types Separation of interface and implementation Predates OOP, Classes Requires programmer compliance/honesty E.g., stack, binary tree, hash table Only have Arrays, SUEs to create composite data
types Interface:
predefined functions, SUEs, and types in the "public" header file
Implementation: function definitions in the C file possible second "private" header file with
implementation-specific SUEs, types, and declarations
A Simple List/* Private type declaration */
typedef struct listruct {
void *data;
struct listruct *next;
} list;
/* Public Interface */
list *newList(void);
void *car(list *l);
list *cdr(list *l);
list *cons(void *d, list *l);
int length(list *l);
int isEmpty(list *l);
Hungarian Notation (Charles Simonyi) Two variations exist Systems Hungarian:
Prefix identifiers with their actual physical data type
e.g., piSum = pointer to an integer Application Hungarian:
Prefix identifiers with some useful semantic information
e.g., dVertical = difference/delta, rwVal = row
CLI Development on Unix Editor: vi, vim Compiler, Linker, PP: gcc Text tools: AWK, perl, m4 Building: make, cmake, imake Search: grep, egrep, fgrep Comparision: diff, diff3 Debugging: gdb, lint Profiling: gprof Repository: subversion, cvs, rcs, git Unix tools: bash, sort, uniq, ...
GCC Flags-Ox Use optimisation level x= 0,1,2,3 (none to most)-c compile, do not link-E preprocess only-S generate assembly language-o name rename the output to name-Wall Issue all warnings-Wextra Issue extra warnings over –Wall-Wpedantic Issue all ISO C warnings-Werror Turn warnings into errors-g Enable debugging support-ggdb Enable gdb support
There are several hundred command line flags, these are just a few of the common ones you will use
Debugging + optimisation can yield very strange results, best to turn optimisation off when debugging
/*
* Tami Meredith: Pipe example
*/
#include <sys/types.h>
#include <errno.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define BUFSIZE 64
#define READING 0
#define WRITING 1
/* A variation on perror() */
void
failure (msg)
char *msg;
{
fprintf(stderr, "%s: error %d\n", msg, errno);
exit(EXIT_FAILURE);
} /* end failure () */
int main (int argc, char **argv) {
pid_t pid;
int p[2], i, sum = 0;
char buf[BUFSIZE];
/* Create pipe BEFORE we fork (so both have it) */
if (pipe(p) == -1) { failure ("pipe allocation failure"); }
for (i = 0; i < 10; i++)
{
if ((pid = fork()) == -1) { failure ("can't fork"); }
if (pid == 0)
{ /* Child process writes */
sprintf(buf, "%d", getpid());
write(p[WRITING], buf, strlen(buf)+1);
exit(EXIT_SUCCESS);
} else { /* Parent reads */
read(p[READING], buf, BUFSIZE);
sscanf(buf,"%d",&pid);
sum = sum + pid;
}
}
printf("Average is %f\n", sum/10.0);
exit(EXIT_SUCCESS);
} /* end main () */
int fubar (argc, argv)
int argc;
char **argv;
{ ... }
int
fubar (int argc, char **argv)
{ ... }
int
fubar (int argc, char **argv) { ... }
int fubar (
int argc, char **argv
) { ... }
int fubar (int argc, char **argv) { ... }
There is a space after the name in the definition and not in any use to help search for definitions. e.g.: fubar(2,{"a.exe"});
Variations in Style
Confused?
C can be completely unreadable, perhaps more so than any other language
It takes decades to fully grasp C which is rather amazing since it's one of the smallest, most concise languages there is
For some fun, see:http://www.cise.ufl.edu/~manuel/obfuscate/obfuscate.html