156
Chapter: Pointers The sections are: Addresses Objects Memory Array Lists Linked Lists 1

Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Chapter: PointersThe sections are:

AddressesObjectsMemoryArray ListsLinked Lists

1

Page 2: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Section: AddressesA computer's memory is an array of bytes:

You can pick out one byte using its index.

That's called the address of the byte.

In C, you can assume that addresses are always byte-based (not bit-based or word-based).

2

Page 3: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Multi-byte valuesSuppose the memory holds an array of ints.

The address of an int is still a byte-address.

Int addresses go up in 4's (assuming 4-byte ints).

But you want to index the array by 0,1,2..., not0,4,8....

3

Page 4: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

PointersA pointer is an address in memory, together with thetype and size of the item stored at that address.

The type of a pointer to an int is int * ('int pointer').

C gives you direct and total access to pointers, butwithout worrying about exactly what the addressesactually are.

A pointer is an unsigned integer with 4 bytes (on a 32-bit system) or 8 bytes (on a 64-bit system, so you canhave more than 4Gb of memory).

4

Page 5: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

What are pointers for?Pointers allow call-by-reference so that functions canaccess data belonging to the caller.

A pointer allows you to store something such as a stringor array that has a different size at different times.

And a pointer can be used with dynamic allocation, toallow an item's lifetime to be more flexible than theduration of a single function call.

5

Page 6: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Pointer variablesPointers can be stored in memory, in variables.

int i, j, k; int a[3]; int *p;

p is declared to be of type int * and must point to thebeginning or end of an actual int in memory.

For example, p could point to the location of i or of jor of k, or to any of the elements of the array a, or tothe end of the array.

Suppose p is made to point to k.

6

Page 7: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Picturing pointersIt is important, when programming or debugging, tocreate pictures of pointers, in your head or on paper.

We don't know how memory is allocated, so the pictureof p pointing to k could equally well be.

7

Page 8: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

ScatteringSince we don't know (and don't need to know) wherethings are located in memory, they are often pictured as'randomly' scattered:

8

Page 9: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

NotationWhich of these should you write to declare a pointervariable?

int *p; int* p;

The compiler doesn't care, so it is a matter of taste.

Some programmers even write int * p;.

9

Page 10: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Poor notationIf you write int *p then this is not clear:

int *p = x;

The problem is that it means:

int *p; p = x;

even though it looks like:

int *p; *p = x;

10

Page 11: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Poor notationIf you write int* p then this is not clear:

int* p, q;

The problem is that it means:

int* p; int q;

even though it looks like:

int *p; int *q;

11

Page 12: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

DecisionThe choice between int *p and int* p is a no-winsituation.

So in this tutorial, we will follow the most commonconvention and write int *p.

12

Page 13: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Reason for notationWhy are C's pointer types written in this way?

With pointers, types can get very complicated, and thedesigners of C wanted types to be written the same wayround as the operations performed on the variables, notthe opposite way round.

A declaration is written as an example of using thevariable, plus the basic type you reach at the end.

So int *p means "p is a variable to which you canapply the * operator, and then you reach an int".

13

Page 14: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Pointer arithmeticIf a pointer variable p of type int * points to an int,then the expression p+1 points one int further on.

For example, if p points to a[1] then p+1 points toa[2].

The C compiler uses the knowledge of the type of theitem which a pointer points to, and its size, to make thearithmetic as convenient as possible.

If you need to know a size (in bytes) yourself, apply thesizeof() pseudo-function to a variable or a type.

14

Page 15: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Two operatorsThe & operator takes a variable, and creates a pointer toits memory location.

The * operator takes a pointer, and follows it to find thevalue stored at that memory location.

These go in 'opposite directions' along the pointer.

15

Page 16: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

pointer.c

The & operatorThe & operator creates a pointer to a variable.

/* Print a pointer. */ #include <stdio.h> int main() { int n; int *p = &n; printf("pointer %p\n", p); }

The expression &n is often read "address of n", eventhough it should really be "pointer to n". The %p usuallyprints pointers in hex.

16

Page 17: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

value.c

The * operatorThe * operator finds the value which a pointer refersto.

/* Print a value. */ #include <stdio.h> int main() { int n = 42; int *p = &n; printf("value %d\n", *p); }

17

Page 18: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

NULLOne special pointer is provided in C, called NULL,available from stdio.h for example.

Don't confuse it with the null character, written '\0',which is only one byte long.

The NULL pointer is guaranteed to be unusable. Itpoints to location 0 which always belongs to theoperating system.

It is used for uninitialised pointers, as an error indicatorfor functions that return pointers, and so on.

18

Page 19: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Deliberate segfaultHere's a deliberate segfault:

/* Demo: cause a segfault */ #include <stdio.h> // Point to the beginning of the memory, and demonstrate // that it doesn't belong to the program. int main() { char *s = NULL; s[0] = 'x'; }

19

Page 20: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Section: ObjectsC doesn't properly support object orientedprogramming.

But it is reasonable to use the word object to mean astructure or array, accessed using a pointer.

This represents another step towards object orientedprogramming.

This section looks at what changes when structures andarrays are handled using pointers.

20

Page 21: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

StringsCompare these program fragments:

char s[] = "bat";

char *s = "bat";

They are nearly identical. Extracting a character s[i]is the same, printing is the same, comparing with otherstrings is the same, ...

21

Page 22: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Meaning 1char s[] = "bat";

The constant string "bat" is stored in an array of fourbytes somewhere within the program.

This declaration involves allocating a new array of fourbytes, and copying the original string into the newarray.

So for long strings it is less efficient than leaving theoriginal string where it is.

22

Page 23: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Meaning 2char *s = "bat";

This creates a pointer s to the original string.

But the original string, being part of the program, isread only.

An update s[0] = 'c' is illegal.

So this is less flexible than the array copying version.

That means both versions are useful in differentcircumstances.

23

Page 24: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

PuzzleWhat does this program do?

#include <stdio.h> int main() { char *s1 = "cat"; char *s2 = "cat"; if (s1 == s2) printf("same\n"); else printf("different\n"); }

Try to decide before moving on.

24

Page 25: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

AnswerThe question was, what does this program do?

#include <stdio.h> int main() { char *s1 = "cat"; char *s2 = "cat"; if (s1 == s2) printf("same\n"); else printf("different\n"); }

Answer: it depends whether the compiler optimises bynoticing that it can reuse the same constant string.

The important thing is that s1 == s2 is pointercomparison.

25

Page 26: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Passing arraysCompare these functions:

void print(char s[]) { ... }

void print(char *s) { ... }

These are identical, because in the first version, thearray is passed by reference, i.e. by pointer.

In other words, the compiler converts the first versioninto the second.

26

Page 27: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Pointer to arrayThere are many circumstances where C treats "array"and "pointer to array" as the same.

char s[10]; char *p = s; char *q = &s[0]; char *r = &s;

Which of the pointers p, q and r are legal?

Which of them are the same?

27

Page 28: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Equalchar *p = s; char *q = &s[0];

The pointers p and q are legal and the same, and can beused in almost the same way as s.

Both p[0], p[1], p[2]... and q[0], q[1], q[2]...can be used to index the array.

In most situations, C converts automatically between sand &s[0] as necessary, and p[i] means exactly thesame as *(p + i).

28

Page 29: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Odd one outchar *r = &s;

This isn't legal, because &s is a pointer to the wholearray, not to the first character of the array.

char (*r)[10] = &s;

This is a legal definition of r as a pointer to the array. Itholds the same address as p and q.

But r[1] points to the end of s, not to it's secondelement, because C adds the size of the type that rpoints to.

29

Page 30: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Returning arraysPointer notation seems to allow us to return an arrayfrom a function, which we couldn't do before:

char *show(int n) { // BAD char s[12]; itoa(n, s, 10); return s; }

But this is illegal, with undefined behaviour, becausethe array s disappears when the show function returns.

You are returning a dangling pointer.

30

Page 31: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Returning arrays 2This is OK, though:

char *min(char *s1, char *s2) { if (strcmp(s1, s2) < 0) return s1; else return s2; }

It returns one of the two arrays that were allocated bythe caller.

31

Page 32: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

AllocationThe fact that newly allocated memory can't be returnedfrom a function is a serious restriction.

A complex program might need to allocate lots ofmemory, but without knowing in advance how much isgoing to be needed.

We will sort this out when we get to malloc.

32

Page 33: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Pointing into arraysPointers allow us to do this:

int *p = &a[i];

This allows us to handle subarrays and substrings, loopthrough arrays using pointers (not usuallyrecommended) and so on.

33

Page 34: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Structs without pointersReminder: when you pass a struct directly to a function,it is copied, so any updates have to be returned and putback in the original structure:

struct car move(struct car b, ...) { ... b.x = ... } ... b = move(b, ...);

This is very fussy, so let's tidy it up.

34

Page 35: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The struct keywordLet's use typedef like before to get rid of the structkeyword:

typedef struct car car; car move(car b, ...) { ... b.x = ... }

One-word type names seem more readable.

(The only slight downside is that syntax colouringeditors and tools may not colour the type name nicely.)

35

Page 36: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The return problemIf a function move updates the struct, then it is onlyupdating its local copy (in its local argument variable b)so the updated struct has to be returned.

car racer; ... racer = move(racer, ...);

It is incredibly easy to forget the "racer =" bit whichcopies the updated struct back into the original variable.

36

Page 37: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The copying problemThe fields in the struct are all copied across into thefunction's argument variable, and then copied back intothe original (whether they have been updated or not).

This is inefficient.

The inefficiency may matter for programs where structsare passed around a lot, or where some structs are verybig (e.g. containing an array).

37

Page 38: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The solutionThe answer is to pass structs around using pointers.

Passing structs without pointers is very rare in real Cprograms.

There are some changes to the functions that get called,and there are some changes to the calling functions.

38

Page 39: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Called functionsA called function is passed a pointer to a struct:

void move(car *b, ...) { ... b->x = ... }

The function no longer needs to return anything.

It uses b->x to access fields, which is a shorthand,which all C programmers use, for (*b).x.

39

Page 40: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Local allocationIn calling functions, one strategy is to continue toallocate structs as local variables.

int main(...) { car racerdata = { ... }; car *racer = &racerdata; ... move(racer); }

The struct variable is given an obscure name, e.g....data because it is temporary, and the pointervariable is given a nice name for general use.

40

Page 41: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

NamingThis is inferior:

car racer = { ... }; ... move(&racer);

It is far too easy to forget the &, and it doesn't match theway that pointer variables are used elsewhere.

It is the pointer which represents the object and whichdeserves the nice name.

41

Page 42: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

ReturningThis is illegal:

car *newCar(...) { // BAD car cardata = { ... }; car *c = &cardata; ... return c; }

The memory for cardata disappears (to be reused byother function calls) when the function returns, so thisis returning a dangling pointer again.

42

Page 43: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

car.c

Full example/* Passing structures using pointers */ #include <stdio.h> struct car { int x, y; }; // Move a car by a given amount void move(struct car *b, int dx, int dy) { b->x = b->x + dx; b->y = b->y + dy; } int main() { struct car racerdata = {41, 37}; struct car *racer = &racerdata; move(racer, 1, 5); printf("%d %d\n", racer->x, racer->y); }

43

Page 44: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Section: MemoryA lot of details will be simplified in this section, to get aprogrammer's view of what is going on. The layout of acomputer's memory is roughly:

Memory is allocated in segments. Your program consistsof one or more segments, plus indirect access to theoperating system segment.

That's why an access to memory not belonging to yourprogram results in a segmentation fault (segfault).

44

Page 45: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Virtual memoryThe processor translates addresses so that the operatingsystem and the segments that belong to your programappear to your program as contiguous:

The OS at the start of virtual memory means that it isalways illegal to access memory via NULL.

Apart from that, virtual memory can be ignored.

45

Page 46: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Zooming inThe layout of your program is roughly:

The machine code and constants are loaded when theprogram is run, then the heap expands upwards, and thecall stack expands downwards.

The three parts are often separate segments, with thecode and constants being read-only.

46

Page 47: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The stackThe call stack is where local variables are allocatedduring function calls.

New space is allocated on entry to a function, thendiscarded on exit.

This allows functions to be recursive, e.g.

int fact(int n) { if (n == 0) return 1; int f = fact(n - 1); return n * f; }

47

Page 48: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

CallsHere's the stack during fact(3) (simplified).

48

Page 49: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Normal callsThe example shows a recursive function, but the samesort of thing happens with normal functions.

Memory is allocated for main, then for sort (say),then for compare, then compare returns and sortcalls swap, then sort repeatedly makes similar calls,then returns, then maybe main calls something else,then eventually returns.

The stack grows and shrinks 'at random' as functions arecalled and return, until eventually main returns.

49

Page 50: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The heapThe heap is used for dynamically allocated memory, i.e.for items which can't be handled by function-call-basednested lifetimes.

The most common case is an array or any other datastructure which needs to grow when new data arrives.

The heap is managed by the malloc and free libraryfunctions.

50

Page 51: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

malloc and freeThe library functions malloc ("memory allocate") andfree allocate and deallocate memory blocks.

/* Demo: string using malloc/free */ #include <stdio.h> #include <stdlib.h> int main() { char *s = malloc(4); // was char s[4]; strcpy(s, "cat"); printf("%s\n", s); free(s); }

51

Page 52: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

stdlibThe stdlib library contains the functions mallocand free so we need to include its header.

#include <stdlib.h>

This provides the compiler with the prototypes of thelibrary functions, so it knows how to generate calls.

The machine code of a standard libraries like stdlib islinked automatically by the compiler, but other librariesmay need to be mentioned explicitly.

52

Page 53: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Calling mallocThe call to malloc allocates the memory.

char *s = malloc(4);

The variable is declared as a pointer to the first elementof an array.

The argument to malloc is the number of bytesdesired.

The return type of malloc is void * which means"pointer to something", and is compatible with allpointer types.

53

Page 54: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Visualising mallocThe effect of malloc needs to be visualised.

Before the call, s contains random rubbish.

After the call, s is a pointer to some new memory.

54

Page 55: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

FreeingThe new memory is freed explicitly when not neededany more.

free(s);

The call is unnecessary in this case because the programis about to end, and all of its memory will be returnedto the operating system.

But you should free all malloced memory, to avoidmemory leaks.

55

Page 56: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Compiler optionThe -fsanitize=undefined option can't cope withdynamically allocated memory. For that, anothercompiler option -fsanitize=address is needed.

As well as adding extra code to a program, it alsoreplaces malloc and free by versions which allowallocations to be monitored and checked.

On Windows, whether using Cygwn, MSYS2 or a nativeenvironment, this option is usually unavailable.

56

Page 57: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Memory leaksA further option -fsanitize=leak can be useful.Depending on your platform, it may automatically beswitched on when you use -fsanitize=address.On Windows and macOS, it is usually unavailable.

When the program ends, it checks for memory leaks, i.e.allocations which haven't been freed. This only normallymatters for long-running programs such as servers, orfrequently used libraries.

57

Page 58: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

AccessThe newly allocated memory is accessed via the pointerreturned from malloc:

char *s = malloc(4); s[0] = 'c'; strcpy(s, "cat");

58

Page 59: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The heapHere's the heap after some malloc and free calls.

The heap never shrinks, but gaps appear after free.

malloc searches for the best gap, free merges gaps,and both use a header of about 8 bytes, not shown, atthe start of allocations and gaps to keep track of them.

So, they can be a bit expensive, but there are furtherdetails which reduce the cost.

59

Page 60: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

What's wrong?Why is this not a good thing to do?

char *s = malloc(4); s = "cat";

60

Page 61: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

AnswerThe question is: why is this not a good thing to do?

char *s = malloc(4); s = "cat";

The pointer s is updated to point to the constant string,so it no longer points to the allocated memory.

The allocated memory remains allocated but unused, i.e.wasted, for the rest of the program and, if leak detectionis on, will be reported as a memory leak at the end.

61

Page 62: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Allocating an arraySuppose you want an array of 10 integers:

int *ns = malloc(10 * sizeof(int));

Don't forget to multiply by the size of the things youare allocating.

62

Page 63: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

callocThere is an alternative function calloc.

int *ns = calloc(10, sizeof(int));

One difference is trivial (comma instead of *).

The other is that the memory is cleared (set to zero).

Some textbooks and tutorials use calloc all the time,but (a) clearing the memory is inefficient if you areabout to initialise it yourself, and (b) it might give youthe mistaken idea that variables in C are always cleared.

63

Page 64: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

ReallocationHow do you change the capacity of an array?

char *array = malloc(8); int capacity = 8; ... capacity = capacity * 3 / 2; array = realloc(array, capacity);

The realloc function allocates a new array, copies theold array into the start of the new array, and deallocatesthe old array!

The pointer changes, so array needs to be updated.

64

Page 65: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Reallocation efficiencyThe realloc function sounds costly (searching for anew gap and copying the old array into it).

But there are two circumstances where it is cheap.

If the old array is at the end of the heap, realloc canjust make it bigger without moving it.

If the array is large, realloc uses a separate virtualmemory segment for it, to avoid any further copyingcosts.

65

Page 66: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

StrategySuppose you increase arrays in size (using realloc)when they run out of space. What size should they startat, and how much should their sizes be increased by?

You should start small so lots of empty arrays aren'tmemory-inefficient. (Don't use less than 24 bytes: it isthe minimum malloc gives out.)

And multiply the size so that copying large arrays isn'ttime-inefficient. (Multiply by 1.5 because multiplying by2 may mean merged old arrays aren't enough to storenew ones.)

66

Page 67: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

StructuresBefore, we did this:

struct car ...; typedef struct car car; int main() { car racerdata = { 41, 37 }; car *racer = &racerdata; ... }

But there are problems if we don't know in advance howmany cars we are going to want.

67

Page 68: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Allocating structuresInstead we can now do this.

car *newCar(int x0, int y0) { car *c = malloc(sizeof(car)); c->x = x0; c->y = y0; return c; } int main() { car *racer = newCar(41, 37); ... }

68

Page 69: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Initialising structuresTo initialise more compactly, we can do this:

car *newCar(int x0, int y0) { car *c = malloc(sizeof(struct car)); *c = (car) {x0, y0}; return c; }

Or this:

... *c = (car) {.x = x0, .y = y0}; ...

69

Page 70: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

VisualisationVisualising the memory during newCar:

70

Page 71: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Word countingBefore, we did this:

struct word { char s[10]; int count; }; typedef struct word word;

The problem is that words have different lengths.

71

Page 72: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Flexible array fieldsNow, we can do this:

struct word { int count; char s[]; }; typedef struct word word;

The array field must go last in the structure, with nolength specified, then it can have a variable length(stretching past the notional end of the structure).

72

Page 73: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

AllocationHere's how to allocate a flexible array:

struct word { int count; char s[]; }; typedef struct word word; word *newWord(char *s) { int n = strlen(s) + 1; word *w = malloc(sizeof(word) + n); strcpy(w->s, s); w->count = 0; return w; }

You allocate memory for the structure plus the array.

Note this is a recent C feature.

73

Page 74: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

LinesSuppose a program reads in a line of text.

We might guess that this would be enough:

char line[1000];

But if a user feeds a line into our program which hasbeen generated from some other program, this isprobably not enough!

We've already seen that we can use realloc toincrease the size of an array.

74

Page 75: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Flexible array field?So maybe we could write this:

struct line { int size; char s[]; }; typedef struct line line; // Resize to make room for at least n characters line *resize(line *l, int n) { ... }

But the pointer to the structure changes on resize, sothis would have to be called with:

l = resize(l, n);

It is incredibly easy to forget the l = bit.

75

Page 76: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Pointer fieldThe normal solution is to write this:

struct line { int size; char *s; }; ... void resize(line *l, int n) { ... }

Now there are two lumps of memory and two pointers.

The structure pointer allows functions to update thefields in place, the array pointer makes sure thestructure never moves, only the pointer field inside it.

76

Page 77: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

OverviewThe rest of this section gives a very brief overview of anumber of other details about memory and the way it isused.

It is rare that this level of detail is needed in ordinaryeveryday programming, so it can be skipped reasonablysafely.

But some of it is relevant to complex or low-levelprograms.

77

Page 78: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Binary compatibilityMany other languages are compiled to have 'binarycompatibility' with C.

That means they use the same conventions about code,heap, stack, and function calls, either for the wholelanguage, or at least for the operating system servicecalls and cross-language calls.

78

Page 79: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Stack detailsThe compiler uses the stack memory for each functioncall to store:

local variables, including argumentsthe return address in the codesaved register contents from outer callsintermediate calculations that don't fit in registers

The result is that the exact layout of the stack is verymuch dependent on architecture and compiler choices,and can't easily be analysed by hand (hence -g optionand gdb for debugging).

79

Page 80: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

ArgumentsConventionally, arguments belong to the caller (callingfunction) rather than the callee (called function).

This allows variable-argument functions like printf.

In retrospect, this was a bad design choice: it is illogical,and it prevents simple tail-call optimisations.

It would have been better to generate special-case codefor (fairly rare) variable-argument calls.

But the issue isn't as simple as described here!

80

Page 81: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Two improvementsThere are two common ways to improve programswhere dynamic allocation efficiency is an issue.

One is to use the glib library, which containsimproved versions of malloc and free.

Another is to allocate memory in large lumps, andimplement a custom system for efficient high-turnover,small-object allocation within the lumps.

81

Page 82: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Memory leaksSince calling free is up to the programmer, even acorrect program may gradually use up more and morememory unnecessarily.

That's called a memory leak, and is an importantpotential flaw in long-running programs such asservers.

Counter-measures are to use a library which deallocatesautomatically via garbage collection, or to use a librarywhich detects leaks so they can be fixed, or use the-fsanitize=address compiler option.

82

Page 83: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

RelocationA program can be compiled into code which expects tobe loaded at a particular location in memory.

Alternatively, a program can be compiled into code plusextra information about the location-sensitive parts.

The extra info allows the program to be relocated, i.e.loaded into different locations on different runs.

83

Page 84: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Position independent codeA scheme which is much more elegant and flexible isfor compiled machine code to be independent of whereit is loaded in memory.

Then relocation issues are avoided.

It involves having an instruction set where jumps andcalls are relative to the current location rather thanabsolute (e.g. "call the function 100 bytes further onfrom here").

Despite its clear superiority, this hasn't become normal.

84

Page 85: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

LinkingEven with position independent code, linking isnecessary.

This involves sorting out function calls (and otherreferences) from one program component to another,e.g. calls to library functions.

85

Page 86: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Static linkingWith static linking, the parts of the library which areactually used by the program are copied into theprogram by the compiler.

That way, the compiler can relocate the library code inadvance, sort out all the function calls and otherreferences between parts, and create a completeprogram which is ready to run.

86

Page 87: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Dynamic linkingWith dynamic linking, the library code is potentiallyshared between programs to save memory space.

The compiler needs to know, somehow, where to expectthe library to be in memory when the program runs.

Then the program and the library are linked by thesystem when the program is loaded and prepared forexecution.

Shared libraries are called DLLs in Windows, and SOson Linux/macOS.

87

Page 88: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The DLL approachThe approach taken by Windows is:

Compile a DLL library into code which is always at afixed place in virtual memory.

Then compile each program into fixed code whichrefers to the library code at its known location.

When loading each program, arrange its virtual memoryso that the virtual library location refers to the actualphysical library location.

88

Page 89: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

A DLL problemThe DLL approach has a fundamental problem:

What happens if two independent DLL libraries havebeen compiled into the same place in virtual memory,and a program wants to use both?

The solution in Windows is (a) have a central authorityfor 'official' libraries which allocates locations and (b) ifthat fails, abandon sharing and copy one of the librariesinto the program.

For further problems, look up "DLL hell" online!

89

Page 90: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The SO approachThe "Shared Object" approach in Unix-based systems is:

Compile a program which uses an SO to retainrelocation information about the library references.

When loading the program, find the library location andcomplete the linking of the program by resolving thelibrary references.

90

Page 91: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

SO problemsThe SO approach still has administrative problems:

The compiler needs to know where the SO library fileis, to find out what functions it makes available.

The loader needs to know which SO the program needs,and where the SO library file is in case it needs to beloaded into memory for the first time.

There are considerable potential problems withinstallation locations on disk, library versions, where toput the location information, and discrepencies betweencompile-time and load-time information.

91

Page 92: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Shared library designShared libraries often have a monolithic design, makingthem unsuitable for static linking (because the wholelibrary gets copied into the program).

The libraries are typically very big - programs only loadquickly because the platform-specific libraries they useare already loaded into memory.

If you port a program to another platform, it typicallytakes 20 seconds to load, because the shared librarieshave to be loaded as well.

So true cross-platform programming is very difficult.

92

Page 93: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The futureNobody knows the future, but these would be good:

scrap virtual memorymake all machine code position independentmake data position independent (relative pointers)make pointers variable-sizedensure all machine code is validated in advancemake all memory 'object oriented', even the stackprovide hardware support for garbage collectionmake all platforms compatible

93

Page 94: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Section: Array ListsA list is like an array, except that the length changes.

Items are added to a list over time, and you don't knowin advance how many there will be.

This chapter implements lists in two ways in C.

The same ideas apply to lists in almost every otherprogramming language.

94

Page 95: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Two approachesThere are two approaches that you can take:

array listslinked lists

Array lists are better when the most commonoperations are to add or remove at one end.

Linked lists can be better when frequently inserting anddeleting in the middle.

95

Page 96: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Library approachWe will start with array lists, and implement them as ifwe were creating a library. That means creatingfunctions which are reasonably reusable, and which areeasy to use rather than easy to write.

On the other hand the program we are writing doesn'thave to do anything other than test the functions.

The first step is to decide what items are going to bestored in the lists.

96

Page 97: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

GenericsArrays in C are generic. In other words, there can be anarray of anything, e.g. char cs[], int ns[],double ds[], and so on.

Unfortunately, C makes it nearly impossible to creategeneric lists, so that there can be lists of anything.

So our target will be lists of int, but written so thatthe functions can reasonably easily be copied andreused to make lists of some other type.

97

Page 98: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

ItemsLet's make a type synonym:

typedef int item;

Now all the functions can be written as acting on lists ofitems. For any other simple type, this one typedef canbe changed, without changing the functions.

The next step is to decide what we want the functionsto look like when they are used.

98

Page 99: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

CallsA program which uses the lists might look like this:

... list *ns = newList(); add(ns, 3); add(ns, 5); add(ns, 41); get(ns, 2); set(ns, 2, 42); for (int i = 0; i < length(ns); i++) ... ...

A list starts empty, it can have items added to itindefinitely, and there are functions length, get andset similar to an array.

99

Page 100: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

TestingAs well as the list functions, we need to think abouthow testing is going to work.

A test should check that the result of an operation is aself-consistent list containing given items, like this:

assert(check(ns, 3, (int[]){1, 2, 3}));

This checks that ns is a list with length three containing1, 2, 3. The raw notation {1, 2, 3} can only be usedin declarations, but if the type int[] is made explicitwith a 'cast', it can be used elsewhere.

100

Page 101: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

PrototypesWe can now write prototypes of all the functions:

list *newList(); int length(list *xs); void add(list *xs, item x); item get(list *xs, int i); void set(list *xs, int i, item x); bool check(list *xs, int n, item ys[n]);

101

Page 102: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

First attemptWe could use a flexible array, with a length variable tosay how full it is:

int length = 0, capacity = 4; item *items = malloc(capacity * sizeof(item)); ... if (length >= capacity) { capacity = capacity * 3 / 2; items = realloc(items, capacity * sizeof(item)); } ...

102

Page 103: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Add functionThe add function seems to need to be like this:

void add(int length, int capacity, item *items, item x) { ...

We need to pass the length and capacity as well as thearray and new item. But that's not what the addfunction is supposed to look like.

103

Page 104: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

FailureIt would be tiresome to passing around the threevariables separately. We want calls to look likeadd(xs, n).

And the function wouldn't work anyway, because it can'tupdate the caller's length variable, or the caller'scapacity variable.

And although it can update items in the caller's itemsarray, it can't update the items variable itself, in casethe array needs to be moved by calling realloc.

104

Page 105: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Second attemptWe could pass in all three variables in one go in astructure, and return the updated structure:

struct list { int length, capacity; item *items; }; typedef struct list list; list add(list xs, item x) { ... } ... xs = add(xs, x);

105

Page 106: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Poor attemptThe second attempt does work, but has two flaws.

it is too easy to make the mistake of writingadd(xs,x) instead of xs = add(xs,x)if a list function needs to return some otherresult, as well as the updated list, we are stuck

It is important to be able to write add(xs, x), notxs = add(xs, x).

106

Page 107: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Third attemptHow do we achieve really simple function calls likeadd(xs,x) ?

Answer, pass the list structure by pointer:

struct list { int length, capacity; item *items; }; typedef struct list list; void add(list *xs, item x) { ... }

This treats a list as an 'object'.

107

Page 108: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

PictureA good picture of the situation may help:

struct list { int length, capacity; item *items; }; typedef struct list list; ... list *xs;

The xs pointer points to a fixed structure with threefields, one of which is a pointer to a variable array.

108

Page 109: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Pointer purposesThe pointers in the picture have two different purposes.

The first, xs, allows functions to update the liststructure in place. The structure never moves, so thepointer never changes, so it never needs to be updated,so functions never need to return an updated pointer.

The second pointer, xs->items, allows the array to bemoved and resized. Only one pointer xs->itemsneeds to be updated.

109

Page 110: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The newList functionHere's a function to make a new list:

// Make a new empty list list *newList() { list *xs = malloc(sizeof(list)); item *items = malloc(6 * sizeof(item)); *xs = (list) { 0, 6, items }; return xs; }

How should this be tested?

110

Page 111: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Test strategyIt is really easy to make mistakes with pointers or witharray bounds.

So it is vital to compile with the-fsanitize=undefined and-fsanitize=address options to catch thosemistakes as quickly as possible.

Then the next task is to write the check function.What do you think it should look like?

111

Page 112: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The check functionHere's the check function:

// Check that a list matches a given array. bool check(list *xs, int n, item ys[n]) { bool ok = true; if (xs->length != n) ok = false; if (xs->capacity < n) ok = false; for (int i = 0; i < n; i++) { if (xs->items[i] != ys[i]) ok = false; } return ok; }

This checks everything it can. It can't check that pointersare valid, but that's checked by the compiler options.

112

Page 113: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The programNow we can wrap the functions in a program:

#include <stdio.h> #include <stdbool.h> #include <stdlib.h> #include <assert.h> typedef int item; struct list { int length, capacity; item *items; }; typedef struct list list; ... newList ... check ... int main() { assert(check(newList(), 0, NULL)); }

113

Page 114: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

TestingWhat happens when we test the program?

It compiles, but the -fsanitize=address optionreports a memory leak, because we have allocated a listand not freed it.

This change to main fixes the problem:

int main() { list *ns = newList(); assert(check(ns, 0, NULL)); free(ns->items); free(ns); }

114

Page 115: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Missing functionThe memory leak incident suggests that we needanother function.

With a real library, a programmer would not be able toget at ns->items to free it.

So a freeList function is needed:

void freeList(list *xs);

What does it look like?

115

Page 116: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The freeList functionHere's the freeList function:

// Free the memory for a list. void freeList(list *xs) { free(xs->items); free(xs); }

The two calls can't be in the opposite order, because it isillegal to use xs after it has been freed. In general,structures need to freed from the bottom upwards.

The next task is to write the length function.

116

Page 117: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The length functionHere's the length function:

// Find the length of a list int length(list *ns) { return ns->length; }

It doesn't seem necessary to add any testing for it.

The next task is to prepare for writing the add function.

117

Page 118: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

An expand functionHere's a function to expand a list, to do part of the job ofthe add function:

// Make a list bigger static void expand(list *ns) { ns->capacity = ns->capacity * 3 / 2; ns->items = realloc( ns->items, ns->capacity * sizeof(item) ); }

This is hidden from the library user by declaring itstatic. It is only to be called by list functions.

118

Page 119: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Continuationns->items = realloc( ns->items, ns->capacity * sizeof(item) );

This is a single function call which is a bit long to fit onone line. The convention used in this tutorial (notcommon in C, but borrowed from Python) is thatsplitting a statement over several lines is signalled byround brackets in a similar way to curly bracket blocks.

119

Page 120: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The add functionHere is the add function:

// Add an item to a list void add(list *ns, item n) { if (ns->length >= ns->capacity) expand(ns); ns->items[ns->length] = n; ns->length++; }

It is time to add more tests.

120

Page 121: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Test addA reasonable way to test the add function is:

int main() { list *ns = newList(); assert(check(ns, 0, NULL)); add(ns, 40); assert(check(ns, 1, (int[]) {40})); add(ns, 41); assert(check(ns, 2, (int[]) {40, 41})); add(ns, 42); assert(check(ns, 3, (int[]) {40, 41, 42})); freeList(ns); return 0; }

This tests newList, so it replaces the previous testing.

121

Page 122: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

IndependenceSome purists would argue that tests should beindependent and isolated from each other.

In other words, each test should involve building a listfrom scratch, then applying an operation and checkingthe result.

It is an important principle, but it is not an absoluterule. There are also gains to be had from checkingsequences of operations on the same structure, and it iscertainly simpler in this case.

Next we need to prepare for get and set.

122

Page 123: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

A fail functionHere's a function to report an error:

// Report a list error static void fail(char *message) { fprintf(stderr, "List failure: %s\n", message); exit(1); }

This is not for general use, it is only for the get andset functions to call, so it is declared static.

123

Page 124: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Get and setHere are the get and set functions:

// Get a list item, like ns[i] int get(list *ns, int i) { if (i < 0 || i >= ns->length) fail("get"); return ns->items[i]; } // Set a list item, like ns[i] = n void set(list *ns, int i, item n) { if (i < 0 || i >= ns->length) fail("set"); ns->items[i] = n; }

124

Page 125: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

TestsTests for these can be added near the end of main:

int main() { ... set(ns, 2, 84); assert(check(ns, 3, (int[]) {40, 41, 84})); assert(get(ns, 2) == 84); freeList(ns); }

Is there anything else that needs to be tested?

125

Page 126: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

list.c

ExpansionThe main thing that hasn't been tested is what happenswhen the list gets longer. Does it expand properly?Instead of adding lots more items to the list, it isprobably enough to add a direct test of the expandfunction at the end:

int main() { ... expand(ns); assert(check(ns, 3, (int[]) {40, 41, 84})); freeList(ns); }

126

Page 127: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Section: Linked listsA problem with array lists is that, to insert or delete anitem in the middle, lots of items have to be moved up ordown to make space.

Can we find a way of storing a list so that items neverhave to be moved?

One way is to introduce a pointer to go with each item,pointing to the next item.

127

Page 128: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Example: primesA linked list of primes (without 5) might look like this.

128

Page 129: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Example: insertionAfter inserting 5, it might look like this.

129

Page 130: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

InsertionTo insert 5 into the list, these steps are needed:

find the structures containing 3 and 7allocate some space for a new structureset the first field to 5set the second field to point to the 7 structurechange 3's pointer to point to the new structure

That's a small fixed number of operations. The listentries end up scattered in memory, but it doesn'tmatter where they are.

130

Page 131: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

StackThe easy and efficient operations on a linked list arecalled the stack operations:

isEmpty: check if there are any itemspush: insert an item at the start of the listtop: look at the first item (sometimes called peek)pop: remove an item from the start of the list

131

Page 132: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

PrototypesThe functions needed for stacks are:

stack *newStack(); void freeStack(stack *xs); bool isEmpty(stack *xs); void push(stack *xs, item x); item top(stack *xs); item pop(stack *xs); bool check(stack *xs, int n, item ys[n]);

132

Page 133: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

CellsThis structure holds each item and its next pointer:

struct cell { item x; struct cell *next; }; typedef struct cell cell;

133

Page 134: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

TemptationIt is tempting to say that a stack is just a pointer to thefirst item, or NULL when the stack is empty:

cell *xs = NULL;

The push function adds a new first cell, so the stackvariable xs becomes a different pointer. That means wewould have to write xs = push(xs, x). The popfunction also updates the xs pointer, but it can't easilyreturn that and the first item.

So let's have a separate unmoving structure for the stackitself, as with array lists.

134

Page 135: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

StartThe stack program needs to start like this:

#include <stdio.h> #include <stdbool.h> #include <stdlib.h> #include <assert.h> typedef int item; struct cell { item x; struct cell *next; }; typedef struct cell cell; struct stack { cell *first; }; typedef struct stack stack; ...

135

Page 136: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

New stackThe function to create a new stack is:

// Create a new empty stack stack *newStack() { stack *xs = malloc(sizeof(stack)); xs->first = NULL; return xs; }

136

Page 137: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The check functionHere's the check function:

// Check that a stack matches a given array. bool check(stack *xs, int n, item ys[n]) { bool ok = true; cell *c = xs->first; for (int i = 0; i < n; i++) { if (c->x != ys[i]) ok = false; c = c->next; } if (c != NULL) ok = false; return ok; }

137

Page 138: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The main functionTo finish the initial version of the program, add this:

int main() { stack *ns = newStack(); assert(check(ns, 0, NULL)); return 0; }

Again, we are relying on the sanitize compiler optionsto check things that the check function can't check.

Everything works, except for the memory leak causedby the lack of a freeStack function.

138

Page 139: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The freeStack functionHere is the freeStack function:

void freeStack(stack *xs) { cell *c = xs->first; while (c != NULL) { cell *next = c->next; free(c); c = next; } free(xs); }

The next pointer has to be extracted from a cell beforethe cell is freed, because it is illegal to access the cellcontents afterwards.

139

Page 140: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Check stack emptyThe function to check if a stack is empty is:

bool isEmpty(stack *xs) { return xs->first == NULL; }

140

Page 141: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The push functionThe function to push an item onto a stack is:

void push(stack *xs, item x) { cell *c = malloc(sizeof(cell)); *c = (cell) { x, xs->first }; xs->first = c; }

Here are pictures of pushing 5 onto a stack:

141

Page 142: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Classic mistakeA very common mistake with pointer handling is to dothings in the wrong order:

void push(stack *xs, item x) { cell *c = malloc(sizeof(cell)); xs->first = c; // BAD *c = (cell) { x, xs->first }; }

142

Page 143: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

TestingTo test push, main becomes:

int main() { stack *ns = newStack(); assert(check(ns, 0, NULL)); push(ns, 40); assert(check(ns, 1, (int[]) {40})); push(ns, 41); assert(check(ns, 2, (int[]) {41, 40})); push(ns, 42); assert(check(ns, 3, (int[]) {42, 41, 40})); freeStack(ns); return 0; }

143

Page 144: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The fail functionHere's a function to call if something goes wrong:

void fail(char *message) { fprintf(stderr, "Stack failure: %s\n", message); exit(1); }

The function prints to stderr, and stops the programwith an error code (as if returning 1 from main) toplay nicely with any scripts that include the program.

144

Page 145: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The top functionThe function to look at the top item is:

item top(stack *xs) { if (xs->first == NULL) fail("top of empty"); return xs->first->x; }

If the caller tries to get the top item from an emptystack, the fail function is called, to make sure theprogram doesn't do anything terrible.

145

Page 146: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The pop functionThe function to remove the top item is:

item pop(stack *xs) { cell *c = xs->first; if (c == NULL) fail("pop of empty"); xs->first = c->next; item x = c->x; free(c); return x; }

This has to be written incredibly carefully, saving thefirst cell in a variable before removing it from the list,and extracting its fields before freeing up its space.

146

Page 147: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Visualising popThe main steps in pop are:

147

Page 148: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

stack.c

TestingTo test top and pop, add:

int main() { ... assert(top(ns) == 42); assert(check(ns, 3, (int[]) {42, 41, 40})); assert(pop(ns) == 42); assert(check(ns, 2, (int[]) {41, 40})); freeStack(ns); return 0; }

148

Page 149: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Structure listsTo store structures instead of ints, you could include thenext field in the structure, e.g.

struct cell { char *name; int number; struct cell *next; };

The next field can be ignored everywhere except inthe list functions.

Although this is common, it doesn't allow an item to bestored in more than one list.

149

Page 150: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Object listsA more flexible approach is to store objects, i.e. pointersto structures, in lists:

struct cell { struct entry *item; struct cell *next; };

This has an extra layer of pointers, but now an objectcan appear in any number of lists, and updates toobjects are shared by all occurrences.

150

Page 151: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

EfficiencyThere is an efficiency problem with what we have done.

All the stack functions are supposed to be O(1), but theymay not be.

That is because of the cost of malloc and free whichcan, at worst, have O(n) behaviour.

When complexities are discussed, they usually excludemalloc and free costs.

151

Page 152: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

Free listTo overcome the problem, it is common for a liststructure to contain a free list, i.e. a list (stack) of cellswhich are currently unused but are free to be re-used.

struct list { struct cell *first; struct cell *free; };

You put cells on the free list instead of calling free.

And when you want a new cell, you get it from the freelist if possible, and only allocate a new one if the freelist is empty.

152

Page 153: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

ModulesOnce you have built a good implementation of stacks, itis natural to re-use it in other programs.

To do that, you put the stack functions into a separatemodule.

And you make sure that programs cannot access thecells being used, and in fact cannot tell how the stack isbeing implemented - it is just a service, and a robustone.

153

Page 154: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

List variationsThere are many variations on linked lists, such as:

keep track of the last cell in the list structure, toallow adding at the endkeep track of the length of the listkeep track of a current position within the list, toallow traversal and insertion in the middlehave a previous pointer as well as a next pointerin each cell, to make deletions easierhave dummy cells which go before the first oneand after the last, to simplify the code by gettingrid of NULL tests

154

Page 155: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

The real truthFor implementing general lists, and stacks, an array listis almost always better than a linked list.

Linked lists use a lot of memory, you can't index themefficiently, the efficiency of insertion and deletion in themiddle is offset by the cost of finding or keeping trackof the right place in the list to make the changes, andthey are difficult to encapsulate well.

Most library implementations of linked lists would bebetter if they were replaced by an array implementation(e.g. a circular gap buffer).

155

Page 156: Chapter: PointersPointers A pointer is an address in memory, together with the type and size of the item stored at that address. The type of a pointer to an int is int * ('int pointer')

But...But the idea of linked lists comes up a lot in computing.

They are used in operating systems and other low-levelsoftware, they often generate design ideas, they areoften used as part of something else, e.g hash tables, andvariations are often used, e.g. a linked list simulated inan array.

156