29
CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Embed Size (px)

Citation preview

Page 1: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

CS 213

Fall 1998

Namespaces

Inline functions

Preprocessing

Compiling

Name mangling

Linking

Page 2: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Namespaces

Suppose we purchase a library of cartographic tools from the Acme Geographic company. These tools are written as a collection of C++ classes:

map, legend, road, city, ...

Also, say we want to use container classes from STL:

string, list, vector, map, ...

There’s a problem: when we type “map”, does this refer to Acme Geographic’s map class, or STL’s map class?

Page 3: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

To deal with this, Acme Geographic and STL can declare their classes, functions, and variables in separate namespaces:

// STL header file:

namespace std {

class string {...};

template<...> class vector {...};

template<...> class list {...};

template<...> class map {...};

...

}

// Acme Geographic header file:

namespace AcmeGeo {

class map {...};

class legend {...};

...

}

Page 4: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

namespace std {

class string {...};

template<...> class vector {...};

template<...> class list {...};

template<...> class map {...};

...

}

namespace AcmeGeo {

class map {...};

class legend {...};

...

}

To use classes, functions, or variables declared in a namespace, a program gives the namespace name followed by a :: and the name of the class, function or variable:

AcmeGeo::map ithacaStreetMap;

std::map<std::string, int> ithacaPhoneNumbers;

Page 5: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Names can be added to a namespace in many different places. For instance, STL names are spread over many different header files:

// “string” header file:

namespace std {

class string {...};

}

// “list” header file:

namespace std {

template<...> class list {...};

}

// “map” header file:

namespace std {

template<...> class map {...};

}

Page 6: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

A program can import a name into a scope with a using declaration:

#include <string>

#include <map>

#include “AcmeGeo.h”

void main() {

using std::string;

using AcmeGeo::map;

map ithacaStreetMap;

std::map<string, int> ithacaPhoneNumbers;

...

}

Within main(), “string” refers to std::string, and “map” refers to AcmeGeo::map.

Page 7: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

It is also possible to import all the names from an entire namespace at once, with a using directive:

void main() {

using std::string;

using namespace AcmeGeo;

map ithacaStreetMap;

std::map<string, int> ithacaPhoneNumbers;

...

}

Page 8: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Namespaces are a fairly new feature in C++. Not all compilers implement them correctly yet. In addition, a good deal of C++ code was written before namespaces were part of the language:

// Old-style C++ code

#include <iostream.h>

void main() {

cout << “hello” << endl;

}

// New, namespace-savvy C++ code:

#include <iostream>

void main() {

std::cout << “hello” << std::endl;

}

The old code continues to work because iostream.h contains a using directive that imports all of std into the program (iostream does not contain this using directive).

Page 9: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

In other words,

#include <iostream.h>

is equivalent to

#include <iostream>

using namespace std;

Officially, the old .h standard library header files (iostream.h, stdlib.h, etc.) are deprecated. The new header files the .h in the older header files, while even older header files from C drop the .h and add a ‘c’:

iostream.h -> iostream

fstream.h -> fstream

stdlib.h -> cstdlib

ctype.h -> cctype

Page 10: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Inline Functions

To avoid the cost of a function call, functions may be defined to be inline:

inline double square(double x) {return x * x;}

An inline function’s code is replicated at each point in the program where it is called. This can enable many optimizations. Consider the following:

double f(double y) {return y * square(3);}

Because square is declared inline, most compilers will optimize y * square(3) to y * 9 at compile-time.

Page 11: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Watch out, though: only small functions should be declared “inline”. If you make too many big functions inline, the size of your executable code will grow. Because large executables lead to bad cache usage and even paging, excessive inlining may make your program slower, rather than faster.

Page 12: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

When you declare the implementation of a function inside the definition of a class, the function is automatically considered inline. So

class Foo {

...

int f(int x) {return x + 1;}

}

is equivalent to:

class Foo {

...

int f(int x);

}

inline int Foo::f(int x) {return x + 1;}

Page 13: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

AcmeGeo.h

#ifndef __ACME_GEO_H__#define __ACME_GEO_H__namespace AcmeGeo {class map{public: map();};class legend {};class road {};class city {};

Let’s look at how a collection of header files and source files are turned into an executing program in memory:

I will assume that header files have the suffix “.h”, while regular source files have the suffix “.cpp”. Other suffixes (.C, .cxx, .cc) are also common.

memory

AcmeGeo.cpp

#include "AcmeGeo.h"

using namespace AcmeGeo;

map::map(){}

void map::draw(double sc{

ithaca.cpp#include <string>#include <map>#include “AcmeGeo.h”

void main(){ using std::string; using AcmeGeo::map;

map ithacaStreetMap; std::map<string, int

Page 14: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

There are 4 steps involved:

•preprocessing: the .cpp files are preprocessed by a macro processor. The macro processor handles #include, #ifdef, #define, etc.

•compiling: the preprocessed .cpp files (called translation units) are compiled into object files.

•linking: the object files are linked together to form a single executable program.

•loading: the executable program is loaded into memory and executed.

AcmeGeo.h

macro processor

linker

compiler

loader

0 0 1 0 1 1 0 0 0 1 0 1memory

AcmeGeo.cpp ithaca.cpp

mapstring

AcmeGeotranslation unit

ithacatranslation unit

AcmeGeo.obj ithaca.obj

ithaca.exe

Page 15: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

The preprocessor

The preprocessor (or “macro processor”) interprets directives and macros in a C or C++ file. Typical preprocessor directives are:

#include

#define

#ifdef

#ifndef

#endif

Page 16: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

ithaca.cpp#include <string>#include <map>#include “AcmeGeo.h”

void main(){ using std::string; using AcmeGeo::map;

map ithacaStreetMap; std::map<string, int

The most important directive in C++ is the #include directive. When the preprocessor sees a #include directive, it substitutes the contents of the included file into the file that is being preprocessed:

preprocess

ithaca translation unit

// string standard heade...

// map standard header...

// AcmeGeo.h:namespace AcmeGeo { class map { public: map(); ... }; class legend {...}; class road {...}; class city {...};}

void main(){ using std::string; using AcmeGeo::map;

string

map

AcmeGeo.h

Page 17: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

#include is a fairly blunt tool. One common problem is that a header file gets included multiple times within the same translation unit. To prevent this, a header file can use an #ifndef as a guard:

// AcmeGeo.h:

#ifndef __ACME_GEO_H__

#define __ACME_GEO_H__

namespace AcmeGeo {

class map {...}

class legend {...};

class road {...};

class city {...};

}

#endif //__ACME_GEO_H__

Page 18: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

In C, the preprocessor was often used to imitate const variables and inline functions:

#define PI 3.141592654

#define square(x) ((x) * (x))

However, preprocessor macros used in this way were rather dangerous. One common mistake was to say:

#define square(x) (x * x)

Then square(1 + 2) evaluates as:

square(1 + 2) => 1 + 2 * 1 + 2

= 1 + 2 + 2

= 5

C++ inline functions and const variables are safer:

const double PI = 3.141592654

inline double square(double x) {return x * x;}

Page 19: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

The compilerithaca translation unit// string standard heade...// map standard header...

// AcmeGeo.h:namespace AcmeGeo { class map { public: map(); ... }; class legend {...}; class road {...}; class city {...};}

void main() {...}

void processInput( AcmeGeo::map m, int arg) {...}

void printResults( int format, bool verbose) {...}

The preprocessor generates a translation unit containing pure C++ code (with no preprocessor directives).

The compiler’s job is to turn this translation unit into an object file containing machine code.

ithaca.obj_main

?printResults@@YAXH_N@Z

?processInput@@YAXVmap@AcmeGeo@@H@Z

6a ff 68 00 00 00 00 64 ...

56 33 f6 56 b9 00 00 00 ...

68 00 00 00 00 68 00 00 ...

compile

Page 20: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

An object file is language independent. On many systems, the format of an object file dates back to the days when programming was done largely in C.

Roughly, an object file contains raw data (mostly consisting of machine code), and a symbol table whose entries point into the raw data.

(A real object file contains much more information, such as import and export specifications, relocation specifications and debugging data.)

ithaca.obj_main

?printResults@@YAXH_N@Z

?processInput@@YAXVmap@AcmeGeo@@H@Z

6a ff 68 00 00 00 00 64 ...

56 33 f6 56 b9 00 00 00 ...

68 00 00 00 00 68 00 00 ...

Page 21: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Each symbol in the symbol table corresponds to a function or global variable defined in the C/C++ source code.

In C, the name of a symbol was often just the name of the function or variable. For instance, the symbol _main represents the function main(), and points to main()’s machine code.

However, this is not sufficient for C++, because the same function name can be overloaded with many different argument types:

double square(double x);

int square(int x);

float square(float x);

We need a different symbol

name for each square function.

ithaca.obj_main

?printResults@@YAXH_N@Z

?processInput@@YAXVmap@AcmeGeo@@H@Z

6a ff 68 00 00 00 00 64 ...

56 33 f6 56 b9 00 00 00 ...

68 00 00 00 00 68 00 00 ...

Page 22: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

To handle overloading, C++ compilers encode a function’s types into the symbol name for the function. For instance,

void processInput(AcmeGeo::map m, int arg);

void printResults(int format, bool verbose);

are encoded (under Visual Studio) as:

?processInput@@YAXVmap@AcmeGeo@@H@Z

?printResults@@YAXH_N@Z

This technique is known as

name mangling.

ithaca.obj_main

?printResults@@YAXH_N@Z

?processInput@@YAXVmap@AcmeGeo@@H@Z

6a ff 68 00 00 00 00 64 ...

56 33 f6 56 b9 00 00 00 ...

68 00 00 00 00 68 00 00 ...

X: void H: int _N: bool

Vmap@AcmeGeo@@:AcmeGeo::map

H: int

Page 23: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Name mangling can cause difficulties when interfacing to programs written other languages, such as C, that know nothing about name mangling.

Suppose that a function foo was written in C, and we would like to access it from C++. Name mangling can be disabled for this function by declaring it to be extern “C”:

extern “C” {

void foo();

}

Now the C++ compiler will refer to foo using an unmangled symbol, which will match the symbol used for foo by the C compiler.

Page 24: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

The Linker

The linker combines multiple object files into a single executable program.

Object files may refer to symbols defined by other object files, and the linker’s main job is to connect these references to the correct locations in the other object files.

linker

AcmeGeo.obj ithaca.obj

ithaca.exe

Page 25: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

The linker allows us to combine several files into a single executable. Although an entire software project can be kept in a single file, it is useful to break up large pieces of software into many files:

• The structure of a large project is clearer if it is divided into a number of files.

• In a project developed by multiple people, different people can work on different files at the same time.

• Different files can be compiled separately, so that you don’t have to recompile the entire project every time you make a small change.

• You can use libraries from other vendors without having to paste their source code into your source code (in fact, they may not even want to give their source code to you).

Page 26: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Declarations and Definitions

A declaration describes the nature of a particular entity, but does not define it in detail. Examples of declarations:

class Matrix; // declare (but don’t define) a class

void foo(int i); // declare (but don’t define) a function

extern int a; // declare (but don’t define) global variable

Definitions describe an entity in full detail:

class Matrix {double **arr; public:...}; // define class

void Matrix::foo(int i) {...} // define function

int a; // define global variable

const int b = 5; // define constant

Page 27: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

Classes, global variables, and functions can be declared as many times as you want within a translation unit or program, as long as all the declarations are consistent:

// Declare Matrix twice:

class Matrix; // declare (but don’t define) a class

class Matrix; // declare (but don’t define) a class

In general, definitions can not be repeated:

class Matrix {...}; // define class

class Matrix {...}; // error: duplicate definition

Page 28: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

In more detail:

• A class definition may appear at most once in each translation unit.

• A function definition may appear at most once in each executable program.

• A global or static variable definition may appear at most once in each executable program.

These rules correspond to the organization of header and source files:

• Class definitions go in a header file (so that they are included once in each translation unit).

• Function and global/static variables definitions go in a .cpp file, so that they are defined once for the entire program.

Page 29: CS 213 Fall 1998 Namespaces Inline functions Preprocessing Compiling Name mangling Linking

classes:

one definition per translation unit

global/static variables, functions:

one definition per program

There are a few exceptions:

• const variables and inline functions may be defined once for each translation unit. Therefore const variable and inline function definitions should go in a header file

• Template function definitions may be defined once for each translation unit, and should go in a header file.