cpp

1. Introduction

Motivation

Library design is language design. [Stroustrup]

Course Goal

To learn how to implement software libraries, such as STL, CGAL, LEDA, ..., that have a focus on algorithms and data structures. To learn advanced programming techniques in C++, such as templates, generic programming, object-oriented design, design patterns, and large-scale C++ software design.

We will approach library design from different angles:

� General design methods: here Generative Programming and Domain Analysis with Feature Diagrams

� Studying the expressiveness of the library implementation language C++

� Studying example libraries

� Studying individual and often occurring problems

In contrast to the software design of an application, the design for a software library is an obvious quality criterion for the user, i.e., software developer using the library. In addition, the library designer is more often faced with the problem of a greenfield design, i.e., the library designer does not know the specific needs of the (potentially many) library user.

We like to emphasize here that the user's point of view of library design can differ quite drastically from the library designers point of view. The use of a library can be much easier than its implementation. A complicated implementation technique for a library does not necessarily make the library complex for the user. However, maintainability and the required training of the library developers have to be taken into account.

Generative Programming

Def. [pg. 5, Czarnecki00]

Generative Programming (GP) is a software engineering paradigm based on modeling software system families such that, given a particular requirements specification, a highly customized and optimized intermediate or end-product can be automatically manufactured on demand from elementary, reusable implementation components by means of configuration knowledge.

So, generative programming has a strong focus on reuse. This is reflected in the analysis and design phase of the software development that is broken into two tracks: the domain engineering develops the design for the system families, and application engineering develops the design for a specific product created with specific members of the system families. So, domain engineering is design for reuse, and application engineering is design with reuse.

We focus on domain engineering and later on domain engineering for algorithm libraries. Domain engineering breaks down into the following tasks:

� Domain analysis: determine the scope of the domain and define a set of reusable, configurable requirements for the systems in the

domain.

� Domain design: develop a systems architecture and a production plan.

� Domain implementation: implement the reusable assets, from components up to the configuration knowledge.

In more detail from the domain engineering method DEMRAL [Czarnecki00]:

1. Domain analysis

1. Domain definition

1. Goal and stakeholder analysis

2. Domain scoping and context analysis

1. Analysis of application areas and existing systems

2. Identification of domain features

3. Identification of relationships to other domains

2. Domain modeling

1. Identification of key concepts

2. Feature modeling of the key concepts, i.e., identification of commonalities, variabilities, feature dependencies, and feature

interactions

2. Domain design

Page 1 of 72

1/13/2010file://C:\Users\rmanoj\Desktop\Untitled.htm

1. Identification and specification of the overall implementation architecture

2. Identification and specification of domain specific languages

3. Specification of the configuration knowledge

3. Domain implementation

Besides [Czarnecki00] see also the Multi-Paradigm Design in [Coplien98] for details on domain analysis.

Feature Modeling

A feature model represents the common and the variable features of concept instances and the dependencies between the variable features. It consists of a feature diagram and some additional information described later. The feature diagram is a tree with a concept as root and features in each node. The following decorations on the tree edges are distinguished:

In principle, all combinations of decorations are allowed. In particular, the arcs indicating alternative or or-features do not have to cover all edges of a node. For example, two arcs could cover two groups, creating two sets of children.

Some combinations are redundant and are normalized as follows:

Features can be direct children of the concept root node or of other feature nodes. If they are direct children of another feature node then they are only considered if the parent feature has been chosen.

Feature diagrams are a compact notation to record the set of feasible combinations of features of a concept. Let's consider an example:

In words: a car must have a car body, a transmission, an engine, and it can optionally pull a trailer. Note that although it looks like a physical decomposition, features can be arbitrary predicate statements, such as `maximum speed is 100mph', and the box `transmission' is read as `the car has a transmission'.

In the second layer of the tree, we see the transmission can be either automatic or manual, but not both at the same time (alternative feature). The car must have an engine that is electric or gasoline powered. However, it can have both types of engines at the same time (or-feature).

So the feature set encoded in this feature diagram is, written in a set notation:

{ {Car body, {Automatic}, {Electric}}, {Car body, {Automatic}, {Electric}, Pul ls trailer},

Mandatory features: all features

have to be present.

Optional features: any feature can

be chosen optionally.

Alternative features: exactly one

feature has to be chosen.

Or-features: at least one feature

has to be chosen.

Optional alternative features: If one feature in an alternive is

optional then all features in that alternative are optional.

Optional or-features: If one feature in an or-feature is optional then the àt

least one feature' requirement becomes meaningless and we normalize to a

set of optional features.

Page 2 of 72


{Car body, {Manual}, {Electric}}, {Car body, {Manual}, {Electric}, Pul ls trailer}, {Car body, {Automatic}, {Gasoline}}, {Car body, {Automatic}, {Gasoline}, Pul ls trailer}, {Car body, {Manual}, {Gasoline}}, {Car body, {Manual}, {Gasoline}, Pul ls trailer}, {Car body, {Automatic}, {Electric, Gasoline}} {Car body, {Automatic}, {Electric, Gasoline}, Pul ls trailer}, {Car body, {Manual}, {Electric, Gasoline}} {Car body, {Manual}, {Electric, Gasoline}, Pul ls trailer}, }

We have some information associated with a feature diagram

� Semantic description: A description for each feature. Informal text or any other suitable formalism can be used here.

� Rational: Why is this feature in this model? Conditions and recommendations when and when not to select a variable feature.

� Stakeholders and client programs: A list of stakeholders (user, customers, managers, etc.) who are interested in this feature, and in the

case of a component a list of client programs that need this feature. (Such a list can be helpful to avoid unnecessary feature bloat.)

� Exemplar systems: Is possible, mention known systems that have this feature.

� Constraints and default dependency rules: Dependency rules can be expressed between any set of variable features. We have mutual-

exclusion constraints and require constraints. Default dependency rules help in automatically configuring a system.

� Availability sites, binding sites, and binding mode: Availablity sites describes when, where, and to whom a variable feature is available,

and binding site describes when, where, and by whom a feature may be bound. Binding mode determines whether a feature is

statically, changeably, or dynamically bound.

� Open/closed attribute: A feature is open if we expect more variable direct subfeatures, for example, the number types possible for an

element in a matrix class might increase in the future. Otherwise we would mark the feature as closed.

� Priorities: Records the relevance for the project.

� Implementation strategies: Identifying implementation strategies for this feature (already part of the domain design).

We give some more options for the binding mode, i.e., the binding time of a variable feature:

� Source time: C++ templates, e.g. std::list<int>

� Compile time: C++ function overloading, templates, C preprocessor -DNDEBUG

� Link time: makefile configuration

� Load time: dynamic link libraries

� Infrequent at run time: Sun's HotSpot technology for Java

� Frequent at run time: variable values, virtual member functions

Brain Storming

Brain storming would be a possible technique to get a first set of concepts and features in the domain analysis. A variant of brain storming, known as MetaMind, uses written notes instead of voicing ideas freely in a group. Voicing ideas tends to hinder creativity since a major opinion or idea could dominate the session.

For MetaMind, a neutral organizer writes a central question on the board. The participants have about 10 minutes to write their free associations on index cards, preferably single words or short phrases, each one on one card. The participants tend to be busy for the first few minutes. It is important to continue a few minutes past the first rush of obvious answers. Thereafter the cards are pinned on a board in an unorganized manner. Unclear cards have to be clarified, duplicates and synonyms are removed. Then, the organizer moderates a discussion about how to relate the cards to each other, for example commonalities, abstractions, etc. The outcome is a set of concepts and features, relations among feature that lead to feature diagram with annotations, for example, commonalities and variablities.

Feature Starter Sets

Another way of getting started are predefined feature starter sets. Specifically for our area of algorithm library design the feature starter sets for ADTs and for algorithms from [Czarnecki00] can be used. Here is the set for ADTs:

� Attributes: Named properties of an ADT, e.g., size of a container. Features of an attribute can be; mutable versus const, volatile or not,

stored or computed, internally or externally owned.

� Data structures: Used to implement the ADT on top.

� Operations: Accessors and the core algebra operations. Signature, i.e., name, operands, operand types. Different alternative

implementations. Possible optimizations and specializations. Binding mode and binding time.

� Error detection, response, and handling: Pre- and post-conditions, invariants. Exception safety.

� Memory management: Heap or stack. Custom or standard allocation. Thread safety. Persistence.

� Synchronization: Thread safety of shared data.

� Persistency: Storing and restoring a state of the ADT to and from disk.

� Perspectives and subjectivity: Different views of stakeholders or different client programs on the ADT.

Page 3 of 72


For container like ADTs we can also consider:

� Element type: The types managed by the ADT.

� Indexing: Are the elements accessible with an index, e.g., with integer keys or symbolic keys.

� Structure: The structure of the representation, e.g., matrix.

The feature starter set for algorithms:

� Computational aspect: The functional description of the algorithm, e.g., text book description or pseudo code. Possible classification of

several algorithms into groups, e.g., searching and sorting.

� Data access: The interface to the ADTs, e.g., iterators or data accessors.

� Optimizations.

� Error detection, response, and handling.

� Memory management: Might be needed here as well for local data or optimization purposes.

� Parallelization: Related to optimization, here for specific hardware or system architectures.

Odds and Ends

From software engineering we list some general design goals one should keep in mind:

� Flexibility

� Adaptability (to existing environment)

� Extensibility

� Openness (to other standards/libraries)

� Ease of Use

� Smooth learning curve (number of concepts, structure)

� Uniformity

� Complete and minimal interfaces

� Rich and complete functionality (is it useful?)

� Efficiency

� Space and time

� O-calculus

� Practice (log(1000000) = 7, and sqrt( sqrt( 1000000)) = 31.6)

� Worst case/average case

� Input model, are the hard cases realistic?

� Correctness

� Documentation

� Hard because of complexity

� Robustness

� Rounding errors (in geometric)

� Boundary cases, degenerate situations (in geometry)

� Modularity

� Maintainability

� ...

The famous Occam's razor:

"Pluralitas non est ponenda sine neccesitate" or "plurality should not be posited without necessity."

The words are those of the medieval English philosopher and Franciscan monk William of Ockham (ca. 1285-1349). It leads to the useful KISS (Keep It Simple, Stupid) principle that can be used to fend of "creeping featurism".

size=2 width="100%" align=center>

Lutz Kettner (<surname>@mpi-sb.mpg.de). Last modified on Tuesday, 17-Jan-2006 17:53:41 MET.

2. The C++ Language

Introduction

A basic familiarity with C++ is assumed. We repeat shortly classes, member functions, constructors and destructor, derivation, virtual member

functions, and static class members. We continue then with an introduction to templates. Recommended introductions for C++ are

[Stroustrup97] (advanced), [Lippman98] (more basic), [ISO-C++-98] (the actual standard ;-).

Page 4 of 72


Classes and Member Functions

struct A { int i; };

We call the type A a class. If we create a variable of type A, we call this variable an object of class A, which is also known as an instantiation of

class A. Each object of type A has a member variable i which can be accessed with the dot notation: int main() { A a; a.i = 5; }

C++ provides access control for class members, which can be either of private , protected , or public . Private members can only be used

within the class. Protected members can also be used by derived classes. Public members can be used from everywhere. The members in

struct are by default public , the members in class are by default private . Thus above definition of A is equivalent to: class A { public: int i; };

We will use struct or class interchangeable, whichever default is more convenient.

A member function looks like a normal function declaration within a class definition. It is usually placed in a header file *.h .

class A { int i; // private public: int get_i(); void set_i( int n); };

A member functions definition is written outside of the class definition and uses the scope operator :: to name the function within the class. It

is usually placed in a source file *.C . Member functions can just access all other class members. To accomplish this, the C++ compiler adds

automatically a hidden function parameter that points to the current object. This hidden function parameter is named this and is of type A* . int A::get_i() { return i; } void A::set_i( int n) { i = n; }

The member variable i is now inaccessible from the outside and the object can only be manipulated through its member functions. int main() { A a; a.set_i(5); int j = a.get_i(); }

For efficiency, functions and member functions (collectively called functions in the following) can be declared inline, which advises the

compiler to replace a function call directly with the function body if possible, instead of creating a function call and a separately compiled

function body. Small inline functions can lead to faster and smaller code. Not so small inline functions can still lead to faster, but larger code.

Inline functions have to be defined before their use, thus their implementation moves usually to the header file. If the member function is

defined within the class, the inline declaration is automatically assumed. class A { int i; // private public: int get_i() { return i; } // both inlin e void set_i( int n) { i = n; } };

Variables can be declared const in C++. To check const correctness across classes and member functions, a member function has to be

declared const if it does not change the member variable of the class. Otherwise, a member function cannot be called for an object that is

declared const . So the complete definition of our example class A looks as follows: (See also const correctness.) class A { int i; // private public: int get_i() const { return i; } // inline and const void set_i( int n) { i = n; } // inline }; int main() { A a; a.set_i(5); int j = a.get_i(); const A a2; // uninitialized constant // a2.set_i(5); // this is forbidden by C++ ty pe system // So, const A is currently pr etty useless. }

Constructors, Assignment, and Destructor

Constructors are special `member functions' of which exactly one is called automatically at creation time of an object. Which one depends on

Page 5 of 72


the form of the variable declaration. Symmetrically, there exists a destructor which is automatically called at destruction time of an object.

Three types of constructors are distinguished: default constructor, copy constructor, and all others (user defined). A constructor has the same name as the class and it has no return type. The parameter list for the default and the copy constructor are fixed. The user defined constructors can have arbitrary parameter lists following C++ overloading rules.

class A { int i; // private public: A(); // default constructor A( const A& a); // copy constructor A( int n); // user defined ~A(); // destructor }; int main() { A a1; // calls the default constructor A a2 = a1; // calls the copy constructor (not the assignment operator) A a3(a1); // calls the copy constructor (usua l constructor call syntax) A a4(1); // calls the user defined construct or } // automatic destructor calls for a4, a3, a2, a1 at the end of the block

The compiler generates a missing default constructor, copy constructor, or destructor automatically. The default implementation of the default

constructor calls the default constructors of all class member variables. The default constructor is not automatically generated if other

constructors are explicitly declared in the class (except an explicit declared copy constructor). The default copy constructor calls the copy

constructor of all class member variables, which performs a bitwise copy for built-in data types. The default destructor does nothing. All

default implementations can be explicitly inhibited by declaring the respective constructor/destructor in the private part of the class.

Constructors initialize member variables. A new syntax, which resembles constructor calls, allows to call the respective constructors of the member variables (instead of assigning new values). The constructor call syntax extends also to built-in types. The order of the initializations should follow the order of the declarations, multiple initializations are separated by comma.

class A { int i; // private public: A() : i(0) {} // default construc tor A( const A& a) : i(a.i) {} // copy constructor , equal to the compiler default A( int n) : i(n) {} // user defined ~A() {} // destructor, equa l to the compiler default };

Usually, only the default constructor (if the semantics are reasonable), and some user defined constructors are defined for a class. As soon as

the class manages some external resources, e.g., dynamically allocated memory, the following four implementations have to work together to

avoid resource allocation errors: default constructor, copy constructor, assignment operator, and destructor. Note that the compiler would

create a default implementation for the assignment operator if it is not defined explicitly. See the following example and [Item 11 and 17,

Meyers97]. Note the use of this , the pointer to the current object. (see also Buffer.C) class Buffer { char* p; public: Buffer() : p( new char[100]) {} ~Buffer() { delete[] p; } Buffer( const Buffer& buf) : p( new char[100]) { memcpy( p, buf.p, 100); } void swap( Buffer& buf) { char* tmp = p; p = buf.p; buf.p = tmp; } Buffer& operator=( const Buffer& buf) { // Check for self-assignment, but its only an optimization if ( this != & buf) { // In general: perform copy constructor and destructor // Make sure that self-assignment is no t harmful. Buffer newbuf( buf); // create the new copy swap( newbuf); // exchange it wit h 'this' // the destructor of newbuf cleans up t he data previously // stored in 'this'. } return *this; } };

Automatic Conversion and the explicit Keyword for Constructors

A user-defined constructor with a single argument defines a conversion between two types; the type of the argument and the class the

constructor belongs to. The C++ compiler is allowed to perform this conversion automatically to find a matching function call. struct Buffer { Buffer( int n); // construtor to allocate n by tes for buffer

Page 6 of 72


// ... }; void rotate( Buffer& buf); // a function to rotate buffer cyclically int main() { rotate( 5); // oops, a temporary Buffer initial ized with 5 will be created }

These automatic conversions can be a source of errors that are difficult to spot. It is advised to forbid them with the new keyword explicit . struct Buffer { explicit Buffer( int n); // ... };

Ambiguity between Function-Style Cast and Declaration

The constructor notation and its implied function style cast gives rise to a couple of ambiguities in C++ and the heavily overloaded use of

parentheses. In case of ambiguities between a statement and a declaration, the compiler choses the declaration. struct S { S(int); }; void foo( double d) { S v( int(d)); // function declaration S w = S( int(d)); // object declaration S x(( int(d))); // object declaration }

v is a function declaration because the parentheses around d are redundant, so S v( int d); is obviously a function declaration and not an

object of type S that gets initialized with d casted to int .

To get the second interpretation we have to disambiguate the declaration explicitly, either by using the other intializer notation as for w or by adding parentheses that exclude the function declaration as for x . (see also decl.C)

Derivation

If we derive a class B from a base class A, B inherits all member variables and all member functions from A, but it can access only those that are

not private. We consider only public inheritance here (inheritance can also be qualified as private). class B : public A { int j; };

B has now two integer variable member. Objects of class B can be assigned to objects of class A. Doing so, they loose their additional member

variable j . int main() { B b; A a = b; }

Constructors and destructor are not inherited. But the default implementations of the derived class call automatically the respective

implementations of the base class. Only the additional, user defined constructors are missing and must be repeated. Calling a base class

constructor explicitly follows the same syntax as the member variable initialization. class B : public A { int j; public: B( int n) : A(n) {} B( int n, int m) : A(n), j(m) {} };

The first constructor and the default constructor leave the value of j uninitialized. We solve this in the following example and use default

values to implement the three constructors in one. class B : public A { int j; public: B( int n = 0, int m = 0) : A(n), j(m) {} };

Virtual Member Functions

Virtual member functions and derivation provide the backbone of flexibility in C++ for the object-oriented paradigm. A base class defines an

interface with virtual member functions, here a pure virtual member function. struct Shape { virtual void draw() = 0; };

We derive different concrete classes from Shape and implement the member function draw for each of them. struct Circle : public Shape { void draw();

Page 7 of 72


}; struct Square : public Shape { void draw(); };

We cannot create an object of a class that contains pure virtual member functions, but we can have pointer of this type and we can assign

pointer of the derived types to them. If we call a member function through this pointer, the program figures out at runtime which member

function is meant, Circle::draw or Square::draw . int main() { Shape* s1 = new Circle; Shape* s2 = new Square; s1->draw(); // calls Circle::draw s2->draw(); // calls Square::draw }

This runtime flexibility is achieved with a virtual function table per class. (dispatch table with function pointers). Each object gets an additional

pointer referring to this table. Thus, each object knows at runtime of which type it is, which is also used for the runtime type information in

C++. These extra costs, additional pointer and one more indirection for a function call, are only imposed on objects which class or base classes

use virtual member functions.

Since we don't know the size of the actually allocated objects any more, we also have to use a virtual member function to delete the objects properly. It is sufficient to define a virtual, empty destructor in the base class. (see also Shape.C)

struct Shape { virtual void draw() = 0; virtual ~Shape() {} }; // ... the derived shape classes int main() { Shape* s1 = new Circle; Shape* s2 = new Square; s1->draw(); // calls Circle::draw s2->draw(); // calls Square::draw delete s1; delete s2; }

Static Class Members

Static member variables belong to the class, not to the object. They can be accessed from outside the class with the scope operator. For

example, a class that counts how many objects of its type have been created looks as follows: (see also Counter.C) #include <iostream.h> struct Counter { static int counter; Counter() { counter++; } }; int Counter::counter = 0; int main() { Counter a; Counter b; Counter c; cout << Counter::counter << endl; }

Note that an explicit definition of the static member outside of the class is needed. This definition is supposed to show up in only one

compilation unit (much like a global variable). Thus, the definition is usually in a *.C source file.

Static member variables are guaranteed to be initialized before main gets executed. The order of initialization for multiple static member variables is specified to be in the order of declaration in a single compilation unit, but the order is unspecified between different compilation units. This implies in particular that a class which relies on the proper initialization of a static member variable (such as our Counter example) cannot be used as a type of a static member variable in another compilation unit (see [Item 47, Meyers97] for how to get around this restriction with local static variables in global functions).

A static variable can be used to initialize a library automatically, see Automatic Library Initialization and Housekeeping.

A member function exists already only once per class, but it has a pointer (this ) to the current object hidden in its parameter list. A static member function omits this pointer. It is called like a normal function using the scope operator. A static member function cannot access member variables of the object. It can only access static member variables.

struct A { static int i; int j; static void init(); };

Page 8 of 72


void A::init() { i = 5; // fine // j = 6; // is not allowed } int main() { A::init(); assert( A::i == 5); }

Templates

Templates provide the backbone of flexibility in C++ for the generic-programming paradigm. Their flexibility is resolved at compile time thus

retaining efficiency.

C++ supports two kinds of templates: class templates and function templates. Templates are incompletely specified components in which a few types are left open and represented by formal placeholders, the template arguments. The compiler generates a separate translation of the component with actual types replacing the formal placeholders wherever this template is used. This process is called template instantiation. The actual types for a function template are implicitly given by the types of the function arguments at instantiation time. Therefore, all template arguments must be used somewhere in the function parameter list. An example is a swap function that exchanges the value of two variables of arbitrary types.

template <class T> void swap( T& a, T& b) { T tmp = a; a = b; b = tmp; } int main() { int i = 5; int j = 7; swap( i, j); // uses "int" for T. }

The actual types for a class template are explicitly provided by the programmer. An example is a generic list class for arbitrary item types. template <class T> struct list { void push_back( const T& t); // append t to lis t. }; int main() { list<int> ls; // uses "int" for T. ls.push_back(5); }

Defining push_back outside of the list class requires the repetition of the template declaration template <class T> , and the name of

the list for the scope, which is list<T> . For the naming convention, list is a class template, while list<T> is a template class, (in particular,

list is a template, and list<T> is a class). template <class T> void list<T>::push_back( const T& t) { ... }

The C++ compiler uses pattern matching to derive automatically the template argument types for functions template. Consider for example a function template that works only for lists:

template <class T> void foo( list<T>& ls);

Template arguments can have default values, e.g., a stack class built with our list. Note the space between the two > > . Otherwise this would be parsed as the right-shift operator.

template <class T, class Container = list<T> > struct stack { ... };

Besides type parameters, class templates can also have builtin integral types as template parameters. In that case, the template arguments must be constant expression for that integral type. A useful example are points of constant dimension. The parameter is determined at compile time and thus constant. We can use it to declare a fixed size array of coordinates for the point. Features like default arguments and specialization (see below) work also on this kind of parameter.

template <int dim> struct Point { double coordinates[dim]; // coordinate array // ... }; int main() { Point<3> // a point in 3d space }

Page 9 of 72


The C++ standard aims for separate compilation of templates, but this is not available in most compilers today. The current model for templates is that all source code including definitions goes into the header files.

"Lazy" Implicit Instantiation

As already mentioned above, the process of using a function template forces the compiler to instantiate the template, which is more precisely

called implicit instantiation. There is also an explicit instantiation, which we do not use and do not mention any further.

Member functions of class templates are also instantiated implicitly.

Assume that we want to add a sort member function to our list template from above. Quicksort is not efficient on lists (needs an extra container), so our specialized member function realizes, for example, a more efficient merge sort. For a sort function, the item type of the list needs to be comparable. But lists in general are also useful for types that are not comparable. However, this causes no problem in C++, since C++ is not allowed to compile the sort member function if it is not instantiated somewhere. And the compiler is not allowed to complain about the missing comparison operator in the sort member function. In consequence, we can use the list class with a type that is not comparable, as long as we do not try to sort this list somewhere in our program. In the chapter on the STL we will see examples where this useful to implement generic adapters for iterators.

This behavior is restricted to class templates.

Member Templates

Member functions are basically normal global functions (with some syntactic sugar and compiler support for the this pointer). Thus, function

templates were easy to extend to member functions and even constructors (although late in the standardization process).

The small convenience class pair<T1,T2> from the STL makes use of a member template constructor. The class pair<T1,T2> contains a member variable of type T1 and a member variable of type T2, i.e., a tuple (first,second) with types (T1,T2). The default copy constructor allows the construction of a pair, if the types for the template arguments are exactly the same. However, in C++ are several automatic conversions possible, for example mutable to const, pointer of derived class to pointer of base class, short to int , etc. It would be nice, if we could create one pair from another pair, if the types T1 and T2 would be assignable from one pair to the other pair. The solution is a template constructor accepting all pairs. The actual construction of its member variable compiles only if this construction is permitted.

template <class T1, class T2> struct pair { T1 first; T2 second; pair() {} // don't forget the default construc tor if there are also others // template constructor template <class U1, class U2> pair( const pair<U1,U2>& p); };

Now, let's assume we want to define this constructor not inline (however, inline would be preferable here). Here is, how we nest the two

template declarations. (see also pair.C) template <class T1, class T2> template <class U1, class U2> pair<T1,T2>::pair( const pair<U1,U2>& p) : first( p.first), second( p.second) {}

Specialization and Partial Specialization

Let us assume we have a generic vector class vector<T> that is just fine for the general case, but for booleans we could do better with a bit

vector. We can write a specialized class for booleans. template <> struct vector<bool> { // specialized implementation };

The compiler matches vector<bool> automatically with this specialization. The empty template declaration was previously superfluous, but

is now mandatory.

Now suppose, vector has a second template argument for a memory allocator (which it does in the STL, but hidden by a default setting). The resulting partial specialization is still a template.

template <class Allocator = std::allocator> struct vector<bool,Allocator> { // partially specialized implementation };

Specialization and partial specialization work also for function templates. Since we have already mentioned pattern matching for function

Page 10 of 72


templates, it might not be such a surprise. However, it needs to be clarified, how the resulting overloading of the function name gets resolved.

The general rule of thumb is that the compiler tries to instantiate all function templates that can match the function call, and it chooses the

`most specific' instantiation. If there is more than one `most specific' instantiation, it is reported as an ambiguity error. The bad news here is

that a sound type theory as known from functional languages is missing here.

Local Types and Keyword typename

Besides variables and member functions, classes can also contain enum's and types. They are accessed with the scope operator `:: '. template <class T> struct list { typedef T value_type; }; int main() { list<int> ls; list<int>::value_type i; // is of type int }

Let us assume a class X uses a container, such as list<T> . Now class X needs the value type T of the container, which we have already

prepared with the typedef in the list class template. For convenience, we use a typedef and the same name in class X. template <class Container> struct X { typedef Container::value_type value_type; // no t correct // ... };

But how can the compiler know that Container::value_type is actually a type and not a static member variable without knowing the

actual type for Container , i.e., before actually instantiating the template? It does not. The solution is the new keyword typename . By

default, the compiler assumes that in case of such ambiguities the token is not a type. If it is a type, we can say so explicitly with the new

keyword. Thus, the correct examples is: template <class Container> struct X { typedef typename Container::value_type value_type; // ... };

The keyword typename is used to indicate that the name following the keyword does in fact denote the name of a type. However, one cannot

just liberally sprinkle code with typenames. More precisely, one must use the keyword typename in front of a name that:

1. denotes a type; and

2. is qualified: i.e., it contains a scope operator `:: '; and

3. appears in a template; and

4. is not used in a list of base-classes or as an item to be initialized by a constructor initializer list, and

5. has a component left of a scope resolution operator that depends on a template parameter.

Furthermore, one is only allowed to use the keyword in this sense if the first four apply. To illustrate this rule, consider this code fragment: template<class T> struct S : public X<T>::Base { // no typename, because of 4 S(): X<T>::Base( // no typename, because of 4 typename X<T>::Base(0)) {} // typename need ed X<T> f() { // no typename, because of 2 typename X<T>::C *p; // declaration o f pointer p, typename needed X<T>::D *q; // no typename = => multiplication! } X<int>::C *s_; // typename allo wed but not needed }; struct U { X<int>::C *pc_; // no typename, because of 2 };

Dynamic and Static Polymorphism

Polymorphism refers to the ability of a single piece of code to work with multiple types. C++ supports two kinds of polymorphism: dynamic

(runtime) polymorphism through virtual functions and static (compile time) polymorphism through templates. Dynamic polymorphism is in

central role in object-oriented programming while static polymorphism is at the heart of generic programming.

Let us look at the shape example we saw earlier:

struct Shape { virtual void draw() = 0; virtual ~Shape() {} }; struct Circle : public Shape { void draw(); }; struct Square : public Shape { void draw();

Page 11 of 72


};

Now we have two ways of writing a single function that works for circles, squares, and any other classes derived from Shape : void display (const Shape& s) { // dynamic polymo rphism s.draw(); } template <class T> // static polymor phism void display (const T& s) { // T does not nee d to be derived from Shape s.draw(); }

In this case, dynamic polymorphism is more appropriate. The problems with static polymorphism are:

� Heterogeneous dynamic collections are difficult to handle. For example, a drawing class might have a list of its component shapes:

• class Drawing { • list<Shape*> components; // relies on dynamic polymorphism • // ... • }

� Code bloat. There is a separate instance of display<T> for each type T.

� No separate compilation. Compilation of display<T> needs the definition of T.

Templates have their advantages, too. Let us look at the swap example: template <class T> void swap( T& a, T& b) { T tmp = a; a = b; b = tmp; }

The basic operations used by swap are copy constructor and assignment. To do this with virtual functions, we need a base class with virtual

versions of copy constructor and assignment. Because a constructor cannot be virtual and virtual assignment has its problems (see [Gillam98]),

we use normal member functions: struct Swappable { virtual Swappable* clone() const =0; virtual Swappable& assign(const Swappable& rhs) =0; virtual ~Swappable() {}; } void swap (Swappable& a, Swappable& b) { Swappable* tmp = a.clone(); a.assign(b); b.assign(*tmp); delete tmp; }

Now swap works with any type that is derived from Swappable and defines clone and assign appropriately. This is clearly more awkward

to use than the swap template. Other problems with dynamic polymorphism are:

� Built-in types cannot be handled directly.

� Inefficiency. (Virtual function call overhead, no inlining.)

� Static type checking is compromised. For example, trying to swap objects of two different types cannot be detected at compile time.

Name Spaces

As a library developer we do not own the universe, for example, for identifier names. An application programmer most likely uses more than one library and create additional identifier names. It frequently happened that common names such as min , max, swap, or Byte have been defined in more than one library with surprising results when those libraries where used together. Assume a header file "a.h" contains the macro definition

#define Byte unsigned char;

and another header file "b.h" defines

typedef unsigned char Byte;

What happens if we include both header files, does it depend on the order of inclusion?

A common way out of this dilemma was and still is the use of a common prefix for all identifiers of one library. The prefix is typically a short abbreviation such as std_ , CGAL_, Q for the Qt GUI interface, and it is supposely different than all other prefixes.

In C++ we have the new concept of name spaces. They act as a scope and group identifiers together. They can be extended anytime. An example:

namespace CGAL { int max( int a, int b); class A; void foo( const A& a); } // ends the namespace CGAL

Page 12 of 72


We could use the above declarations in the following way:

int main() { int i = CGAL::max( 3, 4); A a; // assumes that we have also seen the full definition of A somewhere CGAL::foo( a); }

So far, the name space scope CGAL:: with the so-called scope operator :: has just replaced the prefix in user code. However, within the name space itself we don't have to repeat the scope all the time. Name lookup happens following the name space scope from the inside to the outside. The global name space scope is just denoted :: and can be used to name identifiers in the global scope that also exist in the current local scope.

There is an alternative possebility for calling the function foo , see this example instead:

int main() { A a; foo( a); }

We just omitted the scope, but now the compiler also examines the argument types and includes the scopes of the argument types for name lookup searches. This is well known in C++ under the name Koenig lookup.

All the standard C and C++ library has been enclosed in the std namespace. One can import a whole namespace or just selected identifiers into the current namespace with the using declarative:

using namespace std; using std::vector;

C Preprocessor: Include Guards and assert Macro

C++ has inherited from C its preprocessor, a separate phase of the compiler that processes its own language before the compiler looks at the (transformed) source code. We discuss only a few aspects of the preprocessor, largely because it has almost become superfluous with the new language elements of C++, namely constants and templates.

Symbolic Constants

One can define symbolic constants, such as:

#define M_PI 3.14159265358979323846 #define CGAL_CFG_NO_KOENIG_LOOKUP 1

Each of these definitions defines a replacement rule, where the identifier following the #define gets literally replaced with the text following it. Note that this replacement happens on a text processing basis only, the preprocessor does not know about classes, protection, namespaces and scopes!

Include Guards

The preprocessor has control structures that can control which parts of the source code are compiled and which parts are excluded from the compiler.

assert Macro

3. STL and Generic Programming

Introduction

The Standard Template Library (STL) falls into the class of foundation libraries. It provides basic data types, such as list , vector , set , and

map, and it provides basic algorithms, such as find and sort . The STL is part of the C++ Standard Library, but not all templates in the standard

library belong to the STL, for example, strings are not part of the STL. Historically, the STL started as an Ada library [Musser89]. It became

widely recognized as the report [Stepanov95] started circulating in the C++ standardization committee. The STL became in a slightly modified

form part of the C++ standard. These modifications make the standard version incompatible to the first STL, but despite that it is still called the

STL. A good up-to-date introduction and reference for the STL can be found in [Austern98]. The reference part is also available online from SGI,

see [SGI-STL].

The programming paradigm underlying STL is called generic programming. Here is one definition [Jazaeri98]:

Generic programming is a sub-discipline of computer science that deals with finding abstract representations of efficient algorithms, data

structures, and other software concepts, and with their systematic organization. The goal of generic programming is to express algorithms and

Page 13 of 72


data structures in a broadly adaptable, interoperable form that allows their direct use in software construction. Key ideas include:

� Expressing algorithms with minimal assumptions about data abstractions, and vice versa, thus making them as interoperable as possible.

� Lifting of a concrete algorithm to as general a level as possible without losing efficiency; i.e., the most abstract form such that when

specialized back to the concrete case the result is just as efficient as the original algorithm.

� When the result of lifting is not general enough to cover all uses of an algorithm, additionally providing a more general form, but

ensuring that the most efficient specialized form is automatically chosen when applicable.

� Providing more than one generic algorithm for the same purpose and at the same level of abstraction, when none dominates the others

in efficiency for all inputs. This introduces the necessity to provide sufficiently precise characterizations of the domain for which each

algorithm is the most efficient.

Concept and Model

Consider our first example of a function template, swap: template <class T> void swap( T& a, T& b) { T tmp = a; a = b; b = tmp; }

When the template is instantiated (by calling the function) the placeholder T becomes an actual type. However, compilation can only succeed

if this actual type has an assignment operator and a copy constructor. The function could have been implemented using a default constructor

and assignment, but the copy constructor is more likely to exist than the default constructor (given that the assignment operator is required

anyway).

We can distinguish between syntactic requirements and semantic requirements. The syntactic requirements are the assignment operator and the copy constructor in our example. If an actual type fails to comply with these requirements a compilation error points that out. The semantic requirements are that the copy constructor and the assignment operator should actually copy the values, should be side effect free, and in general should behave according to the C object model, e.g., tmp = y; x = tmp; should give you the same as x = y; . Remember that these are user defined functions. Semantic requirements are not checkable at compile time.

Instead of documenting requirements always in all detail, it is convenient to group them in often used combinations. We call these collections of requirements concepts. The concept for the swap function parameter is called Assignable .

If an actual type fulfills the requirements of a concept, it is a called a model for this concept. In our example, int is a model of the concept Assignable .

Common basic concepts

A regular type is one that is a model of Assignable , Default Constructible , Equality Comparable , and one in which these expressions interact in the expected way, for example, for x = y; we may assume that now x == y true is.

In general, concepts factor out common signature and behavior for template arguments. One can think of a concept as the `greatest common denominator' of all types for which a function template is supposed to work. Of course, the function has then to be implemented using only the operations specified in the concept.

In analogy to the object-oriented paradigm, concepts correspond to virtual base classes, and models correspond to derived classes. However, there is the important difference that concepts are nowhere explicitly coded in the language. They are only communicated in documentations. This is a maintenance disadvantage, but also an advantage, because it avoids the coupling of a common base class. A common base class needs a header file and all derived classes have to agree on this single header file, linking, etc.

In general, the flexibility is resolved at compile time which gives us the advantages of strong type checking and inline efficiency where needed. If runtime flexibility is needed, the generic data structures and algorithms can be parameterized with a base class used in the object-oriented programming to get the runtime flexibility.

Generic Algorithms Based on Iterators

Algorithmic abstraction is a key goal in generic programming [Musser89]. One aspect is to reduce the interface to the data types used in the

algorithm to a set of simple and general concepts. One of them is the Iterator concept in STL which is an abstraction of pointers. Iterators

Concept Syntactic requirements

Assignable copy constructor

assignment operator

Default Constructible default constructor

Equality Comparable equality and inequality operator

LessThan Comparable order comparison with operators <, <=, >=, and >

Page 14 of 72


serve two purposes: They refer to an item and they traverse over the sequence of items that are stored in a data structure, also known as

container class in STL. Five different categories are defined for iterators: input, output, forward, bidirectional and random-access iterators,

according to the different possibilities of accessing items in a container class. The usual C-pointer referring to a C-array is a model for a

random-access iterator.

The following table shows the different iterator concepts and the refinement relation between them and the basic concepts (see above). The syntactic requirements are only sketched here, see [ISO-C++-98, SGI-STL] for the full requirements.

Sequences of items are specified by a range [first,beyond) of two iterators. This notion of a half-open interval denotes the sequence of all iterators obtained by starting with the iterator first and advancing first until the iterator beyond is reached, but it does not include beyond . The iterator beyond is also referred to as the past-the-end position.

A container class is supposed to provide a member type called iterator , which is a model of the Iterator concept, and two member functions: begin() returns the start iterator of the sequence and end() returns the iterator referring to the past-the-end position of the sequence. The list class template example from the previous section can be extended as follows, though we leave the actual implementation of the iterator open.

template <class T> class list { void push_back( const T& t); // append t to lis t. typedef ... iterator; iterator begin(); iterator end(); };

Generic algorithms are not written for a particular container class in STL, they use iterators instead. For example, a generic contains function

can be written to work for any model of an input iterator. It returns true iff the value is contained in the values of the range

[first,beyond) . template <class InputIterator, class T> bool contains( InputIterator first, InputIterator b eyond, const T& value){ while ((first != beyond) && (*first != value)) ++first; return (first != beyond); }

This generic contains function can be used with C-pointers referring to a C-array. Recall that C-pointers are a model for a random access

iterator, which is more general than an input iterator. The following example declares an array of a hundred integers and searches for a 42. int a[100]; // ... initialize elements of a. bool found = contains( a, a+100, 42);

We can also search only a part of an array. bool in_first_half = contains( a, a+50, 42); bool in_third_quarter = contains( a+50, a+75, 42);

This generic contains function can also be used with our list class template as illustrated in the following example: list<int> ls; // ... insert some elements into ls. bool found = contains( ls.begin(), ls.end(), 42);

A generic copy function copies the values of an iterator range to a sequence starting where another iterator points to. The copy function

returns an iterator pointing to the past-the-end position of the target sequence after copying. template <class InputIterator, class OutputIterator > OutputIterator copy( InputIterator first, InputIter ator beyond, OutputIterator result){ while (first != beyond) *result++ = *first++; return result; }

Lets copy 100 elements from an array of integers to another array of integers. int a1[100]; int a2[100]; // ... initialize elements of a1. copy( a1, a1+100, a2);

The copy function is writing over the already existing elements in a2 . If we want to copy the 100 elements into a list that is empty at the

beginning, we cannot use the begin() iterator of the list. For an empty list the begin() iterator is actually equal to the end() iterator, which

is not dereferenceable.

Concept Refinement of Syntactic requirements

Trivial Iterator Assignable , Equality Comparable operator*()

operator->()

Input Iterator Trivial Iterator operator++() , ...

Output Iterator Assignable operator*() , operator++() ...

Forward Iterator Input Iterator , Output Iterator , Default

Constructible ...

Bidirectional Iterator Forward Iterator operator--() , ...

Random Access Iterator Bidirectional Iterator , LessThan

Comparable

operator+() , operator+=() , operator-() ,

operator[]() , ...

Page 15 of 72


The STL provides in these cases small adapters that interface between the concepts. Here, the adapter is a model of an output iterator, and it uses a model of a container class, here the list , to append a new element to the end of this container class whenever an element is written to the iterator. We will see later on how this back_inserter adaptor is actually implemented. Here is the example how it is used with the copy function and the list class assuming we still have the array a1 at hand.

list<int> ls; copy( a1, a1+100, back_inserter(ls));

There are also adapters to interface between C++ I/O streams and iterators. The following example reads integers from the standard input

stream and writes them to the standard output stream, each integer followed by a carriage return "\n" . The istream_iterator with the

empty parenthesis denotes the past-the-end position for this range, which is the end-of-file condition for the stream. copy( istream_iterator<int>(cin), istream_iterator< int>(), ostream_iterator<int>( cout, "\n"));

The concepts in the STL and the adaptors form an extremely flexible toolkit. Most adaptors are small classes and function. Own adaptors for

other concepts are easy to add. The whole is more than the sum of its parts.

A First Partial Implementation of an Iterator

The stream iterator adaptor example makes a point: Streams, and ranges, can be infinite. For technical reasons, this idea works best with input

iterators that generate the sequence on the fly, i.e., they compute the sequence from a small internal state. A first example would be an

iterator to a constant value. template <class T> class Const_value { T t; public: // Default Constructible ! Const_value() {} explicit Const_value( const T& s) : t(s) {} // Assignable by default. // Equality Comparable (not so easy what that s hould mean here) bool operator==( const Const_value<T>& cv) cons t { return ( this == &cv); } bool operator!=( const Const_value<T>& cv) cons t { return !(*this == cv); } // Trivial Iterator: const T& operator* () const { return t; } const T* operator->() const { return & operator *(); } // Input Iterator Const_value<T>& operator++() { return *this; } Const_value<T> operator++(int) { Const_value<T> tmp = *this; ++*this; return tmp; } };

Note that operator!= and operator++(int) are implemented in terms of other member functions of the iterator. In this example, they

are unnecessarily complicated. But in general, only a small subset of the member functions needs to be implemented for a new iterator, all

other member functions are generic.

Other examples for such simple input iterators are a counting iterator and a random number generator.

Using the concept of lazy evaluation from functional programming languages we can also imagine iterators representing more complex and potentially infinite sequences, for example, the sequence of prime numbers.

However, there is no point in copying an infinite sequence. Instead, we might be interested in a finite subsequence. Another generic function, copy_n solves this. Note that copy_n is not part of the C++ standard, but it is available in most implementations of the STL (or easy to write). (see also Const_value.C)

int a[100]; Const_value<int> cv( 42); copy_n( cv, 100, a); // fills a with 100 times 42.

Function Objects

A function object basically is an instance of a class with the operator() member function implemented, such that a call to this member

function of the object looks like a function call.


Generator Assignable function call, no arguments: Result operator()()

Unary Function Assignable function call, one argument: Result operator()(Arg1)

Binary Function Assignable function call, two arguments: Result operator()(Arg1, Arg2)

Page 16 of 72


Function objects are well suited as parameters for generic functions. A typical example would be the exchange of the equality comparison with a function object, which is currently hard coded as the operator== in the generic contains function from above. First, we define a function object equals that performs the same comparison.

template <class T> struct equals { bool operator()( const T& a, const T& b) { retu rn a == b; } };

We modify the iterator-based generic contains function from above. It needs an additional template parameter Eq and takes an additional

function parameter eq for a binary function object which is used for the comparison. template <class InputIterator, class T, class Eq> bool contains( InputIterator first, InputIterator b eyond, const T& value, Eq eq ) { while ((first != beyond) && ( ! eq( *first, val ue))) ++first; return (first != beyond); }

The example using C-arrays with the contains function needs now an additional argument -- the function object. The expression

equals<int>() calls the default constructor for the template class equals<int> from above which is a function object comparing two

integers for equality. int a[100]; // ... initialize elements of a. bool found = contains( a, a+100, 42, equals<int>()) ;

The next section illustrates how the additional parameter of the contains function can be automatically selected if the value type of the

iterator is known. C++ allows to use also simple function pointers as function objects. The advantage of objects is that they can have an

internal state. We continue our example of the contains function and define a comparison object that is true when the absolute value of the

difference of its two arguments is smaller than eps. The eps value is stored in the function object itself. At construction time of the function

object the actual value for eps is initialized, in our example to one, so that the contains function will also return true if the values 41 or 43 do

occur in the range. template <class T> struct eps_equals { T epsilon; eps_equals( const T& eps) : epsilon(eps) {} bool operator()( const T& a, const T& b) { return (a-b <= epsilon) && (b-a <= epsilon) ; } }; bool found = contains( a, a+100, 42, eps_equals<int >(1));

How about a function object that counts the number of comparisons needed as a side-effect? Here it is: template <class T> struct count_equals { size_t& count; count_equals( size_t& c) : count(c) {} bool operator()( const T& a, const T& b) { ++count; return a == b; } }; size_t counter = 0; bool found = contains( a, a+100, 42, count_equals<i nt>(counter)); // counter contains number of comparisons needed.

Note that since function objects are usually passed by value in the STL we store a reference to an external counter and not the counter value

itself in the function objects.

Iterator Traits

Iterators refer to items of a particular value type. Algorithms parameterized with iterators might need the value type directly. Assuming that

iterators are implemented as classes the value type can be defined as a local type of the iterator, as in the following example of an iterator

referring to integer values. The value type can be referred to with the expression iterator_over_ints::value_type . struct iterator_over_ints { typedef int value_type; // ... };

Since a C-pointer is a valid iterator, this approach is not sufficient. The solution chosen for the STL is the iterator traits class, which is a class

template parameterized with an iterator: template <class Iterator> struct iterator_traits { typedef typename Iterator::value_type value_t ype; // ... };

The value type of the iterator example class above can now be expressed as iterator_traits< iterator_over_ints >::value_type .

For C-pointers a specialized version of the iterator traits class exists. template <class T>

Predicate Unary Function result type is bool

Binary Predicate Binary Function result type is bool

Page 17 of 72


struct iterator_traits<T*> { typedef T value_type; // ... };

Now the value type of a C-pointer, e.g., to int , can be expressed as iterator_traits< int* >::value_type . Here, partial specialization

is required. The iterator traits class contains also definitions about the difference_type , the iterator_category , the pointer type and

the reference type of the iterator.

The example of the generic contains function with the function object from above can be made more convenient for the default use with a default initializer as follows: (see also contains.C)

template <class InputIterator, class T> bool contains( InputIterator first, InputIterator b eyond, const T& value) { typedef typename iterator_traits<InputIterator> ::value_type value_type; typedef equals<value_type> Equal; return contains( first, beyond, value, Equal()) ; }

STL makes use of traits classes in other places as well, for example, char_traits to define the equality test and other operations for a

character type. In addition, this character traits class is used as a template parameter for the basic_string class template, which allows the

adaption of the string class to different character sets.

Implementing Adaptable Function Objects

Adaptable function objects require in addition to regular function objects some local types that describe the result type and the argument

types. A function pointer can be a valid model for a function object, but it cannot be a valid model of an adaptable function object.

Small helper classes help to define adaptable function objects easily. For example, our function object equals from above could be derived from std::binary_function to declare the appropriate types.

#include <functional> template <class T> struct equals : public std::binary_function<T,T,boo l> { bool operator()( const T& a, const T& b) { retu rn a == b; } };

The definition of binary_function in the STL is as follows: template <class Arg1, class Arg2, class Result> struct binary_function { typedef Arg1 first_argument_type; typedef Arg2 second_argument_type; typedef Result result_type; };

Adaptable function objects can be used with adaptors to compose function objects. The adaptors need the annotated type information to

declare proper function signatures etc. An examples is the negater unary_negate that takes an unary predicate and is itself a model for an

unary predicate, but with negated boolean values. template <class Predicate> class unary_negate : public unary_function< typename Predicate::ar gument_type, bool> { protected: Predicate pred; public: explicit unary_negate( const Predicate& x) : pr ed(x) {} bool operator()(const typename Predicate::argum ent_type& x) const { return ! pred(x); } };

The function adaptors are paired with function templates for easy creation. The idea is that the function template derives the type for the

template argument automatically (because of the matching types). template <class Predicate> inline unary_negate< Predicate> not1( const Predicate& pred) { return unary_negate< Predicate>( pred); }

A short program in [Stepanov95] makes use of this negater. The program copies all integers from cin to cout that cannot be divided by the

Concept Refinement of Syntactic requirements, model T

Adaptable Generator Generator T::result_type

Adaptable Unary Function Unary Function T::result_type , T::argument_type

Adaptable Binary Function Binary Function T::result_type , T::first_argument_type ,

T::second_argument_type

Adaptable Predicate Predicate , Adaptable Unary

Function

Adaptable Binary Predicate Binary Predicate , Adaptable

Binary Function

Page 18 of 72


integer parameter given to the program. (see also remove_if_divides.C) int main( int argc, char** argv) { if ( argc != 2) throw( "usage: remove_if_divides integer\n" ); remove_copy_if( istream_iterator<int>(cin), ist ream_iterator<int>(), ostream_iterator<int>(cout, "\n "), not1( bind2nd( modulus<int>(), atoi( argv[1])))); return 0; }

The other function object adaptor in this example, bind2nd , is again a small helper function to create an object of type binder2nd . template < class Operation, class Tp> inline binder2nd< Operation> bind2nd( const Operation& fn, const Tp& x) { typedef typename Operation::second_argument_typ e Arg2_type; return binder2nd< Operation>( fn, Arg2_type(x)) ; }

An object of type binder2nd stores an adaptable binary function object and a value compatible with the type of the second argument of the

adaptable binary function object. The object itself behaves then like an unary function object. Whenever its operator is called, it returns the

value of the binary function object called with its argument and its internally stored value as second argument. This adapter binds a value to

the free variable of the second argument of a binary function object. There is a similar adaptor called binder1st that binds a value to the first

argument. This is similar to currying known in functional programming languages (it needs much more writing in C++ to make it work, but then

it works). So, these are higher order function objects. template <class Operation> class binder2nd : public unary_function< typename Operation::firs t_argument_type, typename Operation::resu lt_type> { protected: Operation op; typename Operation::second_argument_type value; public: binder2nd( const Operation& x, const typename Operation::second_arg ument_type& y) : op(x), value(y) {} typename Operation::result_type operator()(const typename Operation::first_argu ment_type& x) const { return op(x, value); } };

Other function object adaptors exist that can compose function objects, or encapsulate function pointers and member function pointers in

adaptable function objects.

Implementation of the Iterator Adaptor back_inserter

A class template that is a model of an output iterator. It keeps a reference to a container class as internal state. Each time an expression for a

back_insert iterator i of the from *i = value; is evaluated, the value is appended to the container class using the push_back() member

function. template <class Container> class back_insert_iterator { protected: Container* container; public: typedef Container container_type; typedef output_iterator_tag iterator_category; typedef void value_type; typedef void difference_type; typedef void pointer; typedef void reference; explicit back_insert_iterator(Container& x) : c ontainer(&x) {} back_insert_iterator<Container>& operator=(const typename Container::value_type& value) { container->push_back(value); return *this; } back_insert_iterator<Container>& operator*() { return *this; } back_insert_iterator<Container>& operator++() { return *this; } back_insert_iterator<Container>& operator++(int ) { return *this; } };

A small helper function template provides again the convenience not to type the template arguments explicitly. template <class Container> inline back_insert_iterator<Container> back_inserte r(Container& x) { return back_insert_iterator<Container>(x); }

Here is a short example of its use with a list class. list<int> ls; copy( a1, a1+100, back_inserter(ls));

Page 19 of 72


More Iterators

See Iterator_identity.h and Iterator_identity.C for an adaptor class that takes an iterator and behaves itself exactly like this iterator. The

example in Iterator_base.h and Iterator_base.C implements the same adaptor, but based on the Barton-Nackman trick

Function Dispatch using Iterator Category at Compile Time

An iterator belongs to a specific iterator category. This category can used to select different algorithms. For example the difference between

two iterators can be computed in constant time for random access iterators, but can only be computed in linear time (by counting) for all other

categories.

The C++ standard defines five empty classes to denote the different iterator categories. These types will be used as symbolic tags at compile time.

struct input_iterator_tag {}; struct output_iterator_tag {}; struct forward_iterator_tag : public input_iterator _tag {}; struct bidirectional_iterator_tag : public forward_ iterator_tag {}; struct random_access_iterator_tag : public bidirect ional_iterator_tag {};

An iterator is assumed to have a local type iterator_category that is defined to be one of these tags. struct Some_iterator { typedef forward_iterator_tag iterator_category; // ... };

This iterator category is accessed using iterator traits. Now we can implement a generic distance function (original implementation as it is in

the STL): template <class InputIterator> inline typename iterator_traits<InputIterator>::dif ference_type __distance( InputIterator first, InputIterator last , input_iterator_tag) { typename iterator_traits<InputIterator>::differ ence_type n = 0; while (first != last) ++first; ++n; return n; } template <class RandomAccessIterator> inline typename iterator_traits<RandomAccessIterato r>::difference_type __distance( RandomAccessIterator first, RandomAcces sIterator last, random_access_iterator_tag) { return last - first; } template <class InputIterator> inline typename iterator_traits<InputIterator>::dif ference_type distance( InputIterator first, InputIterator last) { typedef typename iterator_traits<InputIterator> ::iterator_category Category; return __distance(first, last, Category()); }

Note how the class hierarchy among the iterator tags is used to reduce the number of overloaded functions __distance that need to be

implemented here. Following the refinement relation of the iterator concepts, the forward_iterator_tag should be derived also from the

output_iterator_tag . Obscure reasons about multiple derivation kept this derivation out of the standard. On the other hand, this

derivation isn't likely to simplify real implementations anyway.

These tags are quite convenient to annotate symbolic information at compile time. However, there is a catch. An object has always non-zero size, even of an empty class. This is reasonable (the address identifies an object) and helps defining invariants about size, allocation, arrays, etc. However, if we derive from an empty class, like we do with function objects and binary_function<Arg1,Arg2,Result> , we would like to avoid any size penalties. In principle the compiler could perform this optimization, but, for example, g++ does not. The following program shows the effect.

#include <iostream> using namespace std; class A {}; class B : public A { int i; }; class C { int i; }; int main() { cout << "size of A = " << sizeof(A) << endl; cout << "size of B = " << sizeof(B) << endl; cout << "size of C = " << sizeof(C) << endl; return 0;

Page 20 of 72


}

4. Design Strategies

Domain Design

The domain engineering process (see Generative Programming) consists of three main tasks: domain analysis, domain design, and domain

implementation. The purpose of the second step, domain design, is to develop the architecture. The domain engineering method DEMRAL

[Czarnecki00] divides domain design into three activities:

1. Identification and specification of the overall implementation architecture

2. Identification and specification of domain-specific languages (DSLs)

3. Specification of the configuration knowledge

It involves the following tasks:

� Scope domain model for implementation

� Identify packages

� Develop architectures and identify implementation components

� Identify user DSLs

� Identify interactions between DSLs

� Specify DSLs and their translation

Domain Specific Languages

A domain specific language (DSL) is a specialized problem-oriented language. DSLs range from separate special languages such as SQL to

implicit language extensions such as a library user interface. Conventional library interfaces as DSLs are restricted with respect to:

� syntactic and semantic extensions

� domain-specific optimizations

� domain-specific error checking

In C++, template metaprogramming offers a tool to address these limitation.

Generative programming emphasizes multiple modular, composable DSLs instead of single monolithic DSL. DEMRAL has two kinds of DSLs:

� Configuration DSLs: Used for specifying the configuration of data types and algorithm.

� Expression DSLs: Used for writing expressions involving the library data types. (see Expression Templates)

Czarnecki00]: // specify matrix configurations using a configurat ion DSL

typedef MATRIX_GENERATOR< matrix< double, rect<> > >::RET RectMatrixType;

typedef MATRIX_GENERATOR< matrix< double, symm<> > >::RET SymmMatrixType;

// matrix

RectMatrixType R1(3, 3), R2(3, 3); SymmMatrixType S(3, 3);

R1 = 1, 0, 3, 0, 1, 4, 2, 0, 1; S = 4, 4, 5, 4, 2, 6, 5, 6, 1;

R2 = (R1 + S) * R1;

--> size=2 width="100%" align=center>

Configuration DSLs

A user of a library specifies the configuration of a library data structure or algorithm using a configuration DSL. The specification is then

translated into a concrete configuration expressed in implementation components configuration language (ICCL).

Page 21 of 72


DSL and ICCL serve different purposes. DSL is designed to provide a convenient interface for the user of the library. ICCL is designed to allow flexible, reusable implementation components. ICCL describes the full configuration, which can be complex, while DSL allows specification using different levels of detail:

� no details (use defaults)

� usage profile (for example, optimize for space or time)

� direct specification of components

� user-provided implementation

Having both DSL and ICCL separates the problem space and the solution space. This allows independent evolution of the user code and the

implementation component code. Most libraries do not make a clear separation between DSL and ICCL. For example in STL, configuration is

usually directly specified by the user. This separation has a central role in generative programming, where the translation is performed using

generators.

Policy-Based Class Design

Policy-based class design [Alexandrescu01] is a component-oriented architectural design technique. It decomposes the functionality of a class

into policies. Each policy can have multiple implementations, which are called policy classes. The main class or host class obtains the

functionality of a policy by getting a policy class as a template argument.

As an example, let us define a policy for creating objects. Here are two implementations:

template <class> struct OpNewCreator { static T* create() { return new T; } }; template <class T> struct MallocCreator { static T* create() { void* buf = std::malloc(sizeof(T)); if (!buf) return 0; return new(buf) T; } };

We have also a host class that uses the policy. template <class CreationPolicy = OpNewCreator<Widge t> > class WidgetManager : public CreationPolicy { // ... };

Now the user is able to select the way WidgetManager creates objects. typedef WidgetManager<> DefaultWidgetManager; typedef WidgetManager< MallocCreator<Widget> > Mall ocWidgetManager;

Assuming WidgetManager always wants a creation policy for objects of type Widget , requiring the user to specify this is redundant and even dangerous. To avoid this, we can use template template parameters. This also allows the WidgetManager to create objects of other types using the same policy:

Page 22 of 72


// library code template <template <class> class CreationPolicy = O pNewCreator> class WidgetManager : public CreationPolicy<Widget> { // ... void doSomething() { Gadget* q = CreationPolicy<Gadget>().Create (); // ... } // ... }; //application code typedef WidgetManager<MallocCreator> MallocWidgetMa nager;

Enriched Policies

A creation policy class is required to have a member function create , but it can also have additional functionality: template <class T> struct PrototypeCreator { PrototypeCreator(T* p = 0) : prototype(p) {} T* create() { return prototype ? prototype->clone() : 0; } T* get_prototype() { return prototype; } void set_prototype(T* p) { prototype = p; } private: T* prototype; };

The host class inherits this additional functionality, and the user can take advantage of this: typedef WidgetManager<PrototypeCreator> MyWidgetMgr ; // ... Widget* p = ...; MyWidgetMgr mgr; mgr.set_prototype(p); //...

The "lazy" implicit instantiation of member functions of class templates enables even WidgetManager itself to use the additional

functionality: template <template <class> class CreationPolicy = O pNewCreator> class WidgetManager : public CreationPolicy<Widget> { // ... void switch_prototype(Widget* p) { CreationPolicy& myPolicy = *this; delete myPolicy.get_prototype(); myPolicy.set_prototype(p); } // ... };

User can still create objects of type WidgetManager<OpNewCreator> even though OpNewCreator does not have a member functions

get_prototype and set_prototype . A compiler error will occur only if the user tries to call switch_prototype .

Combining Policies

The power policies becomes more apparent when there are multiple policies. The Loki library [Alexandrescu01] contains an implementation of

smart pointers as a host class with four policies: template < class T, template <class> class OwnershipPolicy = RefCo unted, class ConversionPolicy = Disal lowConversions, template <class> class CheckingPolicy = Asser tCheck, template <class> class StoragePolicy = Defau ltSPStorage > class SmartPtr;

Each policy has multiple implementations in Loki:

� Ownership policy: DeepCopy, RefCounted , RefCountedMT , COMRefCounted , RefLinked , DestructiveCopy , and NoCopy.

� Conversion policy: AllowConversion and DisallowConversion .

� Checking policy: AssertCheck , AssertCheckStrict , RejectNullStatic , RejectNull , RejectNullStrict , and NoCheck .

� Storage policy: DefaultSPStorage , ArrayStorage , LockedStorage , and HeapStorage .

Altogether this gives 7x2x6x4=336 combinations using 1+7+2+6+4=20 components. That is, 336 different smart pointer classes! Furthermore,

later version of the library could add new policy classes and users can define their own policy classes.

Named Template Arguments

In SmartPtr , all the policies have defaults; just writing SmartPtr<int> gives the default configuration. Overriding the default

Page 23 of 72


ownership policy is also simple: SmartPtr<int, NoCopy> . But if one wants to override the default storage policy, all the template arguments must be given. A more convenient configuration DSL would allow one to write, for example

SmartPtr<int, StoragePolicy_is<LockedStorage> > sp;

A technique for implementing such named template arguments is described in [Vandevoorde03, Section 16.1]. Let us consider a simple

example: a BreadSlicer class with three policy parameters Policy1 , Policy2 and Policy3 . The BreadSlicer itself is defined as follows. template < class PolicySetter1 = DefaultSetter, class PolicySetter2 = DefaultSetter, class PolicySetter2 = DefaultSetter > class BreadSlicer { typedef PolicySelector < PolicySetter1, PolicyS etter2, PolicySetter3 > Policies; // use policies: Policies::P1 // ... Policies::P2 // ... Policies::P3 // ... };

Here BreadSlicer s template arguments are policy setters instead of the policies themselves. The policy setters can be defined like this. struct DefaultPolicies { typedef DefaultPolicy1 P1; typedef DefaultPolicy2 P2; typedef DefaultPolicy3 P3; }; class DefaultSetter : virtual public DefaultPolicie s {}; template struct Policy1_is : virtual public DefaultPolicies { typedef Policy P1; }; template struct Policy2_is : virtual public DefaultPolicies { typedef Policy P2; }; template struct Policy3_is : virtual public DefaultPolicies { typedef Policy P3; };

Finally, the PolicySelector extracts the actual policies from the policy setters. template <class Base, int D> struct Discriminator : public Base {}; template < class Setter1, class Setter2, class Sett er3 > class PolicySelector : public Discriminator<Setter1 ,1>, public Discriminator<Setter2 ,2> public Discriminator<Setter3 ,3> {};

The purpose of the Discriminator template is to avoid an error when two or more of the setters are the same class (DefaultSetter ).

Let us see what happens when we override the default for the second policy: BreadSlicer<Policy2_is<CustomPolicy> > . This leads to the definition of PolicySelector<Policy2_is<CustomPolicy>,DefaultSett er,DefaultSetter> , whose structure looks like this:

Note that there is only one DefaultPolicies object, because DefaultPolicies is a virtual base class of the setters. This is important to avoid defining P1 and P2 multiple times, which would cause a compiler error due to ambiguity. P2 is defined twice but the one in Policy2_is<CustomPolicy> dominates the one in DefaultPolicies , because Policy2_is<CustomPolicy> is the more derived class. This is exactly what we want.

GenVoca Architecture

Page 24 of 72


GenVoca is a software architecture model that is similar to policy-based architectures in that final types are build from components, but the

components are organized differently. In GenVoca the components are not typically policies that are plugged in to implement some part of the

functionality but wrappers (layers) on top of a more basic components adding functionality to it.

The key GenVoca terms are layers and realms. A layer represents a component and a realm is a standardized interface exported by components. A layer is said to belong to a realm if it exports the required interface. A layer can also import one or more realms. Layers and realms correspond roughly to models and concepts in STL parlance.

A specific GenVoca model can be represented by a GenVoca grammar. For example, in the grammar

R : A | B[R] S : C[R] | D[S] | E[R,S]

R and S are realms, and A, B, C, D and E are layers. A GenVoca grammar defines an ICCL. Possible configurations of the model can be described

by GenVoca expressions.For example, valid configurations of the realm S in the above grammar include:

In an important class of GenVoca models, called stacking models, layers are organized into a hierarchical set of layer categories. Each layer in a category belongs to the same realm and, except for the bottom category, takes exactly one parameter which must belong to the realm of the layer category below. Then valid configurations include exactly one layer from each category. As an extension, parameters of the same realm and optional layer categories can be allowed. For example, A has a parameter of the same realm and R3 is an optional category in the following stacking model:

R1 : A[R1] | B[R2] | C[R2] R2 : D[R3] R3 : F[R4] | G[R4] | R4 R4 : H | I

GenVoca Example: List

As example of a GenVoca, let us design an architecture for a simple list with the feature diagram:

More details on this example can be found in [Czarnecki00, Chapter 12].

Designing a GenVoca architecture consists of the following steps.

1. Identify the main responsibilities in the feature diagram. In our example, the responsibilities include:

� storage of elements

� copying of elements

� destroying elements

� dynamic type checking (to ensure monomorphism)

� length counter

� tracing

2. Enumerate component categories and components per category. There is a component category for each responsibility of the previous

step and a component for each alternative implementation of the responsibility. In the example, the categories and components are:

� BasicList: PtrList

� Copier: PolymorphicCopier, MonomorphicCopier, Empty Copier

� Destroyer: ElementDestroyer, EmptyDestroyer

Page 25 of 72


� TypeChecker: DynamicTypeChecker, EmptyTypeChecker

� Counter: LengthList

� Tracing: TracedList

3. Identify `ùse'' dependencies between component categories. The dependecies of the example are given by the following figure:

4. Sort the categories into a layered architecture. A `ùser'' component category is placed in a higher layer category than the `ùsed'' one.

From the above figure, we get four layer categories. Tracing , Counter and BasicList each form their own layer categories. The

component categories that do not depend on other categories are grouped into the bottom ConfigurationRepository (Config for

short) layer category. Tracing and Counter are optional layer categories.

5. Write down the GenVoca grammar. 6. List : TracedList[OptCounterList] | OptC ounterList 7. OptCounterList : LengthList[BasicList] | BasicList 8. BasicList : PtrList[Config] 9. Config : 10. Copier : PolymorphicCopier | Monomorphi cCopier | EmptyCopier 11. Destroyer : ElementDestroyer | EmptyDestro yer 12. TypeChecker : DynamicTypeChecker | EmptyType Checker 13. ElementType : [ElementType] 14. LengthType : int | short | long | ...

Note how there is no one-to-one correspondence between the feature diagram and the GenVoca grammar. For example, Copier is affected

by both Ownership and Morphology features.

Implementing a GenVoca Architecture

We will now implement the list described by the above GenVoca grammar. We need to implement the following components:

� TracedList

� LengthList

� PtrList

� PolymorphicCopier

� MonomorphicCopier

� EmptyCopier

� ElementDestroyer

� EmptyDestroyer

� DynamicTypeChecker

� EmptyTypeChecker

The full implementation of the components is in list_components.h. (See also list_generator.h and list_example.C.)

A user can define a particular configuration by writing a configuration repository, for example:

struct TracedIntListConfig { typedef int ElementTy pe; typedef const ElementType ElementAr gumentType; typedef MonomorphicCopier<ElementType> Copier; typedef ElementDestroyer<ElementType> Destroyer ; typedef DynamicTypeChecker<ElementType> TypeCheck er; typedef TracedList<PtrList<TracedIntlListConfig> > FinalListType; }; typedef TracedIntListConfig::FinalListType TracedIn tList;

Note the circularity in the definition. The configuration repository is the innermost (bottom) layer of the list but contains the final list type as its

member. This is necessary, because the higher layers need the final list type. The type ElementArgumentType is used as the argument type

when inserting elements. It is either ElementType or const ElementType depending on the configuration. LengthType is not defined

here, since this list configuration has no length counter.

Let us then implement the components. This is a very bare-bones implementation concentrating on the architectural aspects. We start with the basic list PtrList :

template <class Config_> class PtrList { public: // export Config typedef Config_ Config;

Page 26 of 72


private: // retrieve the needed types from the repository typedef typename Config::ElementType Ele mentType; typedef typename Config::ElementArgumentType Ele mentArgumentType; typedef typename Config::Copier Cop ier; typedef typename Config::Destroyer Des troyer; typedef typename Config::TypeChecker Typ eChecker; typedef typename Config::FinalListType Fin alListType; // data members ElementType* head_; FinalListType* tail_; // note: not PtrList* but FinalListType* public: PtrList (ElementArgumentType& h, FinalListType *t = 0) : head_(0), tail_(t) { set_head(h); } ~PtrList() { Destroyer::destroy(head_); } void set_head(ElementArgumentType& h) { TypeChecker::check(h); head_ = Copier::copy(h); } ElementType& head() { return *head_; } void set_tail(FinalListType *t) { tail_ = t; } FinalListType* tail() const { return tail_; } };

This primitive singly linked list could be used as follows (see list_example.C for more): template <class List> void print_list(List* l) { std::cout << "[ "; for ( ; l; l = l->tail() ) std::cout << l->head() << " "; std::cout << "]\n"; } template <class List> void push_front(typename List::ElementArgumentType& e, List*& l) { l = new List(e, l); } int main() { typedef ListConfig::FinalListType List; List* ls = 0; push_front(1,ls); push_front(2,ls); push_front(3,ls); print_list(ls); // prints "3 2 1" }

The PtrList delegates some of the work to other components. The way elements are copied (or not) is determined by the Copier which is

one of the following: template <class ElementType> struct MonomorphicCopier { static ElementType* copy(const ElementType& e) { return new ElementType(e); } }; template <class ElementType> struct PolymorphicCopier { static ElementType* copy(const ElementType& e) { return e.clone(); // polymorphic copy using } // virtual member function clone() }; template <class ElementType> struct EmptyCopier { static ElementType* copy(ElementType& e) { // n ote: not const return &e; // no copy } };

The components for element destruction and type checking are even simpler: template <class ElementType> struct ElementDestroyer { static void destroy(ElementType* e) { delete e; } }; template <class ElementType> struct EmptyDestroyer {

Page 27 of 72


static void destroy(ElementType* e) {} // do not hing }; template <class ElementType> struct DynamicTypeChecker { static void check(const ElementType& e) { assert(typeid(e)==typeid(ElementType)); } }; template <class ElementType> struct EmptyTypeChecker { static void check(const ElementType& e) {} };

Finally, the higher layers are implemented as inheritance-based wrappers. Only LengthList in given here; TracedList is similar and can be

found in list_components.h. template <class BasicList> class LengthList : public BasicList { public: // export config typedef typename BasicList::Config Config; private: // retrieve the needed types from the repository typedef typename Config::ElementType Ele mentType; typedef typename Config::ElementArgumentType Ele mentArgumentType; typedef typename Config::LengthType Len gthType; typedef typename Config::FinalListType Fin alListType; LengthType length_; LengthType compute_length() const { return tail() ? tail()->length()+1 : 1; } public: LengthList (ElementArgumentType& h, FinalListType *t = 0) : BasicList(h, t) { length_ = compute_length(); } void set_tail(FinalListType *t) { BasicList::set_tail(t); length_ = compute_length(); } LengthType length() const { return length_; } };

Generators

The list implementation described above defines the components but contains little configuration information. This makes the components

simple flexible and reusable, but leaves a significant part of the final list definition to the writer of the configuration repository. This is too

tedious and error-prone task to be left for the user. For example, the combination of MonomorphicCopier and EmptyDestroyer would

likely lead to memory leaks. We will next write a list generator to automate the configuration.

A generator takes a possibly incomplete requirements specification written in a more convenient configuration DSL and produces the finished type. In general, a configuration generator performs the following tasks:

� complete the specification (compute defaults)

� check that the specification is valid

� assemble the components into the finished type.

In our example, the configuration DSL does not allow invalid specifications, and the only defaults are provided by default template arguments.

First, we need to define the configuration DSL for specifying list configurations. It is based on the features in the feature diagram rather than the implementation components. We will represent the features using enumeration types:

enum Ownership {ext_ref, own_ref, cp}; enum Morphology {mono, poly}; enum CounterFlag {with_counter, no_counter}; enum TracingFlag {with_tracing, no_tracing};

The generator takes the set of features as template arguments, translates the specification into a configuration repository (configuration DSL

to ICCL translation), and produces the final list type as result. The generator does the translation at compile time, and uses a helper

metafunction IF to choose types based on boolean constants (see template metaprogramming for more about metafunctions): template <bool condition, class Then, class Else> struct IF {

Page 28 of 72


typedef Then RET; }; template <class Then, class Else> struct IF<false, Then, Else> { typedef Else RET; };

The generator itself is here (and in list_generator.h). template < class ElementType_, Ownership ownership = cp, Morphology morphology = mono, CounterFlag counter_flag = no_counter, TracingFlag tracing_flag = no_tracing, class LengthType_ = int > class LIST_GENERATOR { public: // forward declaration of the configuration repos itory struct Config; private: // define the constants used for type selection enum { is_copy = ownership==cp, is_own_ref = ownership==own_ref, is_mono = morphology==mono, has_counter = counter_flag==with_counter, does_tracing = tracing_flag==with_tracing }; // select the components typedef typename IF<is_copy || is_own_ref, ElementDestroyer<ElementType_>, EmptyDestroyer<ElementType_> >::RET Destroyer_; typedef typename IF<is_mono, DynamicTypeChecker<ElementType_>, EmptyTypeChecker<ElementType_> >::RET TypeChecker_; typedef typename IF<is_copy, typename IF<is_mono, MonomorphicCopier<ElementType_>, PolymorphicCopier<ElementType_> >::RET, EmptyCopier<ElementType_> >::RET Copier_; typedef typename IF<is_copy, const ElementType_, ElementType_ >::RET ElementArgumentType_; // define the list type typedef PtrList<Config> BasicList; typedef typename IF<has_counter, LengthList<BasicList>, BasicList >::RET OptLengthList; typedef typename IF<does_tracing, TracedList<OptLengthList>, OptLengthList >::RET List; public: // return the final list type typedef List RET; // define the configuration repository struct Config { typedef ElementType_ ElementType;

Page 29 of 72


typedef ElementArgumentType_ ElementArgumentT ype; typedef Copier_ Copier; typedef Destroyer_ Destroyer; typedef TypeChecker_ TypeChecker; typedef LengthType_ LengthType; typedef RET FinalListType; }; };

Now we can specify a list configuration TracedIntList we saw earlier as follows: typedef LIST_GENERATOR<int,cp,mono,no_counter,with_ tracing>::RET TracedIntList;

If no tracing was required, we could simply write typedef LIST_GENERATOR<int>::RET IntList;

More examples can be found in list_example.C.

5. Advanced C++ Programming

Introduction

C++ provides many ways of abstraction, ways to shield the user from implementation details. We will discuss several important concepts in this

chapter: library initialization, const correctness, breaking cyclic type dependencies with templates, proxy classes, varies smart pointers, and

double dispatch, a technique to make a function `virtual' for two arguments simultaniously.

However, it is useful to remember that all the protections in C++ require the cooperation from the user. John Lakos describes in [Lakos96] the extreme measures a developer team chose when the library they were using turned out to be too restrictive:

#define private public #define protected public #define class struct

The standard does not guarantee that this really works, but it is pretty effective and works probably with most compilers. The developers were

well aware of their sin, but they were desperate.

We can distinguish between to kinds of protection a design can provide: protection against Murphy, and protection against Macchiavelly. Murphy describes an user that makes occasionally mistakes, while Macchiavelli describes an user that willingly tries to get around the protection mechanism. Protection against Macchiavelli in C++ is almost impossible, as C-style type casts and the example above illustrates. An effective but expensive solution would be opaque pointers and a link library manipulating the opaque pointer, where the sources of the underlying data are not published. Here, we discuss usually solutions that protect against Murphy.

Automatic Library Initialization and Housekeeping

In C

Initializers of global and static variables are automatic. But no function call can be triggered automatically in C, even none with the C-

preprocessor. Libraries have to be initialized explicitly, if the initialization isn't trivial.

Depending on the C runtime library, it might be possible to register with atexit() or on_exit() during initialization a callback function that will be called upon the exit of the main function. Otherwise, housekeeping has also to be implemented as an explicit function call. (see also housekeeping.c)

In C++

We use a static member variable. Its default constructor initializes the library, its destructor performs housekeeping. However, an explicit

housekeeping function might be appropriate to avoid the uncertainties about when the destructor gets finally called. The only restriction of

this solution is that the initialization cannot rely on the initialization of static member variables of other classes. The order of initialization for

non-local static objects is unspecified. Finding a feasible order would be too hard, in fact, Halting-Problem equivalent [Item 47, Meyers97] (this

reference describes also how to get around this restriction with local static variables in global functions). Note, C++ initializes always non-local

static objects, even if they are not used. But they have to be in a compilation unit that actually gets linked to the executable. Thus we write a

header file with the following declaration and include it in each header that relies on the library being initialized. The static member variable

count counts now how many different compilation units are initialized. The library is initialized for the first compilation unit, and

housekeeping is performed once the last compilation unit destructs the static init_var (see also [Page 640, Stroustrup97] for this solution). #ifndef INIT_H #define INIT_H class Init { static unsigned int count; public: Init(); ~Init();

Page 30 of 72


}; // Trigger constructor and destructor calls in each compilation unit. // Unnamed namespace can be used to avoid name coll isions. namespace { static Init _init_var; }; #endif // INIT_H //

There is a catch with this solution if we have a template library that does not have any object files for linking -- we need to link with an object

file that contains the definition of the static member variable count and this one must only exist in one compilation unit. We implement also

the constructor and the destructor to maintain the counter and to perform the initialization and housekeeping. (see also EX/Init.h and Init.C for

a full implementation in the library example lib/) unsigned int EX_Init::count = 0; Init::Init() { // default constructor if ( 0 == count++) { // perform initialization } } Init::~Init() { // destructor if ( 0 == --count) { // perform housekeeping } }

For a template library which consists only of header files, we can basically use the same class, but we make it a class template with a dummy

template parameter which allows us to move the static member variable definition from the object file into the header file. (see

EX/Template_init.h in the library example lib/).\

Const Correctness

(See also [Item 21 and 29, Meyers97].)

Const Declarations in C and C++

A const declaration of a variable forbids changes of the variable after its initialization. const int i = 5; // i = 6; // violates const declaration

A pointer can be declared const as well, thus the value of the pointer cannot change, but the value it refers to can change. As well, a pointer

can be declared to point to a constant value. And both can be combined. Here is the table of all four combinations: // the pointer, the dat a it refers to // ------------------------- -------------- int* p; // non-const non-con st int* const q; // const non-con st const int* r; // non-const const const int* const s; // const const

I read these declarations from the inside out. For example, for const int* const q I start with q the variable. const makes it constant. *

makes it a pointer, thus a constant pointer. int makes it a constant pointer referring to a value of type int , and finally the left const declares

the value of type int to be constant as well.

Member variables in classes can be declared constant as well. However, they can only be initialized with the constructor initializer syntax, not in the constructor call. An example:

struct A { const int i; A() : i(42) {} };

Make Temporary Return Objects in C++ Classes Const

Member functions of classes in C++ have a hidden parameter this . For a class C, this parameter is of type T* const if the member function

is declared non-const, or it is of type const T* const if the member function is declared const. struct C { void foo(); // hidden parameter: T* cons t this; void bar() const; // hidden parameter: const T * const this; };

For built-in data types C and C++ distinguish between l-values and r-values. L-values can be used for the left side of an assignment, they are

non-const. R-values cannot be used for the left side of an assignment. They can only be used for the right side of an assignment. They are

const. For example the post-increment operator requires an l-value, but is itself an r-value. Thus, we cannot write: int i; i++ ++; // second ++ forbidden!

For classes in C++ we need to model this behavior explicitly using const declarations. Consider the following class: struct A { A operator++ (int); // the post increment oper ator };

Now, lets try:

Page 31 of 72


A a; a++++;

Fine, that works. It works, because a++ returns a temporary object of type A. But it probably does not what one would expect. Since the

second ++ works on a temporary object, a itself gets only incremented once. We can forbid the second increment explicitly by making the

return type, the type of the temporary object, const. This should be considered for all similar temporary return types. struct A { const A operator++ (int); // the post incremen t operator };

Const Correctness and C++ Classes

Given an object of a class is declared const, the compiler guarantees bitwise constness, i.e., none of its non-static member variables can change

its value. On the other hand, a user of a class expects conceptual constness which means that the value of an object cannot change its

observable state if it is declared const.

Bitwise constness and conceptual constness are not the same. Two reasons: internal, not observable variables, and pointers.

There might be internal variables that can change but that cannot be observed. An example would be a string class that maintains a cache of the string length. The status of the cache cannot be observed from the outside (except in the runtime difference).

class string { char* s; size_t l; bool valid; public: string() : s(0), l(0), valid(true) {} size_t length() const; // const, does not cha nge string conceptually ... // some more functions modifying s and set ting valid to false }; size_t string::length() const { if ( ! valid) { l = strlen(s); // error, violates bitwise c onstness checked by the compiler! valid = true; // error, violates bitwise c onstness checked by the compiler! } return l; }

The new keyword mutable solves the problem here. It can be applied to a member variable to cancel out the const declaration. Changing the

class definition as follows, the example from above works. class string { char* s; mutable size_t l; mutable bool valid; public: string() : s(0), l(0), valid(true) {} size_t length() const; // const, does not cha nge string conceptually ... // some more functions modifying s and set ting valid to false };

The second problem of bitwise constness occurs with pointers and references. Bitwise constness of a pointer says that the pointer cannot

change, but the referenced value can change. However, if the referenced value is considered to be part of the observable state, the referenced

value should be considered constant as well. In the example of the string class, the pointer s cannot change, but the referenced character

array could be changed. The implementor of the string class explicitly has to take care (and can take care) that this cannot happen. The key is

that the class never exposes, i.e., returns, a non-const pointer or non-const reference to the array elements. Two examples: the array index

operator to a single character element in the array, and a member function returning the raw character pointer. Note that we want to keep the

full functionality for the non-const case as well. We implement both member functions twice, once for the const case, and once for the non-

const case. (see also string.C) class string { char* s; mutable size_t l; mutable bool valid; public: string() : s(0), l(0), valid(true) {} char& operator[]( int idx) { valid = false; return s[idx];} const char& operator[]( int idx) const { retur n s[idx];} char* & get_pointer() { valid = false; return s; } const char* get_pointer() const { retur n s; } ... // some more functions };

Given this string class, we can use it safely without breaking const correctness. int main() { char buf[6] = "Hallo"; // German string s; s.get_pointer() = buf; assert( s[1] == 'a'); s[1] = 'e'; // Now it's in English assert( s[1] == 'e'); // get a const reference to the same string: const string& r = s;

Page 32 of 72


// r.get_pointer() = "Salute"; // does not work with const char *, r-value assert( r[1] == 'e'); // r[1] = 'a'; // no, does not work with const char & }

Proxy Classes

A dynamic two-dimensional array of integers could be declared in C++ as follows: class Array2D { public: Array2D( int dim1, int dim2); // ... };

Of course, in a program we would like use the array similar to the builtin (static) two-dimensional arrays and access an element as follows: int main() { Array2D a(5,10); // ... int i = a[2][8]; }

However, there is no operator[][] in C++. Instead, we can implement operator[] to return conceptually a one-dimensional array, where

we can apply operator[] again to retrieve the element. class Array1D { public: Array1D( int dim); int operator[](int i); // ... }; class Array2D { public: Array2D( int dim1, int dim2); Array1D& operator[](int i); // ... };

The intermediate class Array1D is called proxy class, also known as surrogate [Item 30, Meyers97]. Conceptually, it represents a one-

dimensional array, but in this application we surely do not want to copy the elements to actually create a one-dimensional array. The proxy

class will just behave as if it is an one-dimensional array and internally it will use a pointer to the two-dimensional array to implement its

operations.

Another typical example for proxy classes is to distinguish between read and write access. The problem appears with string classes and their operator[] assuming the string class uses reference counting with 'copy-on-write' strategy.

struct string { const char& operator[](size_t index) const; char& operator[](size_t index); // ... }; int main() { string s = "Hello!"; char c = s[3]; s[5] = 'a'; }

Since s is not declared constant, the non-const index operator will be used for both uses of the operator[] . Since the string could be

modified through this operator and it cannot see that the result of operator[] appears on the right side of the assignment in the first

example, the string class must make the conservative assumption that it will be changed. We can defer this decision by introducing a proxy for

the return type of operator[] : struct string { const char& operator[] const; Proxy operator[]; // ... }

The proxy distinguishes two cases: automatic conversion to a character r-value, which does not change the string, such as in the first example

above, or the assignment of a character to the proxy, which corresponds to an l-value assignment, such as in the second example above. The

proxy contains a reference to the string it belongs to: class Proxy { string& str; int index; public: Proxy( string& s, int i) : str(s), index(i) {} // l-value uses, manipulate str if necessary Proxy& operator= ( const Proxy& rhs); Proxy& operator= ( char c); // r-value uses (str does not change) (automati c conversion to char) operator char() const; }

Note that the client code does not change. Clients can pretend that operator[] returns a char in most cases.

Page 33 of 72


Limitations pop up for other l-value uses of the proxy than assignment. For example, taking the address of the result of operator[] , or the operators +=, -=, *= etc. However, all these operators can be implemented to behave correctly.

But in case we would deal with a class with member functions instead of a builtin type, we should also consider the distinction between constant and non-constant member functions. They have to be implemented in the Proxy as well, which changes the problem of writing a proxy from a moderate sized finite set of operators to a potentially infinite set of member functions for a general proxy for arbitrary types.

Another limitation is that proxies rely on an automatic conversion and the compiler can only use one automatic conversion to resolve expressions. The compiler cannot compose automatic conversions. Thus, some expressions that work without proxies do not work with proxies.

We have already seen the use of a proxy in the STL when implementing back_insert_iterator (see this Section). The back_insert_iterator is an output iterator and we use a proxy as a return type of operator* . Assigning a value to the proxy writes it to the output iterator, i.e., appends it to the underlying container in this case. Note that in this example the proxy class is actually the back_insert_iterator itself (for no good reason except to write less code).

Smart Pointers

A smart pointer is an object that behaves much like a pointer, but has some additional ``smart'' functionality. Smart pointer classes differ

greatly in what additional functionality they provide and how closely they mimic built-in pointers. (For more information on smart pointers, see

[Alexandrescu01, Chapter 7] or Boost smart pointer documentation.)

Smart pointers usually point to an object in dynamic memory and own it, that is, they are responsible of destroying and deleting the object when appropriate. This is the biggest deficiency of built-in pointers:

void f() { int* p = new int(42); // do something delete p; }

This looks fine at first, but it can lead to a memory leak if the ``do something'' part of the code throws an exception (see Exception Safety) or

contains something like: if (done) return;

Another context, where the failure of a built-in pointer to delete its pointee is problematic, is a container of pointers (see below).

Perhaps the most fundamental difference between various smart pointers is how they deal with copying. Consider:

void f() { SmartPtr p(new int(42)); SmartPtr q = p; }

The most obvious implementation would lead to deleting the same object twice, which is not acceptable. This is a big problem, because

copying occurs in many, not always obvious places such as passing an argument by value, returning by value, and inserting to a standard

container. The following three sections display four different approaches to copying smart pointers:

� destructive copy (move)

� no copying allowed

� (polymorphic) deep copy

� reference counting

Smart Pointers: std::auto_ptr<T>

The standard library provides one smart pointer template, auto_ptr . It is meant as a replacement for built-in pointers for:

� holding dynamically allocated objects

� passing ownership of dynamically allocated objects into and out of functions

It is not meant for containers of pointers.

The auto_ptr solves the copying problem with destructive copy (or move) semantics. In a copy operation, the new pointer obtains the ownership and the old pointer becomes a null pointer.

#include <memory> using namespace std; auto_ptr<int> source() { return auto_ptr<int>( new int(42)); } void sink( auto_ptr<int> pt) {}

Page 34 of 72


int main() { // these are legal and safe expressions with au to_ptr sink( source()); auto_ptr<int> pt = source(); sink( pt); pt = source(); auto_ptr<int> pt2 = pt; // but this is not ++*pt; // error: dereference of a null pointe r }

How about the following two lines? std::vector< auto_ptr<int> > vec; std::sort( vec.begin(), vec.end());

It depends, but most likely it does not work as expected. The reason is that the unusual copy semantics of auto_ptr<T> (modifies the right-

hand side) does not comply with the requirements for the template parameters of STL container classes and algorithms.

How about the following line?

const auto_ptr<int> ptr( new int(42));

That is a truly const pointer. It cannot even be copied to another const auto_ptr<int> . This represents another approach to copying: no

copying allowed. However, the value it refers to can be changed. This value could be declared constant as well, e.g., const auto_ptr<const

int> . (See auto_ptr.C for the full example.)

Smart Pointers: Polymorphic Deep Copy

As auto_ptr is not suitable for containers, let us take a look at a smart pointer that is. This smart pointer helps in managing objects of a class

hierarchy in a container class [Chapter 5, Koenig96]. We use a small hierarchy of shapes as example. struct Shape { virtual Shape* clone() = 0; virtual ~Shape() {} }; struct Circle : public Shape { virtual Circle* clone(); }; struct Square : public Shape { virtual Square* clone() = 0; };

Container classes in the STL store objects by value. Thus, storing objects of type Shape wouldn't allow us to store objects of type Circle, they

would be sliced to type Shape when we try to store them in the container.

Instead, we can store pointers to Shape in the container. With the lack of ownership of plain built-in pointers, we will use a smart pointer. For copying, we use deep-copy semantics: when a pointer is copied, so is the object it points to. Here, the virtual clone() function is provided for performing a polymorphic copy.

class ShapePtr { Shape* shape; public: ShapePtr(Shape* s) : shape(s) {} ShapePtr(const ShapePtr& other) : shape(other-> clone()) {} void swap(ShapePtr& other) { std::swap(shape, o ther.shape); } ShapePtr& operator= (const ShapePtr& rhs) { ShapePtr tmp(rhs); swap(tmp); return *this; } // still make it behave like a pointer Shape& operator*() { return *shape; } Shape* operator->() { return shape; } };

template class DeepPtr { T* ptr; public: DeepPtr(T* p) : ptr(p) {} DeepPtr(const DeepPtr& other) : ptr(other->clone()) {} ~DeepPtr() { delete T; }

void swap(DeepPtr& other) { std::swap(ptr, other.ptr); } DeepPtr& operator= (const DeepPtr& rhs) { DeepPtr tmp(rhs); swap(tmp); return

*this; } T& operator*() const { return *ptr; } T* operator->() const { return ptr; } };

--> size=2 width="100%" align=center>

Smart Pointers: Reference Counting

The above solutions to the copying problem, destructive copy, no copy and deep copy, maintain the invariant that there is exactly one pointer

pointing to each object. However, sometimes we want multiple pointers to share the same object. We distinguish between two such cases:

� Invisible sharing. From user's point of view, there is no sharing. This is useful as an optimization to reduce the cost of copying large

objects such as strings.

� Visible sharing. Used when an object needs to be accessed throug multiple routes, for example, a set of objects stored in multiple

associative containers with different search keys. (Note that reference counting is not suitable for cyclic data structures such as graphs,

because cycles would never get deleted.)

One solution is to have just one owning pointer per objects. However, this requires a guarantee that the owning pointer lives longer than all

Page 35 of 72


the non-owning ones. A safer and more user-friendly alternative is reference counting. The pointed-to object has an associated counter

keeping track of the number of pointers referring to it. The object is deleted when the counter drops to zero.

Reference counting smart pointers can be classified into intrusive (the counter is stored as a part of the object) and non-intrusive (the counter is stored separately). The non-intrusive variants can be used with any types of objects (just as ordinary pointers) but they cannot be implemented without some overhead. In the two non-intrusive variants illustrated below, the penalty is either large pointer or a more costly dereference operations. Furthermore, extra memory allocation is needed for the counter. Boost shared_ptr uses the design on the right.

The intrusive smart pointer requires modifying the pointed-to class to provide the counter. If we are in a position to design the class from scratch, it often makes sense to use handles instead of pointers. A handle does not behave like a pointer but like the object it points to. Instead of dereference operations it has all the public member functions of the pointed-to class. The pointed-to object (representation) contains the data members and the reference counter, but might not have any of the public member functions. For example, the standard string class is a handle.

The common functionality of handles and representations can be factored out into common base classes. Note that the handle is a class template parameterized by the representation it refers to. The representation is supposed to contain a count variable, for example, by deriving from the class Rep:

struct Rep { int count; Rep() : count( 1) {} }; // Precondition: REP must have a member 'int count' . template < class REP > class Handle { protected: // Invariant: ptr is always != 0. REP* ptr; public: Handle() : ptr( new REP) {} Handle( const REP& rep) : ptr( new REP(rep)) {} Handle( const Handle& x) : ptr(x.ptr) { ++ptr-> count; } ~Handle() { if ( --ptr->count == 0) delete ptr; } void swap(Handle& other) { std::swap(ptr, other .ptr); } Handle& operator= (const Handle& other) { Handle tmp(other); swap(tmp); return *this; } };

The following example shows how to use these classes to implement a class for integers that uses reference counting (of course, reference

counting does not pay off for plain integers). struct Integer_rep : public Rep { int i; Integer_rep() {} Integer_rep( int j) : i(j) {} }; class Integer : public Handle<Integer_rep> { public: Integer() {} Integer( int i) : Handle<Integer_rep>(i) {} int value() const { return ptr->i; } };

In a program, a handle can be used just as a normal type. (see also handle_ref.C for the full example, and handle_ref_extended.C for a more

protected example, where the reference count is a private variable to protect the reference counting scheme from user code) int main() { Integer a(5); Integer b(a); a = b; assert( b.value() == 5); }

So far, the representation has been constant. Allowing modifications is a problem when the sharing in invisible, because user does not expect

that the modification of one affects another. A solution is a strategy called copy-on-write. Before modifying the representation, the count is

Page 36 of 72


checked. If it is greater than 1, a new copy will be created in which the modification takes place.

Exception Safety

When an error occurs in a program, there are two possible responses:

� terminate

� recover and continue

Often we want to do both. That is, we want to terminate a low level operation but recover and continue at a higher level. Then we need to

transfer the execution from the point of error to the point of recovery. The two points may be separated by a long chain of function calls,

which we call the error path. Some cleanup may be necessary on the error path: destroying objects, freeing dynamic memory, closing files,

cancelling partial changes, etc.. The functions on the error path may come from different sources, including libraries.

Exceptions are a mechanism in C++ to handle this kind of error situations. An exception is an object (of an arbitrary type) that is thrown at the point of error and caught at the point of recovery. It may carry information about the error. All automatic (non-static) local objects on the error path are destroyed by calling their destructor. Functions on the error path may also catch the exception, perform cleanup, and re-throw the (same or a different) exception.

void error() { throw "error"; // point of error } void unsafe() { int* a = new int[100]; // leak! error(); delete[] a; // never executed } void safe_the_hard_way() { int* a = new int[100]; // safe now try { unsafe(); } catch (...) { // catch anything delete[] a; throw; // re-throw the original except ion } delete[] a; } void safe_the_easy_way() { vector<int> v(100); // safe: destruc tor cleans up auto_ptr<int> p(new int(42)); // safe: destruc tor cleans up safe_the_hard_way(); } int main() { try { safe_the_easy_way(); } catch (std::exception& e) { // does not catch "error" cout << e.what() << endl; // but would catc h std::bad_alloc } catch (const char* s) { // point of recovery cout << s << endl; } }

A function is exception-safe if it performs proper cleanup for any possible exception, and exception-neutral if it propagates all exceptions to the

caller. There are three levels of exception safety:

1. Basic guarantee: No resource leaks. The guarantee includes indirect resource leaks. For example, a member function may not leave an

object in a state, where the destructor might not be able the free all resources.

2. Strong guarantee: Commit-or-rollback. The function either completes the operation it was performing or leaves the program state

unchanged as if it was never called.

3. Nothrow guarantee: Never emits an exception.

It is difficult or impossible to make a function exception-safe and -neutral if it calls a function that is not exception-safe and -neutral. One weak

link can destroy the guarantee. For this reason library code should be exception-safe and -neutral. Templates are particularly vulnerable since

they know little about the exceptions the template parameters might throw.

Guideline: Resource Acquisition Is Initialization

The preferred way to achieve exception safety is to do resource acquisition (such as memory allocation) in a constructor and let the destructor

take care of cleanup. This technique is frequently referred to as ``resource acquisition is initialization''. As demonstrated by the

safe_the_easy_way() function above, standard library provides convenient facilities such as containers and auto_ptr for this. If the

standard library facilities are not enough, it often makes sense to write a separate class just for managing a resource. Note: `à resource'' (see

next guideline).

Page 37 of 72


Writing exception-safe constructors and destructors requires some extra care.

A destructor should never emit an exception. Recall that a destructor may be called during exception handling. A desctructor throwing in this situation terminates the program immediately. Furthermore, the destructor of a container (even a basic array) calls the destructor of all its elements. If one of the element destructors throws, the resulting behavior is undefined. Therefore, if a destructor has to do something that might throw, it should catch the exception:

X::~X() { try { write_log("Destroying X"); } catch (...) { } // catch everything without re -throwing }

If a constructor throws, the object remains partially constructed, which means that the destructor is never called.

class Person { Image* img; void init(); public: Person(const Image& i) : img(new Image(i)) { init(); } // leaks if init() throws ~Person() { delete image; } };

Fully constructed subobjects (members and bases) are destroyed. Thus the simple cure here is to use auto_ptr : class Person { auto_ptr<Image> img; void init(); public: Person(const Image& i) : img(new Image(i)) { init(); } // safe now ~Person() { } // bonus: empty destructor };

Guideline: Separate Responsibilities

A function or a class with multiple responsibilities is harder to make exception-safe than one with a single responsibility. Moving each

responsibility to a different entity helps. For example, in the Person class above, moving the responsibility for the image memory to

auto_ptr was the key.

Here is a function with two responsibilities: incrementing a counter and returning a value. Is it exception-safe?

int counter = 0; string f(const Person& x) { ++counter; return x.name(); }

At the basic level, yes. At the strong level, no. If x.name() throws, the function does not complete its job (which includes returning a string),

but the counter has been incremented. What about this? int counter = 0; string f(const Person& x) { string tmp = x.name(); ++counter; return tmp; }

Still not strongly exception-safe. Returning involves a string copy construction, which might throw. We can achieve strong guarantee by

returning a pointer to string (copying a pointer cannot throw). A better alternative is to use auto_ptr : int counter = 0; auto_ptr<string> f(const Person& x) { auto_ptr<string> tmp(new string(x.name()); ++counter; return tmp; }

Here we achieved exception-safety by changing the return value of the function. This is often undesirable. For example, a stack operation pop

() that both removes the top element and returns it, has a similar two-responsibilities problem, but changing the return type to auto_ptr is

not really acceptable. For this reason, the STL stack (adapter) splits the two responsibilities between two function: top() returns the top

element and pop() removes it. (See [Sutter99, pp. 25-54].)

Another point illustrated by these examples is that exception safety is not just a matter of implementation details: it can affect the interface. Thus exception safety should be considered early in the design process.

Guideline: Separate Throwing Code form Critical Code

Consider the following line. void f(auto_ptr<int>(new int(42)), g());

The order of evaluation of function arguments is not specified by the standard. In the above line, the evaluation order could be new int(42) ,

g() , auto_ptr<int>(...) . This could lead to memory leak if g() throws. The following is better.

Page 38 of 72


auto_ptr<int> (new int(42)); void f(p, g());

More generally, it is useful to separate the code that might throw an exception from the code that performs critical operations, preferably doing the throwing code first. The throwing code could be performed on automatic local objects that get destroyed if an exception is thrown, effectively canceling the operation. A good example is the canonical copy assignment operator implemented in terms of a copy constructor and a swap:

struct A { // ... A(const A& other); // copy con structor void swap (const A& other) throw(); // swap *th is with other A& operator= (const A& other) { // copy ass ignment A tmp(other); // if this throws, operat ion has no effect swap(tmp); // can not throw return *this; } // ... };

The only place in operator= , where an exception could be thrown, is the copy construction. If this happens, the operation exits without any

permanent effect. Thus, we have the strong guarantee. The guarantee relies on the no-throw guarantee of swap. Usually swap performs a

bitwise swap (for example, swapping pointers not pointees) and can be implemented using only copy operations on built-in types, which

cannot throw.

The no-throw guarantee of swap was formalized in the exception specification throw() . In the general form, an exception specification lists the exception that might be thrown. Exception specifications should be used carefully. While they look similar to const specifications, they are less useful:

� Violations are discovered at run-time, not at compile-time as with const specification, and they cause an immediate termination of the

program.

� Determining what exceptions could possibly be thrown is not easy. In particular, templates cannot be expected to know what

exceptions their template parameter types might throw.

� Addition of a new exception could cause a lot of changes if exception specifications are used frequently.

� There could be a run-time overhead even if no exception is thrown.

Note that a lack of formal exception specification in the code is not an excuse for omitting (a more informal) exception specification in the

documentation.

Associating Data to Items in a Data Structure

Given a data structure with items, for example, a graph with nodes and edges, we want to associate an additional data field with each item. A

typical application would be a boolean field for a graph traversal algorithm, depth first search. This application also highlights a possible

dynamic nature of this associated fields, only a particular sub-algorithm might require them. Several solutions are possible:

� Provide all possible fields: Make the field a requirement for the data structure and implement all fields that come up in the library.

� Provide some general purpose fields: For example, the Stanford GraphBase, a C library for graph algorithms in C by D. E. Knuth, has six

utility fields and macros to access these fields as integer values, typed pointers to edges, vertices or other uses. The algorithms have to

cooperate on the use of these fields.

� Manage some general purpose fields dynamically: This solution is specifically useful for boolean fields. Provide a bitfield, e.g., an

integer, in each item. Let the data structure manage who uses the different bits in the bitfield dynamically. An algorithm can allocate

from the data structure a bitmask and it can use this bitmask to manipulate the corresponding bit in the bitfield of the items. After the

algorithm finishes it returns the bitmask to the data structure and the bit is free to be used by some other algorithm. However, it has to

be considered what should happen if the data structure runs out of bitmasks. Algorithms could state the number of needed bits as

precondition, or they could implement an alternative method to create additional bits, see below.

� Let each node manage additional fields dynamically: Each node contains a dictionary (hash array) that contains name-value pairs.

Additional attributes can be just added to the dictionary.

� Templates: Instead of hardcoding all fields, make them a template parameter. The user has to give the right arguments (at compile

time) dependent on the algorithms the user wants to use. The benefit is the saved space if some algorithms are not used.

� Derivation: Derive a more specialized item type and program the data structure and the algorithm in the object-oriented style such that

they can cope with derived classes. This solution can introduce a noticable runtime overhead if rather small functions have to be made

virtual. To access the specialized item type a dynamic cast is usually needed.

� Enumerate nodes and associate fields dynamically at runtime with an array: An array is used to create within an algorithm additional

fields that are associated with the enumerated items of the container class. This is a limited solution in case the enumeration is easy and

the data structure does not change. It can hide the need for specific fields from the user.

� Associate fields dynamically at runtime with a hash array: This is the more flexible variant of the previous item. Instead of a simple

array a hash array or an associative map is used to create the additional fields. This is the most flexible and easy-to-use solution and it

can hide the need for specific fields from the user. However, hashing is quite costly if it is only needed to associate a boolean with each

node.

� Adaptor pattern: With the adaptor pattern, a new data structure, the adaptor (a.k.a. wrapper), is written using the old data structure

underneath. Changes in the new data structure are also reflected in the underlying old data structure. New data can be integrated in the

adaptor. This solution is as flexible and even easier to use than the hash maps in the previous item. However, implementing an adaptor

just to add a boolean for a depth first search is a bit of an overkill and can be quite some effort to support all operations on a graph

Page 39 of 72


class. The additional pointers for linking the adaptor with the adaptee costs also some additional space.

LEDA graphs offer several of these possibilities: general purpose fields, templates, arrays, and hash arrays. A generic programming approach is

taken by the Boost Graph Library [Siek01], where algorithms are implemented using property maps that hide the actual mechnism. For

example: template <class Vertex, class NameMap> void foo(Vertex v, NameMap name) { typedef typename boost::property_traits<NameMap>::value_type value_type; value_type oldname = get(name, v); // get old name value_type newname = "New"; put(name, v, newname); // assign new name value_type& name_of_v = name[v]; assert(name_of_v == newname); // check t he change name_of_v = oldname; // restore old name }

The example illustrates the three operations on property maps: get() , put() , and operator[]() . Even these are not provided for all

property maps. There are four property map categories (similar to iterator categories):

For example, a boolean implemented as a bit in an integer cannot be a model of LvaluePropertyMap , because it is not possible to have a reference to it.

Solving Mutual Dependencies with the Barton-Nackman Trick

The following C++ technique is usually referred to as the Barton-Nackman trick since they have introduced it in their book [page 352,

Barton97]. However, [Coplien95] documents some more occurrences of these "curiously recurring template patterns".

We start with a simple class that, among other operations, provides an equality and an inequality comparison operator.

class A { public: bool operator == (const A& a) const; bool operator != (const A& a) const { return ! (*this == a); } // ... };

The inequality comparison operator is implemented in terms of the equality operator. Wouldn't it be nice to factor out this generic

implementation into a base class and share it with all classes of this kind? The problem is a cyclic type dependency. Of course, the (then)

derived class A needs to know the base class. But the base class needs to know the derived class as well, since otherwise it cannot call the

correct equality operator, and the derived class shows up in the operators type signature as well. Here, the solution is to inject the derived

class as a template argument into the base class. class A : public Inequality<A> { public: bool operator == (const A& a) const; };

Since we intend to call the equality operator of the derived class, we have to use a type cast, which is a safe down-cast in this case. (see also

barton_nackman.C) template <class T> class Inequality { public: bool operator != (const T& t) const { return ! (static_cast<const T&>(*this) == t ); } };

The same technique can be used to implement a base class for iterators that contains all those small member functions that are defined in

terms of a much smaller set of member functions. Even better, since the base class is a template class, we can make use of the "lazy" implicit

instantiation and implement the most general base class for iterators of the random access category. If we derive an iterator of the forward

category, the extra random access operators in the base class are just ignored and don't cause error messages as long as they are not used.

(see also Iterator_base.h and Iterator_base.C)

There is only one pitfall with this solution: name lookup rules for overloaded functions in class hierarchies. The name lookup stops as soon as the function name has been found, and it does not search for more overloaded function in base classes. The pre- and post-increment operator are an example (note also the use of a private member function to encapsulate the type cast, which is also const-overloaded):

template < class Derived> class Iterator_base {


ReadablePropertyMap CopyConstructible get()

WritablePropertyMap CopyConstructible put()

ReadWritePropertyMap ReadablePropertyMap , WritablePropertyMap -

LvaluePropertyMap ReadWritePropertyMap operator[]()

Page 40 of 72


Derived& derived() { return static_cast<D erived&>(*this); } const Derived& derived() const { return static_cast<const Derived&>(*this); } public: const Derived operator++(int) { Derived tmp = derived(); ++ derived(); return tmp; } // ... }; class Some_iterator : public Iterator_base< Some_it erator> { // ... public: Self& operator++(); // ... }; int main() { Some_iterator i; i++; }

The post-increment call causes the compiler to look for the function name operator++ (without type signature for the arguments) in the class

hierarchy. It finds the pre-increment operator in Some_iterator and stops with the function lookup. Only overloaded instances of the

function in this class and global functions are now used to resolve the function call. The compiler does not find the correct post-increment

operator in the base class and gives an error message. The solution is a workaround; implement all overloaded functions in the base class and

give the involved functions in the derived class a new name, see Iterator_base.h.

Solving Mutual Dependencies between Class Templates

A graph consists of nodes and edges. A node knows incoming and outgoing edges, an edge knows its two incident nodes. Implementing this

cyclic type dependencies between node and edge type would use a forward declaration in C: struct Edge; struct Node { Edge * edge; // .... maybe more than one edge .... }; struct Edge { Node * source; Node * dest; };

If we want to parameterize nodes and edges with a template parameter for some additional auxiliary data, we can follow the same idea. But

since the node has to know the type of the edge, and the edge has to know the type of the node, both also have to know the template

parameter of each other. A solution could look like this (see also graph.C for an example with an additional graph class). // forward declaration template <class A, class B> struct Edge; template <class A, class B> struct Node { Edge<A,B> * edge; A aux; }; template <class A, class B> struct Edge { Node<A,B> * node; B aux; }; int main() { Node<int,double> node; // node with Edge<int,double> edge; node.edge = &edge; edge.node = &node; node.aux = 5; edge.aux = 6.6; }

One disadvantage of this solution is that the auxiliary data is always there and consumes space, even if it is not needed. It would be nice to

specify it as void in this case. However, a local variable void aux is not going to work in C++. However, we can use partial specialization to

write a specializes version, here for the node, to get rid of the reserved auxiliary data if we specify it to be void. template <class B> struct Node<void,B> { Edge<void,B> * edge; };

In this solution the node type and the edge type are tightly coupled together (as in the C solution). In the spirit of generic programming we

Page 41 of 72


might want to decouple them. Imagine to exchange a vertex type Node with another vertex type Node' as in the following figure.

We might specify a concept for a node and a concept for an edge. Now, given a model for a node and a model for an edge they should work together in a graph class.

We use a similar mind-twister as for the Barton-Nackman trick in the previous section. The missing type information for a node as well as for the edge is provided with a template parameter Graph . There is a graph class that takes two parameters, a class template for a node and a class template for an edge (note the nested template declarators to pass a class template as a template argument). The graph class uses itself as parameter to the node template and the class template to define the local types for node and edge. To summarize, the graph knows the node and the edge class that are supposed to work together, and therefore the graph class passes itself as template argument to both types.

template <class Graph> struct Node { typedef typename Graph::Edge Edge; Edge* edge; // .... maybe some more edges .... }; template <class Graph> struct Edge { typedef typename Graph::Node Node; Node* node; }; template < template <class G> class T_Node, template <class G> class T_Edge> struct Graph { typedef Graph< T_Node, T_Edge> Self; typedef T_Node<Self> Node; typedef T_Edge<Self> Edge; }; int main() { typedef Graph< Node, Edge> G; G::Node node; G::Edge edge; node.edge = &edge; edge.node = &node; }

To illustrate the flexibility in this design, we implement a new node class by deriving from the old one and add a member variable for color (see

also graph2.C for this example, and graph3.C for a more extensive example). template <class Graph> struct Colored_node : public Node<Graph> { int color; }; int main() { typedef Graph< Colored_node, Edge> G; G::Node node; G::Edge edge; node.edge = &edge; edge.node = &node; node.color = 3; }

It is important to understand that these cyclic definitions work -- as for the C example -- because we can make use of a declared type to define

pointers and references to this type before this type is defined itself. For example, we cannot change the pointer member Edge * edge of

the node class to a value Edge edge .

Const correctness is an issue for a graph data structure. For example, declaring a node constant, it should not be possible to traverse the graph in any fashion and to reach a mutable node or edge. For the more extensive example graph3.C the adjacency list Edge_list edges is of particular interest. This list contains iterators to edges, but declaring the node constant, all access to the contents of this list has to return const_iterator's to edges.

Double Dispatch, Making a Function Virtual for Two Arguments

Imagine a game in space with ships, base stations and asteroids (see also [Item 31, Meyers96]). Collisions are handled as follows:

Page 42 of 72


Assume all three objects are derived from a single abstract base class. We start with defining a virtual collision handling function that takes the second game object as a parameter.

struct Game_object { virtual void collision( Game_object* other) = 0 ; virtual ~Game_object() {} }; struct Ship : public Game_object { virtual void collision( Game_object* other) { // this (of type Ship) collides here with o ther } }; struct Station : public Game_object; // similar struct Asteroid : public Game_object;

Each collision knows the correct type of its this pointer, which resolves the first dispatch along the type of the first argument. But the other

pointer is still of the abstract base class type, the second dispatch along the type of the second argument remains unresolved yet. We could

use a switch/case statement to distinguish its actual type (using runtime type information (RTTI) with the typeid function). However, the

object-oriented way doesn't like this. Switch/case statements tend to be unmaintainable and are not extendible. A possible object-oriented

solution uses a set of virtual member functions to dipatch along the second argument (see double_dispatch_static.C for this example). struct Game_object { virtual void collision( Game_object* other) = 0 ; virtual void collision2( Ship* other) = 0; virtual void collision2( Station* other) = 0; virtual void collision2( Asteroid* other) = 0; virtual ~Game_object() {} }; struct Ship : public Game_object { virtual void collision( Game_object* other) { // this (of type Ship) collides here with o ther, call second dispatch other->collision2( this); } virtual void collision2( Ship* other) { // Ship collides with ship. }; virtual void collision2( Station* other) { // Ship collides with Station. }; virtual void collision2( Asteroid* other) { // Ship collides with Asteroid. }; }; struct Station : public Game_object {}; // simila r struct Asteroid : public Game_object {};

If the class hierarchy is known and does not change in the future, this is the solution. But, this solution does not solve the problem of

extendibility. Adding another game object, for example stars, would require to add a virtual member function to each class handling stars now.

Furthermore, we can observe that the collision handling is symmetric with respect to its two arguments. But we probably do not want to implement each function twice.

One extendible solution would be to implement double dispatch by hand as a (dynamic) array of function pointers that encode the collision handling table from above. We would implement the two-dimensional extension of the one-dimensional virtual function pointer table that used by the compiler to implement virtual member functions. Since we don't have the support of the compiler here (convinient index generation into this table etc.), the solution described at the end of [Item 31, Meyers96] is rather complicated and uses STL maps.

A simpler solution uses an if/then/else cascade to decide for each class using RTTI with which other classes it can handle the collisions. For unknown types, it calls the collision function again, but with the reversed order of arguments. Among the classes has to exist an order. A class has to implement the collision for all classes that are smaller in this order (in other words, a new class has to implement the collision with all old classes). There is the danger of missing to test for one class, in which case this solution produces an infinite loop (until the stack oveflows). This can be solved by breaking the symmetry, and give the second call with reversed parameters a different name (see double_dispatch_ext.C for this example).

We change the problem slightly and think of two different types that should interact with each other. An example would be a tree hierarchy composed of different node types, and a set of algorithm that work on the tree nodes. We can make this double dispatch problem extendible for either one of these types. We can just encode the algorithms as member functions in the different tree node types. Now it is easy to add a new tree node type. Or we can use the visotor pattern, see the Section Visitor Pattern, to make the family of algorithms extendible.

Ship Station Asteroid

Ship damage proportional to speed docking if slow, else damage destroys asteroid if it small, else ships get

destroyed

Station docking if slow, else damage damage proportional to speed destroys asteroid if it small, else station get

destroyed

Asteroid destroys asteroid if it small, else ships get

destroyed

destroys asteroid if it small, else station get

destroyed asteroids break into smaller pieces

Page 43 of 72


Templates could be used to solve the double dispatch problem conveniently if we change the setting slightly. Since templates work at compile time we have to know the actual types of colliding objects and we cannot work with base class pointers. The generic template implements the symmetry, the specialized functions implement the actual collision handling. This solution is extendible. A new class has to implement all possible combinations with old classes.

template < class T1, class T2> void collision( T1& o1, T2& o2) { collision( o2, o1); } void collision( Ship& o1, Ship& o2); void collision( Ship& o1, Station& o2); void collision( Ship& o1, Asteroid& o2); void collision( Station& o1, Station& o2); void collision( Station& o1, Asteroid& o2); void collision( Asteroid& o1, Asteroid& o2);

6. CGAL, the Computational Geometry Algorithms Library

Introduction

Computational geometry is the sub-area of algorithm design that deals with the design and analysis of algorithms for geometric problems

involving objects like points, lines, polygons, and polyhedra. Over the past two decades, the field has developed a rich body of solutions to a

huge variety of geometric problems including intersection problems, visibility problems, and proximity problems. A number of fundamental

techniques have been designed, and key problems and problem classes have emerged.

Geometric algorithms arise in various areas of computer science. Computer graphics and virtual reality, computer aided design and manufacturing, solid modeling, robotics, geographical information systems, computer vision, shape reconstruction, molecular modeling, and circuit design are well-known examples.

To a large extent the theory has been developed with asymptotic worst-case complexity analysis and under the assumption of the real RAM model. Computations with real numbers are assumed to be in constant time. For many (perhaps most) algorithms and problems this is a justified assumption that can be perfectly simulated with finite precision numbers if the input has limited precision. In 1996 the Computational Geometry Impact Task Force published a task force report, Application Challenges to Computational Geometry. Although crediting the remarkable success of the field in theory, the report now demanded to address also the applicability in practice. Recommendation number one on the list of four was ``production and distribution of usable (and useful) geometric codes''. Where are the difficulties in doing so? There are four major reasons why implementing geometric algorithms in particular is seen to be more difficult than in other fields:

� Algorithms in geometry are among the most advanced algorithms in algorithm design and they make frequent use of complicated data

structures.

Furthermore, software engineering has ignored the difficulties in algorithm engineering for quite a while. For example, the main focus in object-oriented design is on data abstraction, data encapsulation, relationships among data, its reuse and the design of large-scale systems. Only recently have implementations of algorithms been rediscovered as an active topic in software engineering, for example with the generic programming paradigm [Musser89].

� The asymptotic worst-case complexity analysis does not match the practical need for two reasons. In practice, the constant hidden in

the asymptotic analysis can easily outweigh asymptotic factors, such as log-factors, and the worst-case usually depends on a general

input model that may be unrealistic. More realistic input models, such as fatness, can help understanding the practicability of

algorithms.

� Theoretical papers assume exact arithmetic with real numbers. The correctness proofs of the algorithms rely on exact computation, and

replacing exact arithmetic by imprecise built-in floating-point arithmetic does not work in general. Geometric algorithms in particular

are sensitive to rounding errors since numerical data and control flow decisions have usually a strong interrelation. The numerical

problems may destroy the geometric consistency that an algorithm may rely on. As a result the program may crash, may run into an

infinite loop, or -- perhaps worst of all -- may produce unpredictable erroneous output.

The requirements on the arithmetic vary with the algorithm. Some algorithms require only sign computations of polynomial expressions of bounded degree in the input variables. Others require unbounded degree or algebraic roots. Various packages for exact arithmetic are available for different needs. Another approach is to redesign the algorithm to cope with inexact arithmetic. Usually the output is only an approximation of the exact solution. As a common prerequisite for exact arithmetic the input is rounded, either to convert the input into the format required for the arithmetic (rounding floating point to integer), or to lessen the precision requirements on the arithmetic.

� Often, theoretical papers exclude degenerate configurations in the input. Typically, these degeneracies are specific to the algorithm and

the problem, and would involve the treatment of special cases in the algorithm. Simple examples of configurations considered as

degenerate are duplicate points in a point set or three lines intersecting in one point. For some problems, it is not difficult to handle the

degeneracies, but for other problems the special case treatment distracts from the solution of the general problem and it can amount to

a considerable fraction of the coding effort.

In theory, this approach of excluding degeneracies from consideration is justified by the argument that degenerate cases are very rare in the set of all possible input over the real numbers, i.e., they have zero probability if the input set is randomly

Page 44 of 72


chosen over the real numbers. Another argument is that it is first of all important to understand the general case before treating special cases.

In practice, however, degenerate input occurs frequently. For instance, the coordinates of the geometric objects may not be randomly chosen over the real numbers, but lie on a grid. They may be created by clicking in a window in a graphical user interface. In some applications, what are called degeneracies are even high-valued design criteria. In architecture features of buildings do align on purpose. As a consequence, practical implementations usually must address the handling of degeneracies.

General approaches in handling degeneracies are symbolic perturbation or randomized perturbation with performance and correctness guarantee.

The community has addressed these topics from time to time and with increasing intensity (several references are given above), but many

useful geometric algorithms have not found their way into the application domains of computational geometry yet. This situation is also a

severe hindrance for researchers if they wish to implement and evaluate their algorithms. Thus, the constants hidden in the analysis of the

otherwise theoretically efficient algorithms often is not known.

To remedy this situation the Computational Geometry Algorithms Library, CGAL (http://www.cgal.org/ ), has been started five years ago in Europe in order to provide correct, efficient, and reusable implementations [Fabri99]. The library is being developed by several universities and research institutes in Europe and Israel.

The major design goals for CGAL include correctness, robustness, flexibility, efficiency, and ease-of-use. One aspect of flexibility is that CGAL algorithms can be easily adapted to work on data types in applications that already exist. The design goals, especially flexibility and efficient robust computation, led us to opt for the generic programming paradigm using templates in C++, and to reject the object-oriented paradigm in C++ (as well as in Java). In several appropriate places, however, we make use of object-oriented solutions and design patterns. Generic programming with templates in C++ also provides us with the help of strong type checking at compile time. Moreover, the C++ abstractions used by us do not cause any runtime overhead.

The birth of the CGAL-library dates back to a meeting in Utrecht in January 1995. Shortly afterwards, the five authors of [Fabri99] started developing the kernel. The CGAL-project has been funded officially since October 1996 and the team of developers has grown considerably among professionals in the field of computational geometry and related areas, in academia especially research assistants, PhD students and postdocs. The CGAL release 1.2 of January 1999 consists of approximately 110,000 lines of C++ source code for the library, plus 50,000 lines for accompanying sources, such as the test suite and example programs, not counting C++ comments or empty lines (the release 2.4 of May 2002 consists of approximately 290,000 lines of code, comments and empty lines included this time, plus test suite and example programs). In terms of the elder Constructive Cost Model (COCOMO) the line counts, the people involved, and the time schedule indicate a large project comparable to operating systems or database management systems. The WWW home-page of CGAL (http://www.cgal.org/ ) provides a list of publications about CGAL and related research.

Related Work

Three approaches of disseminating geometric software can be distinguished: collections of isolated implementations, integrated applications

or workbenches, and software libraries. An overview on the state of the art of computational geometry software before CGAL including many

references is given in [Amenta97].

Collecting isolated implementations, also called the Gems approach according to the successful Graphics Gems series, usually requires some adaption effort to make things work together. Compared to the graphics gems, computational geometry implementations usually use more involved data structures and more advanced algorithms. This makes adaption harder. A good collection provides the Directory of Computational Geometry Software (http://www.geom.umn.edu/software/cglist/ ).

Integrated applications and workbenches provide a homogeneous environment, for example with animation and interaction capabilities, and all parts work smoothly together. However, they tend to be monolithic, hard to extend, and hard to reuse in other projects. Examples date back to the end of the Eighties, specifically XYZ GeoBench (http://wwwjn.inf.ethz.ch/geobench/XYZGeoBench.html ) developed at ETH Zurich, Switzerland, is one of the precursors of CGAL.

Software libraries promise that the components work seamlessly together, that the library is extensible and that the components can be reused in other projects. Examples are the precursors of CGAL developed by members of the CGAL consortium. These precursors are PlaGeo developed at Utrecht University, C++Gal developed at Inria Sophia-Antipolis, and the geometric part of LEDA (http://www.mpi - sb.mpg.de/LEDA/ ), a library for combinatorial and geometric computing, which has been developed at Max-Planck-Institut für Informatik, Saarbrücken. Another example is GeomLib, a computational geometry library implemented in Java at the Center for Geometric Computing, located at Brown University, Duke University, and Johns Hopkins University in the United States. They state their goal as an effective technology transfer from Computational Geometry to relevant applied fields.

Overview of the Library Structure

Page 45 of 72


CGAL is structured into three layers and a support library, which stands apart. The three layers are the core library with basic non-geometric

functionality, the geometric kernel with basic geometric objects and operations, and the basic library with algorithms and data structures.

The library layers and the support library are further subdivided into smaller modular units. The modular approach has several benefits: The library is easier to learn, the implementation work is more easily spread among the project partners, and the reduction of dependencies facilitates testing and maintenance.

The geometric kernel contains simple geometric objects of constant size such as points, lines, segments, triangles, tetrahedra, circle and more. It provides geometric predicates on those objects, operations to compute intersections of and distances between objects, and affine transformations. The kernel objects are closed under affine transformations, e.g., the existence of circles implies that there are also ellipses in the kernel.

The geometric kernel is split in three parts, one for two-dimensional objects, one for three-dimensional objects, and one for general-dimensional objects. Geometry in two and three dimensions is well studied and has lots of applications which explains their special status. For all dimensions there are Cartesian and homogeneous representations available for the coordinates.

To solve robustness problems, CGAL advocates the use of exact arithmetic instead of floating point arithmetic. An arithmetic is associated with a number type in CGAL and the classes in the geometric kernel are parameterized by number types. CGAL provides own number types and supports number types from other sources, e.g., from LEDA or the Gnu Multiple Precision library. Since the arithmetic operations needed in CGAL are quite basic, every library supplying number types can be adapted easily to work with CGAL.

The basic library contains more complex geometric objects and data structures: polygons, triangulations, planar maps, polyhedra and so on. It also contains algorithms, such as for computing the convex hull of a set of points, the union of two polygons, smallest enclosing ellipse and so on. The figure above indicates the major parts in the basic library. These parts are mostly independent from each other and even independent from the kernel. This independence has been achieved with geometric traits classes, to be discussed later. Default implementations of the traits classes use the CGAL kernel for the types and primitive operations. Other implementations of the traits classes provided in CGAL use the LEDA geometric part. The traits class requirements are simple enough for a user to be able to write a traits class for own geometric data types and operations.

The core library offers basic non-geometric functionality that is needed in the geometric kernel or the basic library, for example support for coping with different C++ compilers which all have their own limitations. The core library contains the support for assertions, preconditions and postconditions. Circulators and random number generators belong here as well.

The support library also contains functionality with non-geometric aspects. In contrast to the core library, this functionality is not needed by the geometric kernel nor the basic library. The support library interfaces the geometric objects with external representations, like visualizations or external file formats. Among the list of supported formats are VRML and PostScript as well as the GeomView program and LEDA windows for 2D and 3D visualization. The support library also contains generators for synthetic test data sets, for example random points uniformly distributed in a certain domain. The adaptation of number types from other libraries is contained in the support library as well. The separation from the kernel and the basic library makes the functionality in the support library orthogonal and therefore open for future extensions.

Geometric Kernel

The geometric kernel contains types for objects of constant size, such as point, vector, direction, line, ray, segment, triangle, iso-oriented

rectangle and tetrahedron. Each type provides a set of member functions, for example access to the defining objects, the bounding box of the

object if existing, and affine transformation. Global functions are available for the detection and computation of intersections as well as for

distance computations.

The current geometric kernel provides two families of geometric objects: One family is based on the representation of points using Cartesian coordinates. The other family is based on the representation of points using homogeneous coordinates. The homogeneous representation extends the representation with Cartesian coordinates by an additional coordinate which is used as a common denominator. More formally, in d-dimensional space, a point with homogeneous coordinates (x0, x1, ..., xd-1, xd), where xd != 0, has

Cartesian coordinates (x0/xd, x1/xd, ..., xd-1/xd). This avoids divisions and reduces many computations in geometric algorithms to

calculations over the integers. The homogeneous representation is used for affine geometry in CGAL, and not projective geometry,

Page 46 of 72


where the homogeneous representation is usually known from. Both families are parameterized by the number type used to represent the Cartesian or homogeneous coordinates. The type CGAL::Cartesian<double> specifies the Cartesian representation with coordinates of type double , and the type CGAL::Homogeneous<int> specifies the homogeneous representation with coordinates of type int . These representation types are used as template argument in all geometric kernel types, like a two-dimensional point declared as

template <class R> CGAL::Point_2;

with a template parameter R for the representation class. Typedefs can be used to introduce conveniently short names for the types. Here is

an example given for the point type with the homogeneous representation and coordinates of type int : typedef CGAL::Point_2< CGAL::Homogeneous<int> > Poi nt_2;

The class templates parameterized with CGAL::Cartesian or CGAL::Homogeneous provide the user with a common interface to the

underlying representation. This common interface can be used in higher-level implementations independently of the actual coordinate

representation. The list of requirements on the template parameter defines the concept of a representation class for the geometric kernel.

CGAL provides clean mathematical concepts to the user without sacrificing efficiency. For example, CGAL strictly distinguishes points and (mathematical) vectors, i.e., it distinguishes affine geometry from the underlying linear algebra. Points and vectors are not the same with regard to illicit computations resulting from identification of points and vectors in geometric computations. In particular, points and vectors behave differently under affine transformations. We do not even provide automatic conversion between points and vectors but use the geometric concept of an origin instead. The symbolic constant CGAL::ORIGIN represents a point and can be used to compute the locus vector as the difference between a point and the origin. Function overloading is used to implement this operation internally as a simple conversion without any overhead. Note that we do not provide the geometrically invalid addition of two points, since this might lead to ambiguous expressions: Assuming three points p, q, and r and an affine transformation A, one can write in CGAL the perfectly legal expression A( p + ( q - r)). The slightly different expression A(( p + q) - r) contains the illegal addition of two points. However, if we allow this addition, we would expect the same result coordinatewise as in the previous, legal expression. But this is not necessarily intended, since the expression within the affine transformation is meant to evaluate to a vector, and not to a point as in the previous expression. Vectors and points behave differently under affine transformations. To avoid these ambiguities, the automatic conversion between points and vectors is not provided as well. (see Origin.C for an example point/vector/origin implementation)

Class hierarchies are used rarely in CGAL. One example are affine transformations, which maintain distinct internal representations specialized on restricted transformations. The internal representations differ considerably in their space requirements and the efficiency of their member functions. For all but the most general representation we gain performance in terms of space and time. And for the most general representation, the performance penalty caused by the virtual functions is negligible, because the member functions are computationally expensive for this general representation. Alternatively we could have used this general representation for affine transformations only. But the use of a hierarchy is justified, since the specialized representations, namely translation, rotation and scaling, arise frequently in geometric computing.

Another design decision was to make the (constant-size) geometric objects in the kernel non-modifiable (value semantics). For example, there are no member functions to set the Cartesian coordinates of a point. Points are viewed as atomic units, and no assumption is made on how these objects are represented. In particular, there is no assumption that points are represented with Cartesian coordinates. They might use polar coordinates or homogeneous coordinates instead. Then, member functions to set the Cartesian coordinates are expensive. Nevertheless, in current CGAL the types based on the Cartesian representation as well as the types based on the homogeneous representation have both member functions returning Cartesian coordinates and member functions returning homogeneous coordinates. These access functions are provided to make implementing own predicates and operations more convenient.

Like other libraries we use reference counting for the kernel objects. Objects point to a shared representation. Each representation counts the number of objects pointing to it. Copying objects increments the counter of the shared representation, deleting an object decrements the counter of its representation. If the counter reaches zero by the decrement, the representation itself is deleted. The implementation of reference counting is simplified by the non-modifiability of the objects. However, the use of reference counting was not the reason for choosing non-modifiability. Using `copy on write' (a new representation is created for an object whenever its value is changed by a modifying operation), reference counting with modifiable objects is possible and only slightly more involved. Reference counting costs about 15% to 30% runtime for the types double and float , but gains 2% to 11% runtime for the type leda_real .

Basic Library

The basic library contains more complex geometric objects and data structures, such as polygons, polyhedrons, triangulations (including

Delaunay triangulations), planar maps, range trees, segment trees, and kd-trees. It also contains geometric algorithms, such as convex hull,

smallest enclosing circle, ellipse, and sphere, boolean operations on polygons and map overlay.

Following the generic programming paradigm as introduced above, CGAL is made to comply with STL. The interfaces of geometric objects and data structures in the basic library make extensive use of iterators, circulators, and handles (trivial iterator), so that algorithms and data structures can be easily combined with each other and with those provided by STL and other libraries.

An example of a geometric algorithmic problem is the computation of the convex hull. The algorithm takes a set of points and outputs the sequence of extreme points on the boundary of the convex hull. The following program computes the convex hull of 100 random points uniformly distributed in the disc of radius one centered at the origin. The point generator gets as parameter a random

Page 47 of 72


source. This random source is initialized with the fixed seed 1 in this example. The result is drawn in a LEDA window, first all points in black, then the hull as polygon in green and finally the vertices in red. The header file hides the usual typedefs and declares all types parameterized with the representation class CGAL::Cartesian<double> .

#include "cartesian_double.h" int main () { Random rnd(1); Random_points_in_disc_2 rnd_pts( 1.0, rnd); list<Point_2> pts; copy_n( rnd_pts, 100, back_inserter( pts)); Polygon_2 ch; CGAL::convex_hull_points_2( pts.begin(), pts.en d(), back_inserter(ch)); Window* window = demo_window(); Window_iterator_point_2 wout( *window); copy( pts.begin(), pts.end(), wout); *window << CGAL::GREEN << ch << CGAL::RED; copy( ch.vertices_begin(), ch.vertices_end(), w out); Point_2 p; *window >> p; // wait for mouse click delete window; return 0; }

This program also illustrates the use of the CGAL polygon as a container class. The back_inserter adaptor of STL is applicable as expected,

which illustrates the generic tool-box character of STL and its concepts.

Triangulations are another example of a container-like data structure in the basic library. Triangulations in CGAL support the incremental construction. The following program is therefore even simpler than the previous one; without using any intermediate container to store all input points, 100 random points are copied into the triangulation data structure.

#include "cartesian_double.h" int main () { Random rnd(1); Random_points_in_disc_2 rnd_pts( 1.0, rnd); Delaunay_triangulation_2 dt; copy_n( rnd_pts, 100, back_inserter( dt)); Window* window = demo_window(); *window << dt; Point_2 p; *window >> p; // wait for mouse click delete window; return 0; }

The major technological achievement in the design of the basic library was the concept of the geometric traits class, see the next Section,

which allows the reuse of the triangulation data structure, for example to triangulate a set of three dimensional points with respect to their xy-

projection (useful to reconstruct terrains. We assume in the following example that the representation class for the geometric kernel is named

REP in the header file. The program reads three-dimensional points from cin , triangulates them, and writes the triangulation to cout . Note

that most of these typedefs are equal to those hidden previously in the header file. The only change is the geometric traits class from

CGAL::Triangulation_euclidean_traits_2 to the one given here. #include "cartesian_double.h" #include <CGAL/Triangulation_euclidean_traits_xy_3. h> typedef CGAL::Triangulation_euclidean_traits_xy_3<R EP> Traits; typedef CGAL::Delaunay_triangulation_2<Traits> Triangulation_xy; int main () { Triangulation_xy dt; copy( istream_iterator<Point_3>(cin), istream_i terator<Point_3>(), back_inserter( dt)); cout << dt; return 0; }

The triangle-based data structure and the halfedge data structure used for the planar map and the polyhedral surface in the basic library are

based on the design of combinatorial data structures described in [Kettner98]. The revised design of the halfedge data structure for polyhedral

surfaces is described in [Kettner99]. It is based on the design in Section Solving Mutual Dependencies between Class Templates.

Geometric Traits Classes

A geometric traits class separates a geometric algorithm or geometric data structure from its underlying geometric kernel.

The following example of a convex hull algorithm illustrates the use of a geometric traits class. The algorithm used is Andrew's variant of Graham's scan [Andrew79]. The actual implementation presented here stems from our framework for one sided error

Page 48 of 72


predicates [Kettner98b]. The implementation has been modified to use the iterator based interface from above and the traits class.

Andrew's variant of Graham's scan needs only a point type and a leftturn predicate from the geometric kernel given that the input points are already sorted lexicographically. Thus, the geometric traits class is quite short in this example. The leftturn predicate for three points p, q, and r in the plane is true if the points in this order perform a left turn. For points represented in Cartesian coordinates, the predicate is equivalent to the sign of the following determinant:

In the following example, the geometric traits class for the convex hull algorithm is itself a class template parameterized with the number type NT for the point coordinates. NT is the same type as used for the Cartesian point type Point_2 . The evaluation of the determinant is implemented as a function object assuming exact arithmetic and the leftturn member function gives access to the function object.

template <class NT> struct Point_2 { typedef NT Number_type; NT x; NT y; }; template <class NT> struct Convex_hull_traits { typedef Point_2<NT> Point; struct Leftturn { bool operator()( const Point& p, const Poin t& q, const Point& r) { return (q.x-p.x) * (r.y-p.y) > (r.x-p.x ) * (q.y-p.y); } }; Leftturn leftturn() const { return Leftturn(); } };

The algorithm is parameterized with iterators. It requires that the sequence of input points from the range [first,beyond) of bidirectional

iterators is lexicographically sorted and contains only pairwise disjoint points and at least two points. The algorithm computes the convex hull

and copies all points on the boundary of the convex hull (not only the vertices) in counterclockwise order to the output iterator result . It runs

in linear time and space and can produce up to 2n-2 output points in the degenerate case of all points on a segment where n is the number of

input points. The local vector can be omitted if the algorithm can use the output container as a stack. This is beyond the capabilities of the

currently defined iterator categories and restricts the applicability of the algorithm. Furthermore this change would distract here from the

purpose of this example, the illustration of the use of a geometric traits class in a geometric algorithm. template <class BidirectionalIterator, class Output Iterator, class Traits> OutputIterator convex_hull( BidirectionalIterator first, Bidirecti onalIterator beyond, OutputIterator result, const Traits& t raits) { typedef typename Traits::Point Point; vector<Point> hull; hull.push_back( *first); // sentinel hull.push_back( *first); // lower convex hull (left to right) BidirectionalIterator i = first; for ( ++i; i != beyond; ++i) { while ( traits.leftturn()( hull.end()[-2], *i, hull.back())) hull.pop_back(); hull.push_back( *i); } // upper convex hull (right to left) i = beyond; for ( --i; i != first; ) { --i; while ( traits.leftturn()( hull.end()[-2], *i, hull.back())) hull.pop_back(); hull.push_back( *i); } // clean up and copy hull to output iterator hull.pop_back(); hull.front() = hull.back(); hull.pop_back(); return copy( hull.begin(), hull.end(), result); }

Calling the algorithm with the traits class is straightforward. The default constructor of the traits class is used. We can use iterator_traits

to deduce the point type from the iterator type and from the point type the number type which can be used to provide our geometric traits

class Convex_hull_traits as default argument to the convex hull algorithm. We use an overloaded definition of the convex hull algorithm

with three parameters to do so. template <class BidirectionalIterator, class Output Iterator> OutputIterator convex_hull( BidirectionalIterator first, Bidirecti onalIterator beyond, OutputIterator result) { typedef typename iterator_traits<BidirectionalI terator>::value_type P; typedef typename P::Number_type Number_type; typedef Convex_hull_traits<Number_type> Traits; return convex_hull( first, beyond, result, Trai ts());

Page 49 of 72


}

One benefit of using function objects in the traits class instead of plain member functions is the possible association of a state with the

function object. We extend this to a traits class with a state.

In our framework on one sided error predicates [Kettner98b] we introduced the notion of a conservative implementation of a predicate. If a conservative implementation of the leftturn predicate returns true, the three points perform a left turn, but if it returns false, we do not know the orientation of the points. Thus, the decision errors due to rounding errors in inexact arithmetic are limited to one side of the two possible answers. Useful applications for such predicates will assume that these false answers occur rarely, but in principle an implementation saying always false is a legal implementation. Examples for convex hull algorithms and a triangulation of point sets are given in the paper that compute a well defined output if the predicate is not exact but a conservative implementation. Andrew's variant of Graham's scan as presented above computes a sequence of points of which the points on the convex hull are a subsequence if it is used with a conservative implementation of the predicate. The output can be easily postprocessed with the same algorithm but with an exact implementation of the predicate. For more properties of the computed output and other examples see [Kettner98b].

For floating point arithmetic an error bound can be computed such that if the expression computing the determinant is larger than the error bound, the exact value of the determinant is greater than zero. The expression to compute the determinant and its error bound give us the conservative implementation

(q.x-p.x)*(r.y-p.y) - (q.y-p.y)*(r.x-p.x) > 8(3u + 6u2 + 4u3 + u4) B2 ,

where u is the unit roundoff of the floating-point number system and B is the absolute value of the maximal coordinate value of the points. We have u = 2-53 for IEEE double precision, and u=2-24 for IEEE single precision floating-point numbers. In the following traits class we assume a built-in type double following the IEEE standard and we make the value B a state value of the traits class. In order to make the error bound representable as double , we round it to (3 2-50 + 2-100) B2 and require B to be a power of two.

class Convex_hull_traits_2 { double B; public: typedef Point_2<double> Point; Convex_hull_traits_2( double b) : B(b) {} struct Leftturn { double B; Leftturn( double b) : B(b) {} bool operator()( const Point& p, const Poin t& q, const Point& r) { const double C = 1.0 / 1024.0 / 1048576 .0 / 1048576.0; //2^-50 return (q.x-p.x) * (r.y-p.y) - (r.x-p .x) * (q.y-p.y) > (3.0 * C + C * C) * B * B; } }; Leftturn leftturn() const { return Leftturn(B); } };

Just to show a specialization of a class template, the following definition replaces the generic traits, which assumes, with our specialized

conservative predicate traits for the number type double. In this implementation, B is arbitrarily set to one. template <> struct Convex_hull_traits<double> : public Convex_h ull_traits_2 { Convex_hull_traits() : Convex_hull_traits_2(1.0 ) {} };

Since B is probably a constant parameter, it might be more appropriate to make it a template parameter in Convex_hull_traits_2 .

However, another example of a useful traits class with a state is the computation of the two-dimensional convex hull for a set of three-

dimensional points projected onto a two-dimensional plane. Instead of projecting the points and computing the convex hull on the projected

points, the convex hull can be computed with the original three-dimensional points and a modified leftturn predicate that takes into

account the projection stored as a local state.

Another example of the flexibility of the geometric traits classes is the reconstruction of a terrain from a set of three-dimensional sample points. A common approach is to triangulate the sample points using a Delaunay triangulation in the xy-projection, just ignoring the elevation in the z-coordinate. Similar to the convex hull algorithm, a geometric class can be used to parameterize the two-dimensional triangulation algorithm to work on the three-dimensional data set. See the example for CGAL in the previous Section.

7. Design Patterns

Introduction

The term design patterns has been introduced to a broad audience of object-oriented developers with the book with the same title [Gamma95].

Design patterns capture the essence of a design solution that has been proven to be useful in practice. Patterns are not invented, patterns are discovered in existing systems. A pattern has to show up in different systems before it is considered to be useful in practice.

A pattern consists of:

Page 50 of 72


Name

such as Adapter, Observer, Singleton...

Problem

the problem statement, its context and preconditions.

Solution

classes, objects, their relationship, responsibilities, and collaborations.

Consequences

results, tradeoffs.

Further information on patterns in the net:

� Appleton: Essential concepts: http://www.cmcrossroads.com/bradapp/docs/patterns - intro.html

� Patterns Homepage: http://hillside.net/patterns/

� WUSTL Patterns page: http://www.cs.wustl.edu/~schmidt/patterns.html

Abstract Factory Pattern

Problem: Consider an application using different graphical user interfaces (GUI) with widgets of different look and feels. The application should be seperated from the decision about the look and feel of its graphical user interface.

Solution: Instead of creating widgets directly, we ask an object -- the factory -- to create them for us. Each widget has its own abstract base class and concrete implementations according to the different GUIs.

class Window { virtual ~Windo w(){} }; class Motif_window : public Window {}; class Athena_window : public Window {}; class Button { virtual ~Butto n(){} }; class Motif_button : public Button {}; class Athena_button : public Button {};

The factory itself is also implemented as an abstract base class and a concrete class for each GUI. The abstract base class defines an interface with a member function to create each widget. The concrete factory classes create the widgets of the respective GUI.

class Widget_factory { virtual Window* create_window() = 0; virtual Button* create_button() = 0; virtual ~Widget_factory(){} }; class Motif_widget_factory { virtual Window* create_window() { return new Mo tif_window; } virtual Button* create_button() { return new Mo tif_button; } }; class Athena_widget_factory { virtual Window* create_window() { return new At hena_window; } virtual Button* create_button() { return new At hena_button; } };

The application uses only pointers to the abstract base classes of the various widgets. It uses a pointer to a factory to create concrete instances of widgets.

Consequences: A factory isolates concrete classes. It makes exchanging a whole family of classes easy. It promotes consistency among the family. New kinds of widgets are dificult to add.

Singleton Pattern

Problem: Ensure a class only has one instance, and provide a global point of access to it. An example are classes representing unique resources in a system, or a printer spooler, a window manager, etc.

Solution: The singleton class has only private constructors that forbid the creation of any objects outside of the class. A static member function is used as a global point of access. It returns a pointer to the unique instance of the singleton that is stored internally in a static member variable.

class Singleton { Singleton(); Singleton( const Singleton&); static Singleton* instance; public: static Singleton* Instance(); };

Page 51 of 72


Singleton* Singleton::instance = 0; Singleton* Singleton::Instance() { if ( instance == 0) instance = new Singleton; return instance; }

Consequences: controlled access, reduced namespace, and permits other limitation on the number of instances.

Adaptor Pattern (a.k.a. Wrapper Pattern)

Problem: A class C provides the right functionality, but not the correct interface, e.g., it is not derived from a required base class.

Solution: Write another class A with the correct interface that contains an instance of class C. Class A implements its interface by calling the right functions in its object of class C. This pattern is so simple that I omit an example here.

The boundary is blurred between an adaptor pattern and a class using some other class to implement its functionality. An adaptor does not perform the major part of the task itself.

Consequences: Helps if derivation from a required base class is missing (different libraries etc.). Same for sets of template requirements (see adaptors in the STL). Tradeoffs, how much work has the adaptor to do?

Visitor Pattern

Problem: Given a collection of objects, the objects are of types from a class hierarchy, and a set of operations that operate on these objects. The visitor pattern allows to define new operations without changing the classes of the elements on which it operates.

Solution: First, the canonical solution that is not extendible. Let us assume a scene graph API with different node types Transform and Geometry , both derived from the abstract base class Node. Let us define two operations, render and optimize , which we make virtual member functions of Node.

struct Node { virtual void render() = 0; virtual void optimize() = 0; virtual ~Node() {} }; struct Transform : public Node { virtual void render(); virtual void optimize(); }; struct Geometry : public Node { virtual void render(); virtual void optimize(); };

This solution is easy to extend with new subclasses of Node, but it is hard to extend with new operations.

Now, the visitor pattern factors out the operations in a seperate hierarchy of classes, all derived from the virtual base class Visitor and each derived class representing one operation. The Visitor classes have a virtual member function for each type in the Node class hierarchy, which performs the operation on that type. We also change the Node hierarchy to accept an object of type Visitor and to call the member function that corresponds to the actual type of the node object. The application now takes an operation object and calls accept of each node object with it.

struct Transform; // forward declaration struct Geometry; // forward declaration struct Visitor { virtual void visit_transform( Transform* ) = 0; virtual void visit_geometry ( Geometry* ) = 0; virtual ~Visitor() {} }; struct Render : public Visit { virtual void visit_transform( Transform* ); // render Transform virtual void visit_geometry ( Geometry* ); // render Geometry }; struct Optimize : public Visit { virtual void visit_transform( Transform* ); // optimize Transform virtual void visit_geometry ( Geometry* ); // optimize Geometry };

Page 52 of 72


struct Node { virtual void accept( Visitor* ) = 0; virtual ~Node() {} }; struct Transform : public Node { virtual void accept( Visitor* v) { v->visit_tra nsform( this); } }; struct Geometry : public Node { virtual void accept( Visitor* v) { v->visit_geo metry( this); } };

Consequences: The visitor pattern is easy to extend with new operations in the Visitor hierarchy, but it is hard to extend with new object types. The visitor pattern groups related operations together. Visitors can be used to accumulate state. Visitors might require to break some encapsulation (expose more implementation details of the objects), since the operation is no longer a member function of the object.

CGAL::Object, a Polymorphic Type in CGAL

Problem: We need a polymorphic return type for intersections in CGAL, for example, the intersection of two segments can be either emtpy, a point, or a segment. But we do not want a single global base class and class hierarchy for all classes in CGAL. Here the example of segment intersection in CGAL:

Segment s( Point(1,1), Point(1,5)); Segment t( Point(1,3), Point(1,8)); CGAL::Object result = CGAL::intersection( s, t); Point pt; if (CGAL::assign( pt, result)) cout << "intersection point = " << pt << endl;

Solution: We do not impose any restriction on CGAL classes. Instead, we create an addtional class hierarchy only where we need it, here for the polymorphic return type (see also [Weihe98]).

class Base { // base class for wrap per classes public: virtual ~Base() {} }; template <class T> // generic wrapper cla ss class Wrapper : public Base { T object; public: Wrapper(const T& obj) : object(obj) {} Wrapper() {} operator T() { return object; } }; class Object { // polymorphic object (smart pointer) Base* p; public: // ... Base* base() const { return p; } };

The assign function uses runtime type information to check whether the object stored in the wrapper behind CGAL::Object is of the appropriate type to actually copy it.

template <class T> bool assign(T& t, const Object& o) { Wrapper<T>* wp = dynamic_cast<Wrapper<T>*>(o.b ase()); if (wp == 0) return false; t = *wp; return true; }

Consequences: The original CGAL classes are not influenced by the design decision for intersection computation (locality of this design decision). They do not carry the extra vptr and associated costs for RTTI (runtime type information).

Remark: In CGAL, the design is a bit more involved since it exploits the handle/rep scheme with reference counting of CGAL to avoid the cost of copying the types into and out of the wrapper class.

8. Template Metaprograms

Introduction

Page 53 of 72


Template metaprogramming refers to a technique where the template instantiation mechanism of the C++ compiler is used to partially

evaluate a program at compile time. At some point during the standardization process of C++ it has been discovered that templates make the

C++ compiler actually Turing equivalent at compile time. The first examples using the C++ compiler as an interpreter are credited to Erwin

Unruh. Among them the following program that computes prime numbers at compile time. The result is communicated with compiler

warnings containing the prime numbers. This program circulated among the members of the ANSI/ISO C++ standardization committee. The

program does not work anymore on current compilers, see prime.C for a modified version that compiles as promised on the current g++

compiler. // prime.C // Program by Erwin Unruh template <int i> struct D { D(void*); operator int(); }; template <int p, int i> struct is_prime { enum { prim = ((p%i) && is_prime< (i>2 ? p : 0) , i-1>::prim) }; }; template <int i> struct Prime_print { Prime_print<i-1> a; enum { prim = is_prime<i,i-1>::prim }; void f() { D<i> d = prim; } }; struct is_prime<0,0> { enum { prim = 1}; }; struct is_prime<0,1> { enum { prim = 1}; }; struct Prime_print<2> { enum { prim = 1}; void f() { D<2> d = prim; } }; void foo() { Prime_print<20> a; }

Todd Veldhuizen uses these advanced template techniques with template metaprograms [Veldhuizen95a] and with expression templates

[Veldhuizen95b], which we are going to detail later.

In [Veldhuizen95a] a template metaprogram for bubblesort is described. Instantiated for a constant number of elements, the metaprogram unrolls the loops of bubblesort and creates the decision tree to sort the elements. The second example is a compile time function for sinus and cosinus that can be used to implement a metaprogram for the FFT, resulting in a single unrolled function for a 256 FFT with all roots evaluated as constants.

In [Veldhuizen95b] templates are used to write code such as:

// Integrate a function from 0 to 10 DoublePlaceholder x; double result = integrate( x / (1.0 + x), 0.0, 10.0 );

The term x / (1.0 + x) is a template expression. Its benefit is that it can be expanded inline in the integrate function. Template

expressions have also nicer syntax than functors. The same technique is used in the paper to implement vector arithmetic more efficiently,

namely avoiding large temporary results in vector expressions.

Both techniques are used in Blitz++, a library for numerical algorithms with vectors and matrices, see http://oonumerics.org/blitz/ .

Compile-time programming

The C++ language provides two entities which can be used to program at compile time:

� types

� constant integral values

We now describe a list of basic operations on these entites, which can be used to program "at compile time". Let us begin by making an

analogy with the following usual function : int f(T t) { return t.category(); } int a = f(t); // function (f) : associates a val ue (a) to an object (t)

Now observe the similarity with the following function on types : template < typename T > struct F { typedef typename T::Category type;

Page 54 of 72


}; typedef F<T>::type A; // function (F) : associates a type (A) to the type (T)

Constant integral values can be associated to types and vice-versa, at compile time. template < typename T > struct G { enum { value = 3 }; }; template < int i > struct H { typedef H<i+1> type; }; template < int i > struct K { enum { value = i+1 }; }; int a = G<T>::value; // (G) : associates an integr al value (a) to the type (T) typedef H<2>::type A; // (H) : associates a type (A ) to an integral constant value (2) int a = K<2>::value; // (K) : associates an integr al value (a) to an integral constant value (2)

Of course, such "meta" functions are not restricted to only one argument, and not even to one return entity. template < int i, typename T, int j, typename U > struct Z { enum { value1 = i+1 }; enum { value2 = j+i }; typedef Y<T, U> type; };

Built-in operators (+, -, *, /, %, ?:, ^, !, &, |, &&, ||, <<, >>, <, > , <=, >=, ==, != ) can also be used to generate

constant integral values directly : int a = K<2+3>::value; // (+) : associates an integ ral value (5) to two integral constant values (2 an d 3)

We can apply composition of meta functions : int a = G<F<2>::value>::value; // Composition of F and G.

To do real programming, we need to be able to perform branches. This can be achieved using (eventually partial) specialization. template < int i > struct D { enum { value = D<i-1>::value }; }; template <> struct D<0> { enum { value = 0 }; }; // Which is equivalent to the run-time program : int d(int i) { if (i == 0) return 0; return d(i-1); } // Note that the following works as well in this pa rticular case : template < int i > struct E { enum { value = i==0 ? 0 : D<i-1>::value }; };

The is_prime example in the introduction illustrates some more possibilities.

Another tool, which allows to connect compile-time programming to the powerful function overload resolution mechanism of C++, is the sizeof() operator :

struct LargeStruct { char c[2]; }; char f(...); LargeStruct f(double); template < int s > struct Is_double { enum { value = (s == 1) }; }; // x is some expression, e.g. "std::sqrt(2.0)" int a = Is_double< sizeof(f(x)) >::value;

Note that sizeof() returns an integral constant value. It also has the property that the expression to which it is applied needs not be defined

(being declared is enough). Overload resolution is a complex mechanism, and we will see later how it allows to extract some properties of

types automatically.

Constraining templates using the SFINAE principle

Page 55 of 72


In C++, overload resolution is the process by which the compiler selects which specific function to use in a given call, when that function is

overloaded. Consider the following overloaded function f : void f(); void f(int); void f(double); void f(...); template < typename T > void f(T const&); int main () { float x; f(x); // which function f is going to be calle d ? }

The compiler proceeds in 2 stages : it first selects all functions that could match the argument (in this case, the last 3 declarations), then it

discards those which require a conversion which has the least "priority". If there is not one unique best match, then this ambiguous case leads

to an error. So conversion rules come into play, and this is an area of C++ which is very complicated. In this case, the template version is going

to be chosen because there is no perfect match otherwise, the other possibilities require conversions which have less priority, and the ellipses

version is the one with the least priority.

There is a particular feature of the first processing stage performed by the compiler, when it tries to substitute the types of the actual arguments of the function call into each function declaration : this process does not produce an error if it fails, it just discards the considered function from the set of possible matching functions. This is called the "Substitution Failure Is Not An Error" (SFINAE) principle. Now let's add the following additional overloaded function to the previous program :

template < typename T > struct G {}; template < typename T > void f(T const&, typename G<T>::type = 0);

Given that G does not define a nested type type , trying to substitute float in this overloaded function is going to fail. The SFINAE principle

implies that the program is perfectly valid, and that the first template is still chosen.

Now what happens if we define a nested type type in G ?

template < typename T > struct G { typedef int type; };

Then the second template matches as well (there is no failure during the substitution process), and we get an ambiguity error with the first

template, because it matches with equal priority.

Now this opens new horizons, because the template class G can be specialized for some particular types, and can define or not a nested type type . This allows to perform some selection on the templates, to constrain them depending on some properties on types. However, the process is not yet so nice, because it requires an additional argument to the function (although we can specify a default value), and this is impossible for overloaded operators.

The remark that is going to save this idea for operators is that the SFINAE principle applies to the whole function signature, including the return type.

template < typename T > typename G<T>::type f(T const&);

Let us now have a look at how we could make use of this in a practical case : the CGAL library. There, we would like to define generic functions

for adding geometric vectors : template < typename Vector_2 > Vector_2 operator+(Vector_2 const &v, Vector_2 cons t & w) { return Vector_2(v.x() + w.x(), v.y() + w.y()); }

Obviously, this doesn't work, because this template is way too general (it matches almost any type), and it would clash with, for example : template < typename Vector_3 > Vector_3 operator+(Vector_3 const &v, Vector_3 cons t & w) { return Vector_3(v.x() + w.x(), v.y() + w.y(), v.z () + w.z()); }

So, let's consider concrete types provided by the user, to which we would like to apply our generic operator : class My_vector { // 2D vector type int _x, _y; public: My_vector(int x, int y) : _x(x), _y(y) {} int x() const { return _x; } int y() const { return _y; } }; class My_point { // 2D point type int _x, _y; public:

Page 56 of 72


My_point(int x, int y) : _x(x), _y(y) {} int x() const { return _x; } int y() const { return _y; } };

The types My_vector and My_point have identical interfaces, and we would like that our generic operator applies only on My_vector .

So the idea is to constrain the function template, in such a way that it will be considered only for types that are 2D geometric vector types, and which provide the needed requirements (.x() and .y() member functions, correct constructor, adequate semantics...). This kind of thing is typically a job for a traits class.

template < typename T > struct IsVector_2 { enum { value = false }; // By default, a type is not a 2D vector. }; template <> struct IsVector_2 <My_vector> { enum { value = true }; // But My_vector is a 2D vector. };

So now we can easily constrain the template using another accessory tool : template < typename T, typename, bool = T::value > struct Enable_if; template < typename T, typename R > struct Enable_if <T, R, true> { typedef R type; }; template < typename Vector_2 > typename Enable_if< IsVector_2<Vector_2>, Vector_2 >::type operator+(Vector_2 const &v, Vector_2 const &w) { return Vector_2(v.x() + w.x(), v.y() + w.y()); }

The following main function illustrates what we finally get : int main() { My_vector v(1,2), w(3,4); My_vector z = v + w; // OK My_point p(1,2), q(3,4); My_point r = p + q; // error : // no match for `My_point& + My_point&' operator }

The whole program can be found here : vector_2.C.

Type Classification

(Note : this section is inspired by chapter 19 of [Vandevoorde03])

We have just seen an example of traits class to express a property of a type. We are now going to see ways to write traits classes to automatically extract fundamental properties of types.

Let's start by writing a traits class which determines if a type is a fundamental type or not. Given that there is a small finite number of such types, it is easy to enumerate them :

template < typename > struct IsFundamental { enum { value = false }; }; template <> struct IsFundamental<int> { enum { value = true }; }; template <> struct IsFundamental<short> { enum { value = true }; }; template <> struct IsFundamental<double> { enum { value = true }; }; // similarly for bool, char, long, long double and the unsigned versions.

Now let's try to classify compound types, that is, types which are constructed from other types : plain pointers, references, arrays... We can use

partial specialization to identify some of these categories. In this case, it is also useful to extract which type(s) the compound is made of (e.g.

the type of each item for an array). // Primary template template < typename T > struct CompoundType { enum { IsPointer = false, IsReference = false, Is Array = false, IsFunction = false, IsPointerToMember = fa lse };

Page 57 of 72


typedef T base_type; }; // Partial specialization for references template < typename T > struct CompoundType<T&> { enum { IsPointer = false, IsReference = true, IsA rray = false, IsFunction = false, IsPointerToMember = fa lse }; typedef T base_type; }; // Partial specialization for pointers template < typename T > struct CompoundType<T&> { enum { IsPointer = true, IsReference = false, IsA rray = false, IsFunction = false, IsPointerToMember = fa lse }; typedef T base_type; }; // Partial specialization for arrays template < typename T, size_t N > struct CompoundType<T[N]> { enum { IsPointer = false, IsReference = false, Is Array = true, IsFunction = false, IsPointerToMember = fa lse }; typedef T base_type; }; // Partial specialization for empty arrays template < typename T > struct CompoundType<T[]> { enum { IsPointer = false, IsReference = false, Is Array = true, IsFunction = false, IsPointerToMember = fa lse }; typedef T base_type; }; // Partial specialization for pointers to members template < typename T, typename C > struct CompoundType<T C::*> { enum { IsPointer = false, IsReference = false, Is Array = false, IsFunction = false, IsPointerToMember = tr ue }; typedef T base_type; };

Unfortunately, function types cannot be recognized that easily. We are therefore going to use the SFINAE principle, after making the following

remark : the only types which cannot be gathered in an array are function types, void and references. We now exploit this fact by trying to

make an array of the given type, and if it fails, it means the type is one of these special types. "Trying", here, is a synonym for using the SFINAE

principle. template < typename T > class IsFunction { struct LargeStruct { char c[2]; }; template < typename U > static char test(. ..); template < typename U > static LargeStruct test(U (*)[1]); public: enum { value = sizeof( IsFunction<T>::test<T>(0) ) == 1 }; }; template < typename T > struct IsFunction<T&> { enum { value = false }; }; template <> struct IsFunction<void> { enum { value = false }; }; template <> struct IsFunction<void const> { enum { value = false }; };

How to determine enumerated types ? What are the features of this types that we could exploit to detect them ? We know that they are

convertible to integral types. But we need something more, because a user defined class could also define a conversion to an integral type.

We are not going to describe the full code here, but the curious reader can find the solution in [Vandevoorde03]. Hint : two user defined conversions cannot be applied automatically consecutively, whereas type promotion between enums and integral types do not have this restriction.

To conclude : it is possible to write traits classes that determine almost all basic properties of types. Libraries such as Boost provide such mechanisms in an extensive way, and the next revision of the standard library may contain such a type_traits mechanism. However, there are properties which cannot be determined using the current language : it is possible to determine if a class defines a static data member of a given name using SFINAE (static_member_detection.C), but it is not possible to extract the list of all data members of a class. Such a possibility would require changes in the core language. The following section illustrates what could be

Page 58 of 72


done with such a feature.

Self Reflection of Types in C++: The flatten.C Example

In this section I describe how far one could go with generic algorithms if types would be first class citizens in C++.

The following is a correct program in the functional programming language PolyP (see http://www.cs.chalmers.se/~patrikj/poly/polyp/ ):

module Main where import Flatten(flatten) data Foo a = Bar | Cee a | Doo (Foo a) a (Foo a) x = Doo (Doo (Cee 'P') 'a' (Doo Bar 't' Bar)) 'r' ( Doo Bar 'i' (Doo Bar 'k' Bar)) main = print (flatten "Patrik", flatten x) -- expected output: ("Patrik", "Patrik")

Foo a is a typical parameterized recursive type definition, here a binary tree, where the nodes with constructor Doo and the leaves with

constructor Cee contain an element of type a, while the leaves with constructor Bar are empty. The variable x contains such a tree. The

generic function (called polytypic in PolyP) flatten performs an in-order traversal and concatenates all elements of type a it can find.

The trick is the implementation of flatten without knowing the structure of Foo. Instead, PolyP allows to write functions over the structure of how types can be defined in the language in general. There is only a small set of possebilities how types can be defined in PolyP, for example, alternatives of types or Cartesian product of types.

Can we write such a flatten function in C++? No, we cannot, but almost. The missing information is what could be called self-inspection of types. We need meta information of a type how it is constructed. For example, for a struct we need to know its member types.

Let us assume we would have such information, and for the running example we provide this information just by hand (as a clever annotation to the type definition). Then, we could write the following program in C++ (see flatten.C for the full program):

// ----------------------------------- USER DATA T YPES struct B; struct A { int i; A* a; B* b; }; struct B { A* a; }; // ----------------------------------- EXAMPLE MAI N PROGRAM int main() { A a1; a1.i = 1; A a2; a2.i = 2; A a3; a3.i = 3; B b; a1.a = &a2; a2.a = &a3; a3.a = 0; a1.b = &b; a2.b = 0; a3.b = 0; b.a = &a2; list<int> ls; flatten(a1,ls); copy( ls.begin(), ls.end(), ostream_iterator<in t>(cout, " ")); cout << endl; }

The program creates an acyclic graph rooted at a1. Nodes of type A can point to nodes of type A and to nodes of type B. They also contain an

integer value, the value we want to concatenate during our flatten call. Nodes of type B can just point to a node of type A. The graph looks

as follows, nodes are labelled with their types, and nodes of type A contain their integer value:

Page 59 of 72


The return value of the flatten call is stored in the list ls . The output of the program is the sequence [1,2,3,2,3].

We have to add some annotation for our type definitions of A and B to make this example work. For the flatten function, we can get away with a rather dense notation that gives us for each structure the types of the members that it contains (incl. how many), and a way to access the different members. We define a class template with two arguments. The first argument is the type we want to annotate. The second argument is the cardinal number of the current member we are describing. The general class template has an operator() that just returns an object of type UNKNOWN. For each member in each type we write a specialization of the class template. The specialization has an operator() that, given an object of that type, returns the member of the given cardinal number.

// ----------------------------------- ANNOTATE TY PES FOR SELF-INSPECTION struct UNKNOWN {}; template <class X, int i> struct T { UNKNOWN operator()( X&) { return UNKNOWN();} }; template <> struct T<A,1> { int operator()(A& a) { return a.i; } }; template <> struct T<A,2> { A* operator()(A& a) { return a.a; } }; template <> struct T<A,3> { B* operator()(A& a) { return a.b; } }; template <> struct T<B,1> { A* operator()(B& b) { return b.a; } };

The flatten function unrolls along the way types can be defined in C++. We have restricted this example to deal with pointers and structures.

We have omitted, for example, reference types, unions, and arrays. We follow the flatten function call. First, we distinguish between pointer

types and non-pointer types. We use partial specialization of a class template to implement this distinction. For pointer types, we apply flatten

recursively on the dereferenced value of the pointer. // ----------------------------------- VALUE / POI NTER template <class X, class I> struct flatten_class { void operator()( X x, list<I>& ls) { flatten_items( T<X,1>()(x), T<X,1>(), x, ls ); } }; template <class P, class I> struct flatten_class<P*,I> { void operator()( P* x, list<I>& ls) { if ( x != NULL) flatten_class<P,I>()(*x, ls); } }; template <class X, class I> void flatten( X x, list<I>& ls) { flatten_class<X,I>()( x, ls); }

Second, we distinguish between structs and simple types. For structs, we iterate through all members of the struct and call flatten

recursively. If the type is simple and equal to the value type of the list, we append the value to the list. Thus, in this implementation of the

flatten function, the value type of the list actually determines what gets collected. We use again a class template and partial specialization

to do the match. Note how the UNKNOWN return type causes the recursive enumeration of all the members to stop. Note also that in the case

we want to flatten a struct that has a meta description we get an ambiguous match between the second and fourth specialization. The third

specialization solves this ambiguity. // ----------------------------------- LEAF / STRU CT WITH SUBITEMS template <class Item, class X, int n, class I> struct flatten_items_class { // flatten all subitems void operator()( Item i, X x, list<I>& ls) { flatten( i, ls); flatten_items( T<X,n+1>()(x), T<X,n+1>(), x , ls); } }; template <class Item,int n, class I> struct flatten_items_class<Item,I,n,I> { // item of type I found void operator()( Item, I i, list<I>& ls) { ls.push_back(i); } }; template <int n, class I> // item of type I found struct flatten_items_class<UNKNOWN,I,n,I> { void operator()( UNKNOWN, I i, list<I>& ls) { ls.push_back(i); } }; template <class X, int n, class I> // no (further) subitems

Page 60 of 72


struct flatten_items_class<UNKNOWN,X,n,I> { void operator()( UNKNOWN, X x, list<I>& ls) {} }; template <class Item, class X, int n, class I> void flatten_items( Item i, T<X,n>, X x, list<I>& l s) { flatten_items_class<Item,X,n,I>()(i,x,ls); }

This concludes our presentation of flatten.C. The purpose of this example was to stretch the imagination with what would be possible in C++

(and is possible in other languages), but how ugly and unmaintainable it looks in C++.

Expression Templates

(Note : this section is inspired by chapter 18 of [Vandevoorde03])

Expression templates have several uses, but basically the advantages of this technic are :

� it allows to write complex types in a natural manner. These types represent expressions.

� it can be applied to implement problem specific optimizations automatically, allowing to reach performance that only hand-coded low

level code could have reached, by rephrasing some expressions at compile time from a high level description.

The typical example of expression templates is the optimization of array operations. Let's consider : class SArray { double * _a; size_t _s; public: SArray (size_t n) : _a(new double[n]), _s(n) {} double const& operator[](size_t i) const { return _a[i]; } double & operator[](size_t i) { return _a[i]; } size_t size() const { return _s; } ~SArray() { delete[] _a; } }; SArray operator+(SArray const& a, SArray const& b) { assert(a.size() == b.size()); SArray tmp(a.size(); for (int i = 0; i < a.size(); ++i) tmp[i] = a[i] + b[i]; return tmp; }

Now, if we consider the typical use below, we can see that there is an efficiency problem with the last line : a temporary array is created in

order to store the result of x + y . This does not increase the number of additions to be performed, but it increases the amount of memory

required by the program, which makes it slower due to cache effects. SArray x, y, z, t; ... // do something useful with x, y, z t = x + y + z;

We can solve the problem by using a dedicated function that adds three arrays in one shot : SArray add_3(SArray const& a, SArray const& b, SArray cons t& c) { assert(a.size() == b.size() && a.size() == c.size ()); SArray tmp(a.size(); for (int i = 0; i < a.size(); ++i) tmp[i] = a[i] + b[i] + c[i]; return tmp; } ... t = add_3(x, y, z);

This is inconvenient because the notation is cumbersome, and for any expression based on arrays, we would like to benefit from the

optimization, without requiring to write a new function for, e.g. x+y*z-z*x , or whatever the user of our array class wants to compute. This is

where expression templates come into play.

The idea is to postpone the actual evaluation of the expression until the assignment operator is seen. This is done by creating an object that will encode the expression in the form of a tree :

Page 61 of 72


Each node in this tree has a particular type representing the type of operation (e.g. +) that it represents, together with references to its operands.

template < typename OP1, typename OP2 > class A_Add { OP1 const& op1; // first operand OP2 const& op2; // second operand public: A_Add (OP1 const& a, OP2 const& b) : op1(a), op2(b) {} // What it is contributing in the final computati on double operator[] (size_t i) const { return op1[i] + op2[i]; } };

Notice now how operator+ is going to create a node of the expression tree, and no computation on the arrays at all. We first need to write a

wrapper around the concrete array SArray . This is the assignment operator which triggers the recursive computation over the tree. template < typename Rep = SArray > class Array { Rep expr_rep; public: // initialization with a size explicit Array(size_t s) : expr_rep(s) {} // create an array from a possible representation Array (Rep const& rb) : expr_rep(rb) {} // Assignment of arrays of the same type Array& operator=(Array const& b) { assert(size() == b.size()); for (size_t i = 0; i < b.size(); ++i) expr_rep[i] = b[i]; return *this; } // Assignment of arrays of the different type template < typename Rep2 > Array& operator=(Array<Rep2> const& b) { assert(size() == b.size()); for (size_t i = 0; i < b.size(); ++i) expr_rep[i] = b[i]; return *this; } double operator[] (size_t i) { assert(i < size()); return expr_rep[i]; } Rep const& rep() const { return expr_rep; } Rep & rep() { return expr_rep; } }; template < typename R1, typename R2 > Array<A_Add<R1, R2> > operator+(Array<R1> const& a, Array<R2> const& b) { return Array<A_Add<R1, R2> > (A_Add<R1, R2>(a.rep (), b.rep())); }

Similarly for the other operations : template < typename OP1, typename OP2 >

Page 62 of 72


class A_Mul { OP1 const& op1; // first operand OP2 const& op2; // second operand public: A_Mul (OP1 const& a, OP2 const& b) : op1(a), op2(b) {} double operator[] (size_t i) const { return op1[i] * op2[i]; } }; template < typename R1, typename R2 > Array<A_Mul<R1, R2> > operator*(Array<R1> const& a, Array<R2> const& b) { return Array<A_Mul<R1, R2> > (A_Mul<R1, R2>(a.rep (), b.rep())); }

With this implementation, we have achieved what we have promised. Note that a complete implementation would also have to take care of

some corner cases like proper handling of local variables (by copying them in the tree nodes, not referencing them), also paying attention to

cases where the variable being assigned to is also in the arguments (e.g. x = y + x )...

Caveats : for the optimizations to apply, the compiler needs to perform two kinds of particular optimizations itself : inlining all the small functions which locally create a tree on the stack, and split the tree node structures in small components. For example, the latter is not being performed by g++ at the moment (version 3.3). As with some other template technics, it also has the potential to increase compilation times considerably depending on the program.

Another caveat of the expression templates is a bad interaction with generic libraries. Consider :

template < typename T > T square(T const& t) { return t*t; } Array<> x, y, z; ... z = square(x); // OK z = square(x+y); // error

The problem here is that x+y doesn't have the "base" type Array<> , but it has the type of a tree node. Inside the square function, the return

value is therefore going to be that particular type, but there is no possible conversion between any two different "internal" tree node types.

The only possible conversion is from an internal tree node to the "base" type Array<> . Therefore it has to be explicitely converted before the

call to the square function : ... z = square(Array<>(x+y)); // OK z = square<Array<> >(x+y); // OK

The other possibility is to partially overload the square function as shown below. The advantage is that the call sites do not require any

change, but the disadvantage is that the overloading has to know about the R parameter of Array , which otherwise never shows up in the

usage of this class. template < typename R > Array<> square(Array<R> const& t) { return t*t; } // or, taking further advantage of the "optimizatio n" : template < typename R > Array<A_Mul<R, R> > square(Array<R> const& t) { return t*t; }

There are other useful applications of expression templates, for example in the Boost Lambda library, which allows to create functors "on the

fly", starting from placeholders variables _1, _2... provided by the library. This way you can write functors on the fly using a natural syntax

like : std::vector<double*> V; ... std::sort(V.begin(), V.end(), *_1 > *_2);

Another application can be found in the GMP library (GNU Multi Precision), which provides C++ types for computing with multi precision numbers (integers, rationals...). In this case, expression templates are used to avoid creating temporaries. We can also cite PETE (Portable Expression Template Engine) which is a tool which allows to create expression templates easily.

9. Large Scale C++ Software Design

Introduction

This section contains material from [Lakos96]. This book contains many more examples that help in understanding the different techniques.

Page 63 of 72


We have seen various implementation aspects for single classes. We have also seen designs with several classes, for example, design patterns and template metaprograms. If we develop a large application or library, we have to consider a new level of organization: How to distribute classes and functions over files -- and also bigger units of organization.

We distinguish between the logical design and the physical design. The logical design describes how to write a class or a function and how to relate them to each other. Physical design describes how to organize the code in files.

In large systems, it is crucial to keep the complexity managable. One measure of complexity is the dependency between different units. Specifically cyclic dependencies increase the complexity. Complex systems are hard to understand and hard to test. Compilation times and link times can grow unmanagable large for complex systems.

After some definitions, which basically set up a sane way of organizing source code into components, we see techniques how to analyse physical dependencies between packages and how to break up cyclic dependencies between packages.

Internal and External Linkage

A name in C++ has internal linkage if it is local to its translation unit and cannot collide with an identical name defined in another translation

unit. Examples are type names and static variables.

A name in C++ has external linkage if, in a multi-file program, that name can interact with other translation units at link time. Examples are non-static function names, member function names, and non-static global variables.

Components and Dependency Relations

A component name consists of one header file name.h and one source file name.C (or whatever suffix for C++ source files is appropriate).

Components are not restricted to a single class or function. They will usually contain a few closely related classes and functions. A couple of

sanity rules apply:

� name.C only implements name.h and name.h is only implemented by name.C .

� name.C includes name.h first to assert that name.h includes all header files that are needed when using name.h somewhere else.

� name.h uses include guards to prevent multiple inclusions. For example:

• #ifndef NAME_H • #define NAME_H 1 • • // here goes the body of the header file • • #endif // NAME_H //

Some compilers als accept a #pragma once statement. Nowadays, compilers also detect automatically include guards, such that a

second attempt to include this header file will not even result in opening and scanning this file. Thus, redundant include guards (those

around include statements in the including file as proposed in [Lakos96]) are superfluous.

� All definitions with external linkage in name.C are declared in name.h .

� Whenever a name of external linkage is used, we include name.h (as opposed to just declaring the name again).

We are interested in physical dependencies between components (the dependencies within a component are not of interest here). The sanity

rules make it easy to see the physical dependencies from the header file inclusion graph.

A component y DependsOn a component x if x is needed in order to compile or link y. More specifically: Component y exhibits a compile-time dependency on x if x.h is needed in order to compile y.C . Component y exhibits a link-time dependency on x if the object file y.o contains undefined symbols for which x.o may be called upon either directly or indirectly to help resolve them at link time. Compile-time dependency almost always implies link-time dependency (see also [Page 127 ff., Lakos96]).

The IsA relation and the HasA relation from the logical design form always compile-time dependencies.

Physical Hierarchy

The DependsOn relation forms a graph over components. The major design rule is: Avoid cycles in the dependency graph! Designs with cycles

are hard to understand. Designs with cycles can have much larger compile and link-times for testing.

Let us take a closer look on testing. We assume a test-driver program for each component. The benefit of components (and the intended modularization) is hierarchical testing. We test each compononent in isolation, before we test components that depend on this component. Of course, this does not work if we have cycles in the dependency graph. All components participating in the cycle have to be tested at once and together. However, this shows also a way out of cycles between components; reorganize the parts that participate in the cycle into one compoment (since we do not care what happens within one component). Some of the possible ways

Page 64 of 72


of reorganizing a design are covered in the next section.

Furthermore, the link time for building all test drivers increases. We compare two worst cases for n components: First, each component depends on all other components, and second, no component depends on any other component. If we assume unit cost for each linking with one component, we get O( n2) cost for linking all test drivers in the first case, and O( n) in the second case. However, each non-trivial realistic system will have dependencies. A well designed system will aim for a flat acyclic hierarchy, approximately shaped like a balanced tree. Total linking cost would be O( n log n).

The cost for linking all test-drivers can be captured in a useful metric.

The Cumulative Component Dependency, CCD, is the sum over all components Ci in a subsystem of the number of components

needed in order to test each Ci incrementally.

Derived metrics are the avarage component dependency, ACD = CCD / n, and the normalized cumulative component dependency, NCCD, which is the CCD devided by the CCD of a perfectly balanced binary dependency tree with the same number of components. The CCD of a perfectly balanced binary dependency tree of n components is (n+1) * log2(n+1) - n.

The book [Lakos96] describes tools to analyse the dependencies of components and to compute these metrics automatically, assuming the above rules for packages have been followed. The sources for the tools are available at ftp://ftp.aw.com/cp/lakos/.

Reducing Link-Time Dependencies: Levelization

We introduce several techniques for eliminating cyclic dependencies in the dependency graph. The underlying assumption is that an initial

design is actually likely to be free of cycles, but as the design evolves over time cycles are introduced.

An examples: We are given a bunch of geometric objects, among others a rectangle in a component of the same name:

// rectangle.h #ifndef RECTANGLE_H #define RECTANGLE_H 1 class Rectangle { // ... public: Rectangle( int x1, int y1, int x2, int y2); // ... }; #endif // RECTANGLE_H //

We also work with a graphical user interface and have a component with a class for a window: // window.h #ifndef WINDOW_H #define WINDOW_H 1 class Window { // ... public: Window( int xCenter, int yCenter, int width, in t height); // ... }; #endif // WINDOW_H //

We realize, that both represent (among others) a two-dimensional box and we would like to be able to construct a rectangle from a window

and vice versa. A first attempt might just include the respective constructors. But as a consequence, we have to include the respective header

files and have a cyclic dependency. // rectangle.h #ifndef RECTANGLE_H #define RECTANGLE_H 1 #include "window.h" class Rectangle { // ... public: Rectangle( int x1, int y1, int x2, int y2); Rectangle( const Window& w); // ... }; #endif // RECTANGLE_H // // window.h #ifndef WINDOW_H

Page 65 of 72


#define WINDOW_H 1 #include "rectangle.h" class Window { // ... public: Window( int xCenter, int yCenter, int width, in t height); Window( const Rectangle& r); // ... }; #endif // WINDOW_H //

The dependancy graph looks like this:

Actually, if we follow the include statements and the include guards, we will find out that the solution does not compile yet, since we include the other header file before declaring the own class. We need to use forward declarations to solve this problem.

In fact, since we use the class Window only by reference in the Rectangle constructor, we do not need the full definition of the class Window, which we get by including the header file, but a declaration would be sufficient. The same is true for the class Rectangle in the Window header file.

// rectangle.h #ifndef RECTANGLE_H #define RECTANGLE_H 1 class Window; class Rectangle { // ... public: Rectangle( int x1, int y1, int x2, int y2); Rectangle( const Window& w); // ... }; #endif // RECTANGLE_H // // window.h #ifndef WINDOW_H #define WINDOW_H 1 class Rectangle; class Window { // ... public: Window( int xCenter, int yCenter, int width, in t height); Window( const Rectangle& r); // ... }; #endif // WINDOW_H //

However, in order to implement the constructors in the source files rectangle.C and window.C the respectively other header file has to be

included again. The resulting dependency graph shows that we still have the cyclic dependency between the components. We just have

reduced the compile-time dependencies, not the link-time dependencies.

Definition: A subsystem is levelizable if it compiles and the graph implied by the include directives of the individual components (including the .C files) is acyclic.

Thus, our example so far is not levelizable. We will see now some techniques to break cycles and to make a design levelizable.

Escalation

Page 66 of 72


Escalation breaks a cycle by lifting the interdependent functionality one level up into a new component. The interdependent functionality is

supposed to be small compared to the involved components. Thus, the extracted functionality is small enough to be put in a single component.

In our example we introduce the component, boxutil , that contains only the two conversion functions, here as static member functions of a

class. The rectangle and the window component remain untouched. // boxutil.h #ifndef BOXUTIL_H #define BOXUTIL_H 1 class Rectangle; class Window; struct Boxutil { static Window toWindow( const Rectangle& r); static Rectangle toWindow( const Window& w); }; #endif // BOXUTIL_H //

Demotion

Demotion is similar to escalation. But instead of collecting the interdependent functionality in a component in a level up, we collect the

functionality one level down. It does not work nicely with our running example.

Factoring

Factoring is the general version of escalation and demotion. The interdependent functionality is isolated and repackaged in components, not

necessarily in a single component. However, the goal is to reduce the complexity of the remaining cycle.

Opaque Pointers

Definition: A function f uses a type T in size if compiling the body of f requires having first seen the definition of T.

Definition: A function f uses a type T in name only if compiling f and any of the components on which f may depend does not require having first seen the definition of T.

Examples for in name only are reference and pointer types. Both definitions extend naturally for components.

Components that use objects in name only can be thoroughly tested, independently of the named object. Examples are container classes, nodes, and handles that just pass their data as pointers around.

Redundancy

This is not necessarily a technique to break cycles, but to reduce coupling and dependencies in general. The idea is, whenever only a small

fraction of a component is actually used in another component and causes the dependency, it might be worthwhile to reimplement this small

fraction again in the other component. Consider an example of a cell class, that contains among others a name. The name has been

implemented as a string, but is still presented at the interface of cell as an old C-style char* pointer. // cell.h #ifndef CELL_H #define CELL_H 1 #include class Cell { std::string d_name; // ... public: Cell( const char* name); const char* name() const; // ... }; #endif // CELL_H //

Here it might be worthwhile to reimplement the small fraction we need from string, namely storing a dynamically allocated array of characters. // cell.h #ifndef CELL_H #define CELL_H 1 class Cell { char* d_name; // ... public: Cell( const char* name); Cell( const Cell& name); ~Cell(); Cell& operator=( const Cell& cell); const char* name() const; // ... };

Page 67 of 72


#endif // CELL_H //

However, it is arguable for this example that a dependency on strings is not a big issue, and that the old C-style interface is actually a bit clunky.

Callback

Callback functions allow to break a cycle. Typical example are graphical user interfaces, or the simple qsort function in the standard C library: NAME qsort - sorts an array SYNOPSIS #include void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const voi d *))

The function pointer compar is the callback function. However, callback functions are difficult to understand, debug, and maintain.

Reducing Compile-Time Dependencies: Insulation

We give a list of parts in a class that can create compile-time dependencies. A compile-time dependency exist for another component using

this component if the other component has to be recompiled if one of the following parts in this component changes:

� Base class (incl. private inheritance).

� Layering (HasA relationship, but not HoldsA relationship, which is by name only).

� Inline functions and inline member functions.

� Private and protected members.

� Compiler generated member functions, such as the assignment operator.

� Include directives.

� Default arguments.

� Enumerations.

Insulation techniques eliminate the above dependencies, for example, compiler generated member functions can be explicitly implemented,

even if they perform the same as the default implementation would. In case that a future revision would like to change this semantics it could

be implemented without recompiling dependent components.

Besides the obvious ones, we address two techniques for partial insulation in the next section. Two techniques for full insulation are covered in the section thereafter.

Sometimes, going from partial insulation to full insulation is very easy. But sometimes, the last 5 percent are the hardest and the most costly. Full insulation is usually not appropriate at the bottom layers of a library. Full insulation is appropriate at the higher layers of a library that are exposed to the users.

Major runtime costs for insulation can happen because inline functions are no longer possible, virtual dispatch tables can add another indirection, and dynamic allocation of memory is slow. Memory use can increase with dynamic memory or virtual tables.

Techniques for Partial Insulation

A HasA relationship can be changed to a HoldsA relationship. The cost for this is usually dynamic memory management.

A private member function can be changed to a static non-local function of the component. If the private member function can be implemented using the public interface of the class, we just need to add the this pointer as an explicit function argument. If the member function needs exclusive access to private member variables, references to the private member variables can be added to the function signature. The price could be a penalty in runtime: the extended function signatures with larger parameter list cost time when calling the function if the function is not inline.

Techniques for Full Insulation

Definition: An abstract class is a protocol class if

1. it neither contains nor inherits from classes that contain member data, non-virtual functions, or private (or protected) members of any

kind,

2. it has a non-inline virtual destructor defined with an empty implementation, and

3. all member functions other than the destructor including inherited functions, are declared pure virtual and left undefined.

A protocol class is a nearly perfect insulator. Protocol classes in C++ are similar to interfaces in Java. Several of the design patterns have

protocol classes as part of their design, for example the adaptor pattern.

Another technique for full insulation is an opaque pointer. The insulated class contains only one opaque pointer to its private data and no other member variables.

// Insulated.h #ifndef INSULATED_H #ifndef INSULATED_H

Page 68 of 72


class Insulated_private; class Insulated { Insulated_private* d_data; // opaque pointer public: // ... constructors and member functions }; #endif // INSULATED_H //

The .C file implements the private data type and all member functions and constructors. The .C file can be changed and recompiled without

forcing any other component to recompile. // Insulated.C #include "Insulated.h" class Insulated_private { // ... }; // ....

10. Overview of Some Foundation Libraries

Introduction

The following overview highlights only a few aspects of the different C++ libraries that are of interest for this course. All libraries implement

strings, lists, maps and similar container classes. All emphasize the separation of policy and implementation.

Gnu C++ Library

The libg++ was one of the first available C++ libraries. It became available first in 1985. It contains strings, number types, container classes, IO

streams, storage allocators, and C-lib wrappers. See [page 33ff, Lippman96].

An emphasized distinction was between the view on abstract data types with value semantics and object-oriented implementations with state changing operations. Smaller classes, such as complex numbers and strings, are implemented with value semantics, while the container classes are implemented in the object-oriented way.

An implication of the value semantics for strings (and others) is that modifying operations are most naturally implemented such that they return a newly created object. See for example an operator + to concatenate two strings:

class string; string operator+ ( const string& s1, const string& s2);

The return type causes problems if we derive a class from string and use this operator. class special_string : public string { ... }; void foo() { special_string s1, s2; special_string s3 = s1 + s2; // type error

We get a type error because we cannot assign a string to a special_string. The class string is not suitable for subclassing. To lessen the

temptation to derive from string, the developers of libg++ gave the string class a rich (not to say fat) interface.

The string class was reference counted at the beginning using copy-on-write for modifying accesses. Reference counting has been eliminated later with the argument that reference counting is too clever and the user knows better where to copy and where to use references. (An interesting yet exceptional opinion. I have not seen this opinion anywhere else yet. However, we had the discussion in CGAL with reference counting for the small kernel objects. Here, it pays off once we use number types that are bigger than the builtin number types.)

Tools.h++

Tools.h++ evolved out of inhouse developments at Rogue Wave 1987. It has about 40000 lines of code. A sister library from these inhouse

developments is Math.h++. Todd Veldhuizen (author of Blitz++) was an intern at Rogue Wave, which motivated his work on efficient vector

expressions in Blitz++. See [page 43ff, Lippman96].

Tools.h++ provides container classes following the abstractions of the Smalltalk class hierarchy. However, the classes in Tools.h++ are no derived from a common base class. They form a loose set of classes.

Each container class is provided in three forms; as a class template, as a generic implementation based on the C-preprocessor (based on the genric.h facilities), and as a heterogeneous collection (Smalltalk like).

The template-based collections are availbale in three versions; intrusive, value-based and pointer-based collections. Consider the

Page 69 of 72


example of a list. An intrusive list assumes that the items have already the pointers needed to link them into the list. The value-based list copies the items into internal nodes, just as the STL container classes do. The pointer-based list keeps only a pointer to the items. Value- and pointer-based lists can handle arbitrary item types.

Strings are reference counted using copy-on-write for modifying functions.

Booch Components

The Booch components started as an ADA library, 1984 - 1987. They consisted of 501 packages with about 150000 lines of code, which is about

125000 lines of non-comment source lines (NCSL). The first C++ release in 1990 was already shortened to 17000 NCSL and continued to shrink

to 15000 NCSL with release 1.4.7 and to 15000 NCSL with release 2.0. This code size reduction was even accompanied by an increase of

flexibility and functionality of the library. However, future releases are expected to add more functionality that will finally increase the NCSL

count again. See [page 59ff, Lippman96].

The library contains monolithic data structures, which are stacks, strings, queues, deques, rings, maps, sets, bags, and polylithic data structures, which are lists, trees and graphs. When copied, a monolithic data structures perform a deep-copy (copy all of its contained items). Polylithic data structures can share substructures, which are sublists, subtrees, or graph nodes and edges. When copied, a polylithic data structures performs a shallow copy (it just creates pointers to the original parts).

Each container class can have several different implementations. The implementation can be bounded or unbounded, and the implementation can provide different forms of synchronisation for multi-threaded program executions.

The library distinguishes between the abstract data types (the interfaces) and the concrete data types (the implementation). The set of available implementations have been rigorously factored out and reduced to three different implementations: Simple_List allocating its nodes on the heap, Simple_Bounded_List allocating a limited number of nodes from the stack, and Simple_Vector allocating its array from the stack. This rigoruous factoring reduced the library size coming from Ada by 20-30%.

Polymorphic iterators increased the flexibility and functionality of the C++ library and reduced its size by about 50%.

Virtual constructors addresses memory allocation schemes. Factoring memory allocation and its synchronisation into its own components saved another 50%.

Multi-threaded support is factored into a lock-class. Resource allocation and de-allocation is handled in the constructor and the destructor respectively. Moving to lock-classes saved another 50%.

So this report doesn't seem to be a recommendation for Ada.

LEDA, the Library of Efficient Datatypes and Algorithms

LEDA started 1988 at the Max-Planck Institute in Saarbrücken, Germany. In the release 4.0 it consists of 210000 lines of code. However, this

count includes the comments, which contain also the reference manual of LEDA. [Mehlhorn99].

� Data types

� Number types

� Graphs and supporting data structures (node arrays, node priority queues, ...)

� Graph algorithms

� Graph editor

� Geometric objects

� Geometric algorithms

� 2d-and 3d-visualization

LEDA aims for the (theoretical and practical) fastest solution, and usually provides several implementations of the same data structure, for

example, for dictionaries or priority queues. LEDA covers algorithms and data structure of most text books.

An important design goal for LEDA is ease-of-use. Programs in LEDA look similar to their pseudo-code counterparts in text books. Therefore, LEDA claims to fulfill the equation "algorithm + LEDA = program", or emphasizing LEDA's performance and often worst-case optimality "algorithm + LEDA = efficient program".

LEDA uses a concept of items that is similar to iterators. A main difference to iterators is that the access of another item from the current item (e.g. going to the successor item in a list) requires the container class. For the ease-of-use LEDA defines convenient macros for for-loops, here some examples:

forall_items( item, container) { ... } forall( elems, container) { ... } forall_nodes( node, graph) { ... } forall_edges( edge, graph) { ... } forall_adj_edges( edge, node) { ... }

Page 70 of 72


Ease-of-use and flexibility can come with a price tag on the performance of the library. For LEDA, the overhead in memory size is claimed to be

less than two and in runtime less than three compared to hand-coded solutions.

The designers of LEDA did not choose to implement data structures as templates for the following reasons:

1. Long compilation times for templates.

2. Source code must be exposed to library client.

3. Templates not available when LEDA started.

The current design in LEDA (version 3.0 and 4.0) splits data structures into a base class and a derived class template. The base class implements

as much functionality as possible based on void* pointers for the user data. It can be compiled and distributed in object-code format. The

derived class template builds a wrapper around the base class that does the safe type casts between void* pointers and the user data. The

derivation is actually a private derivation and wouldn't be necessary for the wrapper functionality.

Instead, this derivation has been chosen as a compact solution to provide the base class with some knowledge how to operate on the user data, for example, how to copy and how to compare it, using dynamic binding at runtime. For this, the base class defines abstract virtual functions accepting void* pointers, and the derived class template performs the required operation after casting the void* pointers back to the user data.

Here is a small example for an ordered pair as a data structure. The comparison function is implemented using a virtual member function. First, the base class that can be compiled and put into a library. For the simplicity of presentation here it is implemented inline. Note that the void* pointer storage requires dynamic allocation. We ignore the proper copy and destruction functionality here. (see also leda_impl.C)

class pair_impl { void* d_min; void* d_max; protected: // this class is only ment for derivati on // to be implemented in derived class: virtual bool cmp( void* a, void* b) = 0; // note, constructor doesn't work here instead of set() because // the virtual cmp function wouldn't be availab le (will be available // after derived class is constructed). void set( void* a, void* b) { if ( cmp(a,b)) { d_min = a; d_max = b; } else { d_min = b; d_max = a; } } void* min() { return d_min; } void* max() { return d_max; } };

Second, the (privately) derived class template. It implements the comparison required by the base class. template <class T> class pair : private pair_impl { virtual bool cmp( void* a, void* b) { // use casts to T return * static_cast<T*>(a) < * static_cast <T*>(b); } public: pair( const T& a, const T& b) { set( new T(a), new T(b)); } T& min() { return * static_cast<T*>( pair_impl: :min()); } T& max() { return * static_cast<T*>( pair_impl: :max()); } };

This separation into derived class templates and base classes coincide with the separation into data types, such as a dictionary, and possible

implementations, such as balanced trees, skip lists, etc. This separation has also been mentioned as policy versus implementation for the other

libraries.

The default dictionary array in LEDA is called d_array<I,E> , where I is the index type and E is the associated element type. Since LEDA does not rely on template default arguments, a different name has been given to the dictionary type where the user can select from different implementations, _d_array<I,E,Impl> .

The dictionary with implementation parameter is (publicly) derived from the default dictionary and its member functions are virtual. Thus, a use can write generic algorithms based on the default dictionary, and still choose a specific implementation when using it.

Both dictionaries are implemented using the private inheritance from the implementation. Using above example with the ordered pair data structure, we get the following inheritance relationship:

pair_impl <------ pair<T> ^ | | Impl <------ pair_x<T,Impl>

Page 71 of 72


Since Impl could be equal to pair_impl , we have to make the private inheritance virtual. Assuming we also had made the min and max

member functions of our pair class from above virtual, we can continue with the derived class including an implementation parameter. (see

also leda_impl2.C for the full example) template <class T, class Impl> class pair_x : private virtual Impl, public pair<T> { virtual bool cmp( void* a, void* b) { // use casts to T return * static_cast<T*>(a) < * static_cast <T*>(b); } public: pair_x( const T& a, const T& b) { Impl::set( ne w T(a), new T(b)); } virtual T& min() { return * static_cast<T*>( Im pl::min()); } virtual T& max() { return * static_cast<T*>( Im pl::max()); } };

Note that this solution derives actually from two implementations, thus carrying an unused implementation and its space around.

There are two major sources of inefficiencies in this design: the dynamic memory allocation and virtual function calls for basic operations, such as comparisons. LEDA partially compensates these inefficiencies with two optimizations.

The first optimization gets rid of dynamic memory management for small data types. LEDA distinguishes between big data types and small data types. Small data types are small enough to fit into the size of a pointer. LEDA uses a smart pointer, called GenPtr , to actually store small data types in the place of the pointer instead of allocating memory. Only for big data types, LEDA uses dynamic memory, which is then drawn from an efficiently self-managed pool of memory.

The second optimization selectively optimizes operations for builtin data types to get rid of the overhead of calling a virtual function for rather basic tasks. An example is the binary search in ordered dictionaries, which is handcoded for integers to call the operator < directly. However, this path for optimizing LEDA's performance is not readily available in the documentation for the library users. Though it could be done already by deriving from a container class and rewriting the member functions in question.

In conclusion, when comparing with LEDA, choose a builtin type that fits into a pointer (e.g. int) to benefit from LEDAs optimizations, or write a trivial class slightly larger than a pointer, to see the full overhead of LEDAs design.

Page 72 of 72


Documents

cpp