A Critique of C++ (2nd edition) - Ian Joyner

Embed Size (px)

DESCRIPTION

A Critique of C++ (2nd edition)

Citation preview

  • C++??A Critique of C++

    2nd Edition

    Ian Joyner

    c/- Unisys - ACUS115 Wicks Rd, North Ryde

    Australia 2113Tel: +61-2-390 1328 Fax +61-2-390-1391

    [email protected]

    Ian Joyner 1992

  • 1. Introduction......................................................................................................................12. The Role of a Programming Language..............................................................................1

    2.1. Safety and Courtesy Concerns .............................................................................34. C++ Specific Criticisms ..................................................................................................4

    3.1. Virtual Functions ................................................................................................43.2. Pure Virtual Functions ........................................................................................63.3. The Nature of Inheritance....................................................................................73.4. Function Overloading .........................................................................................73.5. Virtual Classes....................................................................................................83.6. Name overloading...............................................................................................83.7. Polymorphism and Inheritance ............................................................................93.8. . and -> .........................................................................................................103.9. Anonymous parameters in Class Definitions .......................................................103.10. Nameless Constructors ......................................................................................113.11. Constructors and Temporaries ...........................................................................113.12. Optional Parameters ..........................................................................................113.13. Bad deletions.....................................................................................................113.14. Local entity declarations....................................................................................123.15. Members ...........................................................................................................123.16. Friends .............................................................................................................123.17. Static .................................................................................................................123.18. Union ...............................................................................................................133.19. Nested Classes..................................................................................................133.20. Global Environments........................................................................................133.21. Header Files .....................................................................................................143.22. Class Interfaces.................................................................................................143.23. Class header declarations ..................................................................................143.24. Garbage Collection ...........................................................................................153.25. Type-safe linkage .............................................................................................153.26. C++ and the software lifecycle..........................................................................163.27. Reusability and Communication .......................................................................173.28. Reusability and Trust........................................................................................173.29. Reusability and Compatibility ..........................................................................173.30. Reusability and Portability................................................................................173.31. Idiomatic Programming ....................................................................................173.32.Concurrent Programming ...................................................................................18

    4. The role of Language.......................................................................................................185. On Writing ......................................................................................................................206. Generic C criticisms ........................................................................................................20

    6.1. Pointers...............................................................................................................216.2. Arrays.................................................................................................................216.3. Function Parameters............................................................................................226.4. void * .................................................................................................................226.5. void fn ().............................................................................................................226.6. fn ().....................................................................................................................236.7. Metadata in Strings .............................................................................................246.8. ++, -- ..................................................................................................................246.9. Defines ...............................................................................................................256.10. NULL vs 0 ........................................................................................................256.11. Case Distinction ................................................................................................256.12. Assignment Operator ........................................................................................266.13. Type Casting ....................................................................................................266.14. Semicolons.......................................................................................................27

    7. Conclusions......................................................................................................................278. Bibliography ....................................................................................................................29

  • 1. Introduction most difficult to understand and technical sectionof the paper, but it is fundamental to theunderstanding of the weaknesses of C++.The C++ programming language is becoming

    widely used. So it is important and timely toquestion its success. Two books are alreadypublished on the subject [Sakkinen 92] and[Yoshida 92]. This critique addresses thefollowing questions. How well does C++implement object-oriented concepts? Can it easilyimplement small, quick projects? Does it scale upwell for large projects? Does it support or hindergood programming practices? As a result, does itease the production of quality software? What isthe relationship between a language, compiler andsoftware developers; and between the language,compiler and the target system? This last questionaddresses issues of correctness, compatibility,portability, and efficiency.

    Having said that, I hope that you find thiscritique useful, and enjoyable. If by any chanceyou do, please feel free to distribute it to yourmanagement, peers and friends.

    2. The Role of a ProgrammingLanguage

    A programming language functions at manydifferent levels and has many roles. It should becritiqued with respect to those levels and roles.Historically, programming languages had a verylimited role, that of writing executable programs.As programs have grown in complexity, this rolealone has proved insufficient. Many design andanalysis techniques have arisen to support othernecessary roles. The organisation of projects alsorequired tools external to the language andcompiler, like make. Object-oriented techniqueshave arisen to help in the analysis and designphases, and object-oriented languages to supportthe implementation phase of OO. Traditional,tried and tested but failed software practices areinfiltrating the object-oriented world. Object-orientation, however, offers a better rationalapproach to software development. Thecomplementary roles of analysis, design,implementation and project organisation shouldbe better integrated in the object-oriented scheme.This results in economical software production.

    A paper on the recommended practices for usein C++ [Ellemtel 92] suggests C++ is a difficultlanguage in which there may be a very fine linebetween a feature and a bug. This places a largeresponsibility upon the programmer. Is this aresponsibility or a costly burden? The fine lineis a result of poor language definition. The C++standardisation committee warns C++ is alreadytoo large and complicated for our taste [X3J1692].

    While it is true that C++ is immediatelyusable by many C programmers, and many seethis as a strength, the C base is C++s greatestweakness. This is the engineering compromisethat C++ devotees talk about. Adoption of C++does not suddenly transform C programmers intoobject-oriented programmers. A complete changeof thinking is required, and C++ actually makesthis difficult. A critique of C++ cannot beseparated from criticism of the C base language,as it is essential for the C++ programmer to befluent in C. Many of Cs problems affect the waythat object-orientation is implemented and used inC++. This critique is not exhaustive of theweaknesses of C++, but it illustrates the practicalconsequences of these weaknesses with respect tothe timely and economic production of qualitysoftware.

    C++ is an interesting experiment in adaptingthe advantages of object-orientation to atraditional programming language. BjarneStroustrup is to be applauded for having theinsight to put the two technologies together. C++,however, retains the problems of the old order ofsoftware production. C++ has an advantage overC as it supports many facets of object-orientation.These can be used for limited analysis and design.The processes of analysis, design, andorganisation, however, are still largely external toC++. Thus C++ has not realised the importantadvantages of object-orientation that will indeedlead to the economic production of software.

    This critique criticises C++ in its own right,without comparison to other languages. Section 2considers the role of a programming language.Section 3 examines some specific aspects of C++.Section 4 examines the general role of language.Section 5 is a short comment on writing. Section6 looks specifically at C. The conclusionexamines where C++ has left us, and considers thefuture. The approach taken is to criticise specificaspects of C++ and C. Each section tries to be selfcontained. It is expected that not everyone willagree with all of the sections. It is probably best toapproach the paper, not by reading it entirely, butto read those sections that interest you. Onesection, however, is fundamental to the criticismof C++, that on virtual functions. This is also the

    A language should not only be critiqued froma technical point of view, considering its syntacticand semantic features. It should also be critiquedfrom the viewpoint of its contribution to the entiresoftware development process. It should enablecommunication between project members actingat different levels, from management, who have arequirement for the product, to testers, who musttest the result. It should also enablecommunication between project membersseparated in space and time. Often oneprogrammer is not responsible for a task over itsentire lifetime.

    C++?? 2nd Edition page 1

  • The primary purpose of any language iscommunication. A programming language shouldsupport the exchange of ideas, intentions, anddecisions between project members. Aprogramming language should provide a formal,yet readable, notation to support consistentdescriptions of systems that satisfy therequirements of diverse problems. A languageshould also provide methods for automatedproject tracking. This ensures that modules(classes and functionality) that satisfy projectrequirements are completed in a timely andeconomic fashion. A programming language aidsreasoning about the design, implementation,extension, correction, and optimisation of asystem.

    techniques of schema checking are often criticisedas being restrictive and therefore unusable for realworld software. This is nonsense andmisunderstands of the power of these languages. Itis an immature conception; the best programmersrealise that programming is difficult. As a whole,the computing profession is still learning toprogram.

    Another example of consistency checkingcomes from the user interface world. Instead ofcorrecting a user after an erroneous action, a gooduser interface will not offer the action as apossibility in the first place. It is cheaper to avoiderror than to fix it. Most people drive their carswith this principle in mind. Smash repair is timeconsuming and expensive.

    A language definition should enable thedevelopment of integrated automated tools tosupport software development. For example,browsers, editors and debuggers. The compiler isanother such tool. The role of a compiler istwofold. Firstly, to generate code for the targetmachine. The role of the machine is to execute theproduced programs. A compiler has to check thata program conforms to the language syntax andgrammar, so it can understand the program inorder to translate it into an executable form.Secondly, and more importantly, the compilershould check that the programmers expression ofthe system is complete, valid and consistent. Acompiler should perform semantics checking. Thisis checking that a program is internally consistent.Generating a system that has detectableinconsistencies is pointless.

    Program development is a dynamic process. Aprogram description is constantly modified duringdevelopment. Modifications often lead toinconsistencies and error. Languages andcompilers that provide consistency checks helpprevent such bugs, which can creep into apreviously working system. These checks helpverify that as a program is modified, previousdecisions and work are not invalidated.

    It is interesting to consider how muchchecking could be integrated in an editor. Thefocus of many current generation editors is text.What happens if we change this focus from text toprogram components? Such editors might checknot only syntax, but semantics. Alertingprogrammers of potential errors earlier andinteractively will shorten development times.Future languages should be defined very cleanlyin order to enable such editor technology.Semantics checking is done by ensuring that a

    specification conforms to some schema. Forexample, the sentence The boy drank thecomputer and switched on the glass of water isgrammatically correct. But the sentence isnonsense. It does not conform to the mentalschema we have of computers and glasses ofwater. A programming language should includetechniques for the detection of similar nonsense.The language definition provides the frameworkthat makes this role of the compiler possible.

    A programming language should provide aformal notation. During requirements analysis anddesign phases, formal and semi-formal notationsare required. Notations used in analysis, design,and implementation phases should becomplementary, rather than contradictory.Currently, analysis, design and modellingnotations are too far removed from programming,while programming languages are in general toolow level. Both designers and programmers mustcompromise to fill the gap. Current notationsprovide difficult transition paths between stages.This semantic gap contributes to errors andomissions between the requirements, design andimplementation phases. Future programminglanguages will be an implementation extension ofthe high level notations used for requirementsanalysis and design. This will lead to improvedconsistency between analysis, design andimplementation. Object-oriented techniquesemphasise the importance of this, as abstractdefinition and concrete implementation can beseparate, yet provided by the same syntax.

    Checking is often enabled by the specificationof redundant information. Declarations are anexample of redundancy that help check formisspellings. Declarations define the vocabularyof a program, ie the elements in its universe. Thecompiler uses redundant information forconsistency checking, and strips it away toproduce efficient executable systems. Type safetyis another technique. Declarations also associatean entity with a type, to define the entities role.Typing ensures that you cant drink computers orswitch on glasses of water. C++ is animprovement over C in type safety.

    It is a misconception that consistency checksare training wheels for student programmers,and that syntax errors are a hindrance toprofessional programmers. Languages that exploit

    Programming languages also providenotations to formally document a system.Program source is the only reliable documentationof a system, so a language should explicitly

    C++?? 2nd Edition page 2

  • support documentation. As with all language, theeffectiveness of communication is dependent uponthe skill of the writer. Good program writersrequire languages that support the role ofdocumentation. They require that the syntax of alanguage is perspicuous, and easy to learn. Thosenot trained in the skill of writing programs, canread them to gain understanding of the system.After all, it is not necessary for newspaper readersto be journalists.

    These quotes from Reade are a good summaryof the principles from which I criticise C++. WhatReade calls administrative tasks, I callbookkeeping. C and C++ are often criticised forbeing cryptic. The reason is that C concentrates onpoints 2 and 3, while the description of what is tobe computed is obscured. High level languagesdescribe what is to be computed. This is theproblem domain. How a computation isachieved is in the low-level machine-orienteddomain. The conflict between these aspects recursfrequently throughout this critique. Automatingthe bookkeeping tasks enhances correctness,compatibility, portability and efficiency.Bookkeeping tasks arise from having to specifyhow a computation is done. Specifying howthings are done in some environments hindersportability to other platforms.

    Chris Reade [Reade 89] gives the followingexplanation of programming and languages. One,rather narrow, view is that a program is asequence of instructions for a machine. We hopeto show that there is much to be gained fromtaking the much broader view that programs aredescriptions of values, properties, methods, prob-lems and solutions. The role of the machine is tospeed up the manipulation of these descriptions toprovide solutions to particular problems. Aprogramming language is a convention forwriting descriptions which can be evaluated.

    The industry should be moving towards theseideals. They will help in the economic productionof software, rather than the costly techniques oftoday. We should consider what we need, andassess the problems of what we have against that.Object-orientation provides one solution to theseproblems. Its effectiveness, however, depends onthe quality of its implementation.

    [Reade 89] also describes programming asbeing a Separation of concerns. He says:

    The programmer is having to do severalthings at the same time, namely,

    It is relevant to ask if grafting OO conceptsonto a conventional language realises the fullbenefits of OO? Perhaps a biblical quote can beconsidered: No one sews a patch of unshrunkcloth on to an old garment; if he does, the patchtears away from it, the new from the old, andleaves a bigger hole. No one puts new wine intoold wineskins; if he does, the wine will burst theskins, and then wine and skins are both lost. Newwine goes into fresh skins. Mark 2:22

    (1) describe what is to be computed;(2) organise the computation sequencing into

    small steps;(3) organise memory management during the

    computation.Reade continues, Ideally, the programmer

    should be able to concentrate on the first of thethree tasks (describing what is to be computed)without being distracted by the other two, moreadministrative, tasks. Clearly, administration isimportant but by separating it from the main taskwe are likely to get more reliable results and wecan ease the programming problem by automatingmuch of the administration.

    We must abandon disorganised and error-prone practices, not adapt them to new contexts.How well can hybrid languages support thesophisticated requirements of modern softwareproduction? Surely a basic premise of object-oriented programming is to enable thedevelopment of sophisticated systems through theadoption of the simplest techniques possible?Software development technologies andmethodologies should not impede the productionof such sophisticated systems.

    The separation of concerns has otheradvantages as well. For example, program provingbecomes much more feasible when details ofsequencing and memory management are absentfrom the program. Furthermore, descriptions ofwhat is to be computed should be free of suchdetailed step-by-step descriptions of how to do itif they are to be evaluated with different machinearchitectures. Sequences of small changes to adata object held in a store may be an inappropriatedescription of how to compute something when ahighly parallel machine is being used withthousands of processors distributed throughout themachine and local rather than global storagefacilities.

    2.1. Safety and Courtesy ConcernsThis critique makes two general types of

    criticism, about safety concerns and courtesyconcerns. These themes recur throughout thiscritique, as C and C++ have flaws thatcompromise them frequently. Safety concernsaffect the external perception of the quality of theprogram. Failure to meet safety concerns results inunfulfilled requirements and program crashes.Automating the administrative aspects means

    that the language implementor has to deal withthem, but he/she has far more opportunity to makeuse of very different computation mechanismswith different machine architectures.

    Courtesy concerns affect the internal view ofthe quality of a program in the development andmaintenance process. Courtesy concerns areusually stylistic and syntactic, whereas safetyconcerns are semantic. The two often go together.

    C++?? 2nd Edition page 3

  • It is courtesy for an airline to keep its fleet wellmaintained. This courtesy concern is also verymuch a safety concern.

    descendant classes are part of the same namespace as classes they inherit from. Theredeclaration of a name within the same scopeshould cause a name clash. Allowing two entitiesto have the same name within one scope causesambiguity and other problems. (See the section onname overloading.)

    Courtesy issues are even more important inthe context of reusable software. Reusabilitydepends on the clear communication of thepurpose of a module. Courtesy is important toestablish social interactions, such as com-munication. Courtesy implies inconvenience tothe provider, but provides convenience to others.Courtesy issues include choosing meaningfulidentifiers, consistent layout and typography,meaningful and non-redundant commentary, etc.Courtesy issues are more than just a styleconsideration. A language design should directlysupport courtesy issues. A language, however,cannot enforce courtesy issues, and it is oftenpointed out that poor, discourteous programs canbe written in any language. But this is no reasonfor being careless about the languages that wedevelop and choose for software development.

    The following example illustrates the secondproblem:

    class A{

    public:void nonvirt ();virtual void virt ();

    }

    class B : public A{

    public:void nonvirt ();void virt ();

    }3. C++ Specific CriticismsA a;B b;3.1. Virtual Functions A *ap = &b;

    Polymorphism is a key concept of OOP.Virtual functions are one way to implementpolymorphism. A language designers choice iswhether this should be specified in the parent orthe inheriting class. Is it the decision of thedesigner of the parent or descendant class? Casescan be made for both. They are not mutuallyexclusive and can be catered for quite easily in anobject-oriented language.

    B *bp = &b;

    bp->nonvirt (); // calls B::nonvirt // as you would // expectap->nonvirt (); // calls A::nonvirt, // even though this // object is of type // B.

    There are three options, corresponding tomust not, can, and must be redefined: ap->virt (); // calls B::virt, the

    // correct version of1) The redefinition of a routine is prohibited;descendant classes must use the routine as is. // the routine for B

    // objects.2) A routine could be redefined. Descendantclasses can use the routine as provided, or providetheir own implementation as long as it conformsto the original interface definition andaccomplishes at least as much.

    In this example, class B has extended orreplaced routines in class A. B::nonvirt is theroutine that should be called for objects of type B.It could be pointed out that C++ gives the clientprogrammer flexibility to call either A::nonvirt orB::nonvirt. But this can be provided in a simplermore direct way. A::nonvirt and B::nonvirt shouldbe given different names. That way theprogrammer calls the correct routine explicitly,not by an obscure and error prone trick of thelanguage, as follows:

    3) A routine is abstract. No implementationis provided and each non-abstract descendent classmust provide its own implementation. This ispolymorphism.

    The base class designer must decide options 1and 3. Descendant class designers must decideoption 2. A language should provide direct syntaxfor these options. class B : public A

    {Option 1 public:

    C++ does not cater for the first option. Notusing a virtual function is the closest. But in thatcase the routine can be completely replaced. Thiscauses two problems. Firstly, a routine can beunintentionally replaced in a descendent. Thecompiler should report a syntax error due toduplicate declaration. This is logical as

    void b_nonvirt ();void virt ();

    }

    B b;B *bp = &b;

    C++?? 2nd Edition page 4

  • bp->nonvirt (); // calls A::nonvirt whether the function f() is defined virtual or non-virtual in order to interpret exactly what a->f ()means. Therefore, the statement a->f () is notimplementation independent. A change in thedeclaration of f () will change the semantics of theinvocation. Implementation independence meansthat a change in the implementation DOES NOTchange the semantics, of executable statements.

    bp->b_nonvirt (); // calls // B::b_nonvirt

    Now the designer of class B has direct controlover Bs interface. The application requires thatclients of B can call both A::nonvirt, andB::b_nonvirt. Bs designer has explicitly providedfor this. This is good object-oriented design,which provides strongly defined interfaces. C++allows client programmers to play tricks with theclass interfaces, external to the class, and Bsdesigner cannot prevent A::nonvirt from beingcalled. This is opposite to good modular design.This shows the unsafeness C++s virtualmechanism. Objects of class B have their ownspecialised nonvirt. But Bs designer does nothave control over Bs interface to ensure that thecorrect version of nonvirt is called.

    If a change in the declaration changes thesemantics, this should generate a compilerdetected error. The programmer should make thestatement semantically consistent with thechanged declaration. This reflects the dynamicnature of software development, where theprogram text is subject to perpetual change.

    For yet another case of the inconsistentsemantics of the statement a->f () vs constructors,consult section 10.9c, p 232 of the C++ ARM.[Sakkinen 92] points out that a descendant classcan redefine a private virtual function even thoughit cannot access that function in other ways. Whenthe ancestor class calls the function it insteadinvokes the function in the descendant class.

    C++ also does not protect class B from otherchanges in the system. Suppose we need to write aclass C that needs nonvirt to be virtual. Thennonvirt in A will be changed to virtual. But thisbreaks the B::nonvirt trick. The requirement ofclass C to have a virtual routine forces a change inthe base class. This has an effect on all otherdescendants of the base class, instead of thespecific new requirement being localised to thenew class. This is opposite to the reason for OOPhaving loosely coupled classes, so that newrequirements, and modifications will havelocalised effects, and not require changeselsewhere which can potentially break otherexisting parts of the system.

    Option 2The second option should be left open for the

    programmers of descendant classes. In C++,however, the decision must be made in the baseclass. In object-oriented design, the decisions youdecide not to make are as important as thedecisions you make. Decisions should be made aslate as possible. This strategy prevents mistakesbeing built into the system at early stages. Bymaking early decisions, you are often stuck withassumptions that later prove to be incorrect. C++requires the parent class to specify potentialpolymorphism by virtual (although anintermediate class in the inheritance chain canintroduce virtual). This prejudges that a routinemight be redefined in descendants. This can be aproblem because routines that arent actuallypolymorphic are accessed via the slightly lessefficient virtual table technique instead of astraight procedure call. (This is never a large over-head but object-oriented programs tend to usemore and smaller routines making routineinvocation a more significant overhead.) Thepolicy in C++ should be that routines that mightbe redefined should be declared virtual.

    Rumbaugh et al, put their criticism of C++svirtual as follows: C++ contains facilities forinheritance and run-time method resolution, but aC++ data structure is not automatically object-oriented. Method resolution and the ability tooverride an operation in a subclass are onlyavailable if the operation is declared virtual in thesuperclass. Thus, the need to override a methodmust be anticipated and written into the originclass definition. Unfortunately, the writer of aclass may not expect the need to definespecialized subclasses or may not know whatoperations will have to be redefined by a subclass.This means that the superclass often must bemodified when a subclass is defined and places aserious restriction on the ability to reuse libraryclasses by creating subclasses, especially if thesource code library is not available. (Of course,you could declare all operations as virtual, at aslight cost in memory and function-callingoverhead.) [RBPEL91]

    Virtual, however, is the wrong mechanism forthe programmer to deal with. A compilationsystem can detect polymorphism, and generate theunderlying virtual code, where and only wherenecessary. Having to specify virtual burdens theprogrammer with another bookkeeping task. Thisis the main reason why C++ is a weak object-oriented language as the programmer mustconstantly be concerned with low level details.The compiler should take care of such detail andso relieve the programmer.

    A further argument is that any statementshould consistently have the same semantics. Theobject-oriented interpretation of a statement likea->f () is that the most suitable implementation off() is invoked for the object referred to by a,whether the object is of type A, or a descendent ofA. In C++, however, the programmer must know

    Another problem in C++ is mistakenredefinition. The base class routine can be

    C++?? 2nd Edition page 5

  • redefined unwittingly. The compiler should reportan erroneous name redefinition within the samename space unless the descendant classprogrammer specifies that the routine redefinitionis really intended. The same name can be used,but the programmer must be conscious of this,and state this explicitly, especially inenvironments where systems are assembled out ofpreexisting components. Unless the programmerexplicitly overrides the original name a syntaxerror should report that the name is a duplicatedeclaration. C++, however, adopted the originalapproach of Simula. This approach has beenimproved upon, and other languages have adoptedbetter, more explicit approaches, that avoid theerror of mistaken redefinition.

    3.2. Pure Virtual FunctionsAs mentioned above, pure virtual functions

    provide a means of leaving a function undefinedand abstract. A class that has such an abstractfunction cannot be directly instantiated. A non-abstract descendant class must define the function.The C++ pure virtual syntax is:

    virtual void fn () = 0;

    This leaves the reader to guess its meaning,even those well versed in object-orientedconcepts. A better choice would have been akeyword such as abstract. Direct expression ofconcepts enhances communication, and the easewith which a language can be learnt. Whenlearning a language it is often important to use theindex of a text book. A keyword like abstractwould be easily found in an index. But what doyou look for in the case of = 0? You might noteven realise it is significant. It should havesyntactic significance as abstract functions are avery important concept in object-oriented design.The C++ decision is in keeping with the Cphilosophy of avoiding keywords. This is often atthe expense of clarity. A keyword wouldimplement this concept more clearly. Forexample:

    Eiffel and Object Pascal cater for this situationas the descendant class programmer is required tospecify that redefinition is intended. This has theextra benefit that a later reader or maintainer ofthe class can easily identify the routines that havebeen redefined, and that this definition is relatedto a definition in an ancestor class without havingto refer to ancestor class definitions. Thus option2 is exactly where it should be, in descendantclasses.

    Option 3 pure virtual void fn ();The pure virtual function caters for the thirdoption. The routine is undefined, the class isabstract and cannot be directly instantiated. Adescendant class must define the routine if it is tobe instantiated. Any descendants that do notdefine the routine are also abstract classes. Thisconcept is correct, but see the section on purevirtual functions for a criticism of the syntax.

    or

    abstract void fn ();

    The mathematical notation used in C++suggests that values other than zero could be used.What if the function is equated to 13? -

    virtual void fn () = 13;Virtual is a difficult notion to grasp. The

    related concepts of polymorphism and dynamicbinding, redefinition, and overloading are easier tograsp, being oriented towards the problemdomain. Virtual routines are an implementationmechanism for polymorphism. Polymorphism isthe what, and virtual is the how. Smalltalk andObjective-C use a different mechanism toimplement polymorphism. Virtual is an exampleof where C++ obscures the concepts of OOP. Theprogrammer has to come to terms with low levelconcepts, rather than the higher level object-oriented concepts. Interesting as underlyingmechanisms might be for the theoretician orcompiler implementer, the practitioner should notbe required to understand or use them to makesense of the higher level concepts. Having to usethem in practice is tedious and error-prone, andcan prevent the adaptation of software to furtheradvances in the underlying technology andexecution mechanisms (see concurrency).

    A function is either pure, or it is not. This toany analyst suggests a boolean state, which asingle keyword conveys. A simple suggestion tofix this is to define = 0 as abstract:

    #define abstract = 0

    thenvirtual void fn () abstract;

    Pure virtual is also an abuse of naturallanguage. It is a combination of words that aresomewhat opposite in meaning. Pure meanssomething that really is what it appears to be. Forexample pure gold. Virtual means something thatappears to be what it actually is not. For examplevirtual memory. Perhaps virtual gold could befools gold. As has been said before, virtual is adifficult concept to grasp. When it is combinedwith a word such as pure, the meaning becomeseven more obscure. Modern language designersshould be very careful in the vocabulary theychoose.

    C++?? 2nd Edition page 6

  • 3.3. The Nature of Inheritance before. Assembling software components isbuilding a system that has never existed before.Inheritance is a close relationship. It provides

    a fundamental way to assemble softwarecomponents. Objects that are instances of a classare also instances of all ancestors of that class. Foreffective object-oriented design the consistency ofthis relationship should be preserved. Eachredefinition in a subclass should be checked forconsistency with the original definition in anancestor class. A subclass should preserve therequirements of an ancestor class. Requirementsthat cannot be preserved indicate a design errorand perhaps inheritance is not appropriate.Consistency due to inheritance is fundamental toobject-oriented design. C++s implementation ofnon-virtual overloading, and overloading bysignature (see below) means that the compilercannot check for this consistency. C++ does notrealise this aspect of object-oriented design. Thiscontributes to a wide and costly gap betweenanalysis and design, and implementation.

    Inheritance in C++ is like a jig-saw where thepieces fit together, but the compiler has no way ofchecking that the resultant picture makes sense. Inother words C++ has provided the syntax forclasses and inheritance but not the semantics.Certainly, not very many reusable C++ librariesare available, which suggests that C++ might notsupport reusability as well as possible. C++ failsto provide this fundamental goal of object-oriented design and programming.

    3.4. Function OverloadingC++ allows functions to be overloaded if the

    arguments in the signature are of different types.Such overloading can be useful as these examplesshow:

    max (int, int);max (real, real);

    Inheritance has been classified as syntacticinheritance and semantic inheritance. Saake et aldescribe these as follows : Syntactic inheritancedenotes inheritance of structure or methoddefinitions and is therefore related to the reuse ofcode (and to overriding of code for inheritedmethods). Semantic inheritance denotes in-heritance of object semantics, ie of objectsthemselves. This kind of inheritance is knownfrom semantic data models, where it is used tomodel one object that appears in several roles inan application. [SJE91]. Saake et al concentrateon the semantic form of inheritance. Behaviouralor semantic inheritance expresses the role of anobject within a system.

    This will ensure that the best max routine forthe types int and real will be invoked. Object-oriented programming, however, provides avariant on this. Since the object is passed to theroutine as a hidden parameter (this in C++), anequivalent but more restricted form is alreadyimplicitly included in object-oriented concepts. Asimple example such as the above would beexpressed as:

    int i, j;real r, s;

    i.max (j);r.max (s);

    Wegner, however, believes code inheritance tobe of more practical value. He classifies thedifference between syntactic and semanticinheritance as code and behaviour hierarchies[Weg90] (p43). He suggests these are rarelycompatible with each other and are oftennegatively correlated. Wegner also poses thequestion of How should modification ofinherited attributes be constrained? Codeinheritance provides a basis for modularisation.Behavioural inheritance provides modelling bythe is-a relationship. Both are useful in theirplace. Both require consistency checks thatcombinations due to inheritance actually makesense.

    but i.max (r) and r.max (j) result incompilation errors because the types of thearguments do not agree. (By operator overloadingof course, these can be better expressed, i max jand r max s, but min and max are peculiarfunctions that might want to accept two or moreparameters of the same type.)

    The above shows that in most cases, theobject-oriented paradigm can consistently expressfunction overloading, without the need for thefunction overloading of C++. C++, however, doesmake the notion more general. The advantage isthat more than one parameter can overload afunction, not just the implicit current object pa-rameter.

    It seems that inheritance is most powerful inthe most restrictive form of a semanticspreserving relationship. A subclass should notbreak the assumptions of an ancestor class.

    The disadvantage is that C++ introduces someinconsistencies that the compiler cannot detect. Ifthe programmer intends to redefine a virtualroutine, but makes a mistake in the declaration ofthe function signature, the compiler willerroneously assume an overloaded function. Anycalls to the function using one or other of thesignatures will also fail to detect theinconsistency.

    Software components are like jig-saw pieces.When assembling a jig-saw the shape of thepieces must fit, but more importantly, theresulting picture must make sense. Assemblingsoftware components is more difficult. A jig-sawis reassembling a picture that was complete When calling the routine, if the programmer

    makes a mistake in supplying the actual

    C++?? 2nd Edition page 7

  • parameters, a C++ compiler cannot be specificabout the error. It can only report that no functionwith a matching signature could be found.Programmers make this sort of mistake for subtlereasons, and it can be time consuming to pinpointthe parameter at fault. Secondly, the incorrectparameter might accidentally match, one of theother routines. In that case this error will bepropagated into the production code, and couldremain undetected a long time.

    and attempting to do so considerably complicatesdesign.

    3.6. Name overloadingNaming is fundamentally important in

    producing self-documenting software. Naminghelps realise maintainable and reusable softwarecomponents. Names are fundamental in freeingprogrammers from low level manipulation ofaddresses. Naming is the basis for differentiatingbetween different entities in a software module.Name overloading allows the same name to referto two or more different entities. The problem iswhether the resultant ambiguity is useful, and howto resolve it, as ambiguity weakens the power ofnames to distinguish entities.

    If it is felt that C++s scheme of havingparameters of different types is useful, it shouldbe realised that object-oriented programmingprovides this in a more restricted and disciplinedform. This is done by specifying that theparameter needs to conform to a base class. Anyparameter passed to the routine can only be a typeof the base class, or a subclass of the base class.For example:

    Name overloading is useful for two purposes.Firstly it allows programmers to work on two ormore modules without concern about nameclashes. The ambiguity can be tolerated as withinthe context of each module, the nameunambiguously refers to a unique entity.Secondly, name overloading providespolymorphism, where the same name applied todifferent types refers to different implementationsfor those types. Polymorphism allows one word todescribe what is to be computed. Differentclasses might require different specifications ofhow, a computation is done. For example drawis an operation that is applicable to all differentshapes, even though circles and squares, etc aredrawn differently.

    A.f (B someB) {...};

    class B ...;class D : public B ...

    A a;D d;

    a.f (d);

    The entity d must conform to the class B,and the compiler checks this.

    The alternative to function overloading bysignature, is to require functions with differentsignatures to have different names. Names shouldbe the basis of distinction of entities. This isknown to work and solves the above problems.The compiler can cross check that the parameterssupplied are correct for the given routine name.This also results in better self-documentedsoftware. It is often difficult to choose appropriatenames for entities, but it is well worth the effort.

    These two uses of name overloading provide apowerful concept. But use of the same name inthe same context must be resolved. Errors canresult from ambiguity. In this case theprogrammer needs to differentiate betweenentities in ways other than name alone. Acommon way to do this is to introduce extradistinguishing names. For example in a group ofpeople, where two or more share the same firstname, they can be distinguished by their surname.Similarly a unique first name will distinguish themembers of a family with a common surname.

    3.5. Virtual ClassesIf class D multiply inherits class A via classes

    B and C, then if D wants to inherit only a singlecopy of A, the inheritance of A must be specifiedas virtual in both B and C. This raises twoquestions. Firstly, what happens if A is declaredvirtual in only one of B or C? Secondly, what ifanother class E wants to inherit multiple copies ofA via B and C? In C++, the virtual class decisionmust be made early, reducing the flexibility thatmight be required in the assembly of derivedclasses. In a shared software environmentdifferent vendors might supply classes B and C. Itshould be left to the implementer of class D or E,exactly how to resolve this problem. And this isthe simplest case. What if A is inherited via morethan two paths, with more than two levels ofinheritance? Such flexibility is key to reusablesoftware. You cannot envisage when designing abase class all the possible uses in derived classes,

    This is analogous to classes, where each classin a system is given a unique name. Each memberwithin a class is also given a unique name. Wheretwo objects with members of the same name areused within the same context, the object name canqualify the members. For example a.mem andb.mem.

    [Reade 89] points out the difference betweenoverloading and polymorphism. Overloadingmeans the use of the same name in the samecontext for different entities with completelydifferent definitions and types. Polymorphismthough has one definition, and all types aresubtypes of a principle type. C. Strachey referredto polymorphism as parametric polymorphism andoverloading as ad hoc polymorphism.

    Block structured languages provideoverloading by scoping. Scoping allows the same

    C++?? 2nd Edition page 8

  • name to be used in different contexts withoutclash or confusion. Nested blocks provide a subtleproblem. Names in an outer block are in scope ininner blocks. Many languages, however, allow aname to be overloaded in an inner block. Thisdoes more than overload the name, it hides it. Theuse of a name in the inner block does not indicateany relationship with the same name in the outerblock. Textually nested blocks inherit namedentities from outer blocks. Inheritanceaccomplishes this in object-oriented languages.Inheritance eliminates the need to textually nestentities, and also accomplishes loose coupling.Nesting makes entities tightly coupled.

    a combination of components, which quicklyleads to an exponentiation in the number of testsrequired.

    C++ has an analogous form of hiding. A non-virtual function in a derived class hides a functionin an ancestor class. This hiding is explained insection 13.1 of the C++ ARM. This is adiscrepancy with declaring multiple functionswith the same name in the same class withdifferent signatures. A function in the derivedclass will hide the functions of the ancestor class,rather than add its signature to the list of possiblefunctions which can be called. This is confusingand error prone. Learning all these ins and outs ofthe language is extremely burdensome to theprogrammer. Often they will only be learnt afterfalling into a trap.

    Contrary to most high level languages, a nameshould not be overloaded while it is in scope. Thisinconveniently hides the outer declaration, and theprogrammer cannot access the outer entity. It isalso error prone. The following exampleillustrates this:

    3.7. Polymorphism and InheritanceInheritance provides a form of name

    overloading similar to overloading in subblocks.The scope of a name is the class in which itoccurs. If a name occurs twice in a class, it is asyntax error. Inheritance introduces somequestions over and above this simpleconsideration of scope. Should a name declared ina base class be in scope in a derived class? Thereare three choices:

    { int i; {

    int i; // hide the outer i. i = 13; // assign to the inner i.

    // Cant get to the outer i here. // It is in scope, but hidden. } 1) Names are in scope only in the immediate

    class but not in subclasses. Subclasses can freelyreuse names because there is no potential for aclash. This precludes software reusability.Subclasses will not inherit definitions ofimplementation. Therefore case 1 is not worthconsidering.

    }

    Now delete the inner declaration:{

    int i;2) The name is in scope in a subclass, but the

    name can be overloaded without restriction. Thisis closest to the overloading of names in nestedblocks. This is C++s approach. Two problemsarise. Firstly, the name can be unintentionallyreused. Secondly, because the new entity is notassumed to have any relationship to the original,its signature cannot be type checked with theoriginal entity. Since consistency checks betweenthe superclass and subclass are not possible, thetight relationship that inheritance implies, whichis fundamental to object-oriented design, is notguaranteed. This can lead to inconsistenciesbetween the abstract definition of a base class, andthe implementation of a derived class. If thederived class does not conform to the base class inthis way, it should be questioned why the derivedclass is inheriting from the base class in the firstplace. (See the nature of inheritance.)

    { i = 13; // Syntactically valid,

    // but not the // intention.

    }}

    The inner overloaded declaration is removed,and references to that name do not result in syntaxerrors due to the same name being in the outerenvironment. The inner instruction nowmistakenly changes the value of the outer entity.A compiler cannot detect this situation unless thelanguage definition forbids nested redeclarations.E.W. Dijkstra uses similar reasoning in An essayon the Notion: The Scope of Variables in ADiscipline of Programming, [Dijkstra 76].

    The above example demonstrates how nestingresults in unmaintainable programs. This isbecause the inner block is tightly coupled to theouter block, and each is sensitive to changes in theother. The advantage of keeping componentsdecoupled and separate is that a programmer canconfidently make modifications to one componentwithout affecting other components. Testing canbe limited to the changed component, rather than

    3) The name is in scope in the subclass, butcan only be overloaded in a disciplined way toprovide a specialisation of the original. Other usesof the name are reported as duplicate name errors.This form of overloading in a subclass ensures theentity referred to in the subclass is closely relatedto the entity in the ancestor class. This helpsensure design consistency. The relationship of

    C++?? 2nd Edition page 9

  • name scope is not symmetric. Names in a subclassare not in scope in a superclass (although this isnot the case in typeless languages such asSmalltalk). In order to provide the consistentcustomisation of reusable software components,the same name should only be used by explicitlyredefining the original entity. The programmer ofthe descendant class should indicate that this isnot a syntax error due to a duplicate name, butthat redefinition is intended. (This has alreadybeen covered in the virtual section.) This choiceensures that the resultant class is logicallyconstructed. This might seem restrictive, but isanalogous to strong typing, and makes inheritancea much more powerful concept.

    3.9. Anonymous parameters in ClassDefinitions

    C++ does not require parameters in functiontemplates to be named. The type alone can bespecified. For example a function f in a classheader can be declared as f (int, int, char). Thisgives the client no clue to the purpose of theparameters, without referring to theimplementation of the function. Meaningfulidentifiers are essential in this situation, becausethis is the abstract definition of a routine. A clientof the class and routine must know that the firstint represents a count of apples, etc. It is truethat well known routines might not require aname, for example sqrt (int). But this is notappropriate for large scale software development.The use of anonymous parameters handicaps thepurpose of abstract descriptions of classes andmembers: to facilitate the reusability of software.Program text captures the meaning of the systemfor some future activity, such as extension ormaintenance. To achieve reusability,communication of intent of a software element isessential. A compiler strips away this level ofcommunication, producing a machine executableentity. Languages and compilers that perform lessthan optimal translations should not penalisecareful production of semantic entities. Butneither should a language definition allow lessthan optimal expression to the human reader.Languages do not have to be cryptic to achieveefficiency. In fact cryptic languages impairefficiency, as they make it harder for the pro-grammer to develop efficient systems, andfurthermore, they make it harder for automaticcode optimisers.

    3.8. . and ->The . and -> member access syntax came

    from C structures. It illustrates where the C baseadversely affects flexibility. Semantically bothaccess a member of an object. They are, however,operationally defined in terms of how they work.The dot (.) syntax accesses a member in anobject directly. For example x.y means accessthe member y in the object x.

    obj x; // declare object x of // class obj // with a member y.

    x.y; // access y in object x // directlyx->y; // syntax error . expected

    The -> syntax means access a member in anobject referenced by a pointer. For example x->y(or the equivalent *(x).y) means access themember y in the object pointer x refers to . Names are not strictly necessary in

    programming. Naming exists to help the humanreader identify different entities within theprogram, and to reason about their function. Forthis reason naming is essential. Without naming,development of sophisticated systems would benearly impossible. Some languages accessparameters by their address (position) in theparameter list ($1, $2, etc). This is quiteunsatisfactory, even for shell scripts. Anonymousparameters can save typing in a function template,but then programming is not a matter of conve-nience. This is inconvenient for later readers. Theredundancy is beneficial and saves laterprogrammers having to look up the information inanother place. A real convenience in functiontemplates would be that abstract functiontemplates be automatically generated from theimplementation text (see header files for moredetails).

    obj *x; // declare a pointer x to an // object of class obj.x->y; // access y via pointer xx.y; // syntax error -> expected

    In this example, what is to be computed isaccess the element y of object x. In C++,however, the programmer must specify for everyaccess the trivial detail of how this is done. Thecompiler can easily remove this burden from theprogrammer, as in fact most languages do.Furthermore, this reduces flexibility as if the objx declaration is changed to obj *x, the effect iswidespread as all x.y must be changed to x->y.Since the compiler gives a syntax error if thewrong access is used, this shows it already knowswhat access code is required and can generate itautomatically. Good programming centralisesdecisions. The decision to access the objectdirectly or via a pointer should be centralised inthe declaration.

    Anonymous parameters illustrate the linkbetween courtesy and safety issues inprogramming. Due to pressure of work, a clientprogrammer might wrongly guess the purpose of aparameter from the type. Thus the failure of theoriginal programmer to provide a courtesy has

    C++?? 2nd Edition page 10

  • caused a later programmer to breach safety. Aninterface client must know the intention of theinterface for it to be used effectively.

    3.12. Optional ParametersOptional parameters that assume a default

    value according to the routines declaration aresupposed to provide a shorthand notation.Shorthand notations are intended to speed upsoftware development. Such shorthand notationscan be convenient in shell scripts, and interactivesystems. In large scale software production,however, precision is mandatory, and defaults canlead to ambiguities and mistakes. With optionalparameters the programmer could assume thewrong default for a parameter. More importantly,optional parameters undermine type safety. Thetype of a function is defined by the compositionof its input types, and its output type:

    3.10. Nameless ConstructorsMultiple constructors can have different

    signatures, similar to overloaded functions. Thisprecludes two or more constructors having thesame signature. Constructors are also not named(apart from the same name as the class). Thismakes it difficult to discern from the class headerthe purpose of the different constructors. It isdifficult to match an object creation with thecalled constructor. Constructors suffer from all ofthe problems described with regards to functionswith the same name but different signatures. Itwould be easy to mark routines as constructors,for example:

    f: T1 x T2 x T3... -> T4

    The entire signature determines the type of thefunction, not just the return type. Optionalparameters mean that C++ is not type safe, andthat the compiler cannot check that the parametersin the call exactly match the function signature.

    constructor make (...)...constructor clone (...)...constructor initialise (...)...

    Furthermore, they do not provide a great dealof convenience. If a routine has five parameters,the last three of which are optional, and callerwants to assume the defaults for parameters 3 and4, but must specify parameter 5, then all fiveparameters must be specified. A better schemewould be to have a default keyword in functioncalls:

    where each constructor leaves the object invalid, but potentially different states. Namedconstructors would aid comprehension as to whatthe constructor is used for in the same way asfunction names document the purpose of afunction. Secondly, named constructors wouldallow multiple constructors with the samesignature. Thirdly, it is easier to match up anobject creation with the constructor actuallycalled.

    f (a, b, default, default, e);

    Other means, already in the language, caneasily provide this mechanism. For example, acall to another (possibly inline) function couldprovide the defaults for the optional parameters.This not only provides the convenience ofoptional parameters, but is more powerful. Anyparameter or combination can be filled in with anycombination of defaults, not just the lastparameters. Multiple intermediate routines canprovide multiple sets of defaults.

    3.11. Constructors and TemporariesA return can result in a

    different value than the result of . Insection 6.6.3, the C++ ARM says If required theexpression is converted, as in an initialisation, tothe return type of the function in which it appears.This may involve the construction and copy of atemporary object (S12.2).

    Section 12.2 explains In some circumstancesit may be necessary or convenient for the compilerto generate a temporary object. Such introductionof temporaries is implementation dependent.When a compiler introduces a temporary object ofa class that has a constructor it must ensure that aconstructor is called for the temporary object.

    3.13. Bad deletionsThe following example is given on p.63 in the

    C++ ARM as a warning about bad deletions thatcannot be caught at compile-time, and probablynot immediately at run-time:

    A note says The implementations use oftemporaries can be observed, therefore, throughthe side effects produced by constructors anddestructors.

    p = new int[10];p++;delete p; // errorp = 0;Putting this together, creation of a temporary

    is implementation dependent, so might or mightnot be done. If a temporary is created, aconstructor is called as a side effect, which canchange the state of the object. Different C++implementations could therefore return differentresults for the same code.

    delete p; // ok

    One of the restrictions of the design of C++ isthat it must remain compatible with C. Thisresults in examples like the above, that are ill-defined language constructs, that can only becovered by warnings of potential disaster.Removal of such language deficiencies wouldresult in loss of compatibility with C. This might

    C++?? 2nd Edition page 11

  • be a good thing if problems such as the abovedisappear. But then the resultant language mightbe so far removed from C that C might be bestabandoned altogether.

    data. Friend is a limited export mechanism.Friends have three problems:

    1) They can change the internal state ofobjects from outside the definition of the class.

    2) They introduce extra coupling betweencomponents, and therefore should be usedsparingly.

    3.14. Local entity declarationsDeclaring an entity close to where it is used,

    has both advantages and disadvantages. It isconvenient, but can make a routine appear morecomplex and cluttered. A problem is that anidentifier can be mistakenly overloaded within anested block in a function, with the resultantproblems covered in the sections on nameoverloading and nesting. C does not have nestedroutines or blocks so does not have this problem.ALGOL uses this simple form of nameoverloading. (A block in the ALGOL sensecontains both declarations and instructions.)

    3) They have access to everything, ratherthan being restricted to the members of interest tothem.

    Friends are useful, and a case can be made forshades of grey between public, protected andprivate members. Multiple interfaces to a classprovide the functionality of friends and avoid theabove problems. Each interface to the class can beexported to everything, or selected classes only. Aselective export mechanism is more general thanpublic, private, protected and friend, andexplicitly documents the couplings betweenentities in the system. Selective export specifiesnot only that a member is exported but to whichclasses it is exported.

    The ARM explains problems of localdeclarations with branching, which shows thecomplications in intermingling declarations andinstructions. Caveats cannot make up for or fixfaulty language definition.

    One reason given for friends, is that theyallow more efficient access to data members thana function call. The way C++ is often used is thatdata members are not put in the public section,because this breaks the data hiding principle.

    The C++ FAQ [Cline] (Q83) is unclear on thispoint (although it is mostly excellent), claimingthat an object is created and initialised at themoment it is declared. This only applies to auto,in stack objects. Dynamic entities are not createdand initialised until they are the subject of a newinstruction. In well written object-orientedsoftware, routines will be small, typicallyperforming one atomic action per routine.

    Data hiding is better described asimplementation hiding. Only a classes abstractfunctional interface should be visible to theoutside world. That is data members can beexported, but are viewed externally as functionalentities. This is because, when used inexpressions, functions and variables have nosemantic difference. They both return values of agiven type. (See fn () for an explanation of whyvariables and functions are best regarded assimilar entities.) (See also Marshall Clinesexplanation of friends in the FAQ for furtherclarification of the friend concept.)

    Small routines that implement atomicoperations are fundamental to loose coupling. Forexample, a base class that provides a singleroutine that logically performs operations A andB, is not useful to a subclass that needs to provideits own implementation of B, but does not want tochange A. The descendant must reimplement thelogic of both A and B, missing an opportunity toreuse the logic of A. Tight coupling reducesflexibility. Splitting A and B into differentroutines accomplishes loose coupling, andtherefore flexibility. Efficiency is also attainedwithout the mess of local entity declarations.Good design and clean modularisation achieveefficiency, as the entities which would be locals toa block in C++ are only created when the routineis entered.

    The Cambridge Encyclopedia of Languagehas an interesting point about public and privatenames. It says Many primitive people do not liketo hear their name used, especially inunfavourable circumstances, for they believe thatthe whole of their being resides in it, and theymay thereby fall under the influence of others.The danger is even greater in tribes (in Australiaand New Zealand, for example), where people aregiven two names - a public name, for generaluse, and a secret name, which is only known byGod, or to the closest members of their group. Toget to know a secret name is to have total powerover its owner.

    3.15. MembersCare should be taken with the C++ use of the

    term member. In general use, an object is amember of a class. This corresponds to membersin set theory. But in C++, the term member meansa data item, or function of the class. Thisambiguity could have easily been avoided. 3.17. Static

    The word static is confusing in C++. Page98 of the C++ Annotated Reference Manual(ARM) mentions this confusion and gives twomeanings. Firstly, a class can have static

    3.16. FriendsFriends are a mechanism to override data

    hiding. Friends of a class have access to its private

    C++?? 2nd Edition page 12

  • members, and a function can have static entities.The second meaning comes from C, where a staticentity is local in scope to the current file. Thechoice of different keywords would easily solvethis trivial problem. There is also a third moregeneral meaning that objects are statically orautomatically allocated and deallocated on thestack when a block is entered and exited, asopposed to dynamically allocated in free space.

    variants. Inheritance and polymorphism providethis in OOP. A reference to a superclass can alsobe used to refer to any subclass, and thus providesthe same semantics as union, only in a type safemanner, as the alternatives can never be confused.An object reference is implicitly a union of allsubclasses.

    3.19. Nested ClassesStatic class members are useful. Page 181 of

    the ARM states that statics reduce the need forglobal variables. It is good to reduce globalvariables, but the C syntax obscures the purpose.

    Simula provided textually nested classessimilar to nested procedures in ALGOL. Textual(syntactic) nesting should not be confused withsemantic nesting, nor static modelling withdynamic run time nesting. Modelling is done inthe semantic domain, and should be divorcedfrom syntax. You do not need textually nestedclasses to have nested objects. Nested classes arecontrary to good object-oriented design, and thefree spirit of object-oriented decomposition,where classes should be loosely coupled, tosupport software reusability. Semantic nesting isachieved independently of textual nesting. Inobject-oriented design all objects should interactonly via well defined interfaces. Objects of a classthat is textually nested in another class haveaccess to the outer object without the benefit of aclean interface. C avoided the complexity ofnested functions, but C++ has chosen to imple-ment this complexity for classes, which is of lessuse than nested functions.

    Entities declared in functions can also bestatic. These are not needed in an object-orientedlanguage. The reason and history is this. ALGOLhas the notion of OWN locals in blocks. Thesemantics of an OWN entity is that when a blockis exited, the value of the OWN is preserved forthe next entry to the block. I.e. the value ispersistent. The implementation is that at compiletime, the OWN entity is limited in scope to theblock, but at run time, it is located in the globalstack frame. The same instance of the variable isused in all invocations of the procedure, ratherthan each invocation using separate local storageon the stack. This causes complication in recur-sion.

    Simulas designers generalised the ALGOLnotion of block into class, and so object-orientation was born. Instead of discarding a classblock on exit, it is made persistent. Declarationswithin the class block are persistent, and thereforeprovide the functionality of static and OWN.Classes are more flexible than statics. Statics arepersistent in the same way as globals, ie for theduration of the program. Class member lifetime isgoverned by the lifetime of the object. Object-oriented languages do not need OWNs or statics.

    OOP achieves nesting in two ways: byinheritance and object-oriented composition. Thusmodelling nesting is achieved without tighttextual coupling. For example, consider a car. Weknow in the real world that the engine isembedded within the car. In object-orientedmodelling, however, this embedding is modelledwithout textual nesting. Both car and engine areseparate classes. The car contains a reference to anengine object. This also allows the vehicle andengine hierarchy to be independently defined.Engine is derived independently into petrol,diesel, and electric engines. This is simpler andmore flexible than having to define a petrol enginecar, a diesel engine car, etc, which you have to doif you textually nest the engine class in the car.Other examples can also be structured withouttextual nesting, and no loss of generality.

    3.18. UnionUnion is another construct that is superfluous

    in OOP. Similar constructs in other languages arerecognised as problematic. For example,FORTRANs equivalences, COBOLsREDEFINES, and Pascals variant records. Whenused to overload memory space these force theprogrammer to think about memory allocation.Recursive languages use a stack mechanism thatmakes overloading memory space unnecessary, asit is allocated and deallocated automatically forlocals when procedures are entered and exited.The compiler and run time system automaticallyallocate and deallocate storage as required,ensuring that two pieces of data never clash forthe same memory space. This is essential so thatthe programmer can concentrate on the problemdomain, rather than machine oriented details.When union is used similarly to FORTRANsequivalences it is not needed.

    In C++, not only can classes be nested withinother classes, but also within functions, therebytightly coupling a class to a function. Thisconfuses class definition with object declaration.The class is the fundamental structure in object-oriented programming and nothing has existenceseparate from class (including globals). C++ isconfused as to whether it is procedure-oriented orobject-oriented.3.20. Global Environments

    The global environment provides a specialcase of nested classes. When classes are nested ina global environment, dependencies can arise that

    Union is also not needed to provide theequivalent to COBOL REDEFINES or Pascals

    C++?? 2nd Edition page 13

  • make the classes difficult to decouple from thatenvironment, and therefore not reusable. Even if aclass is not intended for use in another context, itwill benefit from the discipline of object-orienteddesign. Each class is designed independently ofthe surrounding environment, and relationshipsand dependencies between classes are explicitlystated.

    header B also includes header C. A simple butmessy fix in all headers solves this problem:

    #ifndef thismod#define thismod

    ... rest of header#endif

    Headers show how C++ addresses theproblem of independent modules by a non-object-oriented approach that is sub-optimal; theprogrammer must supply this bookkeepinginformation manually. A class interface isequivalent to a module header. A module headercontains data and routines exported to othermodules. This is exactly the purpose of the classinterface. A class definition contains allknowledge of component classes and theirdependencies (inheritance and client) in the classtext. Dependency analysis is derivable from theclass text. Tools like make can be integrated intothe compiler itself, and the errors and tediumencountered in the use of make are avoided.#includes relate to the organisation andadministration of a project. Rational language de-sign eliminates such bookkeeping mechanisms.

    In C++ functions can change the globalenvironment, beyond the object in which they areencapsulated. Such changes are side-effects thatlimit the opportunity to produce loosely-coupledobjects, which is essential to enable reusablesoftware. This is a drawback of both global andnested environments.

    A good OO language will only permit routinesin an object to change its state. Removing theglobal environment is trivial. It is simplyencapsulated in an object or set of objects of itsown. Therefore global entities are subject to thediscipline of object-oriented design. Havingglobals in a system circumvents OOD. Objectscan also provide a clean interface to the externalenvironment, or operating system, without loss ofgenerality, for a negligible performance penalty.Thus classes are independent of a surroundingenvironment, and the project for which they werefirst developed, and are more easily adaptable tonew environments and projects.

    A traditional system is assembled bycombining modules. An object-oriented system isassembled by combining classes. Modules are aprimitive form of classes. Classes are moresophisticated. They express more preciselyrelationships with other classes. C++ #includesand modules have problems. This primitivemethod is not required in an object-orientedlanguage.

    3.21. Header FilesIn C++ a class interface must be maintained

    separately from its body. While an abstractinterface should be distinct from a concreteimplementation, the interface and implementationcan both be derived from one source. In C++though, programmers must maintain the two setsof information. Replicated information has wellknown drawbacks. In the event of change, bothcopies must be updated. This can lead toinconsistencies that must be detected andcorrected. Tools can automatically extract abstractclass descriptions from class implementations,and guarantee consistency.

    3.22. Class InterfacesSection 9.1c of the C++ ARM points out that

    C++ has no direct support for interfacedefinition and implementation module. In aC++ class definition, all private and protectedmembers must be included in the public text ofthe class. The ARM points out that whenever theprivate or protected parts are changed, the wholeprogram must be recompiled. Further to what theARM says, all modules that are dependent on theheader file must be recompiled, even though theprivate and protected members do not affect othermodules. Private members should not be in theabstract class interface, as this exposesimplementation details to programmers of othermodules.

    The programmer must also use #includes tomanually import class headers. #include is an oldand unsophisticated mechanism to providemodularity. #include is a weak form of inheritanceand import. C++ still uses this 30 year oldtechnique for modularisation, while otherlanguages have adopted more sophisticatedapproaches, for example, Pascal with Units,Modula with modules, Ada with packages. InEiffel the unit of modularisation is the class itself,and includes are handled automatically. The OOPclass is a more sophisticated way to modulariseprograms. Inheritance implements reusability andmodularisation, so #include is superfluous.

    3.23. Class header declarationsCs syntax for function declarations is

    [] (). For (a verysimple) example:

    class CAnother problem is that if header A includesheader B, and header B includes header A acircular dependency occurs. The same problemoccurs if header A includes headers B and C, and

    {a ();b ();

    C++?? 2nd Edition page 14

  • int c (); In C++ the programmer must manuallymanage storage due to the lack of garbagecollection. This is a difficult bookkeeping taskthat leads to two opposite problems. Firstly, anobject can be deallocated prematurely, while validreferences still exist (dangling pointers).Secondly, dead objects might not be deallocatedleading to memory filling up with dead objects(memory leaks). Attempts to correct eitherproblem can lead to overcompensation and theother problem occurring. A correct system is afine balance. This is illustrated in the figurebelow.

    d ();char e ();virtual void f ();

    }

    To find an identifier in this layout, the eyemust trace a course around the type specifications.This is a tiring activity. The eye has a greaterchance of missing the sought identifier, and theprogrammer must resort to using the searchfunction of a text editor to help out.

    Other languages place the entity names first.For example:

    DanglingPointers

    CorrectSystem

    MemoryLeaksclass C

    {a (); These problems contribute to the fragility of

    C++ programs, and usually result in systemfailure. Garbage-collection solves both problems.Garbage-collection has an undeserved badreputation due to some early garbage-collectorshaving performance problems, instead of workingtransparently in the background, as they can andshould. These problems are often over-emphasised as a justification for C++ ignoringgarbage collection. A possible solution is to buildgarbage collection into the run time architecture,but allow the programmer to activate anddeactivate it manually. Garbage collection can bedisabled in systems where it is inappropriate.

    b ();c () int;d ();e () char;f () virtual void;

    }

    To those used to the ALGOL and FORTRANstyle of type first, this seems backwards. Butname first is logical as a real world exampleillustrates. Imagine if a dictionary is published,and the keywords are not placed first, but ratherthe entry order is -

    In C++ it might be argued that the lack ofgarbage-collection is not an engineeringcompromise. Its inclusion is nearly an engineeringimpossibility, as a programmer can undermine thestructures required for implementing correctlyworking garbage-collection. While garbage-collection might not actually be an impossibilityin C++ (EC++), it is difficult, and programmerswould have to settle for a more restricted way ofprogramming. This could be a good thing. Butthen the compromise to remain compatible with Cbecomes difficult, if the compiler is to detectpractices inconsistent with the operation ofgarbage-collection.

    noun /obvrzen/ obversion, the act or result of obverting

    Such a dictionary would not sell many copies,unless the marketers managed to fool manypeople that the explanation of the meaning wasmore correct because the order of layout wasmysteriously magical. This example illustrateshow important subtle syntax decisions are, andwhy PASCAL style languages might have orderedthings contrary to FORTRAN, ALGOL andothers. The language designer must consider thesetrivial but important alternatives. The layout ofprogramming entities is essential for effectivecommunication. The dual roles of languagesyntax, and programming style affectcomprehension. A dictionary or index style layoutsuggests placing entity names first, followed bytheir definition.

    3.25. Type-safe linkageThe C++ ARM explains that type-safe linkage

    is not 100% type safe. If it is not 100% type-safe,then it is unsafe. It is the subtle errors that causethe most problems, not the simple or obviousones. Often such errors remain undetected in thesystem until critical moments. The seriousness ofthis situation cannot be underestimated. Manyforms of transport, such as planes, and spaceprograms depend on software to provide safety intheir operation. The financial survival oforganisations can also depend on software. Toaccept such unsafe situations is at best ir-responsible.

    3.24. Garbage CollectionOne of the hallmarks of high level languages

    is that programmers declare data without regard tohow the data is allocated in memory. In blockstructured languages, local variables are allocatedon the stack, and automatically deallocated whenthe block exits. This relieves the programmer of agreat burden. Garbage collection providesequivalent relief in languages with dynamic entityallocation.

    C++?? 2nd Edition page 15

  • The C++ ARM summarises the situation asfollows - Handling all inconsistencies - thusmaking a C++ implementation 100% type-safe -would require either linker support or amechanism (an environment) allowing thecompiler access to information from separatecompilations.

    methodology of choice of disciplined thinkers.Some people can hold a whole problem andsolution in their head and work in a disciplinedfashion until the solution is complete. Mozart issaid to have composed this way, producing hislast three symphonies in as many months in 1788.Beethoven toiled far more over the production ofhis works, taking years to complete onesymphony. Both composers producedmasterpieces. Mozart wrote music directly,whereas Beethoven wrote themes and ideas in hisfamous sketchbooks. The production ofmasterpieces depends on skill, not on method-ologies.

    So why does the C++ compiler (at leastAT&Ts) not provide for accessing informationfrom separate compilations? Why is there not aspecialised linker for C++, that actually provides100% type safety? There is no reason why C++should not be implemented this way. Buildingsystems out of preexisting elements is thecommon Unix style of software production. Thisimplements a form of reusability, but not in thetruly flexible manner of object-orientedreusability.

    It is becoming accepted that the softwarelifecycle should be an integrated process.Analysis, design and implementation should be aseamless continuum.The activities of the lifecycleshould progress in parallel to expedite softwaredevelopment. Facts found out only as late as theimplementation stage can be fed back into theanalysis and design stages. The object-orientedapproach supports this process. Artificialseparation of the steps leads to a large semanticgap between the steps. The transformationsrequired to bridge such semantic gaps are prone tomisinterpretation, time consuming and costly.

    In the future, Unix could be replaced byobject-oriented operating systems, that are indeedopen to be tailored to best suit the purpose athand. By the use of pipes and flags, Unix softwareelements can be reused to provide functionalitythat approximates what is desired. This approachis valid and works with efficacy in someinstances, like small in-house applications, orperhaps for research prototyping, but isunacceptable for widespread and expensivesoftware, or safety critical applications. In the lastten years the advantages of integrated softwarehave been acknowledged. Classic Unix systemsdont provide those advantages. Integrated sys-tems are more ambitious, and place more demandson developers. But this is the sort of software nowbeing demanded by end users. Systems that arecobbled together are unacceptable.

    The same people should be responsible for allstages. This way they take responsibility for thesystem as a whole, rather than passing the buckand blame which occurs when analysts, designersand implementers are different groups. This is nota popular viewpoint in traditional hierarchicalmanagement structures where programmers getpromoted to designers who get promoted toanalysts. Hierarchical management alsodiscourages people from feeling responsible for aproduct. This culture must radically change if weare to produce quality systems.

    A further problem with linking is thatdifferent compilation and linking systems shoulduse different name encoding schemes. Thisproblem is related to type-safe linkage, but iscovered in the section on reusability andcompatibility.

    We should have learnt from the extremesSA/SD. Some quarters believed that methodologywas all important, while programming andprogramming languages were unimportant.Arcane and machine-oriented programminglanguages strengthened this attitude. Theselanguages concentrate on the how ofcomputation, whereas the modellers correctlydemand notations that express the what, in orderto be implementation independent. A modernsoftware language supports the integration of theactivities of design and implementation by beingreadable, and problem-oriented. A languageshould be as close to design as possible. Theneeds and requirements of an enterprise canchange much more rapidly than programmers cankeep up, especially in a highly competitive andcommercial world.

    3.26. C++ and the software lifecycleThe software lifecycle has attracted a great

    deal of attention. It is at least generally acceptedthat the activities in the lifecycle are analysis ofrequirements, design, implementation, testing anderror correction, extension. Unfortunately, theresult of identifying these activities has resulted ina school of thought that the boundaries betweenthese activities are fixed, and that they should besystematically separate, each being completedbefore the next is commenced. It is often arguedthat if they are not cleanly separated, then you arenot practicing disciplined system development.

    This view is incorrect. Someone who writes aprogram straight away is actually doing all thesteps in parallel. It might not be the best way dodo things in many circumstances, might or mightnot suit the style and thinking of different people,but this works in some scenarios, and can be the

    So how does C++ fit into this picture? Well itis based on C that was designed mainly as animplementation and machine-oriented language. Itis an old language, that did not need to considerthe integrated lifecycle approach. C++ might have

    C++?? 2nd Edition page 16

  • some of the trappings of object-oriented concepts,but it is the marriage of a problem-orientedtechnique with a machine-oriented language. Itaddresses implementation, but not so well theother aspects of the software lifecycle. Since C++is not so well integrated with analysis and design,the transformation required to go from analysisand design to implementation is costly. Thesemantic gap between design languages and theimplementation language is great.

    software component, require assurance that thecomponent is trustworthy. Trusting programmersis against the commercial interest of both parties.This is not to cast dispersion on programmers, butmerely recognises that computers are good atperforming mundane tasks and checks, but peopleare not. If people were good at such things, wewould not need computers in the first place.Building trustworthy components is a safetyconcern.

    We should have learnt from the structuredw