21
C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica [email protected]

C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica [email protected]

Embed Size (px)

Citation preview

Page 1: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

C++ Code Analysis: an Open Architecture for the Verification of Coding Rules

Paolo Tonella

ITC-irst, Centro per la Ricerca Scientifica e Tecnologica

[email protected]

Page 2: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

ITC/CERN collaboration

The collaboration aims at improving the quality of the code developed at CERN, by means of:

Automatic check of coding rules. Recovery of the design from the code. Refactoring of the design.

All objectives share a common C++ code analysis functionality.

Page 3: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Outline of the talk

C++ analysis model Tool architecture Preprocessing Language issues Implementation of coding rules State of development

Page 4: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

C++ analysis model

Page 5: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

C++ analysis model

The model of the C++ language enjoys the following properties:

Generality. Extensibility. Abstraction.

Page 6: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Tool Architecture

Packages syntax and entities collaborate to generate a network of objects according to the C++ model.Package rules contains the coding conventions to be checked.

Page 7: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Tool architecture

The adoption of this architecture provides a remarkable flexibility.

All rules relying on properties of entities in the C++ model can be directly encoded.

The C++ model can be extended if additional properties need to be collected.

Adding a new application package is simple.

Page 8: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Preprocessing

C++ macros are expanded in the code by the preprocessor.

Macros do not necessarily comply with the C++ syntax.

#define BEGIN {#define END }

void f()BEGIN

int x = 0;...

END

Page 9: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Strip filter

The C++ preprocessor prepends all directly and indirectly included files. The strip filter removes those that are not user defined.

Moreover, the C++ preprocessor inserts some flags that are useful for the successive compilation step. Examples are: __extension__, __const__

The output of the strip filtering is a legal C++ module, that can be analyzed by the parser.

Page 10: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

C++ language

C++ was conceived as an object oriented evolution of C. A strong requirement in its design was a total backward compatibility with C.

C++ had also a controversial evolution in its more advanced features, like exception handling and generic classes.

Is it a function declaration or a global object creation?

A x();

Page 11: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Language issues

To deal with the complexity of C++, it is important to distinguish between the compilation perspective and the analysis perspective.

The analyzer can assume that the input program is compilable with no errors.

The compiler needs to capture the statement level semantics.

The performances expected from the compiler are substantially superior.

All these considerations led to the choice of a javacc based C++ grammar

Page 12: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Compatibility with C

Structures and unions are reinterpreted as classes. Although methods are available from classes,

functions are still usable. Functions may operate on class objects, and

classes may invoke functions. Global variables violate encapsulation, but are

allowed. Types other than classes can be defined with the typedef.

Page 13: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Language issues (cont.) The language model contains C as a subset. Type equivalence affects the association

between declaration and definition.

Additional difficulties: Body of methods within class definition. Constructors, destructors, conversion functions

and operators. Encapsulation violation via friend construct. Generic classes (template). Exception throwing and catching.

Page 14: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Coding rules

Adding a new coding rule involves the following steps: A new class is defined which extends the general

class Rule. Its constructor passes the rule name and

description to the superclass constructor. A method check must be defined to implement the

interface of the superclass. The body of the method check can use the access

functions of the analysis package.

Page 15: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Coding rule exampleThe following coding rule is taken from the Naming

Rules enforced within the CERN experiment ALICE:

RN3 No special characters in names are allowed (_, #, &, @, -, %).

check() {

classes = Module.getClasses();

foreach (c in classes) {

if (c.getName().hasChar(_, #, &, @, -, %))

printViolationMessage(...);

methods = c.getMethods();

foreach (m in methods) {

if (m.getName().hasChar(_, #, &, @, -, %))

printViolationMessage(...);

locals = m.getLocals();

foreach (l in locals) ...

Page 16: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Adding new coding rules

The only constraint is that a formal description of the rule can be derived, for which a procedure can be written.

It may be necessary to augment the set of entities extracted by the CPPParser.

When entities are available, rule introduction is simple.

There is a clear and sharp separation between the responsibilities of packages rules and analysis.

Page 17: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Current limitations

Known limitations are related to the difficulties of covering the whole range of C++.

Genericity is not handled. Exception throwing and catching is not detected. Type equivalence is implemented only in a

simplified form.

Such limitations did not substantially limit the possibility of analyzing ALICE code, which does not exploits genericity and exceptions.

Page 18: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

State of development

Conventions Tot. Impl. To be impl. Non impl. Excl.

Naming rules 21 17 0 3 1

Coding rules 14 8 2 2 2

Style rules 5 1 4 0 0

Namingguidelines

2 0 0 2 0

Codingguidelines

4 0 1 3 0

Styleguidelines

0 0 0 0 0

See: http://AliSoft.cern.ch/offline/codingconv.html

Page 19: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

State of development (cont.)Coverage of the coding conventions for which an

automatic check is feasible:

Conventions Abs. Perc.

Total 46

Implementable 33 72%

Implemented 26 79%

Page 20: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Analyzed code

The RuleChecker tool was successfully executed with no errors on all the code in the current release of the ALICE experiment software.

Lines of code 85730

Subsystems 18

Classes 215

Modules (.cxx) 136

A violation report was generated for each module under analysis.

Page 21: C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it

Conclusion

To make analysis independent from the applications using its outcomes:

a C++ language model was defined, a simple query protocol was used to access code

entities.

Executed on ALICE code, the tool RuleChecker: collected information about 85730 lines of code, reported no parse error, produced a violation report associated to each

input module.