Improving Code Quality Through Effective Review Process

Improving Code Quality Through Effective Review Process

ByDr. Syed Hassan Amin

The objective of code review exercise is to ensure highest code quality.

Quality code ensures that your code is : Understandable Extendable and easily modified with changing business

requirements Scalability : Covers performance, maintainability and

fault tolerance Reliable : Does not easily breakdown

In this discussion, we look at commonly found problems in code, and find ways of overcoming them.

Code Quality

Coding Guidelines Understanding and Avoiding Common Coding and Design Mistakes

Memory Leaks Bad smells of code Excessive and un-necessary API calls

Best Practices Centralized Server Communication Following Single Responsibility Principle Centralized Threading Optimizing Use of Images in terms of file access, storage, and draw

calls Reducing Code Complexity Static Code Analysis Profiling

Path to High Quality Code

Coding conventions are a set of guidelines for a specific programming language that recommend programming style, practices and methods for each aspect of a piece program written in this language.

These conventions usually cover file organization, indentation, comments, declarations, statements, white space, naming conventions, programming practices, programming principles, programming rules of thumb, etc.

Coding Guidelines

http://en.wikipedia.org/wiki/Programming_language

http://en.wikipedia.org/wiki/Programming_style

http://en.wikipedia.org/w/index.php?title=File_organization&action=edit&redlink=1

http://en.wikipedia.org/wiki/Indentation

http://en.wikipedia.org/wiki/Comment_(computer_programming)

http://en.wikipedia.org/wiki/Declaration_(computer_science)

http://en.wikipedia.org/wiki/Statement_(programming)

http://en.wikipedia.org/wiki/Whitespace_(computer_science)

http://en.wikipedia.org/wiki/Identifier_naming_convention

http://en.wikipedia.org/wiki/Best_Coding_Practices

http://en.wikipedia.org/wiki/Category:Programming_principles

http://en.wikipedia.org/wiki/Category:Programming_rules_of_thumb

Redundant Code Writing functions which span many pages Writing functions with deeply nested if’s, case's Writing badly typed functions Function names which do not reflect what the functions do Variable names which are meaningless Using processes when they are not needed Badly chosen data structures (Bad representations). Bad comments or no comments at all (always document

arguments and return value). Hard coded values Unindented code. Inaccessible parts of code (Dead Code)

The Most Common Mistakes

Leads to lack of reusability Can lead in unforeseen results Example:

Code to perform sell operation added at 21 different places.

As a consequence, coins or virtual money being used in the project ends up being negative in a multi threaded game.

Redundant Code

Long functions may not always be bad Especially when a single algorithm is implemented

Long functions are bad when they have too many if blocks Seriously consider dividing those functions

Extremely long functions are definitely bad General rule of thumb is consider sub dividing

your method when more than 50 lines of code

Don’t Write Very Long Function’s

Nested code is code containing case/if/receive statements within other case/if/receive statements.

It is bad programming style to write deeply nested code - the code has a tendency to drift across the page to the right and soon becomes unreadable.

Try to limit most of your code to a maximum of two levels of indentation.

This can be achieved by dividing the code into shorter functions.

Don't Write Deeply Nested Code

A module should not contain more than 400 lines of source code.

It is better to have several small modules than one large one.

A module is also very long when it has too many subroutines/functions

Don't Write Very Large Modules

Comments should be clear and concise and avoid unnecessary wordiness.

Make sure that comments are kept up to date with the code.

Comments should add to the understanding of the code.

Comments

The important things to document are: The purpose of the function. The domain of valid inputs to the function. That is,

data structures of the arguments to the functions together with their meaning.

The domain of the output of the function. That is, all possible data structures of the return value together with their meaning.

If the function implements a complicated algorithm, describe it.

The possible causes of failure and exit signals which may be generated by exit/1, throw/1 or any non-obvious run time errors. Note the difference between failure and returning an error.

Any side effect of the function.

Comments About Function/Method

Example: %%--------------------------------------------------------%% Function: get_server_statistics/2 %% Purpose: Get various information from a process. %% Args: Option is normal|all. %% Returns: A list of {Key, Value} %% or {error, Reason} (if the process is dead) %% Modification History:%%--------------------------------------------------------------

Comments About Function/Method (Cont’d)

Never leave commented code when submitting/committing code

Remember the source code control system will help you!

Do Not Comment Out Old Code – Remove It

Many errors go unnoticed until runtime at client site. Exceptions provide a powerful and flexible way to handle issues

in your code, Use them wisely. Common examples of such errors include:

Sending messages an object doesn’t respond to Going out of bounds of an array Accessing invalid data File not found Unable to save/retrieve file on a network server because of

incorrect encodling e.g. ASCII on client side and Unicode on server side

Losing the connection to the window server

Exception Handling

When reviewing code, please make sure that you do not have any segments/blocks of code where control will never be transferred.

Inaccessible code is also called dead code. Examples include a function that is never used. An if/else block that is never supposed to be get

control transferred to it. Use exception handling to cover unexpected

scenarios rather than leaving inaccessible code for end of world type scenarios.

Inaccessible Code

Avoid hardcoding values into the source code Hard coding requires the program's source code

to be changed any time the input data changes Constants may be put in a separate configuration

file when necessary

Avoid Hard Coded Values

How efficient is an algorithm or piece of code? CPU (time) Usage Memory Usage Disk Usage Network Usage

We express complexity of our algorithms using big-O notation.

For a problem of size N: a constant-time method is "order 1": O(1) a linear-time method is "order N": O(N) a quadratic-time method is "order N squared": O(N2)

Code Complexity

In 1976, Tom McCabe published a paper arguing that code complexity is defined by its control flow.

It seems to be generally accepted that control flow is one of the most useful measurements of complexity

High complexity scores have been shown to be a strong indicator of low reliability and frequent errors.

The Cyclomatic Complexity computation based on Tom McCabe's work and is defined in Steve McConnell's book, Code Complete on page 395 : Start with 1 for the straight path through the routine Add 1 for each of the following keywords or their

equivalents: if, while, repeat, for, and, or Add 1 for each case in a case statement

Code Complexity(Cont’d)

http://www.stevemcconnell.com/

http://www.amazon.com/exec/obidos/tg/detail/-/1556154844/qid=1087067145/sr=8-1/ref=pd_ka_1/102-7870910-4615315?v=glance&s=books&n=507846

So, if we have this C# example: while (nextPage != true) { if ((lineCount <= linesPerPage) && (status != Status.Cancelled)

&& (morePages == true)) { // ... } } In the code above, we start with 1 for the routine, add 1 for the while,

add 1 for the if, and add 1 for each && for a total calculated complexity of 5.

Anything with a greater complexity than 10 or so is an excellent candidate for simplification and refactoring.

Minimizing complexity is a great goal for writing high-quality, maintainable code.


Some advantages of McCabe's Cyclomatic Complexity include:

It is very easy to compute, as illustrated in the example

It can be computed immediately in the development lifecycle (which makes it Agile-friendly)

It provides a good indicator of the ease of code maintenance

It can help focus testing efforts It makes it easy to find complex code for formal

review


Formal code review coupled with complexity measurements provide a very compelling technique for quality improvement, and it is something that can easily be adopted by an Agile team.

So, what can you do to implement this technique for your project? Find a tool that computes code metrics (specifically complexity) for

your language and toolset Schedule the tool so that it automatically runs and captures

metrics every day Use the code complexity measurement to help identify candidates

for formal code review Capture the results of the code review and monitor their follow-up

(too many teams forget about the follow-up)


Programmers love to write loops ! Loops can significantly increase code complexity

while decreasing code quality.

Common mistakes to avoid in loops are : Memory allocations in loops Variable declarations in loops Function calls that just return constant value

Loops

Example In this for loop, we are calling array count function multiple

function multiple times, although it remains fixed throughout the execution of the loop.

for (int i = 0; i< [landPatchesArray count]; i++) for (int j=0;………){}AlternativelyInt i=0; int j=0;landPatchArrayCount=[landPatchesArray count]; for (i = 0; i<landPatchArrayCount; i++) for (j=0;………)

Loops(Cont’d)

Pay careful attention to memory allocation and deallocation.

Always run profiler to check for memory leaks.

Memory Leaks

Its difficult to think of applications that do not have use of multiple threads.

Threads are horrible: They tend to make behavior of your programs as

unpredictable As soon as you notice unpredictable behavior then

you start putting synchronized blocks around your code and data.

Developers just start creating threads without ever thinking about organizing/managing their threads.

Centralized Threading Model

Grand Central Dispatch is a mechanism for controlling multiple threads of execution.

It is an implementation of task parallelism based on the thread pool pattern.

GCD works by allowing specific tasks in a program that can be run in parallel to be queued up for execution and, depending on availability of processing resources, scheduling them to execute on any of the available processor cores[11].

Centralized Threading Model (Cont’d)

http://en.wikipedia.org/wiki/Task_parallelism

http://en.wikipedia.org/wiki/Thread_pool_pattern

http://en.wikipedia.org/wiki/Task_(computers)

http://en.wikipedia.org/wiki/Scheduling_(computing)

http://en.wikipedia.org/wiki/Grand_Central_Dispatch#cite_note-gcdworks-10

http://en.wikipedia.org/wiki/Grand_Central_Dispatch#cite_note-gcdworks-10

Centralized server communication is important because we need to ensure consistent behavior of the system when updating or retrieving its state from the server.

A common mistake is to allow individual modules to update their own state with the server.

Example In one scenario, inconsistent game state udpation

resulted because of non-adherence to single responsibility principle because many classes were managing their own NSURLConnections, and none of the instances really do much error handling or robust retry logic.

Centralized Server Communication

Pay careful attention to optimizing use of Images in terms of file access, storage, and draw calls.

PNG and JPG formats are highly inefficient, because the data must be parsed into bitmap form on the fly at load time.

iOS Specific Image Formats can come to your rescue in such a scenario

One of the better format is pvr.ccz, which cocos2d natively supports,

Pvr.ccz image are faster to read from disk, and require no parsing to load into memory.

Use spritesheets to minimize image loading and faster rendering through batch sprites.

Optimizing Image Use

Static code analysis is the process of detecting errors and defects in software's source code.

Static analysis can be viewed as an automated code review process.

Static Code Analysis

Static analysis is usually poor regarding diagnosing memory leaks and concurrency errors.

To detect such errors you actually need to execute a part of the program virtually. It requires extra effort and carefulness.

A static analysis tool warns you about odd fragments. It means that the code can actually be quite correct.

It is called false-positive reports.

Disadvantages of Static Code Analysis

Profiling is a form of dynamic program analysis that measures, for example, the usage of memory, the usage of particular instructions, or frequency and duration of function calls.

The most common use of profiling information is to aid program optimization.

Profiling is achieved by instrumenting either the program source code or its binary executable form using a tool called a profiler (or code profiler).

Profiling

http://en.wikipedia.org/wiki/Dynamic_program_analysis

http://en.wikipedia.org/w/index.php?title=Memory_profiler&action=edit&redlink=1

http://en.wikipedia.org/wiki/Instruction_set_simulator

http://en.wikipedia.org/wiki/Optimization_(computer_science)

http://en.wikipedia.org/wiki/Source_code

http://www.erlang.se/doc/programming_rules.shtml#REF32141

http://blogs.msdn.com/b/mswanson/archive/2004/06/12/154460.aspx

http://en.wikipedia.org/wiki/Cyclomatic_complexity

Links










Software

Improving Code Quality Through Effective Review Process