52
Unit Testing and Performance Using Entity Framework 4.0 Tommy H¨ ornlund January 24, 2013 Master’s Thesis in Computing Science, 30 credits Supervisor at CS-UmU: Jan-Erik Mostr¨ om Examiner: Fredrik Georgsson Ume ˚ a University Department of Computing Science SE-901 87 UME ˚ A SWEDEN

Unit Testing and Performance Using Entity Framework 4.0 - DiVA

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Unit Testing and PerformanceUsing Entity Framework 4.0

Tommy Hornlund

January 24, 2013Master’s Thesis in Computing Science, 30 credits

Supervisor at CS-UmU: Jan-Erik MostromExaminer: Fredrik Georgsson

Umea UniversityDepartment of Computing Science

SE-901 87 UMEASWEDEN

Abstract

POANGEN is a web application for rent management. The core of the application is amodule that performs rent calculations. In the past the application relied heavily on businesslogic in stored procedures that made the program hard to test and maintain.

The purpose of this thesis was to find a new method for combining unit testing and dataaccess. A new implementation of the rent calculation had to be created that was easier totest, maintain and have good performance.

This thesis shows how to combine data access and unit tests using Entity Framework4.0, an object relational mapping framework from Microsoft. The new module uses theRepository and Specification design patterns to create a data abstraction that is suitable forunit testing.

Also the performance of Entity Framework 4.0 is evaluated and compared to traditionaldata loading and it shows that Entity Framework 4.0 severely lacks in performance whenloading or saving large amounts of data. However the use of POCO entities makes it possibleto create optimized functionality for time critical data access.

ii

Contents

1 Introduction 1

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Goals & Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 POANGEN 3

2.1 Utility principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.3 Residential unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.4 Apartment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.5 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.6 Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.7 Rent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.8 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Entity Framework 7

3.1 Entity Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.2 Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.3 LINQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.4 Loading Related Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.5 Change Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Testability 13

4.1 Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2 Unit of Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.3 POCO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.4 Mocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.5 Inversion of Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.6 Unit Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

iii

iv CONTENTS

5 Result 17

5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5.2 Entity Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5.3 POCO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5.4 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5.5 FetchStrategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.6 Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.7 Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.8 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.9 Data Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.10 Data Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.11 Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.11.1 Testing Data Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.11.2 Test Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.11.3 Mocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.11.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6 Performance 29

6.1 Test Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.2 Test Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.3 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.4 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.4.1 Calculation time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.4.2 Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.4.3 Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.4.4 Legacy Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

7 Conclusions 35

7.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

8 Acknowledgements 37

References 39

List of Figures

3.1 An example database diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 The entities that are mapped to the database tables in figure 3.1 . . . . . . . 8

5.1 Conceptual overview of the system . . . . . . . . . . . . . . . . . . . . . . . . 18

5.2 The real and the mock context implements the same interface . . . . . . . . . 19

5.3 Calulation module dependencies . . . . . . . . . . . . . . . . . . . . . . . . . 21

6.1 Data loading comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.2 Memore usage comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.3 Entity Framework persistance performance. . . . . . . . . . . . . . . . . . . . 32

6.4 Entity Framework persistence memory usage . . . . . . . . . . . . . . . . . . 32

6.5 Comparison with legacy rent calculator . . . . . . . . . . . . . . . . . . . . . 33

v

vi LIST OF FIGURES

List of Tables

2.1 Common apartment properties. . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 An example model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 Two apartments with different property values. . . . . . . . . . . . . . . . . . 5

2.4 Property values converted to score . . . . . . . . . . . . . . . . . . . . . . . . 6

2.5 Formula calculated score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

vii

viii LIST OF TABLES

Listings

3.1 Using LINQ to query a list of integers. . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Using LINQ to query an entity collection. . . . . . . . . . . . . . . . . . . . . 9

3.3 Loading related entities by including them in the query. . . . . . . . . . . . . 10

3.4 Explicitly loading related entities after the query. . . . . . . . . . . . . . . . . 10

3.5 Loading related entities with lazy loading enabled. . . . . . . . . . . . . . . . 10

3.6 Loading related entities before the query is executed. . . . . . . . . . . . . . . 10

3.7 Saving changes to the context . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.1 Regular dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2 Inversion of Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

5.1 The calculation module interface . . . . . . . . . . . . . . . . . . . . . . . . . 17

5.2 The specification interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.3 An excerpt from the generic repository interface . . . . . . . . . . . . . . . . 20

5.4 Ordinary object instantiation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.5 Dependency injection using a factory lambda expression. . . . . . . . . . . . . 22

5.6 Specification for an active apartment. . . . . . . . . . . . . . . . . . . . . . . 24

5.7 Unit testing the Specification for an active apartment. . . . . . . . . . . . . . 24

5.8 Mock example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.9 IFormulaCalculator interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.10 IModelLayoutCalculator interface excerpt . . . . . . . . . . . . . . . . . . . . 26

5.11 Example unit test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

ix

x LISTINGS

Chapter 1

Introduction

TRIMMA is an IT company focusing on solutions for business management and decisionmaking software. One if its major products is POANGEN (translates to ”Score”), a completesolution for rent management according to the utility principle. An explanation of the utilityprinciple can be found in section 2.1.

POANGEN started out as a small and simple application, but it has quickly grown inboth functionality and number of customers. It is becoming harder and harder to maintainand develop the application and new development methods has to be found in order to ensurethe future of POANGEN.

A core part of the application is the rent calculation module. It is a data heavy processwhere large amounts of data is brought together to calculate the rent for each apartment.The current implementation suffers from many issues.

The first problem is that it is hard to test the correctness of the module. The only wayto know that it works correctly is because it has worked in the past. This means that themodule cannot easily be extended, because the correctness cannot be verified if a change ismade.

Another issue is performance. With larger and larger customers the system cannot handlethe tens of thousands of apartments these companies manage. If a new system design isproposed, it has to be highly efficient.

1.1 Overview

Chapter 1 contains the background and problem statement.

Chapter 2 is a brief introduction to the problem domain.

Chapter 3 is an overview of Entity Framework, the object relational mapping frameworkused in this project.

Chapter 4 contains a theoretical study about testability and Entity Framework.

Chapter 5 describes the final application that was created.

Chapter 6 contains a performance study on the resulting application.

Chapter 7 contains the discussion of the results and conclusions.

1

2 Chapter 1. Introduction

1.2 Problem Statement

The task is to create a new rent calculation module with automated testing. The possibilityto add new features should be taken into consideration as well as the performance of thecalculation.

A large part of the problem is that existing calculation logic reside in the database inthe form of stored procedures. It is not possible with the current tools to automatically testthis functionality. If a change is made there is no safeguard that the previous functionalityis not altered. The problem is to move this logic to the application code without loosingperformance.

Presently POANGEN is using an object relation mapping (ORM) framework known asdOOdads[16]. This framework however is obsolete and no longer maintained. Thereforeanother ORM framework will be used, Microsofts Entity Framework 4.0[14]. Part of theproject will be to evaluate the testability and performance of Entity Framework.

1.3 Goals & Purpose

The purpose is to develop a method to incorporate automated testing in the developmentprocess. The method should be evaluated to determine if it is suitable for the existingapplication as well as in future development. The goal is to develop software that is easier tomaintain and modify, without introducing new bugs in the existing functionality.

Chapter 2

POANGEN

This chapter is a brief introduction to the utility principle. The utility principle is theunderlying principle of POANGEN, a complete solution for rent management.

2.1 Utility principle

The idea behind the utility principle is to provide an alternative to market prices. Thedifference in price between apartments should be easily explainable because of differentstandards on the apartments[3]. In Sweden a landlord is governed by law to use the utilityprinciple[7].

The method used in POANGEN is based on work by the Swedish Association of PublicHousing Companies (SABO)[3]. The basic idea is that a score is calculated for each apartmentand then the current rent is redistributed relative to the score to calculate the new rent.This means that the total income remains constant while some apartments get increasedrent and others decreased.

2.2 Properties

To describe the different aspects of the apartments a number of properties are defined.Properties can be of different data types; string, numeric, boolean and predefined values.The most common properties has been defined by SABO[3] while others has to be chosenbased on experience. Some of the common properties can be seen in table 2.1. Propertiescan also be undefined, represented by a null value.

Table 2.1: Common apartment properties.

Name Data type Possible values

Area Numeric Real numbersApartment type Predefined 1ROK, 2ROK, 3ROKBalcony Boolean yes or noAddress String Any stringLocation Predefined A, B, C

3

4 Chapter 2. POANGEN

2.3 Residential unit

Apartments are contained in groups called residential units. A unit usually consist of a singlebuilding or several very similar buildings. Properties such as building year, surroundingsand distance to different services can be associated with residential units.

2.4 Apartment

The apartment score is calculated from the number of rooms and the type of kitchen. Forexample, one room and kitchenette receives 24 points, while two rooms and a regular kitchenreceives 40 points. This score is added to the area of the apartment and this becomes thetotal score for the apartment.

The apartment score assumes that the apartment has all the standard equipment of aregular apartment. If some part of the apartment differs from the standard an adjustmentscore has to be added. For example, if the balcony is extra large the apartment should gainextra points, while if the balcony is extra small or missing the apartment should loose points.The different types of scores are described in section 2.6.

2.5 Model

A model is a mapping from properties to points, different property values can be assigneddifferent points. An example model can be seen in table 2.2. Each property is also assigneda formula alias. All points with the same formula alias are summed and substituted into thecorresponding variable in the formula, described in the next section.

Table 2.2: An example model. The apartment typehas a fixed score for each possible value. The areaproperty is a numerical property so it can be directlyconverted to points.

Formula alias Property Value Points

A Apartment type 1ROK 34A Apartment type 2ROK 40A Apartment type 3ROK 44

A Area X X

B Balcony Yes 0B Balcony No -5

C Location A 35C Location B 33C Location C 21

2.6. Score 5

2.6 Score

There are three different types of points in the model. The A score is known as the apartmentscore. It is a measure of the size and number of rooms in the apartment. Each apartment isassumed to satisfy the minimum standard requirements. For example, rooms should have atleast one window and heating should be included in the rent.

The second type is the B score, called the adjustment score. If the apartment differsfrom a standard apartment an adjustment has to be made. For example if an apartment hasno balcony a negative score will be added to the B score.

The final type is the C score, called the residential unit score. This is the score for allproperties that are shared by all apartments in a residential unit. The desirability of thebuilding location can be one such property.

The total score is calculated by taking the residential unit score C and adding 100.This score is then multiplied by the apartment score A. Finally the adjustment score B ismultiplies by 100 and added to the product. The final formula can be seen in equation 2.1.

Total score = (100 + C) ×A + (100 ×B) (2.1)

2.7 Rent

To convert from points to rent the total income from all apartments are divided by the sumof the score for all apartments, see equation 2.2.

Factor =Total income

Sum of all scores(2.2)

The resulting factor is then multiplied by the score of the apartment to calculate therent, equation 2.3.

Apartment rent = Score × Factor (2.3)

2.8 Example

An example is two apartments having the same rent of 3000 SEK, but different standard.With the utility principle the rent should be redistributed to reflect the differences of theapartments.

Table 2.3: Two apartments with different prop-erty values.

Value

Apartment one Apartment two

Area 32 30Type 2ROK 1ROKBalcony Yes No

6 Chapter 2. POANGEN

The first apartment in table 2.3 has an area of 32 m2, two room and kitchen (2ROK)and a balcony. The second apartment has an area of 30 m2, one room and kitchen (1ROK)but no balcony. Using the model in table 2.2 the properties can be converted to points intable 2.4.

Table 2.4: The score for each property in table2.3.

Points

Apartment one Apartment two

Area 32 30Type 40 34Balcony 0 -5

Again using the model the A, B and C score can be calculated in table 2.4. The finalscore is calculated by using equation 2.1.

Table 2.5: The score for the two apartments.The total score is calculated using the formula(100 + C) ×A + (100 ×B).

Points

Apartment one Apartment two

A 72 64B 0 -5C 0 0Total 7200 5900

Apartment ones type is worth 40 points. Added with the area the A score becomes 72.The total points become (100 + 0) × 72 + (100 × 0) = 7200. The second apartment receives(100 + 0) × 64 + (100 ×−5) = 5900.

The factor becomes 3000+30007200+5900 = 60

131 . Now the factor can be multiplied with the newscore to calculate the new rent.

Table 2.6: Example

Apartment one Apartment two

Points 7200 5900Old rent 3000 SEK 3000 SEKNew rent 3298 SEK 2702 SEK

Note that the sum of the new rents are the same as the sum of the old rents. It has beenredistributed to better reflect the utility of the apartments.

Chapter 3

Entity Framework

According to the requirements Entity Framework had to be used, the reason being that theframework was already in use for other projects at the company. Entity Framework is anobject relation mapping framework from Microsoft that is included in the .NET framework[14].The version used for this project is Entity Framework 4.0.

The basic idea of Entity Framework is to eliminate the impedance mismatch betweenbusiness logic and data representation. This is done using the Entity Data Model (EDM).

3.1 Entity Data Model

The Entity Data Model has two basic components.

– Entities are strictly typed data structures that contains the record data and an identifierkey.

– Relationships are associations between entities.

More advanced features of the EDM are inheritance and complex types[14], but these arenot used in this project.

The Entity Data Model should be created to reflect the structure of the business objectsused in the application. It may be necessary to create different data models for differentparts of the application, while still using the same database.

3.2 Mapping

To populate the data model with data from an actual relational database a mapping has tobe created. Entities can be mapped to database tables, but several tables can also map to asingle entity or a table can be split up into several entities. In figure 3.1 the Employee andContactInfo tables are combined into a single entity. The mapped entities can be seen infigure 3.2. The Company entity can be accessed from a property on the Employee entity, andthe Company entity contains a list of all employees associated with the company.

When accessing the entities in the application code Entity Framework will automaticallyfetch data from the database and populate the in-memory data structure.

7

8 Chapter 3. Entity Framework

Employee

PK EmployeeID

Salary

FK1 ContactInfoID

FK2 CompanyID

ContactInfo

PK ContactInfoID

Name

Adress

Phone

Company

PK CompanyID

Name

Employee Company

Figure 3.1: An example database diagram. The dashed boxes show the entities that thedatabase tables are mapped to.

+Salary : decimal

+Name : string

+Adress : string

+Phone : string

+Company : Company

Employee

+Name : string

+Employees : List<Employee >

Company10..*

Figure 3.2: The entities that are mapped to the database tables in figure 3.1.

3.3. LINQ 9

3.3 LINQ

Instead of using SQL query strings to query the entity model the C# language has introduceda new feature called Language Integrated Query (LINQ). LINQ can be used to query anumber of different data sources like databases, collections, XML documents and entitymodels using the same syntax. In listing 3.1 a LINQ query is made against a list of numbers.All numbers less or equal to five are selected and the numbers are sorted in ascending order.

Listing 3.1: Using LINQ to query a list of integers.

int[] numbers = new int[] {5, 7, 1, 4, 9, 3, 2, 6, 8};

var smallnumbers = from n in numbers

where n <= 5

orderby n

select n;

foreach(var n in smallnumbers) {

Console.Write(n);

}

OUTPUT: 12345

The same syntax is used to load entities from the database and when used with entitiesit is usually referred to as LINQ to Entities. Each entity type is represented as a collectionand all collections are contained in an ObjectContext class. The object context act as arepository and a unit of work, concepts described in chapter 4. For now the important thingis that entities are accessed through a collection, in the same way as the number example inlisting 3.1.

Listing 3.2 shows an example where LINQ is used to query the company context for allemployees named ”Bob”.

Listing 3.2: Using LINQ to query an entity collection.

CompanyContext companyContext = new CompanyContext ()

var bobs = from e in companyContext.Employees

where e.Name == "Bob"

select e;

10 Chapter 3. Entity Framework

3.4 Loading Related Entities

When loading an entity related entities can be loaded as well, as defined by the relationshipsin the EDM. There are several ways related entities can be loaded.

1. Specified in the query

2. Explicit loading

3. Lazy loading

4. Eager loading

The first method is used in listing 3.3 and it references the related fields in the query andselects them.

Listing 3.3: Loading related entities by including them in the query.

var result = from e in companyContext.Employees

select new { Name = e.Name , Company = e.Company.Name };

In listing 3.4 the employee entity is loaded first, then the company navigational propertyof the employee is explicitly loaded. The First() method simply returns only the firstemployee in the result set. This methods requires two round trips to the database to retrievethe data.

Listing 3.4: Explicitly loading related entities after the query.

var employee = (from e in companyContext.Employees

select e).First();

employee.Company.Load()

If the lazy loading option is enabled in Entity Framework there is no need to explicitlyload the related entity, it is automatically loaded when it is accessed like in listing 3.5. Thistoo requires two round trips to the database and care must be taken when accessing anavigational property. If for example lazy loading happens inside the loop iterating over alist of employees an SQL query will be executed for every iteration.

Listing 3.5: Loading related entities with lazy loading enabled.

var employee = (from e in companyContext.Employees

select e).First();

Company company = employee.Company;

The final method in listing 3.6 is eager loading. Here the Company related entity isincluded just after the LINQ query. This only creates a single SQL query joining the tablestogether.

Listing 3.6: Loading related entities before the query is executed.

var result = (from e in companyContext.Employees

select e).Include("Company");

3.5. Change Tracking 11

3.5 Change Tracking

When a change is made to an entity the change is automatically tracked by Entity Framework.To persist the changes to the database the SaveChanges method is called on the contextobject, as in listing 3.7.

Listing 3.7: Persisting the entity changes made to the object context.

var employee = (from e in companyContext.Employees

select e).First();

employee.Salary += 1000;

companyContext.SaveChanges ();

12 Chapter 3. Entity Framework

Chapter 4

Testability

The focus of the study has been on the subject of testability, specifically how to unit testdata access code.

Because one of the requirements was to use Microsoft Entity Framework 4.0 a lot ofeffort was put into finding information about testability when using EF4. In an articledpublished on MSDN Scott Allen demonstrates some common unit testing techniques thatcan be applied to EF4 [1]. Allen argues that extensive unit testing is a valuable tool todeveloper teams. However, the effort in creating these unit tests are related to the testabilityof the code. Therefore Entity Framework 4.0 was designed with testability in mind.

Allen presents two metrics that will always be exhibited by highly testable code. Thefirst one is observability. If a method is observable, it is easy to visually observe the outputof the method, with a given input. Methods with many side effects are hard to observe.

The other metric is isolation. When you unit test a method you only want to test thelogic inside the method. But if the method depends on some external resource, for examplea network socket or database, the unit test might fail if the resource is off line. The resourcemight also take a very long time to respond, leading the automated test to take a very longtime to run. To achieve testable code a separation of concerns should be maintained. Thisconcept was termed the Single responsibility principle by Robert C. Martin [11]. It is basedon the concept of cohesion and can be summarized as: “There should never be more thanone reason for a class to change”. In this case the logic should reside in one class or moduleand the external resource access should reside in another. They can then be unit tested inisolation.

These metrics presented by Allen are very basic metrics but they can easily be appliedto any newly developed code. The concept will also be repeated when other patterns arediscussed, so these metrics will be used to evaluate the resulting code of the project.

4.1 Repository

Allen goes on to explain some common abstractions that are useful for abstracting datapersistence. One very common abstraction is the Repository pattern. This design patternhas been documented by Martin Fowler in his book Patterns of Enterprise ApplicationArchitecture[5] and a short overview of the pattern can also be found on his website[6]. Therepository pattern is very commonly used, both for unit testing and other uses.

According to Fowler a repository “mediates between the domain and data mapping layersusing a collection-like interface for accessing domain objects”. Allen says that this isolates

13

14 Chapter 4. Testability

the details of the data persistence, and that it fulfills the isolation principle required fortestability. However, he also adds that the interface for a repository shouldn’t contain anyoperations for actually persisting the objects back to the data source. In the spirit of thesingle responsibility principle, a separate structure should be used, and he presents the Unitof Work pattern.

4.2 Unit of Work

The Unit of Work pattern is also described by Martin Fowler in his book Patterns ofEnterprise Application Architecture[5] and on his website[6]. Allen mentions that the unit ofwork pattern should be a familiar pattern for .NET developers, because it has been used inthe ADO.NET DataSet class. It has the ability to handle update, delete and insert operationon database table rows. It is however tightly coupled to database logic. The goal is to isolatethe specifics of data persistence. This is why Allen argues that the unit of work pattern isrequired.

The default behaviour in Entity Framework 4.0 is to create a class extending the Ob-

jectContext class. The object context serves as both a repository for generated entitiesas well as a unit of work. There is no interface defined for the object context, which is aproblem when you want to achieve isolation and testability. Fortunately there exists severalextensions for generating entities which can be used instead of the default code generator.Allen uses a template that generates POCO (Plain Old CLR Objects).

4.3 POCO

The POCO concept originates from the Java POJO classes (Plain Old Java Objects). ThePOCO object is independent of the data source and contains only data and business logic.This is known as Persistence Ignorance. According to Allen objects using POCO classes areeasier to test than entities that include information about persistence.

Julie Lerman mentions two flavours of POCO entities on her blog[9] which is the sameinformation as in her book Programming Entity Framework[10]. The first type is DataTransfer Objects (DTO) that are unable to notify the object context of any changes madeto the entities. The changes to the context is only checked before a commit is made on thecontext. Lazy loading is not possible with this POCO type.

If all the properties and associations of the POCO class is declared virtual a second typeof POCO entites are possible. When the context creates the POCO entity it actually createsa proxy class that overrides the methods of the POCO class and provides feedback to thecontext when the POCO is manipulated. This makes it possible for the context to interceptif an association is accessed that is not yet loaded. It can then be lazily loaded on demand.

One interesting detail is that Lerman puts the generated POCO entities in a separateclass library, allowing to create different applications that are only connected by the POCOentities.

4.4 Mocking

One of the biggest thresholds in beginning unit testing is how to isolate a unit before testingit. Tim Mackinnon, Steve Freeman and Philip Craig used mock objects to isolate units intheir paper[20]. According to them, what makes unit testing hard is that the units are testedfrom the outside.

4.5. Inversion of Control 15

Using mock objects it is possible to test code in isolation. Mock objects replace theapplication code with dummy classes that emulate the real objects, but provide a muchsimpler implementation that can be set up with data relevant to the unit tests. If the mockobjects become too complex this is an indication that the application code itself is toocomplex and requires refactoring.

4.5 Inversion of Control

In his examples Allen creates a class that is dependent upon the interface of a unit of workclass like in listing 4.1. Because it uses an interface he can create another implementation ofthe unit of work class that has no database connection, it just uses hard coded in memorydata known as a fake class. To be able to switch implementation the creation of the unitof work class is moved from the constructor to a member variable that can be sent to theconstructor like in listing 4.2.

Listing 4.1: The Controller class depends on the UnitOfWork class.

class Controller

{

UnitOfWork unitOfWork;

Controller ()

{

this.unitOfWork = new UnitOfWork ();

}

}

Listing 4.2: Inversion of control is used to break the dependency.

class Controller

{

IUnitOfWork unitOfWork;

Controller(IUnitOfWork uow)

{

this.unitOfWork = uow;

}

}

This is a very simple implementation of a pattern known as Dependency Injection. AsAllen mentions this is only a simple example, a real project would use a more complexmethod to automate the process of dependency injection.

When creating the data needed in the unit test Allen creates a class that initializes testdata intended to be used across multiple test suits. This is a design pattern known as ObjectMother, described by Schuh and Punke[18]. They show that it can be a very useful patternfor unit tests that requires data that closely resembles real data. However, as mention byMartin Fowler[4], it creates a strong coupling between tests that use the same test data.Changes to a test that requires the test data to change might affect other tests. The patternstill seems very useful, but it is slightly outside the scope of this thesis.

16 Chapter 4. Testability

4.6 Unit Testing

For unit testing Lerman uses mock contexts that implement the context interface[9]. Insteadof accessing a database the mock context returns mock object sets that read its data froman internal list of POCO entities. Several mock contexts are created, for example one withvalid data and one with invalid data. This approach is similar to using the ObjectMotherpattern mentioned in section 4.5, and it suffers the same drawback that the tests becomesstrongly connected through the shared test data.

The practical use of testability is the ability to unit test the code. R. Venkat Rajendran iswriting in a paper[19] about the impact of testing in general and the benefits and drawbacksof unit testing. One of the benefits is the ability to test one part of the code without havingto rely on other parts being available. This makes it possible for several programmers towork on and create unit tests simultaneously. Unit testing also makes it possible to debug avery confined piece of code. It is also possible to test special test cases with state that is veryhard to set-up for the whole program. The overall structure of the code is also improvedwhen unit testing is enforced. Unit testing is the most cost effective type of testing, becauseit occurs in the early stages of development.

Some of the drawbacks with unit testing according to Rajendran is that unit testing isboring. The solution to this is to provide better tools to automate repetitive task. Anotherproblem is that documentation of test cases is rarely done in practise. This makes it hard tomodify existing test cases. Because lots of stubs have to be created in order for a unit testto function, the test code is in many cases larger then the production code. Stub code canhave bugs as well.

Some of these drawbacks can be resolved, like enforcing code conventions that createself-documenting code. Also if the code has a high testability the unit tests will be lesscomplex, reducing the number of bugs in the test code. The effort of writing full coverageunit tests will always be great, and a careful decision has to be made if the program isimportant enough to justify such an effort.

Chapter 5

Result

This chapter describes the final implementation of the application. The module has a serviceoriented interface shown in listing 5.1. There is a method for calculating the rent for a set ofapartments, given the id of a model, and to calculate the rent for all apartments. There isalso event handlers to receive feedback about the calculation progress.

Listing 5.1: The calculation module interface.

public interface IObjectCalculationService

{

event PumaCalculationService.ObjectService.ObjectCalculationService.←↩CalculationProgressHandler CalculationProgress;

event PumaCalculationService.ObjectService.ObjectCalculationService.←↩CalculationEventHandler CalculationEvent;

void CalculateObjects(IEnumerable <int > objectIDs , int modelID , string ←↩calculationName);

void CalculateAllObjects(int modelID , string calculationName);

}

5.1 Overview

A conceptual overview of the architecture can be seen in figure 5.1. The business logic isseparated from the data access layer and only depends on the POCO entities. The genericquery repository uses specifications and fetch strategies to fetch entities from the context.The resulting entities can then be used in the business logic module. The data access andbusiness logic is wrapped in a service layer that acts as a layer between the whole calculationmodule and the service consumer, in this case a web application.

17

18 Chapter 5. Result

POCO

GenericRepository

Context

FetchStrategy

Logic

Service

Specification

Figure 5.1: Conceptual overview of the system.

5.2 Entity Data Model

The core of the data access layer is the Entity Data Model. It is semi-automatically generatedfrom the current development database and each table is directly mapped to an entity object.The foreign key relations are also included as associations between entities. Because of legacyartefacts in the database some minor adjustment has to be made to the data model, forexample relations without foreign key constraints has to be added manually.

A T4 template[12] is used to generate the context. T4 templates are a combination ofprogram code and a scripting language that’s used to output program code. A mock contextand mock object set is also generated, to allow mocking of dependencies in the unit tests.Figure 5.2 shows how the real context and the mock context implement the same interface.This allows for unit tests that replace the real data access with mock data access.

5.3. POCO 19

+Entities () : IObjectSet <Entity>

«interface»IPumaModelContext

+Entites () : IObjectSet <Entity>

PumaModelContext

ObjectContext

«interface»IObjectSet

+Entities () : MockObjectSet <Entity>

PumaModelContextMock

MockObjectSet

Figure 5.2: The real and the mock context implements the same interface.

5.3 POCO

The POCO entities were generated with the same template as the context. They are placedin a separate project, having no references to any other project. This makes it possible towrite business logic that is not dependent on the data source. Although the POCO entitiesare generated from a database, this is a one-time operation. When the classes are in place,instances can be created at any time, without requiring a database connection.

5.4 Specification

A specification checks if an entity satisfies a certain condition. The condition is specified as aLINQ expression. The same expression is used both to check if an in-memory entity satisfiesthe specification, but it is also used in LINQ to Entities (3.3) to receive entities from thedatabase. This eliminates any duplicate code between accessing the in-memory model and

20 Chapter 5. Result

accessing the database, as well as isolating the query expression so it can be unit tested.

The most important method in listing 5.2 is IsSatisfiedBy. It determines if an entitysatisfies the LINQ expression in the specification. The Predicate property simple return theinternal expression. There is also methods to combine the specifications using boolean logic.

Listing 5.2: The specification interface

public interface ISpecification <T>

{

Expression <Func <T, bool >> Predicate { get; }

bool IsSatisfiedBy(T entity);

ISpecification <T> And(ISpecification <T> other);

ISpecification <T> Or(ISpecification <T> other);

}

5.5 FetchStrategy

FetchStrategy is a very simple class that contains the associated entities that should beloaded when the root entity is loaded. This is the same feature as Include in EntityFramework (3.4), but wrapped in its own class. The fetch strategy is used together with thespecification when loading entities from the repository, as can been seen in listing 5.3.

5.6 Repository

The repository is based on a generic repository created by Will Beattie[2]. The basic ideawhen loading an entity is to provide a specification of the same entity type. Only entitiessatisfying the specification will be loaded. Part of the interface of the repository can befound in listing 5.3. It contains methods to load a single entity matching a specification,load all entities matching this specification and to check if any entity exists that matches thespecification. In addition a FetchStrategy can be supplied. It determines if any associatedentities should be loaded as well. This allows Entity Framework to load the associatedentities joined in a single query, decreasing the number of queries required thus increasingperformance.

Listing 5.3: An excerpt from the generic repository interface

public interface IGenericQueryRepository

{

T Load <T>( ISpecification <T> spec) where T : class;

IEnumerable <T> LoadAll <T>( ISpecification <T> spec) where T : class;

bool Matches <T>( ISpecification <T> spec) where T : class;

T Load <T>( ISpecification <T> spec , IFetchStrategy <T> fetchStrategy)

where T : class;

...

}

5.7. Calculation 21

5.7 Calculation

The logic module handles all aspect of the rent calculation. The class dependency diagram canbe seen in figure 5.3. It is important to note that all dependencies are actually dependencieson the interface.

Another important point is that the module is not aware of any part of the data accessmodules. It uses the POCO entities as if they were in-memory object graphs.

AdjustmentValueCalculator

Calculator

CalcValueCalculator DependencyCalculator

FormulaCalculator

ModelLayoutCalculator

ObjectCalculator

OPICalculator

RentCalculator

Figure 5.3: The depenency between classes in the calculation module.

22 Chapter 5. Result

5.8 Dependencies

Each dependency between two classes is implemented using dependency injection. Instead ofinstantiating an object the usual way, as in listing 5.4, a factory method is used in listing 5.5to instantiate the dependency.

Listing 5.4: Ordinary object instantiation.

public class FormulaCalculator : IFormulaCalculator

{

public decimal MyMethod ()

{

IModelLayoutCalculator modelLayoutCalculator =

new ModelLayoutCalculator ();

...

}

}

Listing 5.5: Dependency injection using a factory lambda expression.

public class FormulaCalculator : IFormulaCalculator

{

public Func <IModelLayoutCalculator > ModelLayoutCalculatorFactory =

() => new ModelLayoutCalculator ();

public decimal MyMethod ()

{

IModelLayoutCalculator modelLayoutCalculator =

ModelLayoutCalculatorFactory ();

...

}

}

The factory method in listing 5.5 may look complex if you are unused to the the syntax,but it is simply a first class function stored in the member variable ModelLayoutCalculator

Factory. The function is assigned a default value that uses the C# language feature oflambda expressions to create a method that has no input (the empty parenthesis) and returnsa new instance of the ModelLayoutCalculator class. To invoke the function the variablename is used, followed by parentheses.

This implementation differs from the one found in the study in section 4.5. In this casethe only reason for using dependency injection is to replace the real dependency with a mockobject. In the actual application the dependencies are hard coded. Therefore the factorymethods allows a default dependency to be implemented, and this makes the classes easierto use, because the dependencies doesn’t have to be sent to the constructors. If dependencyinjection is used to allow different implementations in the actual production code this methodis likely insufficient.

5.9. Data Access 23

5.9 Data Access

The generic repository and specification is only one way to load the data. To measure theperformance of this approach four other loading methods has been implemented to be usedas a reference.

– Using LINQ to query directly against the object sets, as described in chapter 3.

– Using a SQL query string with ADO.NET.

– Using a stored procedure and calling it using ADO.NET.

– Using Entity Framework function import to create a strongly typed result object fromthe stored procedure.

5.10 Data Persistence

To save the result from the calculation to the database, instances of POCO classes that areto be saved to the database are created and added to the object context. The result is thenpersisted by Entity Framework to the database.

The test in section 6.4.3 showed that this method was highly inefficient and had to beabandoned. Instead the SqlBulkCopy[13] class was used that can efficiently copy data fromany data source to a database.

5.11 Unit Tests

All these techniques come together in the unit tests. Because the classes are decoupled fromboth the data access layer and from each other the unit tests become very simple to write.The framework used for unit testing is Visual Studio Unit Testing Framework [15]. Thisframework is built into Microsoft Visual Studio.

24 Chapter 5. Result

5.11.1 Testing Data Access

The generic repository only have to be tested once, it doesn’t have to change when newentities are added. What remains is to test the specifications. The specification in listing 5.6is only satisfied by apartments (objects) that are active. In this case an apartment is activeif its isInActive attribute is null or false. There are three possible states for apartments:

1. isInActive = null should satisfy the specification.

2. isInActive = false should satisfy the specification.

3. isInActive = true should not satisfy the specification.

Each state can now be tested in a unit test. The only dependency that specification has ison POCO entities. Recall from section 5.3 that POCO entities have no dependencies at all.

Listing 5.6: Specification for an active apartment.

public class ObjectIsActiveSpecification:SpecificationBase <PumaPOCO.Object >

{

public ObjectIsActiveSpecification ()

{

predicate = obj => !obj.isInActive.HasValue || obj.isInActive.Value ←↩== false;

}

}

Listing 5.7: Unit testing the Specification for an active apartment.

[TestMethod ()]

public void ←↩ObjectIsActiveSpecification_should_match_object_with_isInActive_null ()

{

PumaPOCO.Object obj = new PumaPOCO.Object ()

{

isInActive = null

};

ObjectIsActiveSpecification target = new ObjectIsActiveSpecification ();

bool expected = true;

bool actual = target.IsSatisfiedBy(obj);

Assert.AreEqual(expected ,actual ,"Object should satisify specification");

}

5.11. Unit Tests 25

5.11.2 Test Data

Because the test cases are so isolated in most cases the amount of test data required for eachunit test is very small. Instances of POCO entities are created on the fly in the test methodand sent as parameters to the method under test.

5.11.3 Mocking

Using the technique in section 5.8 it is possible to replace the real implementation of adependency with a fake, or mock object. One way of doing this is to implement the sameinterface and replace the factory to return the mock object. In this case an external librarycalled Moq[8] is used. It is a library that makes it possible to easily implement an interfaceon the fly. By default each method will return the default value of the return type, forexample null for all reference types. Specific methods can then be overrided to return anyvalue. A short example can be seen in listing 5.8.

Listing 5.8: A mock implementation of ICalcValueCalculator is created and the IsObject

MatchingCalcValue method is overridden to always return true for any input.

[TestMethod ()]

public void MyTest ()

{

Mock <ICalcValueCalculator > calcValueCalculatorMock = new

Mock <ICalcValueCalculator >();

calcValueCalculatorMock.Setup(x => x.IsObjectMatchingCalcValue(

It.IsAny <ICalculationObject >(),

It.IsAny <CalcValue >())

).Returns(true);

}

}

5.11.4 Example

An example of a test method from the project can be seen in listing in 5.11. The testtests the FormulaCalculator method GetFormulaCalculatedPointsForObject which in-terface can be seen in listing 5.9. The purpose of this class is to take an apartment(ICalculationObject), model and formula and calculate the score for the apartment.

Because each class is supposed to have only a single responsibility (section 4) this classonly takes the score for each formula alias (A, B, C) and substitutes them into the formula tocalculate the final score. The rest of the calculation is performed by another class, through aninterface called IModelLayoutCalculator. The method that calculates the points is calledGetFormulaCalculatedPointsForObject. The interface can be seen in listing 5.10.

The first thing to do in the unit test in listing 5.11 is to set up the test data. BecausePOCO entities are used they can simply be created on the fly, and only the relevant datahas to be initialized. For example the model and object will never be read, so no fields hasto be initialized. The formula is initialized to a + 2 × b.

The next step is to hard code a return value for the GetFormulaCalculatedPoints-

ForObject method, because the purpose of this test is to test the FormulaCalculator, notany other classes. Using Moq[8] a mock object is created and the method is set up to returnthe hard coded value.

26 Chapter 5. Result

Using the inversion of control factory method the FormulaCalculator is set up to usethe mock object instead of the real implementation.

Now that everything is set up the actual method that is to be tested can be called, andthe returned value should be 1 + 2 × 2 = 5.

Listing 5.9: The interface IFormulaCalculator implemented by FormulaCalculator.

[TestMethod ()]

public interface IFormulaCalculator

{

decimal GetFormulaCalculatedPointsForObject(ICalculationObject obj , ←↩ICalculationModel model , Formula formula);

}

Listing 5.10: An excerpt from the interface IModelLayoutCalculator implemented byModelLayoutCalculator.

[TestMethod ()]

public interface IModelLayoutCalculator

{

Dictionary <string , decimal > GetPointsForRootModelLayouts(←↩ICalculationObject obj , ICalculationModel model);

...

}

5.11. Unit Tests 27

Listing 5.11: A test method for the formula calculator.

[TestMethod ()]

public void Should_calculate_the_points_bases_on_the_formula ()

{

// Setup test data.

ICalculationModel model = new CalculationModel ();

ICalculationObject obj = new CalculationObject ();

Formula formula = new Formula ()

{

Formula1 = "a + 2 * b"

};

// Instead of the model layout calculator calculating the points the ←↩result is hard coded.

Dictionary <string , decimal > points = new Dictionary <string ,decimal >();

points.Add("a", 1);

points.Add("b", 2);

// Create a mock of the model layout calculator that returns the hard ←↩coded points.

Mock <IModelLayoutCalculator > mockModelLayoutCalculator = new Mock <←↩IModelLayoutCalculator >();

mockModelLayoutCalculator.Setup(m => m.GetPointsForRootModelLayouts(

It.IsAny <ICalculationObject >(),

It.IsAny <ICalculationModel >())

).Returns(points);

// Use the inversion of control factory to make the formula calculator ←↩use the mock object instead of the real object.

FormulaCalculator target = new FormulaCalculator ();

target.ModelLayoutCalculatorFactory = () => mockModelLayoutCalculator.←↩Object;

decimal expected = 5;

// Make the call to the method under testing.

decimal actual = target.GetFormulaCalculatedPointsForObject(obj , model , ←↩formula);

// Assert that the returned value is the expected one.

Assert.AreEqual(expected , actual , "The formula calculated points are ←↩incorrect.");

}

Almost all unit test are based on the layout of the test in listing 5.11. Sometimes not allsteps are necessary, for example if a class has no dependencies. The following are the stepsused:

1. Create test data.

2. Create mock object that return test data.

3. Replace the real dependencies with mock objects.

4. Call the method that is to be tested.

5. Assert that the return value is the expected value.

28 Chapter 5. Result

Chapter 6

Performance

The purpose of the performance measurement is to determine how well the applicationperforms when the amount of data is scaled up. The different loading methods in section 5.9are compared.

6.1 Test Data

The test data sets are based on a customer database with 8723 apartments. The apartmentswere duplicated or removed to create differently sized databases. The number of apartmentsin each database are 100, 200, 500, 1000, 2000, 5000, 10000, 20000, 50000, 100000 and 200000.The numbers were chosen based on the size of existing customers databases and expectedsize of future customers.

6.2 Test Application

A test application was created to test the performance. The time to perform the calculation,including loading all data, and store the result in memory was measured separately from thetime to save the result to the database. The data persistence is the same for each test andtherefore not interesting. However, to be able to compare the new calculator to the legacyone the save time has to be measured as well.

The test application measures the number of apartments that were calculated each second.This is used to see if the calculators performance change over time.

The memory usage was also measured. A separate thread was used that wakes up everyseconds and polls the garbage collector for the current amount of memory allocated by thegarbage collector. Before each poll the garbage collector releases all unreferenced memory.

6.3 Execution

Each implementation is run ten times for each database. The first execution takes longerand is discarded. This is because Entity Framework performs some initialization the firsttime it is invoked. The mean value of the remaining nine values are used as the result.

29

30 Chapter 6. Performance

6.4 Result

6.4.1 Calculation time

The average number of calculated apartments per second is shown in figure 6.1. The functionimport, inline query and stored procedure performs equally with a peak performance atabout 10000 apartments. This means that the calculation is not completely linear and forlarger number of apartments the performance rapidly decreases. The reason for this is thatEntity Framework performs some automatic linking of related entities that is not executedin constant time.

The LINQ and Specification methods also performs equally but compared to the othermethods the performance is awful. They were also unable to calculate more than 50000apartments so no result is recorded for bigger data sets.

0

100

200

300

400

500

600

700

800

900

1000

1 10 100 1000 10000 100000 1000000

Calc

ula

tion

sp

erse

con

d(n

/s)

Apartments (n)

Function Import

Inline Query

LINQ

Specification

Stored Procedure

Figure 6.1: Comparison between the different data loading methods.

6.4.2 Memory Usage

The memory usage for each methods for different number of calculated apartments canbe seen in figure 6.2. The specification method is the most memory intensive with a peakmemory usage of 361 MiB calculating 50000 apartments. The LINQ method uses almost200 MiB for the same calculation and the rest uses only about 20 MiB. Even for 200000apartments the memory usage is only 50 MiB.

6.4. Result 31

2

4

8

16

32

64

128

256

512

1 10 100 1000 10000 100000 1000000

Pea

km

emory

(MiB

)

Apartments (n)

Function Import

Inline Query

LINQ

Specification

Stored Procedure

Figure 6.2: Comparison between the memory usage of the different data loading methods.

6.4.3 Persistence

Early versions used the data persistence feature of Entity Framework to save the result ofthe calculation to the database. An early test run showed that for large amounts of datathe time to save the result was greater then the actual calculation. The result of the testrun can be seen in figure 6.3. Saving the result of 200000 apartments took 35 minutes. Thememory usage was also extremely high, figure 6.4 shows a peak memory usage of 963 MiB.The persistence step starts at about 1000 seconds and continues until the end of execution.

32 Chapter 6. Performance

1

10

100

1000

10000

1 10 100 1000 10000 100000 1000000

Sav

eti

me

(s)

Apartments (n)

Calculation Time

Save Time

Figure 6.3: Performance of using the function import method for calculation and the EntityFramework persistence feature to save the result.

0

200

400

600

800

1000

1200

0 500 1000 1500 2000 2500

Mem

ory

Usa

ge

(MiB

)

Elapsed time (s)

Function Import

Figure 6.4: The memory usage when saving using Entity Framework.

The final program uses the SqlBulkCopy[13] function and saving the same amount ofdata takes only 14 seconds with a memory usage of only a few kibibytes. This method is not

6.4. Result 33

as flexible however and the records persisted to the database are not automatically updatedon the client side, but have to be loaded again manually.

6.4.4 Legacy Calculator

A comparison was made with the old calculator and one of the most efficient methods,function import. This comparison includes the time to persist the result to the database.The result can be seen in figure 6.5. The old calculator calculates only five apartments persecond while the new one using function import and the SqlBulkCopy function has a peakof 762 apartments per second.

0

100

200

300

400

500

600

700

800

1 10 100 1000 10000

Cal

cula

tion

sp

erse

con

d(n

/s)

Apartments (n)

Function Import

Old calculator

Figure 6.5: Comparison between the function import method and the old calculator.

34 Chapter 6. Performance

Chapter 7

Conclusions

Because of lack of time the external supervisor did not have time to create a formalspecification. This meant it took some time to actually figure out what the thesis was allabout. Despite the slow start the work went on smoothly and the project was finished onlyone week behind schedule.

The resulting application is fully functional and it performs the same task as the oldprogram but over a hundred times faster. A big improvement in performance was expected,but the result still exceeded the expectations. The module has over 200 unit tests and onlytime can tell if it is easily maintained, but it will have a greater chance than the legacyapplication.

The main goal of the thesis was to find a method to incorporate unit testing into thedevelopment cycle. It turned out that the main problem was not to write test cases, it wasto write code that is easy to test. By abstracting away the database access and adhering tothe rules of observability, isolation and single responsibility principle writing unit test will bea lot more feasible in the future.

Because of the unit tests some of the bugs introduced when adding new features to theprogram will be avoided. Smaller and less coupled classes will also make it possible to reusetried and tested classes, avoiding the need to modify classes and risking introducing newbugs. The thing that is missing is integration tests that make sure that the module as awhole is still working after modifications has been carried out on the module.

Another main topic was how to test data access code. This turned out to be the hardestpart where several approaches had to be completely abandoned. It was either too mucheffort to write the tests or the tests were useless. The final solution of using specifications isa good compromise and, at least in theory, the whole concept has a lot of potential.

The evaluation of Entity Framework 4.0 showed that almost all code can be automaticallygenerated from the database, minimizing the effort needed to bring the database into objectoriented code. The performance however is awful for large sets of data. Thankfully it ispossible to optimize the bottlenecks by replacing them with stored procedure. It would havebeen interesting to compare Entity Framework with more mature ORM frameworks, mostnotably NHibernate[17].

The unit tests created are very useful, but there is also a need for integration tests to testthe interactions of units. A big challenge here is to maintain test data that can be updatedtogether with the application. This is another topic that would be interesting to explore.

35

36 Chapter 7. Conclusions

7.1 Limitations

The main limitation of the module is that is not yet integrated into the graphic user interfaceof the rest of the application. More issues will probably have to be considered when themodule is integrated with user input. There is a feature to select only a subset of theapartments to be used in the calculation, but its performance does not compare to loadingall apartments at once. A better solution has to be found in creating this subset.

7.2 Future Work

A lot of things like maintainability cannot be evaluated before the module starts to expand.There is also not known how much an effort is required to maintain the code, keeping alltest cases up to date.

Because unit tests only test each unit in isolation the test suite will not detect errorsthat occur when units are interacting. It would be possible to create a suite of integrationtests that test the service layer, because it has a well defined interface. These tests requiresanother database with test data that has to be maintained when the application changes.

Chapter 8

Acknowledgements

I would like to thank TRIMMA for the opportunity of doing this project and my externalsupervisor Mattias Blom and the other employees at TRIMMA for their feedback. A thanksalso to my internal supervisor at Umea University, Jan-Erik Mostrom.

37

38 Chapter 8. Acknowledgements

References

[1] Scott Allen. Testability and Entity Framework 4.0. http://msdn.microsoft.com/en-us/library/ff714955.aspx (visited 2012-05-21).

[2] Will Beattie. Specification Pattern, Entity Framework & LINQ.http://blog.willbeattie.net/2011/02/specification-pattern-entity-framework.html(visited 2012-06-01).

[3] SABO Sveriges Allmannyttiga Bostadsforetag. Satt ratt hyra, handledning i systemema-tisk hyressattning, 2010.

[4] Martin J. Fowler. ObjectMother. http://martinfowler.com/bliki/ObjectMother.html(visited 2012-05-22).

[5] Martin J. Fowler. Patterns of Enterprise Application Architecture. Addison-WesleyProfessional, 2002.

[6] Edward Hieatt and Rob Mee. Repository. http://martinfowler.com/eaaCatalog/repository.html(visited 2012-05-22).

[7] Hyressattningsutredningen. Sou 2004:91 reformerad hyressattning. Socialdepartementet,09 2004.

[8] Clarius Consulting Labs. Moq. http://code.google.com/p/moq/ (visited 2012-06-05).

[9] Julia Lerman. Agile Entity Framework 4 Repository. http://thedatafarm.com/blog/data-access/agile-entity-framework-4-repository-part-1-model-and-poco-classes/ (visited 2012-05-23).

[10] Julia Lerman. Programming Entity Framework. O’Reilly Media, 2009.

[11] Robert C. Martin. The single responsibility principle. Principles of Object OrientedDesign, 2002.

[12] Microsoft. Code Generation and T4 Text Templates. http://msdn.microsoft.com/en-us/library/bb126445.aspx (visited 2012-07-30).

[13] Microsoft. SqlBulkCopy Class. http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx (visited 2012-06-25).

[14] Microsoft. The ADO.NET Entity Framework Overview. http://msdn.microsoft.com/en-us/library/aa697427(v=vs.80).aspx (visited 2012-06-07).

[15] Microsoft. Unit testing framework. http://msdn.microsoft.com/en-us/library/ms243147(v=vs.80).aspx (visited 2012-08-01).

39

40 REFERENCES

[16] MyGeneration. The dOOdads .NET Architecture.http://www.mygenerationsoftware.com/portal/dOOdads/Overview/tabid/63/Default.aspx(visited 2012-06-07).

[17] NHibernate. Nhibernate. http://nhforge.org/Default.aspx (visited 2012-08-01).

[18] Stephanie Punke Peter Schuh. Objectmother - easing test object creation in xp. XPUniverse, 2003.

[19] R. Venkat Rajendran. White paper on unit testing. Deccanet Designs Ltd., 2002.

[20] Philip Craig Tim Mackinnon, Steve Freeman. Endo-testing: Unit testing with mockobjects. XP eXamined, 2000.