31
Conceptual Data Model Data Mining System For a Bank Names: Lee Rock, Richard Olayele Awosika, Sharath Kc, Johnny Syllias Student Numbers: 20038860, 20072896, 20069626, 20072774 Department: Department of Graduate Business Course: MSc in Global Financial Information Systems Module: Data Modelling and Analysis Project Manager: Lee Rock Presented To: Dr Aidan Duane Assignment 1 of 3

D_CDM_Report

Embed Size (px)

Citation preview

Page 1: D_CDM_Report

Conceptual Data Model

Data Mining System For a Bank

Names: Lee Rock, Richard Olayele Awosika, Sharath Kc, Johnny Syllias

Student Numbers: 20038860, 20072896, 20069626, 20072774

Department: Department of Graduate Business

Course: MSc in Global Financial Information Systems

Module: Data Modelling and Analysis

Project Manager: Lee Rock

Presented To: Dr Aidan Duane

Assignment 1 of 3

Group: D

We firmly declare that this assignment was completed by our own accord and to the best of our abilities in accordance with the plagiarism regulations and in line with the standards of academia set out by Waterford Institute of Technology.

Page 2: D_CDM_Report

Contents1.0 Introduction.....................................................................................................................................1

1.1 Definitions...................................................................................................................................1

1.2 Overview......................................................................................................................................2

2.0 Strategy Employed...........................................................................................................................2

3.0 Business Case for Data Mining System............................................................................................5

3.1 Purpose of the Data Mining System............................................................................................5

3.2 Advantages of the Data Mining System to the User....................................................................5

3.3 Advantages of Data Mining System to the end consumers.........................................................6

3.4 Security Implications of the Data Mining System........................................................................6

3.5 Regulations and Protections of the Data Mining System.............................................................7

4.0 Review and Analysis........................................................................................................................7

4.1 Buy to Let Mortgage....................................................................................................................7

4.2 Home Owner Mortgage...............................................................................................................7

4.3 Health Insurance..........................................................................................................................8

4.4 Personal Insurance......................................................................................................................8

4.5 Auto Insurance.............................................................................................................................8

4.6 Personal Loan..............................................................................................................................8

4.7 Auto Loan....................................................................................................................................8

4.8 Educational Loan.........................................................................................................................8

4.9 Deposit Account..........................................................................................................................8

5.0 Conceptual Data Model...................................................................................................................8

5.1 Business Assumptions..................................................................................................................9

5.2 Database Rules............................................................................................................................9

5.3 Conceptual Data Model Cardinality & Optionality Relationship Logic.......................................11

5.3.1 Parent to Child Relationships..............................................................................................11

5.3.2 Child to Parent Relationships..............................................................................................12

5.3.3 Visual Entity Relationships..................................................................................................13

5.4 Entity Relationship Diagram......................................................................................................17

6.0 Conclusion.....................................................................................................................................18

FIGURE 1 ACTIVE USERS AND NEW REGISTRATIONS FROM ISPO QUARTER TWO PERSONAL ONLINE BANKING REPORT (2013)..............................................................................................................................................4

Page 3: D_CDM_Report

FIGURE 2 NEW REGISTRATIONS VS ACTIVE USERS OF ISPO QUARTER 2 PERSONAL ONLINE BANKING REPORT (2013)...........................................................................................................................................................4

FIGURE 3 AUTO LOAN RELATIONSHIP.................................................................................................................12FIGURE 4 PERSONAL LOAN RELATIONSHIP..........................................................................................................13FIGURE 5 EDUCATIONAL LOAN RELATIONSHIP....................................................................................................13FIGURE 6 HOME OWNER MORTGAGE RELATIONSHIP.........................................................................................14FIGURE 7 BUY TO LET MORTGAGE RELATIONSHIP..............................................................................................14FIGURE 8 AUTO INSURANCE RELATIONSHIP........................................................................................................14FIGURE 9 PERSONAL INSURANCE RELATIONSHIP................................................................................................15FIGURE 10 HEALTH INSURANCE RELATIONSHIP..................................................................................................15FIGURE 11 DEPOSIT ACCOUNT DETAIL................................................................................................................16

TABLE 1 PARENT TO CHILD RELATIONSHIPS........................................................................................................11TABLE 2 CHILD TO PARENT RELATIONSHIPS........................................................................................................12

Page 4: D_CDM_Report

1.0 Introduction

1.1 Definitions Entity: A particular subject or object in which data is stored regarding this subject or object

for example, customers. Entities are seen as tables in a database.

Entity Type: A category of a subject or object in which data is collected regarding for

example a customer is an entity while a set of customers is an entity type.

Attributes: The essential facts that are stored regarding each entity.

Record: A collection of fields and their values. Each record stores information i.e. the

attributes on a particular entity within an entity type. Records are sometimes called rows.

Conceptual Data Model (CDM): a skeleton or high level preliminary data model which is

intended to how a particular system may function.

Primary Key: An attribute or identifier that uniquely identifies a particular record. A

concentrated primary key has only one attribute where as an atomic primary key has multiple

attributes. A candidate primary key is an attribute(s) which could be selected as a primary

key.

Foreign Key: A minimal set of attributes in a table used to form a connection or a

relationship between the original table and another table. The value of the primary key in one

table is the value of the foreign key in the other table.

Relationship: the type of connection between one table and another.

Optionality: The minimum requirement in a relationship.

Cardinality: The maximum requirement in a relationship.

Degree: Type of entity relationship

DMS: Data mining system, a system to enable the capture, storage, classification and

clustering of information

1

Page 5: D_CDM_Report

1.2 OverviewData mining plays a huge part in today’s knowledge economy with countless institutions

looking to analyse huge amounts of data in order to make decisions that will ultimately profit

the business. The oracle website interprets data mining to be ‘the practice of automatically

searching large stores of data to discover patterns and trends that go beyond simple analysis.

Put simply data mining is a range of tools and techniques to allow an institution to interpret

data to make decisions.

Banks sit on a goldmine of information with terabytes of customer information being

discarded after audits and within the data protection regulations. This data has the capability

for utilisation of customer trend spotting to further cluster clients in order to market further

financial products to these clusters. The current information systems that many banks and

other financial intuitions employ are unable to effectively utilise data mining which results in

lost opportunities in the market (Bhadoriya, 2013, Domingo, 2003, Barja and Cerquides

1998). Walmart gained a 20% increase in sales due to effective data mining (Domingo,

2003). Banking sales could also exhibit similar increases if they implemented systems to

allow for the efficient capture, grouping and clustering of clients.

The main purpose of this Conceptual Data Model (CDM) report is to provide a frame work

for construction of a relational database further interpreted to be a data mining system to

optimise capture, classification and clustering of customers for targeted marketing of a bank’s

financial products/services. This concept will provide a high level view of how the intended

system will efficiently target existing customers in order to sell further financial products to

clients based trend activity and information provided through transacting with the bank.

This document also presents a description of the entities as well as the relationships among

those entities. This report will cover the Introduction the strategy employed, the cardinality

and optionality relationship logic, the business case, review and analysis and concludes with

a brief summary of the CDM.

2.0 Strategy Employed

In developing the financial products to be deployed, we researched key products central to

modern banking such as mortgages, insurance and loans. In developing the entity relationship

2

Page 6: D_CDM_Report

diagram (ERD), we developed a central entity which captures attributes relevant to all the

other product entities. This entity labelled as “GDClientDetail” which is the main connection

table between the other entities. This client detail entity captures all relevant but general

attributes which are applicable with other product entities such as buy to let mortgages, home

owner mortgages, health insurance, personal insurance, auto insurance, personal loan, auto

loans, educational loans and deposit accounts.

The DMS model will be developed to capture as much client information as possible to

divide clients into market segments for the further marketing of financial products to these

clients. This can be achieved by managing the data obtained from the following:

Using the information gathered in house: This is done by introducing products of

this database user to current existing customers within the same database. An example

is where a new customer purchases one of the Mortgage products e.g. the Home

Owner Mortgage Product. The information obtained on this new customer can be

used to introduce auto insurance, health insurance and other products developed by

the user of this database to this new customer. Information collected and stored will

also provide us with the capability to spot trends and market niches in order to

identify possible new revenue streams.

Selling information and rights to the database: This is done by selling the rights to

the information in the database to external organisations where the transaction does

not infringe on data protection laws. For example, generate a report identifying that

house sales in a particular area have increased. This information could be utilised and

sold to external organisations such as Harvey Norman, Woodies etc.

In today’s digital economy many financial institutions are shifting from the traditional way of

interacting, gathering and storing customer and potential customer data which initially was

the manual way to an efficient Online Banking System (OBS) as a way of capturing and

analysing data. OBS has now become a core financial, marketing and management tool in

modern banking. According to Irish Payment Service Organisation (IPSO) and The Irish

Banking Federation (BFI) quarter two (2013) online personal banking report which assessed

the prevalence of Irish online banking, indicates that:

3

Page 7: D_CDM_Report

Some 2.2 million customers were active users of online banking during 2013, up

16.5% on 2012;

Customers accessed their accounts almost 205 million times during the year, up

12.4% year on-year;

Customers made 42.6 million payments (including mobile phone top-ups and

international payments) through online banking services during 2013;

Online banking payment volumes fell by 5.1% compared with 2012.

Figure 1 Active Users and New Registrations from ISPO Quarter Two Personal Online Banking Report (2013)

Figure 2 New Registrations vs Active Users of ISPO Quarter 2 Personal Online Banking Report (2013)

4

Page 8: D_CDM_Report

It is imperative that the proposed system is able to support and cater for online/mobile

banking data collection as well as traditional methods such as manual entry from a bank

employee.

3.0 Business Case for Data Mining System

3.1 Purpose of the Data Mining System

The Data Mining System (DMS) is an advanced Database Management System (DBMS)

model which can be implemented within any banking system for effective utilisation of the

customer data. The prototype is designed in such a way that the minutest customer details are

captured, monitored and analysed to pop up with a combination of new financial product and

solutions suiting the customer. Basically the intention behind the design is to create a Win –

Win situation for both the providers, users and end consumers.

3.2 Advantages of the Data Mining System to the User

DMS let the financial institutions refine their current back end system by giving a

structured and organized framework.

Effective management and utilisation of large volume of customer information.

The system is designed to fit in and flexibly interact with all types and categories of

data.

Bring in transparency and helps in looking through the existing customer profile to

find out the missing links in customer information.

Suitable for both centralised and franchisee business structure.

Consistent and frequent updates at regular intervals.

DMS facilitates easy installation, running and maintenance.

This platform is user friendly and enables customisation.

In case of retails assets like credit cards, loans and mortgages, DMS will monitor the

customer credentials and generate a report on his credit and debit history.

Helps financial institutions to keep track of missed out payments and further trace the

customer to make the debts good.

The system works out the permutation and combinations with existing customer

details to generate an exclusive tailor made financial product.

5

Page 9: D_CDM_Report

Increase the scope of upselling and cross selling within the bank to generate more

revenue.

Improves the client stickiness to the banks keeping the relationship intact and

retained.

Helps in introducing and developing a new bouquet of financial products within the

system.

Assist in strategic positioning of the financial entity in the market by being one point

contact for all solutions.

Helps in integrating multiple other business setups with a financial system there by

widening the scope of current business.

Surge the inflow of business leads.

3.3 Advantages of Data Mining System to the end consumers

Convenience for the customers in getting all possible financial solutions at one place.

DMS will be indirectly linked to the internet and mobile banking apps and come up

with best possible solutions and package suited for customers on interactive web

page.

Generate E mails and SMS alerts to help the customer in cautioning about missed out

and upcoming payments.

DMS system enables customers to link their frequently visited social media websites

with their bank accounts to know about latest promotions and newsletters on current

developments from their respective bankers.

3.4 Security Implications of the Data Mining System

A DMS should be highly a secure platform as it deals with general customer data and

confidential financial transactions. Following are the inbuilt security measures tagged

within the DMS system:

An anti-virus application built into protect the platform from unwanted foreign

bodies.

A multiple validation system is in place within DMS whereby any major changes are

confirmed by a validator within the bank and finally by an approver at the developers

end.

6

Page 10: D_CDM_Report

Any changes with respect to the customer level data are communicated to the

customer in the form of Email and SMS.

A Security specialist will be assigned to all the financial institutions once the DMS is

installed and will be responsible in monitoring the platform after every update and

changes.

3.5 Regulations and Protections of the Data Mining System

It is essential for a DBMS model to comply with all the data protection norms to legally

launch the application in market and for its universal acceptance. The DMS abides by:

European Union’s Data Protection Directive of 1995.

It complies with Irish Data Protection Act 2003 and British Data Protection Act

1998.

DMS should aim to be certified by International Standards Organisation under the

guidelines of ISO27001 information security certification.

4.0 Review and Analysis

In developing the ERD, we considered several products that a financial institution is likely

sell. The purpose of these products is to capture relevant customer data in order to allow the

marketing of further current and new financial products to existing clients. The following will

identify the most widely used financial products in the modern banking industry that we

intend to incorporate into our model. The proposed hierarchal based system should allow for

ease when adding further products to the system.

4.1 Buy to Let Mortgage

This product is meant for first time or subsequent purchasers of real estate. This gives the

buyers the opportunity to buy a house on mortgage and ‘let’ the same property. This gives the

benefit of utilising the rent from tenants to mitigate the mortgage payments of the owner.

4.2 Home Owner Mortgage

This is a product that allows a customer to buy a property for personal use. This product is

one of the fastest moving products under the mortgage category.

7

Page 11: D_CDM_Report

4.3 Health Insurance

This product falls under the general insurance category and one of the essential products to

all the customers. The product gives complete health coverage including critical illness, cost

of hospitalisation and personal accident coverage.

4.4 Personal Insurance

This is a unique product developed to cover life Insurance, house insurance, personal

property insurance etc.

4.5 Auto Insurance

Auto insurance is designed to cover all types of vehicles including two wheeler, four wheeler

and heavy vehicles. This is a main area of revenue for the banks as there will be frequent

renewals of auto insurance and it is mandatory for all the vehicles.

4.6 Personal Loan

This product falls under one of the core areas of finance which is lending. Personal loans can

be used for a variety of reasons such as holidays, personal spending etc.

4.7 Auto Loan

These are loans given in case if a buyer wants to buy any type of automobile ranging from

two wheelers to heavy vehicles. The financing for the vehicle is done based on the vehicle

model, total cost of the vehicle and the percentage of down payment done by the customers.

4.8 Educational Loan

This product allows for clients to undertake lending for educational purposes such as to assist

the individual financially in attending a third level institution.

4.9 Deposit Account

As a business rule, all clients who wish to avail our financial services must be required to

activate and maintain at least one form of deposit account. This account allows us to identify

the clients spending patterns where applicable.

5.0 Conceptual Data Model

The CDM for this bank is hierarchal based with the client detail at the top or in the centre of

the matrix. The second level of entities are the details of the various products that the bank

8

Page 12: D_CDM_Report

already supplies. The Third level of tables are the payment and transaction tables of the

various products that the bank supplies. From this structure we are able to group and cluster

customers to sell internal financial products to clients for example, customers who take out a

loan can be queried to identify if they are eligible for loan insurance. The database also

allows us for the possibility to sell information to companies regarding customer spending

patterns.

5.1 Business Assumptions

We have made the following assumptions for this bank due to time and resource constraints

in order to develop a functioning model.

The company is a commercial bank that only trades with individuals rather than

companies.

Bank only trades in Euro currency.

Clients must have at least one deposit account.

All customer costs e.g. registration fees, interest etc. are included in the full

instalment/payment price and will not be differentiated out.

No late payment option, if the client misses a payment they will pay on the next

instalment date.

5.2 Database Rules

The following rules were implemented to ensure data integrity and consistency in order to

limit data redundancy and maximise the potential for a functioning model:

All entity names are coded with “GD” (case sensitive) before the actual entity name to

clarify that these tables are owned by GFIS Data Modelling and Analysis Group D

2015 e.g. GDClientDetail.

All entity names must begin with a capital letter and have no spaces between each

word in the name e.g. GDBuyToLetMortgageDetail.

Any table that provides a description must be identified with the word “Detail” at the

end of the name e.g. GDAutoInsuranceDetail.

Any table which records payments or non-payments for products the banks provide

must have “transactions” at the end of their name e.g. GDAutoInsuranceTransactions.

All primary keys must be labelled with “ID” (case sensitive) after the name e.g.

ClientID.

9

Page 13: D_CDM_Report

All attributes in tables with the exception of the ClientDetail table must have an

abbreviation of the table name in capital letters before the Actual Attribute name e.g.

the category field in the GDDepositAccountDetail table is to be labelled as

GADCategory.

Fields where applicable will be validated by domains.

10

Page 14: D_CDM_Report

5.3 Conceptual Data Model Cardinality & Optionality Relationship LogicThe following will outline and describe the relationship between the entities in the conceptual data model. The entities are based on a hierarchal

structure with the client detail table being place at the top then the product detail tables and finally the payment and transaction tables. The

product detail is linked to the client detail and the transaction tables are linked to the product detail. This format will enable allow for the system

to be mined via queried in order to spot trends between products and allow the company to group and cluster clients in order to sell further

products to those clients. The entity relationships are described below in both table and graphical format. Each relationship is given a

relationship number (RNO) to allow for a correspondence between the parent to child, child to parent table as well as the graphical relationship.

The following will identify the optionality, cardinality degree of each relationship.

5.3.1 Parent to Child Relationships RNO Primary Parent Relationship Secondary Parent Relationship Child

1 GDClientDetail one to many GDAutoLoanDetail one to many GDAutoLoanPayments

2 GDClientDetail one to many GDPersonalLoanDetail one to many GDPersonalLoanPayments

3 GDClientDetail one to many GDEducationalLoanDetail one to many GDEducationalLoanPayments

4 GDClientDetail one to many GDHomeOwnerMortgageDetail one to many GDHomeOwnerMortgagePayments

5 GDClientDetail one to many GDBuyToLetMortgageDetail one to many GDBuyToLetMortgagePayments

6 GDClientDetail one to many GDAutoInsuranceDetail one to many GDAutoInsurancePayments

7 GDClientDetail one to many GDPersonalInsuranceDetail one to many GDPersonalInsurancePayements

8 GDClientDetail one to many GDHealthInsuranceDetail one to many GDHealthInsurancePayments

9 GDClientDetail one to many GDDepositAccountDetail one to many GDDepositAccountTransactionsTable 1 Parent to Child Relationships

11

Page 15: D_CDM_Report

5.3.2 Child to Parent Relationships RNO Child Relationship Secondary Parent Relationship Primary Parent

1 GDAutoLoanPayments many to one GDAutoLoanDetail many to one GDClientDetail

2 GDPersonalLoanPayments many to one GDPersonalLoanDetail many to one GDClientDetail

3 GDEducationalLoanPayments many to one GDEducationalLoanDetail many to one GDClientDetail

4 GDHomeOwnerMortgagePayments many to one GDHomeOwnerMortgageDetail many to one GDClientDetail

5 GDBuyToLetMortgagePayments many to one GDBuyToLetMortgageDetail many to one GDClientDetail

6 GDAutoInsurancePayments many to one GDAutoInsuranceDetail many to one GDClientDetail

7 GDPersonalInsurancePayements many to one GDPersonalInsuranceDetail many to one GDClientDetail

8 GDHealthInsurancePayments many to one GDHealthInsuranceDetail many to one GDClientDetail

9 GDDepositAccountTransactions many to one GDDepositAccountDetail many to one GDClientDetailTable 2 Child to Parent Relationships

12

Page 16: D_CDM_Report

5.3.3 Visual Entity RelationshipsAuto Loan (1)

Figure 3 Auto Loan Relationship

Each client can have one or more auto loans; each auto loan can have one or more payments.

Each payment can only have one auto loan; each auto loan can only have one client.

Personal Loan (2)

Figure 4 Personal Loan Relationship

Each client can have one or more personal loans; each personal loan can have one or more payments.

Each payment can only have one personal loan; each personal loan can only have one client.

Educational Loan (3)

13

Page 17: D_CDM_Report

Figure 5 Educational Loan Relationship

Each client can have one or more educational loans; each educational loan can have one or more payments.

Each payment can only have one educational loan; each educational loan can only have one client.

Home Owner Mortgage (4)

Figure 6 Home Owner Mortgage Relationship

Each client can have one or more home owner mortgages; each home owner mortgage can have one or more payments.

Each payment can only have one home owner mortgage; each home owner mortgage can only have one client.

Buy To Let Mortgage (5)

Figure 7 Buy To Let Mortgage Relationship

Each client can have one or more buy to let mortgages; each buy to let mortgage can have one or more payments.

Each payment can only have one buy to let mortgage; each buy to let mortgage can only have one client.

14

Page 18: D_CDM_Report

Auto Insurance (6)

Figure 8 Auto Insurance Relationship

Each client can have one or more auto insurance policies; each auto insurance policy can have one or more payments.

Each payment can only have one auto insurance policy; each insurance policy can only have one client.

Personal Insurance (7)

Figure 9 Personal Insurance Relationship

Each client can have one or more personal insurance policies; each personal insurance policy can have one or more payments.

Each payment can only have one personal insurance policy; each personal insurance policy can only have one client

15

Page 19: D_CDM_Report

Health Insurance (8)

Figure 10 Health Insurance Relationship

Each client can have one or more health insurance policies; each health insurance policy can have one or more payments.

Each payment can only have one health insurance policy; health insurance policy can only have one client.

Deposit Account (9)

Figure 11 Deposit Account Detail

Each client must at least one or more deposit accounts; each deposit account can have one or more transactions.

Each transaction can only have one deposit account; each deposit account must only have one client.

5.4 Entity Relationship Diagram The following entity relationship diagram shows the proposed hierarchal matrix for the entities and their relationships along with each entities

proposed attributes.

16

Page 20: D_CDM_Report

17

Page 21: D_CDM_Report

6.0 ConclusionThe conceptual data model will provide a framework in the construction of a data mining

system that allows for the capture, cluster and grouping of client information in order to sell

further financial products. The conceptual data model provides a high level synopsis of the

team’s solution to meet the organisations needs for a system which can facilitate the capture,

classification and clustering of client information in order to generate marketing efficiencies

and there by potentially increase sales and profits of the institution. With many banks

beginning a shift towards data mining it is only fitting that we keep up with the times and

ensure the system allows for not only modern but traditional banking methods also. The

conceptual data model will provide the skeleton for the next stage in the project which is the

logical data model.

18

Page 22: D_CDM_Report

ReferencesBarja, M., L. and Cerquides, J. ‘Applications of Data Mining in Banking’, (1998) [Accessed

Online 10th of October 2015], Available From: http://www.slideshare.net/Tommy96/data-

mining-in-banking-ppt?related=1

Domingo, T., R. (2003) ‘Applying Data Mining to Banking’, [Accessed Online 10 th of

October 2015], Available From: http://www.rtdonline.com/BMA/BSM/4.html

Irish Payments Services Organisation Limited and Irish Banking Federation (2013),’Online

Personal Banking Report Quarter 2’, [Accessed Online 10th of October 2015], Available

From: www.bpfi.ie/wp-content/uploads/.../Online-bank-stats-Q213-FINAL.pdf

Oralce.com ‘What is Data Mining?’, [Accessed Online 10th of October 2015], Available

From: http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/process.htm

Shrivastava, P. and Bhadoriya, A. (2013) ‘The Use of Data Mining in Banking’, Accessed

Online From: 10th of October 2015, Available From:

http://www.slideshare.net/arpitbhadoriya/use-of-data-mining-in-banking-sector