48
MODELLING BUSINESS INFORMATION Entity relationship and class modelling for business analysts Keith Gordon

9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MO

DELLIN

G B

USIN

ESS INFO

RM

ATION

Keith G

ordon

MODELLING BUSINESS INFORMATIONEntity relationship and class modelling for business analysts

Keith Gordon

It is almost universally accepted that requirements documents for new or enhanced IT systems by business analysts should include a ‘data model’ to represent the information that has to be handled by the system.

Starting from first principles, this book will help business analysts to develop the skills required to construct data models through comprehensive explanations of entity relationship and class modelling, in line with the BCS Data Analysis syllabus. In addition to covering the topics in the syllabus, the book also includes significant extra information of interest including an overview of other modelling notations, information model quality, and taking a requirement model into database design.

• Explains why business analysts should model information• Covers both entity relationship and class modelling in

tandem from the basics• Aligned with the BCS Data Analysis syllabus• Goes beyond the syllabus to include several wider topics

of interest

ABOUT THE AUTHORKeith Gordon is an independent consultant and lecturer specialising in data management and business analysis. He has spent over 50 years in technical, education and training environments as an engineer, computer consultant, data manager, business analyst and education and training manager.

A thoughtful, well-done text on how to do high-quality business analytical data modelling.David Hay, Essential Strategies International, CEO

A terrific contribution to the field.Alec Sharp, Senior Consultant, Clariteq

Provides an excellent grounding in the full range of topics related to information modelling.Matthew West, Information Junction, Director

Computing; IT

ISBN 978-1-78017-353-5

9 781780 173535

You might also be interested in:

Cover photo: iStock © nuwatphoto

Ebooks available

Paperback available

MODELLING BUSINESS INFORMATIONEntity relationship and class modelling for business analysts

Keith Gordon

Page 2: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering
Page 3: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

As the roles of Data and Business Analysts become more intertwined, this book is timely in its publication. Businesses often fail to recognise information is a key resource and are confused by how it is presented or overwhelmed its complexity during use. Keith brings to the forefront of the readers mind the importance of communicating and analysing the relationship between Business, Information, Systems and Data, and the value in developing models cooperatively, gaining ‘consensus, not perfection’ from stakeholders. Simple everyday examples and analogies to support the readers under-standing and make the subject more relatable are used.I enjoyed reading the book and completing the exercises. An excellent learning aid for Analysts who are new to modelling or need reminding of good practice.

Katie Walsh, Business Analyst and Mentor

Anyone interested in a thoughtful, well-done text on how to do high-quality business analytical data modelling should definitely proceed with this book.

David Hay, Essential Strategies International, CEO

Modelling Business Information provides an introduction to data modelling, to the nomen-clature used by common modelling techniques, and to techniques for representing common patterns. This is a useful book for business analysts who are creating the information model as well as for business and IT users who need to understand a data model.

Keith W. Hare, JCC Consulting, Inc., Senior Consultant

Keith Gordon’s wonderfully compact yet thorough introduction to business-friendly information modelling is a terrific contribution to the field. Globally, there’s a surge of interest in data modelling as a powerful tool for improving communication, especially with professionals who used to think business-oriented entity relationship modelling didn’t need to be in their tool kits. Business analysts, Agile developers, data scientists, big data specialists, and other professionals will all benefit from Keith’s work.

Alec Sharp, Senior Consultant, Clariteq

Modelling Business Information by Keith Gordon, is aimed at those who are new to busi-ness analysis or information modelling. Keith draws on a wealth of experience in infor-mation management, both as a practitioner, and as a lecturer with the Open University in his writing.

The first six chapters provide an accessible and clear foundation in the topic cov-ering the reasons for developing information models, the basic elements of entity-relationship diagrams, how to develop an information model from basic information requirements, and finally how to normalise existing data. I particularly like that it uses two graphical notations, the Barker-Ellis notation, noted for its readability, and the ubiquitous Unified Modelling Language notation, which helps to demonstrate that there are different notations that entity-relationship models can be developed in. This first part of the book also takes care to cover the syllabus for the Data Analysis certificate that is part of the scheme for the BCS Advanced International Diploma in Business Analysis.

The second part of the book covers a range of more advanced topics from naming con-ventions and yet more entity-relationship model notations, to considerations of quality in

Page 4: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

information models, corporate data models, modelling for business intelligence applica-tions, and finally goes on to look at data and database topics including an overview of SQL, and moving to database design and optimisation.

Overall, the book provides an excellent grounding in the full range of topics related to information modelling.

Matthew West, Director, Information Junction

Page 5: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

Page 6: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

BCS, THE CHARTERED INSTITUTE FOR IT

BCS, The Chartered Institute for IT, champions the global IT profession and the interests of individuals engaged in that profession for the benefit of all. We promote wider social and economic progress through the advancement of information technology science and practice. We bring together industry, academics, practitioners and government to share knowledge, promote new thinking, inform the design of new curricula, shape public policy and inform the public.

Our vision is to be a world-class organisation for IT. Our 75,000-strong membership includes practitioners, businesses, academics and students in the UK and internationally. We deliver a range of professional development tools for practitioners and employees. A leading IT qualification body, we offer a range of widely recognised qualifications.

Further InformationBCS, The Chartered Institute for IT,First Floor, Block D,North Star House, North Star Avenue,Swindon, SN2 1FA, UK.T +44 (0) 1793 417 424F +44 (0) 1793 417 444www.bcs.org/contact

http://shop.bcs.org/

Page 7: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATIONEntity relationship and class modelling for business analystsKeith Gordon

Page 8: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

© 2017 BCS Learning & Development Ltd

The right of Keith Gordon to be identified as author of this work has been asserted by him in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.

All rights reserved. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted by the Copyright Designs and Patents Act 1988, no part of this publication may be reproduced, stored or transmitted in any form or by any means, except with the prior permission in writing of the publisher, or in the case of reprographic reproduction, in accordance with the terms of the licences issued by the Copyright Licensing Agency. Enquiries for permission to reproduce material outside those terms should be directed to the publisher.All trademarks, registered names etc. acknowledged in this publication are the property of their respective owners.

BCS and the BCS logo are the registered trademarks of the British Computer Society, charity number 292786 (BCS).

Published by BCS Learning & Development Ltd, a wholly owned subsidiary of BCS, The Chartered Institute for IT, First Floor, Block D, North Star House, North Star Avenue, Swindon, SN2 1FA, UK.www.bcs.org

Paperback ISBN: 9781780173535PDF ISBN-13: 9781780173542EPUB ISBN-13: 9781780173559Kindle ISBN-13: 9781780173566

British Cataloguing in Publication Data.A CIP catalogue record for this book is available at the British Library.

Disclaimer:The views expressed in this book are those of the authors and do not necessarily reflect the views of the Institute or BCS Learning & Development Ltd except where explicitly stated as such. Although every care has been taken by the authors and BCS Learning & Development Ltd in the preparation of the publication, no warranty is given by the authors or BCS Learning & Development Ltd as publisher as to the accuracy or com-pleteness of the information contained within it and neither the authors nor BCS Learning & Development Ltd shall be responsible or liable for any loss or damage whatsoever arising by virtue of such information or any instructions or advice contained within this publication or by any of the aforementioned.

Typeset by Lapiz Digital Services, Chennai, India.

vi

Page 9: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

CONTENTS

List of figures and tables x About the Author xiii Foreword xv Acknowledgements xviii Glossary xix Introduction xxv

PART 1 THE BASICS 1

1. WHY BUSINESS ANALYSTS SHOULD MODEL INFORMATION 3 What is business analysis? 3 Information and data 5 The importance for a business analyst of understanding information needs 6 The role of models in business analysis 7 Data models and data 10 Entity relationship modelling 11 Class modelling 12 Use of data models in business analysis 13 What makes a good data model? 14 Introducing data analysis 14

2. MODELLING THE THINGS OF INTEREST TO THE BUSINESS AND THE RELATIONSHIPS BETWEEN THEM 16

Entities and objects 16 Naming of entity types and object classes 18 Introduction to relationships and associations 19 Relationship notation in entity relationship models 20 Association notation in UML class models 22 Degrees of cardinality and optionality 24 Multiple relationships and associations 27 Recursive relationships and reflexive associations 29 Exercises for Chapter 2 30

3. MODELLING MORE COMPLEX RELATIONSHIPS 32 The problems with many-to-many relationships and associations 32 Resolving entity relationship model many-to-many relationships 33 Resolving class model many-to-many associations 35

vii

Page 10: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

The ‘bill of materials’ structure 36 Mutually exclusive relationships and associations 39 Generalisation and specialisation in entity relationship models 41 Generalisation and specialisation in class models 43 Aggregation and composition 46 Exercises for Chapter 3 48

4. DRAWING AND VALIDATING INFORMATION MODEL DIAGRAMS 50 The model drawing process 50 Identifying the entity types or the object classes 51 Identifying the relationships or associations 53 Drawing the initial diagram 54 Validating the diagram 56 Exercises for Chapter 4 63

5. RECORDING INFORMATION ABOUT THINGS 65 Revisiting entity types, object classes, relationships and associations 65 Introduction to attributes 66 The naming of attributes 69 Entity type, object class or attribute? 69 Unique identifiers 72 Domains 74 The UML extended attribute notation 75 Showing operations on class models 77 Exercises for Chapter 5 79

6. RATIONALISING DATA USING NORMALISATION 81 What is normalisation? 81 The relational model of data 82 The rules of normalisation 84 Starting the normalisation process 85 First normal form 86 Second normal form 89 Third normal form 90 The third normal form data model 94 Candidate keys, primary keys and alternate keys 95 The relationship of normalisation to modelling 95 Exercises for Chapter 6 96

PART 2 SUPPLEMENTARY MATERIAL 97

7. OTHER MODELLING NOTATIONS 99 The IDEF1X notation 100 The Information Engineering notation 104 The Chen notation 104 Comparison of the notations 107

8. THE NAMING OF ARTEFACTS ON INFORMATION MODELS 108 The naming of entity types or object classes 108 The naming of domains 110

viii

Page 11: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

CONTENTS

The naming of attributes 110 The naming of relationships in Ellis-Barker entity relationship models 112 The naming of associations on UML class models 112

9. INFORMATION MODEL QUALITY 114 Genericity and specificity in models 114 The nine characteristics of a good data model 116 The six principles of high quality data models 118 The five dimensions of data model quality 120 The layout of models 121

10. CORPORATE INFORMATION AND DATA MODELS 123 The problems 123 Principles for the development of a corporate model 125

11. DATA AND DATABASES 127 The data landscape 127 Databases 130

12. BUSINESS INTELLIGENCE 139 The data warehouse 139 The multidimensional model of data 140 Dimensional modelling 141

13. ADVANCES IN SQL (OR WHY BUSINESS ANALYSTS SHOULD NOT BE IN THE WEEDS) 144

The basics of SQL 144 New SQL data types 145 The future 151 Implications for business analysts and information modellers 151

14. TAKING A REQUIREMENTS INFORMATION MODEL INTO DATABASE DESIGN 154

First-cut database design stage 154 Optimised database design stage 155

APPENDICES 157 Appendix A: Table of equivalences 158 Appendix B: Bibliography 159 Appendix C: Solutions to the exercises 162 Index 172

ix

Page 12: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

LIST OF FIGURES AND TABLES

Figure 1.1 Three levels of system 4Figure 1.2 The relationship between data and information 6Figure 1.3 A rich picture 7Figure 1.4 A business activity model 8Figure 1.5 A business process model 8Figure 1.6 A use case diagram 9Figure 1.7 Requirements engineering in context 9Figure 1.8 An example entity relationship model using the Ellis-Barker notation 12Figure 1.9 An example of a UML class model 13Figure 2.1 The vehicle hire company using Ellis-Barker notation 16Figure 2.2 The vehicle hire company using UML class model notation 17Figure 2.3 A relationship in an entity relationship model 20Figure 2.4 An association in a UML class model 22Figure 2.5 The use of role names 23Figure 2.6 One-to-many (1:n) optional–mandatory relationship and association 25Figure 2.7 One-to-many (1:n) mandatory–optional relationship and association 26Figure 2.8 One-to-one (1:1) optional–mandatory relationship and association 26Figure 2.9 Many-to-many (m:n) optional–optional relationship and association 27Figure 2.10 Modelling the ‘one-way’ hire situation 28Figure 2.11 Employee supervision 29Figure 3.1 Employees and branches 32Figure 3.2 Introducing the ASSIGNMENT entity type 34Figure 3.3 Introducing the ASSIGNMENT object class 35Figure 3.4 Introducing the ASSIGNMENT association class 36Figure 3.5 Introducing products within products 36Figure 3.6 The bill of materials structure in Ellis-Barker notation 37Figure 3.7 The bill of materials structure in UML class model notation 37Figure 3.8 Employee supervision in a matrix organisation 38Figure 3.9 Employee supervision in a matrix organisation resolved 39Figure 3.10 The vehicle hire company as shown in Figure 2.1 39Figure 3.11 The introduction of an exclusive arc 40Figure 3.12 The introduction of the {xor} constraint 41Figure 3.13 An example of a supertype–subtype hierarchy 42Figure 3.14 Alternative depiction of a supertype–subtype hierarchy 43

x

Page 13: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

LIST OF FIGURES AND TABLES

Figure 3.15 A UML superclass–subclass hierarchy 44Figure 3.16 Alternative notation for a UML superclass–subclass hierarchy 44Figure 3.17 A UML class model with multiple superclass–subclass hierarchies 45Figure 3.18 Aggregation using Ellis-Barker notation 46Figure 3.19 An example of the use of the aggregation symbol in a UML class model 47Figure 3.20 An example of the use of the composition symbol in a UML class model 47Figure 3.21 Composition using Ellis-Barker notation 47Figure 4.1 The model drawing process 51Figure 4.2 A ‘relationship matrix’ 53Figure 4.3 The initial Ellis-Barker entity relationship model 54Figure 4.4 The initial UML class model 54Figure 4.5 The first data navigation path 57Figure 4.6 The second data navigation path 58Figure 4.7 The revised Ellis-Barker entity relationship model 59Figure 4.8 The revised UML class model 60Figure 4.9 Partial high-level process map 60Figure 4.10 Completed CRUD matrix 61Figure 4.11 The final Ellis-Barker entity relationship model 62Figure 4.12 The final UML class model 62Figure 5.1 The previous models 65Figure 5.2 Attribute types shown on an Ellis-Barker entity relationship model 67Figure 5.3 Attributes shown on a UML class model 68Figure 5.4 EMPLOYEE expanded (shown in Ellis-Barker entity relationship notation) 71Figure 5.5 EMPLOYEE expanded (shown in UML class modelling notation) 72Figure 5.6 Unique identifiers on an Ellis-Barker entity relationship model 73Figure 5.7 The UML <<enumeration>> class 76Figure 5.8 The UML extended attribute notation 76Figure 5.9 The UML operations notation 78Figure 6.1 Relational tables 83Figure 6.2 The staff record form 85Figure 6.3 Normalisation form completed to UNF 87Figure 6.4 Normalisation form completed to 1NF 88Figure 6.5 Normalisation form completed to 2NF 91Figure 6.6 Normalisation form completed to 3NF 93Figure 6.7 The third normal form data model 94Figure 7.1 The model of the business scenario in Ellis-Barker notation 100Figure 7.2 The model of the business scenario in UML class model notation 101Figure 7.3 The model of the business scenario in IDEF1X notation 102Figure 7.4 The model of the business scenario in Information Engineering notation 105Figure 7.5 The model of the business scenario using Chen’s notation 106

xi

Page 14: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

Figure 7.6 Comparison of the relationship notations 107Figure 9.1 An example of the replacement of roles by entity types 115Figure 9.2 The generic to specific continuum 116Figure 9.3 The cost-balance of flexible design 119Figure 9.4 The five dimensions of data model quality 120Figure 11.1 The data landscape 127Figure 11.2 Example data arranged in tables and columns 128Figure 11.3 The database chronology 131Figure 11.4 Hierarchical database schema 131Figure 11.5 Hierarchical database occurrences 132Figure 11.6 Network database schema 134Figure 11.7 Network database occurrences 135Figure 12.1 A multidimensional data model 140Figure 12.2 A typical ‘star’ schema for a data warehouse 141Figure 12.3 A ‘snowflake’ schema 143Figure 12.4 A ‘galaxy’ schema 143Figure 13.1 The original ‘workshop’ model 151Figure 13.2 The third normal form model 152Figure 13.3 The final model 153

Table 4.1 Identified entity types or object classes 52Table 8.1 Examples of formal attribute names 111Table A.1 Table of equivalences 158

xii

Page 15: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

ABOUT THE AUTHOR

Keith Gordon was a professional soldier for 38 years, joining the Army straight from school at 16 and retiring on his 55th birthday. During his service with the Royal Armoured Corps, the Royal Corps of Signals and the Royal Army Educational Corps (now the Educational and Training Services Branch of the Adjutant General’s Corps) he gained a Higher National Certificate in Electrical, Electronic and Telecommunications Engineering, a Certificate in Education from the Institute of Education of the University of London, a Bachelor of Arts from the Open University and a Master of Science in Design of Information Systems from Cranfield Institute of Technology.

The Master of Science course, held at the Royal Military College of Science, was unclear about what sort of information system the students were supposed to be designing. Was it a business system to be used in the non-operational world of the military? Was it a command and control information system to be used on the battlefield? Was it a real-time system to be used in areas such as weapon control? Or was it a management information system?

The course did, however, cover some really useful stuff. On the technical side this included programming in Ada and Coral-66, which are languages designed for embed-ded and real-time systems. We also studied Soft Systems Methodology (the academic lead for the course had researched for his doctorate at Lancaster University under the supervision of Professor Peter Checkland) and we looked, in particular, at the work of Professor Brian Wilson specialising in the application of Soft Systems Methodology to the development of information systems. The Structured Systems Analysis and Design Method (SSADM) (now called ‘Business System Development’ and the impetus for the Business System Development scheme of BCS which includes the Business Analysis and Solution Development diplomas and used to include a Data Management diploma) also formed a substantial part of the course.

Following the Design of Information Systems course, Keith spent three years as a con-sultant in the Army School of Training Support, where he looked into and procured com-puter systems for use in education and training – computer-based training (CBT). This role was part researcher and part business analyst. The next two years were spent as the Senior Education Officer in the Army’s apprentice college for the training of appren-tice soldier chefs.

In 1992, he was posted to the Ministry of Defence and joined a new team of four officers and a civil servant ‘doing data management’ for the Army. In 1995, he was promoted to Lieutenant Colonel and became the head of that team until he retired from the Army in 1998.

xiii

Page 16: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

He is now an independent consultant and lecturer specialising in data management and business analysis. As well as developing and teaching commercial courses, he was for a number of years a tutor for the Open University, tutoring general computing and database courses in the undergraduate and postgraduate programmes.

He is a Chartered Member of BCS, The Chartered Institute for IT, a Member of the Chartered Institute of Personnel and Development and a Fellow of the Institution for Engineering and Technology.

He holds the Diploma in Business Systems Development specialising in data manage-ment from BCS – formerly the Information Systems Examination Board (ISEB) – and he is now an examiner for the Business Systems Development scheme.

He represents the UK within the international standards development community by being nominated by the British Standards Institution (BSI) to the international stand-ards committee, ISO/IEC JTC1 SC32 WG2 (Information Technology – Data Management and Interchange – Metadata). In this role, he has contributed to the development of ISO/IEC 11179 (Metadata registries) and ISO/IEC 19763 (Metamodel framework for interoperability).

For a number of years, Keith was the secretary of the BCS Data Management Specialist Group and, as a founder member, was a committee member of the UK chapter of DAMA International, the worldwide association of data management professionals.

xiv

Page 17: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

FOREWORD

Business analysts have a curiosity about the business environment. They are keen to understand how processes can be improved, how customers can be given better service and, ultimately, how their organisations can be successful. As analysts though, if we want to understand how things work and can be improved, we can’t look at processes alone. The other key dimension that underlies all of these aspects, and provides a firm foundation for the organisation’s work, is the information that makes the business oper-ate. Information is the lifeblood of organisations and the people working within them.

Let’s think about what information offers:

y evidence for root cause analysis;

y a basis for decisions;

y measures for evaluating performance;

y tangible indications of opportunities;

y parameters for applying business rules.

Information can address all of these areas and more, and provide a means of challeng-ing assumptions and opinions. Surely this is a much better approach than employing gut feel or inventing ideas to suit personal agendas.

However, if businesses require information to operate effectively, they need a clear understanding of their data. If processes are to be efficient and effective, decision-mak-ing is to be precise and customer service is to be of the highest standard, the data needs to be accurate, accessible and available.

Data is at the heart of business. It forms the basis for providing essential information, including:

y who our customers are and what work they have done with us;

y the nature and characteristics of our products and services;

y the details regarding our financial situation and staff.

If organisations understand the importance of data, and work with it effectively, they can succeed in today’s world of high expectations and intense competition. If organisations fail to acquire, record, manage and utilise data, then business failure will surely follow.

xv

Page 18: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

Aside from the effective operation of the enterprise, there are also the opportunities that data can clarify or make available to forward looking, receptive organisations. Data can be interpreted to offer information about the changing nature of the business environ-ment. For example, new service requirements, the demographic make-up of customers, areas where product customisation is desired; all of these can provide opportunities for the organisation to learn and grow. We often talk about the learning organisation, but to become one relies on the receipt of good feedback (the data) and acting upon it (the processes).

Over the last couple of decades though, it has felt as if data was a secondary dimension with process improvement taking centre stage. There has been a move to almost ignore data requirements within organisations and place the focus on the business processes and the customer experience. Those of us who have worked as data analysts, model-lers and managers, have long feared the effect this approach could have, predicting that the impact of a reduced focus on data would eventually be recognised and hoping that it wouldn’t be at too high a cost. The advent of popular memes such as ‘big data’ has certainly brought data back to the forefront but there is also the issue of smaller, every day data - the data that makes the wheels go around rather than completely reinventing the wheel. This data may not help us to predict the innovations of the future but without it the organisation can’t operate and will fail to identify where change is required.

A popular misconception has been that analysing data is ‘difficult’ and ‘technical’, and should be the responsibility of those working as software architects or developers rather than the business analysts engaging with business stakeholders. The reality that information and data reflect business requirements seems to have been lost some-where along the way.

Yet, if organisations are to ‘learn’ and benefit from receiving informed feedback in order that they can respond and grow, they have to understand that data is important and should be handled with care. There needs to be an appreciation that the data reflects the operations, policies and rules of the organisation and while these may be embed-ded within software, they originate from people making decisions – including business managers and staff, external customers and suppliers, and regulatory agencies. In other words, data is not a technical domain, it is something everyone needs to appreciate, and the analysis of data needs to be conducted by those with business understanding. There should be people within the organisation who have the expertise and insight to elicit data requirements, analyse the structure and semantics of data, build clear models of the data and manage the data resource. We are in an insecure world where there is increasing recognition of the importance of data and the need to ensure data security and protection.

Which brings me to this book. Those of us who have long-lamented that we regularly encounter a limited data focus should congratulate Keith Gordon for providing such a comprehensive, clear and practical resource in this book. The topics covered take us through the process of eliciting, modelling and validating data. The key approaches to representing and understanding data are explained, including the often-overlooked topic of data normalisation. All in all, the book provides extensive guidance that the busi-ness analyst – and anyone else requiring an understanding of data analysis – needs if they are to work effectively with data. The book helps anyone new to the world of data to learn the techniques and principles behind successful data analysis. The breadth of the

xvi

Page 19: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

FOREWORD

book also helps those with experience in data analysis to encounter new ideas, brush up and broaden their knowledge, and deepen their understanding.

Modelling Business Information encourages readers to understand that data is not just about modelling for the technical solution, it is concerned with understanding the organ-isation, the rules it applies, and the information it needs. In other words, data analysis is a business discipline and the work to understand data should be performed by those with a business mindset.

Organisations require business analysts who can help the business staff to articulate data requirements and ensure that information needs can be met. Tomorrow’s business world needs data to be collected, governed and analysed in order to be an effective resource for organisations. This book helps organisations to do this. You should read it and use it as a key business resource.

Debra Paul Managing Director, Assist Knowledge Development

June 2017

xvii

Page 20: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

ACKNOWLEDGEMENTS

Understanding and documenting the information needs of the business are an essential part of data management, so my real introduction to modelling these concepts and things came when I joined the Army’s data management team. Among the people to whom I owe a debt of gratitude are my colleagues in that team, Ian Nielsen, Martin Richley, Duncan Broad and Tim Scarlett, as well as the consultants who helped us develop our ‘corporate data model’ and design the resulting database, David Gradwell, Ken Allen, Ron Segal (who is now in New Zealand), Elaine Senior and the wonderful Harry Ellis who has done so much to help the world know how to understand and document information requirements through his pioneering work on data modelling and data management. Later members of that team included, among others, Mark Thurlow, Lucy Finney and Peter Lawson.

Over the last 20 years or so I have had many interesting conversations about infor-mation and data modelling (yes, it is possible!) with, among others: Bob Walker and Gene Simaitis, both formerly of the Institute for Defense Analyses in Washington, DC; Mike Newton and Steven Self from the Open University; Hajime Horiuchi and Masao Okabe from Japan and Ray Gates from Canada, all colleagues in ISO/IEC JTC1 SC32 WG2; Matthew West, author of Developing High Quality Data Models, formerly of Shell and now also a colleague in ISO/IEC JTC1 SC32 WG2; David Hay from Houston, USA, author of many books including Data Model Patterns, Conventions of Thought, Requirements Analysis and UML and Data Modelling, a Reconciliation; and Alec Sharp from Canada, conference speaker and co-author of Workflow Modeling, Tools for Process Improvement and Application Development.

Special mention must go to David Beaumont from Stehle Associates, my constant sounding board for my ideas over the last 18 years.

Homing in on this book, I need to thank Terri Lydiard, a fellow BCS examiner, who reviewed an early draft of Chapter 1 and Keith Hare, of JCC Consulting in Granville, Ohio, USA and convenor of ISO/IEC JTC1/SC32/WG3, who reviewed Chapters 11 and 13. Both provided valuable comments which I have tried to incorporate.

Thanks are also due to Ian Borthwick and Rebecca Youé of BCS who have been respon-sible for getting this book into print.

Finally, a massive thank you to the back-up team at home, my wife, Vivienne. She has found it difficult to understand why I am not wandering around a golf course or sitting in an armchair by the fire instead of enjoying myself running around the world attending meetings, teaching, examining, writing books and generally getting involved in things because of my total inability to say, ‘No’. I have promised Vivienne that I will look up the word ‘Retirement’ in the dictionary one day – but not yet.

xviii

Page 21: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

GLOSSARY

aggregation (Class modelling) A special form of association that specifies a whole–part relationship between an object class representing the aggregate (whole) and another object class representing the component part.

alternate key (Relational data analysis) A candidate key that is not selected to be the primary key for the relation.

artefact A diagram or supporting description providing a representation of the system of interest.

association (Class modelling) A business link between two object classes. The link is required in order to navigate from one class to another.

association class (Class modelling) An object class that has both association and object class properties.

attribute (Entity relationship modelling and class modelling) A named characteristic of an entity type or object class whose values serve to qualify, identify, classify, quantify or express the state of an instance of that entity type or object class.

big data A data set, or a collection of data sets, with characteristics (for example, volume, velocity, variety, variability, veracity) that for a particular problem domain at a given point in time cannot be efficiently processed using current/existing/established/traditional technologies and techniques in order to extract value.

candidate key (Relational data analysis) An attribute, or a set of attributes, that pro-vides the ability to uniquely identify a tuple in a relation without referring to any other data, such that no two tuples in a relation can have the same value, or set of values, for their candidate keys.

cardinality (Entity relationship modelling and class modelling) The degree of occurrence indicated on a relationship between two entity types or an association between two object classes. The cardinality reflects part of the business rules for a relationship or association.

CASE Acronym for computer-aided software engineering – a combination of software tools that assist computer development staff to engineer and maintain software sys-tems, normally within the framework of a structured method.

Chen notation (Entity relationship modelling) The original entity–relationship modelling notation.

xix

Page 22: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

class model (Class modelling) A technique from the Unified Modeling Language (UML). A class model describes, using graphics and documentation, the classes in a system and their associations with each other. Within business analysis the classes are limited to the things of significance about which information needs to be held in support of business operations.

composite identifier (Entity relationship modelling) A unique identifier formed from a combination of attributes.

composite key (Relational data analysis) A candidate key comprising more than one attribute.

composition (Class modelling) A form of aggregation which requires that a part instance be included in at most one composite at a time, and that the composite object is responsible for the creation and destruction of the parts.

column The logical structure within a table of a relational database management sys-tem (RDBMS) that corresponds to the attribute in the relational model of data.

conceptual data model A detailed model that captures the overall structure of organi-sational data while being independent of any database management system or other implementation consideration – it is normally represented using entity types, relation-ships and attributes with additional business rules and constraints that define how the data is to be used.

corporate data model A conceptual data model whose scope extends beyond one application system.

data A reinterpretable representation of information in a formalised manner suitable for communication, interpretation or processing.

data analysis The process of understanding and documenting in a data model the information (or data) requirements of a business or business area; data analysis is a part of business analysis.

data mining The process of finding significant, previously unknown and potentially valuable knowledge hidden in data.

data model (i) An abstract, self-contained logical definition of the data structures and associated operators that make up the abstract machine with which users interact (such as the relational model of data). (ii) A model of the persistent data of some enterprise (such as an entity–relationship model or class model of the data required to support a business or business area).

data modelling The task of developing a data model that represents the persistent data of some enterprise.

data type A constraint on a data value that specifies its intrinsic nature, such as numeric, alphanumeric or date.

data warehouse A specialised database containing consolidated historical data drawn from a number of existing databases to support strategic decision-making.

xx

Page 23: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

GLOSSARY

database (i) An organised way of keeping records in a computer system. (ii) A collec-tion of data files under the control of a database management system.

database management system (DBMS) A software application that is used to create, maintain and provide controlled access to databases.

described domain (Entity relationship modelling and class modelling) A domain that is specified by a description or specification, such as a rule, a procedure or a range (i.e. interval); a domain that is not enumerated.

domain (Entity relationship modelling and class modelling) A named pool (or set) of values from which an instance of an attribute must take its value; a domain provides a set of business validation rules, format constraints and other properties for one or more attributes

Ellis-Barker notation (Entity relationship modelling) A modelling notation designed by Harry Ellis and Richard Barker while working at the consultancy company CACI with business users in mind so as to reduce interactions with those users. This notation was later used by the Oracle Corporation and by the UK Government’s Central Computer and Telecommunications Agency (CCTA) for its Structured Systems Analysis and Design Method (SSADM).

entity (Entity relationship modelling) A named thing of significance about which infor-mation needs to be held in support of business operations.

entity occurrence (Entity relationship modelling) A single instance of an entity within an entity type.

entity relationship model (Entity relationship modelling) A data model based on entity types and their attributes and relationships.

entity subtype (Entity relationship modelling) A subset of the instances of an entity type, known as the supertype, that share common attributes or relationships distinct from other subsets of the supertype.

entity type (Entity relationship modelling) An element of a data model that represents a set of characteristics common to a collection of entities that are instances of the type.

enumerated domain (Entity relationship modelling and class modelling) A domain that is specified by a list of all its permitted values.

first normal form (1NF or FNF) (Relational data analysis) A relation is in first normal form if all the values taken by the attributes of that relation are atomic or scalar values – the attributes are single-valued or, alternatively, there are no repeating groups of attributes.

foreign key (Relational data analysis) One or more attributes in a relation that imple-ment a many-to-one relationship that the relation has with another relation or with itself (the reference); the values of the foreign key in a tuple must match the values of the primary key in one of the tuples in the referenced relation.

xxi

Page 24: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

hierarchic identifier (Entity relationship modelling) A unique identifier where at least one element of the identifier is a relationship; a hierarchic identifier may be either a combination of relationships or a combination of attribute(s) and relationship(s).

hierarchic key (Relational data analysis) A candidate key comprising more than one attribute where part of the candidate key is a foreign key.

IDEF1X (Entity relationship modelling) An entity relationship modelling notation from the family of ICAM (Integrated Computer-Aided Manufacturing) Definition Languages (IDEF) used by the US Federal Government.

information (i) Something communicated to a person. (ii) Knowledge concerning objects, such as facts, events, things, processes or ideas, including concepts, which have a particular meaning within a certain context.

information Engineering notation (Entity relationship modelling) An entity relationship modelling notation that is one of the techniques used in Information Engineering, a methodology developed by James Martin and Clive Finkelstein in the late 1970s.

master data management The authoritative, reliable foundation for data used across many applications and constituencies with the goal to provide a single version of the truth.

metadata Data about data – that is, data describing the structure, content or use of some other data.

multiplicity (Class modelling) A statement, consisting of a lower-bound (or minimum) and upper-bound (or maximum) of the form ‘minimum..maximum’, of the number of elements that may exist in a collection; when applied to an association it represents the cardinality and optionality of the association and when applied to an attribute it repre-sents the optionality of the attribute.

multimedia data Data representing documents, audio (sound), still images (pictures) and moving images (video).

normal form (Relational data analysis) A state of a relation that can be determined by applying simple rules regarding dependencies to that relation.

normalisation (Relational data analysis) Another name for relational data analysis.

object (Class modelling) A construct within a system for which a set of attributes and operations can be specified; an instance of a particular object class.

object class (Class modelling) A definition of a set of objects that share the same attributes, operations and associations.

object-orientation A software-development strategy based on the concept that sys-tems should be built from a collection of reusable components called objects that encompass both data and functionality.

xxii

Page 25: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

GLOSSARY

object subclass (Class modelling) A subset of the instances of an object class, known as the superclass, that share common attributes and associations distinct from other subsets of the superclass.

ODMG Abbreviation for the Object Data Management Group, a body that has produced a specification for object-oriented databases.

OLAP Acronym for online analytical processing – a set of techniques that can be applied to data to support strategic decision-making.

OLTP Abbreviation for online transactional processing – data processing that sup-ports operational procedures.

operation (Class modelling) A set of actions performed on the data within an object.

optionality (Entity relationship modelling and class modelling) The ability of an instance of an entity type or object class to exist without being linked to an instance of the related entity type or object class. The optionality reflects part of the business rules for a rela-tionship or association.

permitted value (Entity relationship modelling and class modelling) One of the explicit set of values that comprise an enumerated domain.

primary key (Relational data analysis) The candidate key that is selected to enforce uniqueness of tuples in a relation.

RDBMS Abbreviation for relational database management system – a database man-agement system whose logical constructs are derived from the relational model of data. Most relational database management systems available are based on the SQL database language and have the table as their principal logical construct.

relation (Relational data analysis) The basic structure in the relational model of data – formally a set of tuples, but informally visualised as a table with rows and columns.

relational data analysis A technique of transforming complex data structures into simple, stable data structures that obey the rules of relational data design, leading to increased flexibility and reduced data duplication and redundancy – also known as nor-malisation.

relational model of data A model of data that has the relation as its main logical construct.

relationship (Entity relationship modelling) A named set of characteristics common to a collection of connections between instances of two or more entity types, or between instances of one entity type and other instances of the same entity type.

schema A description of the overall structure of a database expressed in a data defini-tion language (such as the data definition component of SQL).

second normal form (2NF or SNF) (Relational data analysis) A relation is in second normal form if it is in first normal form and every non-key attribute is fully dependent on the primary key – there are no part-key dependencies.

xxiii

Page 26: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

simple identifier (Entity relationship modelling) A unique identifier formed from a sin-gle attribute.

simple key (Relational data analysis) A candidate key comprising just one attribute.

SQL Originally, SQL stood for structured query language. Now, the letters SQL have no meaning attributed to them. SQL is the database language defined in the ISO/IEC 9075 set of international standards, the latest edition of which was published in 2016. The language contains the constructs necessary for data definition, data querying and data manipulation. Most vendors of relational database management systems use a version of SQL that approximates to that specified in the standards.

structured data Data that has a high level of organisation in that it conforms to speci-fied data types and relationships and is managed by technology that allows for querying and reporting, such as data within relational databases and spreadsheets.

surrogate identifier (Entity relationship modelling) An artificial (i.e. not real world) unique identifier formed from an attribute or a combination of attributes that are either system-generated or allocated by a user.

table The logical structure used by a relational database management system (RDBMS) that corresponds to the relation in the relational model of data – the table is the main structure in SQL.

third normal form (3NF or TNF) (Relational data analysis) A relation is in third normal form if it is in second normal form and no transitive dependencies exist.

tuple (Relational data analysis) A construct in the relational model of data that is equiv-alent to a row in a table or an occurrence of an entity – it contains all the attribute values for each instance represented by the relation.

Unified Modeling Language (UML) A set of diagramming notations for systems analy-sis and design based on object-oriented concepts.

unique identifier (Entity relationship modelling) An attribute, a combination of attrib-utes, a combination of relationships or a combination of attribute(s) and relationship(s) that provides the ability for each entity to be uniquely identifiable so that each instance of an entity type is distinctly identifiable from all other instances of that entity type.

unstructured data Computerised information which does not have a data structure that is easily readable by a machine, including audio, video and unstructured text such as the body of a word-processed document – effectively this is the same as multimedia data.

validation rule (Entity relationship modelling and class modelling) A statement of the validation that may be applied to a described domain; this statement may be a reference to a data type to be applied to attributes, a range of values, or a ‘format mask’, or any other expression that constrains the domain.

xxiv

Page 27: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

INTRODUCTION

In my previous book,1 I looked at how information, and its cousin, data, should be man-aged as an enterprise-wide resource. In this book, I am looking at the role of business analysts in understanding and documenting the information that needs to be recorded in an information system or its supporting information technology (IT) system to meet the needs of the business for the storage and retrieval of information.

The first part of the book (Part 1, The Basics, Chapters 1 to 6) covers the requirements for the Data Analysis certificate that is part of the scheme for the BCS Advanced Inter-national Diploma in Business Analysis.2

The book will, therefore, be of immediate interest for anybody who is studying for this certificate. It should also be of interest to all business analysts as I have tried to set out how an entity relationship model (also known as an entity relationship diagram) or a UML class model (also known as a class diagram) can help a business analyst under-stand the information needs of a particular business area and then help communicate that understanding, both to the business users and, finally, to the systems developers.

The second part of the book (Part 2, Supplementary Material, Chapters 7 to 14) provides extra information that I believe should be of interest to business analysts.

These chapters are followed by three appendices. Appendix A provides a table to show the equivalence between the concepts used in the various parts of the book. Appendix B provides a bibliography and Appendix C provides solutions to the exercises introduced in Part 1.

1 Principles ofaData Management: Facilitating Information Sharing, Second Edition (BCS, 2013).2 http://certifications.bcs.org/category/18428

xxv

Page 28: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering
Page 29: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

PART 1:THE BASICS

The first part of the book (Chapters 1 to 6), which provides a general introduction to entity relationship modelling and UML class modelling, covers the requirements for the Data Analysis certificate that is part of the scheme for the BCS Advanced International Diploma in Business Analysis.

Chapter 1, Why business analysts should model information, provides an introduction to business analysis, systems, information, data and modelling and why these topics come together within the development of requirements for an IT system. The notations used within the book, the Ellis-Barker entity relationship notation and the UML class diagram notion, are introduced. The chapter finishes with a discussion of data analysis.

Chapter 2, Modelling the things of interest to the business and the relationships between them, introduces the basic modelling concept of the entity to represent something of interest to the business about which information needs to be recorded and the related concept of the entity types, the representation of a group of entity occurrences with common characteristics. The relationships between entity types are also introduced. Alongside the introduction of these concepts, the comparable concepts of object, object class and association are also introduced.

Chapter 3, Modelling more complex relationships, explores some of the more complex relationships that can exist between entity types or object classes. The topics covered are the resolution of many-to-many relationships and associations (including the oddity known as the ‘Bill of Materials’ structure), mutually exclusive relationships and asso-ciations, which leads to generalisation and specialisation, and, finally, a quick look at aggregation and composition.

Chapter 4, Drawing and validating information model diagrams, introduces a process for drawing an information model diagram. It then considers two techniques for validating an information model – the data navigation path and the Create-Read-Update-Delete (CRUD) matrix.

Chapter 5, Recording information about things, introduces the related concepts of the attribute, the unique identifier and the domain, and their representation on both entity relationship models and class models. The object-oriented concept of the operation is also introduced at the end of the chapter.

Chapter 6, Rationalising data using normalisation, involves a change of direction as we look at the process of relational data analysis (or normalisation). We need to look at the theory of the relational model of data – the ‘model’ that underpins all of the database

1

Page 30: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

management systems that use the SQL database language. Having understood the theory, we then look at the process of normalisation and the production of a ‘third normal form’ model.

These chapters should be read sequentially, from Chapter 1 through to Chapter 6. Revision exercises are provided at the end of Chapters 2 to 6.

2

Page 31: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

1 WHY BUSINESS ANALYSTS SHOULD MODEL INFORMATION

This chapter provides an introduction to business analysis, systems, information, data and modelling and why these topics come together within the development of require-ments for an IT system. The chapter finishes with a discussion of data analysis.

WHAT IS BUSINESS ANALYSIS?

Business analysis is a discipline that has been evolving for about 20 years. Its main purpose is to ensure that there is alignment between business needs and business change solutions. Many of these business change solutions involve the development of new – or the enhancement of existing – information technology (commonly abbreviated to IT) systems.

There is no fixed route to becoming a business analyst. Some business analysts have a strong information technology background and have developed an understanding of busi-ness in general and their business organisation in particular. Other business analysts have a strong business background and, where a solution involving information technology is concerned, they need to have obtained an understanding both of the capabilities provided by information technology and of how an information technology system is developed.

The word ‘system’ appears in the two preceding paragraphs because it is important for the business analyst to grasp hold of ‘systems thinking’. Whether the proposed business change solution involves the use of information technology or not, the business analyst is working with or specifying the requirements for systems. These systems may be business systems, information systems or information technology systems.

So, what is a system?

Professor Michael C. Jackson of the University of Hull has defined a system as

a complex whole the functioning of which depends on its parts and the interactions between those parts.3

Using this definition, the term ‘system’ can be applied to a hard, designed system such as a central heating system or to a soft, or human activity, system such as a business organisation.

A central heating system consists of a boiler, radiators, pipes and, importantly, a ther-mostat to keep the whole system under control. This is a ‘complex whole’, the function-ing of which depends on all of those parts working together.

3 Systems Thinking: Creative Holism for Managers (2003), page 3, Michael C. Jackson.

3

Page 32: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

A business, whether in the private, public or not-for-profit sectors of the economy, consists of people (employees, suppliers and customers), organisations (headquarters, branches, departments) and processes (including ordering and receiving goods and selling goods). All businesses require information to manage their people, organisations and processes and, for most businesses these days, there is information technology providing support for the management of that information. So, a business can be seen as another ‘complex whole’, the functioning of which depends on the parts, the people, the organisations, the processes, the information and the technology, working together to achieve the goals of the business. There will also be checks and balances to ensure that the business remains effective and efficient – the equivalent of the thermostat in the central heating system.

Any business can, therefore, be considered as a system – a business system. Any sys-tem can have a number of subsystems, so a business system can also have subsystems.

One of the important subsystems of a business system is the information system, or set of information systems, which supports the business by managing the business’s information. I define an information system as:

a system that gets the right information to the right person in the right place at the right time.

We need, therefore, to think about information systemically. If there is a requirement for Sue in production to receive details of an important sales order as soon as it arrives in the business, then we need to arrange for that to happen. It could be that the arrange-ment is for the salesman, John, who has just completed the sale, to walk along the cor-ridor to production to tell Sue about the sales order. The right information (details of the sales order) is being delivered to the right person (Sue) in the right place (the production department) at the right time (immediately) without the use of any technology. We have a technology-free information system! Yes, that is possible, but that information system still needs defining, developing, implementing and maintaining.

Most modern businesses require the information system to be supported by an infor-mation technology system – a collection of hardware, software and networks that func-tion together to store and retrieve information.

The business analyst needs to think in terms of three levels of system: the business sys-tem itself; the subsystem that handles the information for the business (the information system); and the sub-subsystem that provides the technology to support the informa-tion system (the IT system). This is shown diagrammatically in Figure 1.1.

Figure 1.1 Three levels of system

IT SYSTEM

INFORMATION SYSTEM

BUSINESS SYSTEM

4

Page 33: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

WHY BUSINESS ANALYSTS SHOULD MODEL INFORMATION

Information is a key business resource in all business, even if the senior managers of most businesses fail to recognise that fact.

INFORMATION AND DATA

In my previous book, I explained the relationship between information and data. I repeat that explanation here, slightly edited, because this book is about modelling business information with a view to storing that information as data within an information system or an information technology system.

An often-heard definition of information is that it is ‘data placed in context’. This implies that some information is the result of the translation of some data using some processing activity, and some communication protocol, into an agreed format that is identifiable to the user. In other words, if data has some meaning attributed to it, it becomes information.

For example, what do the figures ‘190267’ represent? Presented as ‘19/02/67’, it would probably make sense to assume that they represent a date. Presented on a screen with other details of an employee of a company, such as name and address, in a field that is labelled ‘Date of Birth’ the meaning becomes obvious. Similarly, presented as ‘190267 metres’, it immediately becomes obvious that this is a long distance between two places but, for this to really make sense, the start point and the end point have to be specified as well as, perhaps, a number of intermediate points specifying the route.

While these examples demonstrate the relationship between data and information, they do not provide a clear definition of either data or information.

There are many definitions of data available in dictionaries and textbooks, but the essence of most of these definitions is the understanding that data is ‘facts, events, transactions and similar that have been recorded’. Furthermore, the definition of information is usually based on this definition of data. Information is seen as data in context or data that has been processed and communicated so that it can be used by its recipient.

The idea that data is a set of recorded facts is found in many books on computing. However, this concept of data as recorded facts is used beyond the computing and infor-mation systems communities. It is, for example, also the concept used by statisticians. Indeed, the definition of data given in Webster’s 1828 Dictionary – published well before the introduction of computers – is ‘things given, or admitted; quantities, principles or facts given, known, or admitted, by which to find things or results unknown’.4

However, starting the development of our definitions by looking at data first appears to be starting at the wrong point. It is information that is important to the business, and it is there that our definitions, and our discussion about the relationship between informa-tion and data, should really start.

We start by considering the everyday usage of information – something communicated to a person – and, with that, we can find a definition of data that is relevant to business analysts. That definition is found in ISO/IEC 2382-1 1993 (Information Technology – Vocabulary – Part 1: Fundamental Terms) stating that data is

4 See www.webstersdictionary1828.com

5

Page 34: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

a re-interpretable representation of information in a formalised manner suitable for communication, interpretation or processing.

There is a note attached to this definition in the ISO/IEC standard which states that data can be processed by human or automatic means; so, this definition covers all forms of data but, importantly, includes data held in information systems used to support the activities of an organisation at all levels: operational, managerial and strategic.

Figure 1.2 The relationship between data and information

Subject of information

Information

Interpretation of data

Data

Knowledgeabout

objects, etc.

Information

Representation of information

Data

Storage and Processing

Figure 1.2 provides an overview of the relationship between data and information in the context of an information technology system. The user of the system extracts the required information from their overall knowledge and inputs the information into the system. As it enters the system, it is converted into data so that it can be stored and processed. When another system user requires that information to be retrieved, the data is interpreted – that is, it has meaning applied to it – so that it can be of use to the user.

THE IMPORTANCE FOR A BUSINESS ANALYST OF UNDERSTANDING INFORMATION NEEDS

Understanding the information needed by the business, and its representation, data, is vitally important if we are to develop effective information systems and the infor-mation technology systems to support them. In the BCS publication Business Analysis, now in its third edition, the need for a business analyst to take a holistic view of the business is stressed, where the holistic view is defined as encompassing people, organi-sations, processes, information and technology (my emphasis). Yet, as an examiner,

6

Page 35: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

WHY BUSINESS ANALYSTS SHOULD MODEL INFORMATION

I am constantly coming across practising or aspiring business analysts who believe that understanding and documenting the information needed by the business is nothing to do with them. There appears to be a view that if the processes are sorted out the information will look after itself. I think this is wrong. We are, after all, concerned with information systems or information technology systems, not processing systems.

THE ROLE OF MODELS IN BUSINESS ANALYSIS

Models in various forms play an important role in business analysis. As described later in the chapter, models help the business analyst understand and communicate require-ments. Modelling is an essential competence for a business analyst.

When trying to understand how a business is currently running, we could draw a rich picture (see Figure 1.3).

Figure 1.3 A rich picture

FinanceDirector

IT/IS Department

Old OrderProcessing System

New CRMSystem

Call CentreSales Force

SalesDirector

?

£ ££

Trouble brewing

ExternalSuppliers

ManufacturingMa

We are the experts

Customers

Where’s our order?

ManagingDirector

When trying to understand what a business should be doing, we can draw a business activity model (see Figure 1.4). Both of these valuable techniques are derived from Peter Checkland’s Soft Systems Methodology.5

5 Checkland, P. (1981) Systems Thinking, Systems Practice. John Wiley & Sons, Chichester, UK provides a useful insight into the use of Soft Systems Methodology. For a shorter read try Checkland, P., Poulter, J. (2006) Learning for Action: A Short Definitive Account of Soft Systems Methodology and its use for Practitioners, Teachers and Students, John Wiley & Sons, Chichester, UK.

7

Page 36: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

Figure 1.4 A business activity model

Define target customers

Manage branch network

Decide seasonal fashion range

Procure stock

Distribute stock to branches

Sell fashion

Manage staffDefine sales

targets

Monitor sales volume

Monitor profit margins

Monitor staff performance

Take control action

When trying to understand the current business processes we will probably draw a set of ‘as-is’ business process models (see Figure 1.5). We will then draw a series of ‘to-be’ business process models and discuss those with the business.

Figure 1.5 A business process model

Sale

sman

Sale

s Le

dger

Sale

s A

dmin

istr

atio

nW

areh

ouse

Receive order

[status not satisfactory]

Record Order

Check customer

status

Despatch goods

Despatchinvoice

Fulfil Order

Return order to

client

[status satisfactory]

8

Page 37: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

WHY BUSINESS ANALYSTS SHOULD MODEL INFORMATION

As we move on to consider the functionality that an information technology system has to support, we might start drawing a use case diagram (see Figure 1.6).

Figure 1.6 A use case diagram

Sales Order Fulfilment

Despatch goods

Despatch invoice

Record payment

Record order

Receive order Warehouse

Team

Sales Ledger

Salesman

SalesAdministrator

Check customer

status

<<include>>

All of these models have two main roles.

The first of these roles is to help the analyst understand the situation they are analysing as they carry out the analysis.

Secondly, once developed, they help the analyst communicate that understanding back to the business (and, in the process, maybe demonstrate their misunderstanding, lead-ing to a correction to the model) and, in the case of an information technology system development or enhancement, forward to the developers of that system.

When an information technology system is being developed, the process of eliciting the requirements from the business, analysing those requirements to ensure that they are good quality requirements, validating the requirements with the business, manag-ing and documenting the requirements is known as ‘requirements engineering’. This is shown in context in Figure 1.7.

Figure 1.7 Requirements engineering in context

Requirements Engineering

Business Information

Requirements

System Development

Models used here to communicate

with the business

Models used here to communicate

with the developers

As you can see, we can think of requirements engineering as the filling in a sandwich, with the business and its requirements on one side and the system developers on the other.

9

Page 38: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

When eliciting requirements from the business, models can be used in workshops or other interactions with the business to aid understanding. It can sometimes be useful to sketch out a model on a whiteboard or flipchart while talking to the users. Models are vital when you are asking the business to validate a set of requirements – ‘a picture paints a thousand words’.

Models are also essential when passing, or discussing, requirements to, or with, those who will have responsibility for the development of the system. It is not just that ‘IT people’ like to see models. Models will often express ideas and requirements much more clearly than is possible in text alone.

The trick is, of course, to use the same models when validating requirements with the business and when passing those requirements to the developers. ‘Business people’ do not have the time to learn complicated modelling notations and syntax. For them the models must be easy to understand. On the other hand, the developers want the models to be complete, clear, concise and unambiguous, and this can lead to the use of a complex set of notational elements. It can also, in some circumstances, lead to the use of some particular technical constructs that, from a business perspective, are unnecessary.

If we want to use the same models to communicate with the business and to communi-cate with the developers, we need to use modelling notations and conventions that are both easy for the business to understand and sufficiently detailed to completely, con-cisely, clearly and unambiguously convey the requirements to the system developers.

DATA MODELS AND DATA

We have seen a range of models that are in the business analyst’s toolkit. None of those we have seen so far truly helps us to understand what information a system needs to hold to enable it carry out its functions. These information requirements are modelled with a model called, confusingly, a data model.

In fact, data models can have two roles.

Firstly, they can be used to specify what data (or information) the business needs recorded within the system. Here the model is a complete, concise and unambiguous statement of the information requirements of the business for the system under consideration. This model is the responsibility of the business analyst.

Secondly, they can also be used to specify how the data is to be stored and organised within the system, so that it can be retrieved and analysed. Here the model is a specification for the design of the database of the system. This model (or, more probably, a set of models) is the responsibility of the database designer within the system development team.

The what model is a conceptual model, which is also known as a Computer-Independent Model (CIM). It represents the things that are of interest to the business about which information needs to be recorded, the specific information about these things that the business needs recorded, and any business relationships that exist between those things of interest. A model at this level can be considered as encapsulating the rules of the business, for example:

10

Page 39: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

WHY BUSINESS ANALYSTS SHOULD MODEL INFORMATION

y ‘people or organisations cannot be recorded as customers until they have placed an order’;

y ‘customers can place many orders’; and

y ‘customers must be allocated a customer number and we must record their name and address’.

The system development team then develop a series of how data models that lead to the design of the system’s database. The first model the developers will produce will be a logical model (often known as a Logical Data Model (LDM) or a Platform-Independent Model (PIM)) and the final model will be a model that represents the actual physical design of the database (known as a Physical Data Model (PDM) or a Platform-Specific Model (PSM)).

ENTITY RELATIONSHIP MODELLING

Data models have been around for well over 50 years. The earliest data modelling notation that I know of was for Bachman Diagrams. This notation was developed by Charles Bachman, one of the early database management system pioneers, to show the structure of a required database in the days before the advent of the modern ‘relational’ database.

Entity relationship modelling, until recently the most common form of data modelling, was first introduced by Peter Chen.6 There are, however, many ‘flavours’ of entity rela-tionship model. In this book, I am going to stick to just one of these entity relation-ship modelling notations: the notation developed by Harry Ellis and Richard Barker in the early 1980s when they worked for CACI, a UK-based consultancy company. Unsurprisingly, we will refer to this as the Ellis-Barker notation. I will, however, discuss some other common notations in Chapter 7.

The Ellis-Barker notation was specifically designed with business users in mind. In fact, in Richard Barker’s own words, they were ‘striving for even greater accuracy in systems analysis, while minimising redundant interactions with the users’.7 The notation was used by the Oracle Corporation in its computer-aided software engineering (CASE) tool and was later adopted, in a truncated form, by the UK Government’s Central Computer and Telecommunications Agency (CCTA) for its Structured Systems Analysis and Design Method (SSADM). Richard Barker described the use of this notation in a book8 he wrote while working for the Oracle Corporation.

An example of an entity relationship model drawn using the Ellis-Barker notation is shown in Figure 1.8.

This model shows that the business concerned is interested in recording information about its products, its customers and the orders placed by those customers. Each of those orders has a number of ‘order lines’ (the items on the order) and a number of statuses.

6 Chen, P.P.S. (1976) The Entity–Relationship Model: Toward a Unified View of Data, ACM Transactions on Database Systems.7 In the Foreword to Hay, D.C. (1996) Data Model Patterns: Conventions of Thought. Dorset House.8 Barker, R (1990) CASE*Method: Entity Relationship Modelling, Addison-Wesley.

11

Page 40: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

Figure 1.8 An example entity relationship model using the Ellis-Barker notation

part of

PRODUCT(m)(m)(m)

ORDER LINE order for

subject of

comprised of

placed by

placer of

codedesignationretail cost

(m)(m)

numberordered quantity

CUSTOMER(m)(m)(m)

numbernameaddress

ORDER (m)(m)

numberdate

ORDER STATUS for

described with(m)(m)

designationeffective date

There are a number of advantages to using the Ellis-Barker notation within business analysis. Specifically, the notation:

y was designed for use with business users – it names things in a way that the business will understand;

y avoids the use of technical components that have no relevance to the business user and, if included (as in most other notations), would confuse the business user;

y provides a limited, consistent set of symbols;

y with its supporting documentation, provides a complete, concise, clear and unambiguous statement of the information requirements.

In addition, the Ellis-Barker notation is well known in the United Kingdom, having been adopted for use within the Structured Systems Analysis and Design Method (SSADM). Although no specific notation is mentioned in the syllabus for the BCS Data Analysis certificate, it is, I believe, the notation that the originators of that syllabus had in mind for entity relationship modelling.

CLASS MODELLING

In the late 1980s and early 1990s there were a number of disparate modelling initiatives devised to cope with the design of systems based on the object-oriented programming paradigm. In the mid-1990s three proponents of their own notations, Grady Booch, Ivar Jacobson and James Rumbaugh (known as the three amigos) came together to create the Unified Modeling Language (UML)TM. In 1997 UML was adopted as a standard by the Object Management Group (OMG),9 and in 2005 UML was also published by the International Organization for Standardization (ISO)10 as an approved ISO standard. The current version of the UML standard includes 13 diagram types, but we are only interested in one of

9 See www.omg.org/10 See www.iso.org/

12

Page 41: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

WHY BUSINESS ANALYSTS SHOULD MODEL INFORMATION

those – the ‘class diagram’, which is used to model information and data. We will refer to the models developed using this diagramming notation as ‘class models’.

An example of a UML class model is shown in Figure 1.9. This example uses the same business scenario as that shown in Figure 1.8 using the Ellis-Barker notation.

Figure 1.9 An example of a UML class model

1..*

CUSTOMERnumber [1..1]name [1..1]address [1..1] places

1..1

1..*

part of

1..1

0..*

order for

1..1

ORDER

number [1..1]date [1..1]

ORDER LINEnumber [1..1]ordered quantity [1..1]

PRODUCTcode [1..1]designation [1..1]retail cost [1..1]

0..*

describes

1..1

ORDER STATUSdesignation [1..1]effective date [1..1]

USE OF DATA MODELS IN BUSINESS ANALYSIS

Because this book is aimed at business analysts the focus will be on Computer Independent (or conceptual) Models – the models where the ‘what’, the information or data that is required, is specified. There will, however, be some consideration of the other, ‘how’, models.

The shape and form of the ‘how’ models will be determined by the database management system that is to be used to manage the database. There are many types of database man-agement systems available and that means that the business analyst should not assume any particular type of database when developing the ‘what’ model as part of requirements engineering. The job of the business analyst when modelling is to concentrate on develop-ing a model that represents the information that the business needs to be recorded; the business analyst should not be including anything in that model that depends on a specific implementation.

The example models at Figures 1.8 and 1.9 might give the impression that information or data modelling is a simple task leading to relatively simple databases. But these are just simple examples. In practice, information or data models can be very large – I have seen some models that completely cover the walls of a six-person office. Developing an information or data model is not a trivial task. Having said that, information or data modelling should be part of the ‘consultancy toolkit’ of all business analysts.

Developing an information or data model does not mean that we are just documenting in ‘boxes and lines’ what we find in the business. We need to apply the business analyst’s

13

Page 42: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MODELLING BUSINESS INFORMATION

enquiring mind to analyse what we find. As with all requirements elicitation, as we develop the model we are going to find that there are unanswered questions that need to be answered. Through asking questions of the business about the model we can gradually refine the model so that the end result is useful to the business, and allows the needs of the business and its systems to be met. Developing an information or data model is an iterative process that helps us to understand the information that the business needs to record. This process also helps us uncover the business rules and, more importantly, any exceptions to the rules that need to handled.

WHAT MAKES A GOOD DATA MODEL?

The simple answer to this is that a good model should express the totality of the infor-mation requirements of the business clearly, concisely and unambiguously. As with any set of requirements, all of the requirements should be included, there should be no overlapping or conflicting requirements, and no requirements should be hidden within other requirements.

When modelling the processes of the business, the business analyst will think in terms of two sets of models: the ‘as-is’ models and the ‘to-be’ models. The ‘as-is’ models are the result of pure analysis: the documentation of the current situation using boxes and lines. When developing the ‘to-be’ models the business analyst is straying from pure analysis into the field of synthesis or design, albeit using the same set of box and line constructs as for the ‘as-is’ models. The ‘as-is’ models are developed to help us come up with the ‘to-be’ models. The final form of the ‘to-be’ models (the designs) will depend very much on the experience and creativity of the business analyst who is doing the modelling.

When modelling information or data the same is true, although we do not normally produce an ‘as-is’ information or data model. The only time we would produce an ‘as-is’ model is when we are reverse-engineering an existing database or carrying out relational data analysis on existing documents, reports or input screens. Even so, the purpose of the reverse-engineering or relational data analysis is to influence a model of the information requirements to be met by a future information system (or set of information systems). Such a model is a ‘to-be’ model.

All information models that are the output of the requirements engineering process can, therefore, be considered as the start of the design process for the future information system or systems. The experience and creativity of the modeller will impact not only on the model itself but also on the design of any future database developed from the model.

We will look at the subject of data model quality in more detail in Chapter 9.

INTRODUCING DATA ANALYSIS

‘Data analysis’ is a difficult term because it has two meanings.

One meaning of the term is the analysis of an existing collection of data to find patterns, trends or hidden information. The result is some insight that will be useful to the busi-ness.

14

Page 43: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

WHY BUSINESS ANALYSTS SHOULD MODEL INFORMATION

The other meaning is the analysis of a business domain to understand the information or data that needs to be recorded in an information system to meet business needs. The result is a data model that may lead to a database design.

While some business analysts may be involved in analysing data under the first of those meanings, especially when that data could be used in strategic decision-making or as input into a business case, it is the second meaning that is of interest to us in this book.

Business analysis helps us understand business requirements. Some of these require-ments are information (or data) requirements. Data analysis helps us understand those information requirements, probably the most important of the overall business require-ments for an information system. Data analysis is not separate from business analysis; it is an essential part of business analysis.

Like many other analysis or modelling activities, data analysis and modelling can be approached from a top-down perspective or from a bottom-up perspective.

The top-down approach involves starting with a blank sheet of paper (or, preferably, a clean whiteboard) and using an appropriate set of requirements elicitation techniques (interviews, workshops, observation, among others) to find out what information the business needs to achieve its goals. This then forms the basis of the information model, whether it is an Ellis-Barker entity relationship model or a UML class model. We will look at the development of Ellis-Barker entity relationship models and UML class mod-els in Chapters 2 to 5.

The bottom-up approach involves looking at existing ‘data sources’ – which could be existing databases, screens and reports for existing information systems or, in a paper-based system, the documents and records that are maintained – to build a model that represents the known information requirements.

Business analysts should use both approaches and then compare the results. They will seldom match. Some detail, absolutely vital to the business, may have been missed when looking at things from the top down. Some new requirements not handled by the current system will probably have been missed when looking at things from the bottom up. Almost certainly, the analyst will need to develop a model that is a composite of the top-down model and the bottom-up model.

When taking a bottom-up approach, we can use a formal technique called relational data analysis (which is also called normalisation) or we can just informally use our intui-tion. Most data modellers will use the formal technique when they start data modelling and then move to do things informally when they are more experienced. We will look at relational data analysis in Chapter 6.

15

Page 44: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

INDEX

Page numbers in italics refer to figures or diagrams

1NF (first normal form) xix, 84, 86–9

2NF (second normal form) xxi, 84, 89–90

3NF (third normal form) xxii, 84, 90–4

aggregation xvii, 46–8, 136

alternate keys xvii, 94

artefacts xvii

‘as-is’ models 14

association classes xvii, 35–6

associations

introduction to xvii, 19

identifying 53

multiple 27–8

mutually exclusive 39–41

naming 22–3, 112–13

notation 22–3

reflexive 29

associative entity types 34, 38

attribute occurrences 66–7

attribute types 66–7

attributes

introduction to xvii, 66–9

naming 69, 110–11

UML extended attribute notation 75–7

when to model concepts as 69–71

Bachman Diagrams 11

Barker, Richard 11, 100

bibliography 159–61

big data xvii, 129–30

bill of materials structure. 36–9

binary large object data type 145

Booch, Grady 12

‘boxes within boxes’ depictions 41, 43

Boyce-Codd normal form 84

business activity models 8

business analysis

and data analysis 14–15

definitions 3–5

information needs 6–7

role of models 7–10

use of models 13

business intelligence

data warehouses 139–40

dimensional modelling 141–3

multidimensional data models 140–1

business process models 8

business rules enforcement 117

business systems 4–5

candidate keys xvii, 95

cardinality

overview xvii, 24–7

in model drawing 54–6

problems with 32–9

CASE

see computer-aided software engineering (CASE)

character large object data type 145

Checkland, Peter 7–8

Chen notation xvii, 104, 106

class models xviii, 12–13, 22–3

class terms 111

Codd, Edgar F. 81, 84

columns xviii, 82, 154–5

communication 9–10, 117–18

completeness 116, 120–1

composite identifiers xviii, 73–4

composite keys xviii

composition xviii, 46–8

computer-aided software engineering (CASE) xvii

Computer-Independent Models (CIM) 10–11, 13, 19

conceptual data models xviii, 154–5

constraints 40–1, 77

corporate models xviii, 123–6

correctness 120–1

CRUD matrix validation 59–62

data

big data 129–30

and data models 10–11

definition xviii

and information 5–6

master data 130

metadata 130

semi-structured data 129

structured data 127–9

types xviii, 145–51

unstructured data 129

data analysis xviii, 14–15

data clustering 155–6

data landscape 127–30

data mining xviii

data models

and data 10–11

definitions xviii

requirements of 13–14

third normal form (3NF or TNF) 94

what is a good model? 13–14

172

Page 45: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

data navigation paths 57–9

data reusability 117

data types

binary large object type 145

character large object type 145

datalink type 150

definition xviii

distinct type 145–6

multiset type 148

row type 149

structured type 146–8

variable array type 149

XML type 150–1

future developments 150–1

data warehouses xviii, 139–40

database design

first cut stage 154–5

optimised design stage 155–6

role of models 10–11

database management system (DBMS) xix

databases

chronology 131

definitions xix

hierarchical 130–3

network 133, 134–5

NoSQL 137–8

object-oriented 136–7

relational (SQL) 133–6

datalink data type 150

DBMS (database management system) xix

denormalisation 156

derived attributes 77

described domains xix, 75

diagrams

drawing 54–6

validating 56–62

dimensional modelling 141–3

distinct data type 145–6

document stores 138

domains xix, 74–5, 110

drawing models

see model drawing

elegance 117

Ellis, Harry 11, 99

Ellis-Barker notation xix, 11–12, 99–100

employee supervision 38–9

encapsulation 136

enquiry access path

see data navigation paths

enterprise awareness dimension 120–1

entities xix, 16–18

entity occurrences xix, 17–18

entity relationship modelling xix, 11–12, 20–2

entity subtypes xix, 41–3, 109

entity types

overview xix, 17–18, 65–6

or attributes? 69–72

identifying 51–2

naming 18–19, 108–10

subtypes xix, 41–3, 109

supertypes 41–3

enumerated data types 77

enumerated domains xix, 75

equivalences, table of 158

European Process Industry STEP21 Technical Liaison Executive (EPISTLE) 118–20

exclusive arcs 40

exclusive-or constraint 40–1

exercises

end of chapter 30–1, 48–9, 63–4, 79–80, 96

solutions to 162–71

extensible record stores 138

final models 153

first cut database design stage 154–5

first normal form (1NF or FNF) xix, 84, 86–9

flexibility 117

flexible design 119–20

foreign keys xix, 82, 89

full attribute names 111

galaxy schema 143

generalisation 43–5

generic data models 114–15, 119–20

genericity 114–16

graph databases 138

Gregory, William 120–1

hierarchic identifiers xx, 74

hierarchic keys xx

hierarchical databases 130–3

IDEF1X notation xx, 100–4

indexes 156

information xx, 4–6

Information Engineering notation xx, 104, 105

information needs 6–7, 9–10, 13–14

information systems 4–5

information technology systems 4

inheritance 136

input parameters 78

integration 118

IT systems 4

Jacobson, Ivar 12

key-value stores 138

layout of models, quality of models 121–2

link entities 34

Logical Data Model (LDM) 11, 94

lower bounds 23

mandatory relationships/associations

see optionality

many-to-many associations/relationships 27, 32–7

see also cardinality

many-to-one relationships

see cardinality

master data 130

master data management xx

matrix organisation 38–9

metadata xx, 130

model drawing

process of 50–1

identifying entity types or object classes 51–2

identifying relationships or associations 53

drawing the initial diagram 54–6

validating the diagram 56–62

models

in business analysis 7, 13

communication through 9–10

modifier terms 111

multidimensional data models 140–1

multimedia data xx

173

Page 46: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

multiple associations/relationships 27–8

multiplicity xx, 23, 68–9

multiset data type 148

mutually exclusive associations/relationships 39–41

naming

associations 22–3, 112–13

associative entity types 38

attributes 110–11

conventions 108–13

domains 110

entity types 108–10

object classes 38, 108–10

of object classes and entity types 18–19

relationships 20–2, 112

tables and columns 154–5

network databases 133, 134–5

non-redundancy 116

normal form xx

normalisation

what is it? xx, 81–2

relational model of data 82–4

rules of 84

start of process 85–6

first normal form (1NF or FNF) 86–9

second normal form (2NF or SNF) 89–90

third normal form (3NF or TNF) 90–4

third normal form data model 94

relationship to modelling 95

alternate, candidate and primary keys 94

normalisation forms 86, 87–8, 91, 93

NoSQL databases 137–8

notations

Chen xvii, 104, 106

comparison between 107

Ellis-Barker xix, 11–12, 99–100

IDEF1X xx, 100–4

Information Engineering xx, 104, 105

need for clarity 10

relationships 20–2

UML class model notation 12–13, 22–3, 75–9, 101

object classes

overview xx, 17–18, 65–6

or attributes? 69–72

identifying 51–2

naming 18–19, 38, 108–10

Object Data Management Group (ODMG) xxi, 137

Object Definition Language (ODL) 137

Object Query Language (OQL) 137

object subclasses xxi, 43–5, 109

objectives, conflicting 118

object-orientation xx

object-oriented databases 136–7

objects

behaviour of 77–8

definition xx

and entities 16–18

in UML 77

one-to-many associations/relationships

see cardinality

one-to-one associations/relationships

see cardinality

online analytical processing xxi, 141

online transactional processing (OLTP) xxi

operations xxi, 77–9

optimised database design stage 155–6

optionality xxi, 24–7, 54–6

optional-mandatory relationships 26, 26

optional–optional relationships 27, 27

partial high-level process maps 60

permitted values xxi, 75

Physical Data Model (PDM) 11

‘pig’s ear’ associations/relationships 29

Platform-Independent Model (PIM)

see Logical Data Model (LDM)

Platform-Specific Model (PSM) 11

polymorphism 136

primary keys xxi, 82, 94

prime terms 111

primitive data types 77

product dimensions 140–1

products within products 36–7

quality of models

characteristics of good data models 116–18

five dimensions of data model quality 120–1

genericity or specificity 114–16

layout of models 121–2

principles of high quality data models 118–20

query access path

see data navigation paths

recursive relationships 29, 109

reflexive associations 29, 109

Reingruber, Michael 120–1

relational (SQL) databases 133–6

relational data analysis xxi, 14, 15, 84

relational data analysis process 56

relational database management system (RDBMS) xxi

relational model of data xxi, 82–4

relations xxi, 81, 82–4

relationships

overview xxi, 65–6

identifying 53

introduction to 19

multiple 27–8

mutually exclusive 39–41

naming 20–2, 112

notation 20–2

recursive 29

repeating groups 86

representation terms 111

requirements engineering 9–10, 9

reverse-engineering 14

rich picture 7

role names 23

row data type 149

Rumbaugh, James 12

schema xxi, 141–3

second normal form (2NF or SNF) xxi, 84, 89–90

semantic dimension 120–1

174

Page 47: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

semi-structured data 129

simple identifiers xxii, 73

simple keys xxii

Simsion, Graeme 116–18

single valued attributes 68

SNF (second normal form) xxi, 84, 89–90

snowflake schema 143

Soft Systems Methodology 7–8

solutions to exercises 162–71

specialisation

in class models 43–5

in entity relationship models 41–3

specificity, quality of models 114–16

SQL

basics xxii, 144–5

implications 151–3

new data types 145–51

SQL (relational) databases 133–6

stability 117

standards

overview 6–7

ISO/IEC 2382-1 1993 (Information Technology – Vocabulary – Part 1: Fundamental Terms) 6–7

ISO/IEC 9075 (SQL) 144, 151

Unified Modeling Language (UML) 12–13

star schema 141–2

storage 137

store dimensions 140–1

structured data xxii, 77, 127–9, 146–8

subclasses xxi, 43–5, 109

subtypes 41–3, 109

superclasses 43–5

supertypes 41–3

surrogate identifiers xxii, 74

syntactic dimension 120–1

systems 3–5

tables xxii, 82–4, 154–5

third normal form (3NF or TNF) xxii, 84, 90–4

third normal form data model 94, 152

time dimensions 140–1

‘to-be’ models 14

top-down approaches 15

transitive dependence 92

tuples xxii, 82–4

UML class model notation 12–13, 22–3, 75–9, 101

Unified Modeling Language (UML) xxii, 12–13

unique identifiers xxii, 72–4

un-normalised form (UNF) 86

unstructured data xxii, 129

upper bounds 23

use case diagrams 9

validation xxii, 56–62

value 129

values (attribute occurrences) 66–7

variability 129

variable array data type 149

variety 129

velocity 129

veracity 129

volume 129

West, Matthew 118–20

whole–part relationships 46

wide-column stores 138

Witt, Graham 116–18

workshop models 151

XML data type 150–1

{xor} constraints 40–1

175

Page 48: 9781780173535 Modelling Business Information · MODELLING BUSINESS INFORMATION Keith Gordon ... class modelling, in line with the BCS Data Analysis syllabus. In addition to covering

MO

DELLIN

G B

USIN

ESS INFO

RM

ATION

Keith G

ordon

MODELLING BUSINESS INFORMATIONEntity relationship and class modelling for business analysts

Keith Gordon

It is almost universally accepted that requirements documents for new or enhanced IT systems by business analysts should include a ‘data model’ to represent the information that has to be handled by the system.

Starting from first principles, this book will help business analysts to develop the skills required to construct data models through comprehensive explanations of entity relationship and class modelling, in line with the BCS Data Analysis syllabus. In addition to covering the topics in the syllabus, the book also includes significant extra information of interest including an overview of other modelling notations, information model quality, and taking a requirement model into database design.

• Explains why business analysts should model information• Covers both entity relationship and class modelling in

tandem from the basics• Aligned with the BCS Data Analysis syllabus• Goes beyond the syllabus to include several wider topics

of interest

ABOUT THE AUTHORKeith Gordon is an independent consultant and lecturer specialising in data management and business analysis. He has spent over 50 years in technical, education and training environments as an engineer, computer consultant, data manager, business analyst and education and training manager.

A thoughtful, well-done text on how to do high-quality business analytical data modelling.David Hay, Essential Strategies International, CEO

A terrific contribution to the field.Alec Sharp, Senior Consultant, Clariteq

Provides an excellent grounding in the full range of topics related to information modelling.Matthew West, Information Junction, Director

Computing; IT

ISBN 978-1-78017-353-5

9 781780 173535

You might also be interested in:

Cover photo: iStock © nuwatphoto

Ebooks available

Paperback available

MODELLING BUSINESS INFORMATIONEntity relationship and class modelling for business analysts

Keith Gordon