© Allstate Insurance Company Proprietary and Confidential
Be the Master of Your Domain
Doug Stacey Information Architect
Allstate Insurance Co.
A Domain based approach to
Enterprise Codes Management
Proprietary and Confidential October 24, 2012 2
Agenda
• Allstate Insurance at a Glance
• Introduction
• Codes Management Topologies
• Domain Meta-Model
• Codified Metadata Usage
• Deployment Models
• Opportunities
• Appendix
Proprietary and Confidential October 24, 2012 3
• The Allstate Corporation is the nation’s largest publicly held personal lines insurer
serving approximately 16 million households.
• Allstate is reinventing protection and retirement to help customers insure what
they have today and better prepare for tomorrow through its Allstate, Encompass,
Esurance and Answer Financial brand names.
• Consumers access insurance products (auto, home, life and retirement) and
services through Allstate agencies, independent agencies, and Allstate exclusive
financial representatives in the U.S. and Canada, as well as via www.allstate.com
and 1-800 Allstate®.
• Allstate is widely known through the “You’re In Good Hands With Allstate®”
slogan.
• The Allstate Foundation, Allstate employees, agency owners are committed to
local communities and the corporation provided $28 million in 2011 to thousands
of nonprofit organizations and important causes across the United States.
Allstate Insurance at a Glance
Proprietary and Confidential October 24, 2012 4
Team Structure
Information Analysts (Data Analysts)
Codes Analysts
Repository Administrators
Development Team
Team Awards & Recognitions
Wilshire Award of high recognition for metadata best practices - 2002.
Wilshire Award for best metadata implementation for warehouses –
2003.
Proprietary and Confidential October 24, 2012 5
Introduction
What is Enterprise Codes Management?
• It is the processes and tools used to manage and maintain codified metadata and make this information available to the different applications in the enterprise.
Benefits of an Enterprise Code Management System:
• Consistent metadata across disparate systems
• Ability to provide means to research codified metadata across the enterprise
• Ability to provide codes to support applications runtime requirement
• Ability to generate relationships and data translations across multiple applications
Proprietary and Confidential October 24, 2012 6
Codes Management Topologies
Application Centric
• Applications manage the
codes and assign values to
the codes.
• Applications (can) assign
multiple values to the same
code.
Enterprise Centric
• Enterprise manages the
discrete values and assigns
codes per usage scenario.
• Codes explain how the value
is used within the context of a
scenario or an application.
Proprietary and Confidential October 24, 2012 7
Application Centric Topology
• Applications manage the codes and assign values to the codes.
• Features:
• Easier for a single application to manage data
• Supports code code hierarchies and code code relationships
• Supports multiple values or locales per code
• Opportunities:
• Support identifying shared values
• Automatically support code code translations or conversions
Proprietary and Confidential October 24, 2012 8
Enterprise Centric Topology
• Enterprise manages the discrete values and assigns codes per usage
scenario.
• Features:
• Supports identifying a master set of values and then assigning codes per
usage scenario
• Easily supports code code translations or conversion based on shared
business values
• Supports value value hierarchies and value value relationships
• Supports multiple locales per value
• Opportunities:
• Easily assign the same value to different codes in one usage scenario
Proprietary and Confidential October 24, 2012 9
Domain Meta-Model
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
Domain: “A representation of the set of possible values that may be taken by a particular attribute”
Proprietary and Confidential October 24, 2012 10
Domain Meta-Model Enumerated Domains
State
Illinois
Ohio
Michigan
Texas
California
Florida
Gender Type
Male
Female
Other
Unknown
• Enumerated domains have a finite number of elements.
• It is very common to have a static list of elements.
• Data is verified by looking for an exact match with any of the elements in the domain.
• Example : State, Gender, Policy Lines, Products
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
Proprietary and Confidential October 24, 2012 11
Sample Domain Data
Value
Code
ACORD Warehouse
Investments Claims
Male 1 M 002
Female 2 F 001
Unknown 3 U
Other O 003
Code Value
1 Male
2 Female
3 Unknown
Code Value
M Male
F Female
U Unknown
O Other
Gender Domain
Code Value
002 MALE
001 FEMALE
003 NOT IDENTIFIED
Warehouse Code Value
M Male
F Female
U Unknown
O Others
Investments
ACORD Claims
Enterprise Centric Application Centric
Proprietary and Confidential October 24, 2012 12
Sample Statistics
Code Sets Per Domain Distribution
0
500
1000
1500
2000
1 2 3-10 11-20 >20
Code Sets Count
Co
un
t o
f D
om
ain
s
Business Values Distribution
0
500
1000
1500
2000
1-10 11-100 100-1000 >1000
Business Values Count
Co
un
t o
f D
om
ain
s
• These stats show a distribution of code
sets and values in a sample
implementation consisting of
approximately 3500 enumerated
domains.
Enterprise Domains Reuse per Usage Scenario
0
500
1000
1500
2000
2500
3000
3500
4000
1 2-3 4-5 6-10 11-25 >25
Usage Scenario (or Systems) Count
Co
un
t o
f D
om
ain
s
• These stats are for informational purposes
and do not reflect actual data
• Different implementations have different stats
Proprietary and Confidential October 24, 2012 13
Codified Metadata Usage
• Scenario 1 : Allowed Values Lists
• Scenario 2 : Reporting & Display
• Scenario 3 : Validation
• Scenario 4 : Possible Combination
• Scenario 5 : Rule Based Calculations
• Scenario 6 : Translation and Conversion
Proprietary and Confidential October 24, 2012 14
Codified Metadata Usage Usage Scenario 1 : Allowed Values List
Codes tables are used to present a list of all allowed values to the user to
make a selection (i.e. drop downs, lists, checkboxes).
Codes Table
Domain 1
Domain 2
ILIllinoisDomain 2
MIMichiganDomain 2
001Type 1Domain 1
002Type 2Domain 1
003Type 3Domain 1
Domain 2
Domain 2
Domain 2
Domain 2
Domain CodeValue
TXTexas
NYNew York
FLFlorida
CACalifornia
ILIllinoisDomain 2
MIMichiganDomain 2
001Type 1Domain 1
002Type 2Domain 1
003Type 3Domain 1
Domain 2
Domain 2
Domain 2
Domain 2
Domain CodeValue
TXTexas
NYNew York
FLFlorida
CACalifornia
`
Proprietary and Confidential October 24, 2012 15
Codified Metadata Usage Usage Scenario 2 : Reporting & Display
Codes tables are used to show the value of a code presented in a readable
format. (i.e. code ZX243 could be displayed in a report or on the screen as
“Item Type 243”).
Codes Table
New York
New York
Illinois
Illinois
State CodeName
FemalePerson 4
MalePerson 3
MalePerson 2
MalePerson 1
New York
New York
Illinois
Illinois
State CodeName
FemalePerson 4
MalePerson 3
MalePerson 2
MalePerson 1
002
002
001
001
State Code Gender CodeName
FPerson 4
MPerson 3
MPerson 2
MPerson 1
002
002
001
001
State Code Gender CodeName
FPerson 4
MPerson 3
MPerson 2
MPerson 1
Convert U
sing
State C
ode
Convert U
sing
Gender T
ype
003…State Code
002New YorkState Code
001IllinoisState Code
Gender Type
Gender Type
Gender Type
Gender Type
Domain CodeValue
0Other
UUnknown
FFemale
MMale
003…State Code
002New YorkState Code
001IllinoisState Code
Gender Type
Gender Type
Gender Type
Gender Type
Domain CodeValue
0Other
UUnknown
FFemale
MMale
Proprietary and Confidential October 24, 2012 16
Codified Metadata Usage Usage Scenario 3 : Validation
Codes tables are used to validate the existence (or lack of) a code.
This operation is very useful for cleansing a feed before saving it to the warehouse. It is also useful in ETL processing to accept or reject messages.
002
002
001
001
State Code Gender CodeName
FPerson 4
MPerson 3
MPerson 2
MPerson 1
002
002
001
001
State Code Gender CodeName
FPerson 4
MPerson 3
MPerson 2
MPerson 1
Codes Table
001IllinoisState Code
002New YorkState Code
003…State Code
Gender Type
Gender Type
Gender Type
Gender Type
Domain CodeValue
0Other
UUnknown
FFemale
MMale
001IllinoisState Code
002New YorkState Code
003…State Code
Gender Type
Gender Type
Gender Type
Gender Type
Domain CodeValue
0Other
UUnknown
FFemale
MMale
Validate Using State Code
Validate Using Gender Type
Proprietary and Confidential October 24, 2012 17
Codified Metadata Usage Usage Scenario 4 : Possible Combination
Codes tables are used to present the possible variation of any combination of
codes.
Codes Table
20002002
0002
0001
0001
0001
Product ColorState
1002
1002
2001
1001
20002002
0002
0001
0001
0001
Product ColorState
1002
1002
2001
1001
0004Prod 4Product Code
1WhiteColor
2BlackColor
001IllinoisState Code
002New YorkState Code
003…State Code
Color
Product Code
Product Code
Product Code
Domain CodeValue
3Other
0003Prod 3
0002Prod 2
0001Prod 1
0004Prod 4Product Code
1WhiteColor
2BlackColor
001IllinoisState Code
002New YorkState Code
003…State Code
Color
Product Code
Product Code
Product Code
Domain CodeValue
3Other
0003Prod 3
0002Prod 2
0001Prod 1
20002002
0002
0001
0001
0001
Product ColorState
1002
1002
2001
1001
20002002
0002
0001
0001
0001
Product ColorState
1002
1002
2001
1001
V
alid
ate
Row
Database Table or Data Feed
Valid Combinations Table
Proprietary and Confidential October 24, 2012 18
Codified Metadata Usage Usage Scenario 5 : Rule Based Calculations
Codes tables are used to generate a value based on multiple input values
(i.e. the state code along with the product code determines the discount).
Codes Table
Complex Code Table
Parameter 1
State Code
001
Parameter 2
Product Code
0002 Output Value
Discount
0.25
.250002002
0001
0003
0002
0001
Product DiscountState
.35002
.35001
.25001
.15001
.250002002
0001
0003
0002
0001
Product DiscountState
.35002
.35001
.25001
.15001
Proprietary and Confidential October 24, 2012 19
Codified Metadata Usage Usage Scenario 6 : Translations and Conversions
Codes tables are used to translate a code from an input feed to another code in an output stream.
This functionality is critical to the success of any messaging system or ETL operation between multiple systems.
Codes Table
u
u
f
m
MI
NY
IL
Target Code
001IllinoisState Code
002New YorkState Code
003…State Code
Gender Type
Gender Type
Gender Type
Gender Type
Domain Source CodeValue
0Other
UUnknown
FFemale
MMale
u
u
f
m
MI
NY
IL
Target Code
001IllinoisState Code
002New YorkState Code
003…State Code
Gender Type
Gender Type
Gender Type
Gender Type
Domain Source CodeValue
0Other
UUnknown
FFemale
MMale
Source System
State Code
001
Target System
State Code
IL
Code Translation Table
Proprietary and Confidential October 24, 2012 20
Deployment Models Push Model : Individual Table Deployment
• More suitable for database application
• Practical only for a small number of tables
• Uses one table for every domain or codes set
• Simple to build and develop
• Small number of rows in each table
• Applications can do a simple join to the table to consume the data
• Metadata tables in multiple locations
• Very difficult to manage
Application A
Application B
Dom A Dom B Dom C
Dom B Dom D
Metadata Codes
Repository
Application A
Users
Application B
Users
Codes
Management
Staff
Proprietary and Confidential October 24, 2012 21
Deployment Models Pull Model : Single Enterprise Table Model
• Suitable for database side applications
• Uses one table for all codes used by the application (or for all usage scenarios and applications)
• Primary tables keys include the domain and the usage scenario
• The number of rows in the table may grow to a significant number
• Moderately easy to join to the table
• Relatively easy to manage
Enterprise Codes
Database
Enterprise Codes Table
Metadata Codes
Repository
Application A
Users
Application B
Users
Codes
Management
Staff
App A
App B
Proprietary and Confidential October 24, 2012 22
Deployment Models Services Orientated and/or Web Services
• Strategically positioned SOA architecture
• Suitable for applications capable of consuming services
• Layer on top of the enterprise tables
• Provides additional layers of permissions and easier to track/validate access
• Ability to log and meter transactions and access
• Caching and performance
Enterprise Codes
Database
Enterprise Codes Table
Metadata Codes
Repository
Application A
Users Application B
Users
Codes
Management
Staff
App A
App B
Codes Services
Proprietary and Confidential October 24, 2012 23
Process & Implementation Opportunities
• Convince existing applications to adopt the enterprise centric
model
• Integrate established systems into the enterprise centric model
• Determine how to stay current with codes
• Determine how and who to manage the content of the release
versions
• Determine when to re-use and when to create new values
Proprietary and Confidential October 24, 2012 24
Appendix
Proprietary and Confidential October 24, 2012 25
Domain Meta-Model Simple Domains
• Simple domains are the basic blocks of the domain meta model.
• The types that make up the simple domains can be used as an independent unit or
combined to form a complex domain.
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
Proprietary and Confidential October 24, 2012 26
Domain Meta-Model Non-Enumerated Domains
Coverage
Description
note 1
note 2
….
First Name
George
Bill
Jimmy
…
• Non-Enumerated domains do not have a finite number of elements.
• Data is verified by looking for features of the content, such as data length,
numeric/alpha/alphanumeric, etc.
• Example : First Name, Coverage Description, Premium Amount, Social Security
Numbers
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
Proprietary and Confidential October 24, 2012 27
Domain Meta-Model Range Domains
Percentage Points
.1
.2
….
Effective Date
1/1/2000
1/2/2000
1/3/2000
…
• Range domains are special domains that exhibit features from both the enumerated
and non-enumerated.
• Domains might not have a finite number of elements, but contents follows a pattern
(i.e. any date from 1/1/2008 until now).
• Data is verified by looking for a valid value within a range of values.
• Example : Policy Effective Date, Percentage Points
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
Proprietary and Confidential October 24, 2012 28
Domain Meta-Model Complex or Derived Domains
• Complex domains derive their contents based on contents from one or more domains.
• Each item will be composed from many sub items, which themselves might be either
simple or complex in nature.
• Complex domain models define relationships across domains.
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
Proprietary and Confidential October 24, 2012 29
Domain Meta-Model Calculated Domains
Region State Multiplier
1 MI 1
1 FL 1
2 IL 2
3 CA 2
• Calculated domains derive their contents from one or more enumerated or derived
domains.
• For all states in region 1, the multiplier is 1, otherwise it is a 2.
• Region State Multiplier
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
Proprietary and Confidential October 24, 2012 30
Domain Meta-Model Valid Combination Domains
State Year Policy Line
IL < 5 1
IL 5 - 7 1
IL 5 – 7 2
IL 7 1
IL 7 2
IL 7 5
MI < 5 2
MI 5 – 7 2
• Valid combination domains define rows of valid metadata combinations. The contents
are composed from one or more enumerated, range or another derived domain.
• From the example above: In Illinois, if you have a clean driving record for less than 5
years then you are eligible for line 1 only. If your record is between 5 and 7 years then
you are eligible for line 1 and 2 and if your record is more than 7 years then you are
eligible for line 1,2 or 5.
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
Proprietary and Confidential October 24, 2012 31
Domain Meta-Model Sub-Set Domains
• A subset domain is a special case of a derived domain where the contents are a
subset of the values from the original domain.
• In the example on the previous slide, if we need only the items for year range (5-7)
then this would be a subset of the original domain.
Generic Domain
Simple Derived or Complex
Enumerated
Range
Non-EnumeratedValid CombinationCalculated
Sub-Set
State Year Policy Line
IL 5 - 7 1
IL 5 – 7 2
MI 5 – 7 2
Proprietary and Confidential October 24, 2012 32
Questions
Doug Stacey