Watch Your TPA:A Practical Introduction to Actuarial Data Quality Management
Aleksey Popelyukhin, Ph.D.Vice President, Information Systems & Technology
COMMERCIAL RISK
Epigraph
“Dear Cardmember, the 1997 Year End Summaryof your account regretfully contained an error: we discovered that one or more of your transactions were “double counted” – please, accept our sincerest apologies for the error and for any inconvenience it may caused you.”
The largest US credit card issuer 0
Main Logic
1
Data Issues are everywhere:need for Quality Testing
Internal Solution:Data Quality Shield
External Solution:TPAs systems certification
Data Integrity tests Time-variant Business Rules
Main source of T-VBR:Actuarial Methods Assumptions!
New
New
New
Data Problems surround us
Data Elements (un)availability Counts (claim details) Large Losses’ historical evaluations Premiums, Recoveries, etc..
(Ever changing) Industry Statistics NCCI corrects posted WC LDFs every year
TPAs monthly summaries (Loss Runs) Yet to see a single TPA without a problem 2
Typical Errors
1. Absence of required fields needed for policy conditions checks
Location (when deductible differs by Location) Report Date (when coverage is “claims-made”)
needed for actuarial analysis Closed Date (for Berquist-Sherman adjustments)
needed for unique record identification Coverage Type (if the same accident covered by
WC and Employer's Liability) 3a
More Errors
2. Duplicates (“double counting”) Examples
True duplicates (same Claim ID) Duplicate files (same accident, multiple Claim IDs) Missing key fields (Claim Suffix or Coverage Type)
Detection SQL Aggregation query with HAVING clause
Correction Flagging duplicate file as VOIDED
3b
More Errors
3. Unidentified “Occurences” SQL query with User-Defined Function
4. Recoveries (SIF, S&S) Outer Join with pre-aggregated sub-queries
5. Redundant fields consistency Examples
Closed claims with empty Closed Date Incurred less than Paid + Outstanding
Correction SQL UPDATE query 3c
More Errors
6. Dummy Records Fake claims for unallocated outsourced expenses (these
“claims” distort XS amounts calculations) Unidentified subtotals
7. Y2K and other date-related issues 8 out of 43 TPAs still not Y2K complaint NULL implementations
01/01/01 0 or other “magic” number or date 1/0/1900 11/01/1901 3d
More Errors
8. Disappearing claims SQL query with NOT IN sub-query
9. Non-monotonic gross losses Self Join SQL query
10. Consistent field definitions Statutory Page 14 Data or ISO statistical plan Reasonable expectations
Recoveries are negative Accident Date <= Report Date <= Closed Date (if any) <=
Evaluation Date 3e
More and More and More Errors...
11. Online access & digital exchange Incomplete downloads Improper filtering
12. Human errors on data entry Manual data entry can easily defeat any (even the
most sophisticated) validation system13. Error propagation
Errors propagate to the future Corrections should propagate to the past 3f
Data Quality defined
Quality data has to satisfy the following characteristics: Accuracy: (violated in 6, 8, 9, 10, 12) the measure of the degree of agreement between a
data value and a source assumed to be correct.
Completeness: (1, 2, 3, 7, 8, 11) the degree to which values are present in the attributes that require them.
Consistency: (5, 13) the requirement that data be free from variation or contradiction and satisfy a set of constraints.
Timeliness: (4) the extent to which a data item or multiple items are provided at the time required or specified (a degree to which specified values are up to date)
Uniqueness: (1,2) the need for precise identification of a data record.
Validity: (1, 2, 3, 6, 7, 9) the property of maintained data to satisfy the acceptance requirements of classification criteria and the ability of the data values to pass tests for acceptability, producing desired results. 4
Addressing Data Quality issues
External Solution Data Sources (TPAs) certification
1. Data collection and entry (validation)2. Data storage (database structure)3. Data manipulations (triggers)4. Data exchange (report generators)
Internal Solution Data Quality Shield 5
Data Quality Shield-definition
Data Quality Shield is an integrated set of standardized routines optimized for every external data source and comprised from pre-load data filters and translators, along with post-load data analysis tools, statistical diagnostics
and quality alarms. This type of integration addresses 2 specific distinctions of the
actuarial data: multiple external sources of data (TPA’s) and the time-variant nature of intended applications (actuarial
methods). 6a
Data Quality Shield-purpose
Establish standards, (discovering and enforcing business rules, including time-variant business rules)
Validate Input (checking that data values satisfy data definitions)
Eliminate redundant data Resolve data conflicts (determining which piece of
redundant, but not matching data is the correct one)
Propagate corrections and adjustments to prior evaluation for the time-variant data 6b
Data Quality Shield-possible interface
6cData Quality Shield
© 1998, Aleksey Popelyukhin. Screen design can not be used in commercial packages without written permission.
Data Quality Shield-impossible without actuaries
Actuaries are the last line of defense even with FDA certification of Food Quality,
one should not give up his immune system Actuaries are well positioned for discovery of
Insurance Data Business Rules Actuaries are best positioned for discovery of
Time-Variant Business Rules 6d
Typical Data Errors found in TPA’s Loss Runs can be sharply divided into two major categories:
Violations of static business rules (those which need single Loss Run present to be identified and fixed) and
Time-Variant Business Rules
7
Violations of time-variant business rules (those which track changes in time and need multiple Loss Runs for identification).
Actuarial assumptions testing-necessary part of the Actuarial Process
Formulae don’t work if assumptions are violated 1 + x + x2 + x3 + .. = x/(1-x) with x=2 produces nonsense: 1 + 2 + 4 + 8 + .. = -1 because x=2 is outside the domain: |x| < 1
Chain-Ladder algorithm, for example, produces similar nonsense if significant diagonal effect is present or columns of factors correlate 8a
Actuarial assumptions testing-main source of Time-Variant Business Rules
Assumptions testing fails if Real losses emergence differs from the
hypothetical one Analyzed Data contain significant Data Errors
non-monotonic number of Claims or amount of Losses unexpected seasonality effects outliers
8b
Actuarial assumptions testing-outliers
Every regression and hypothetical distribution may generate outliers
Outliers are indicators of potential Data problems Ideally, every outlier has to be investigated Current technology allows easily
identify outliers (Excel, statistical packages) perform detailed analysis (OLAP with drill-down)
Outliers elimination is iterative process 8c
Conclusion
Actuarial Data Quality Testing is an integral part of the overall Actuarial Process
Actuaries are best positioned for the discovery of business rules, both static and time-variant
Actuarial Assumptions Testing breakthroughs will make Data Quality Testing as sophisticated as Actuarial Analysis itself
Technology breakthroughs will allow actuaries to perform Data Quality Testing themselves without delegating it to other professionals 9
Epilogue
“In going over 1998 year-end summary, you willnotice that the category ‘detail’, at a ‘glance’ and in‘detail’ sections refer to the year 1999 in error.This summary actually reflects all the activity inYour account for the calendar year 1998.We apologize for any confusion this may cause.”
The largest US credit card issuer 0
Recommended Reading
1. Thomas C. Redman, Data Quality for the Information Age (Artech, 1995)
2. Insurance Data Quality. Marr, Richard, ed. (IDMA, 1995)
3. Thomas Mack, Measuring the Variability of Chain Ladder Reserve Estimates. (CAS, 1993)
4. Gary G. Venter, Checking Assumptions of Age-to-Age Factors. (CLRS, 1994)
5. Ben Zehnwirth, Probabilistic Development Factor Models with Applications to Loss Reserve Variability, Prediction Intervals, and Risk Based Capital
The Whole Picture
Watch Your TPAWatch Your TPAA Practical Introduction to Actuarial Data Quality Management, 1997A Practical Introduction to Actuarial Data Quality Management, 1997
On HierarchyOn Hierarchyof Actuarial Objectsof Actuarial ObjectsData Processing from the Actuarial Point of View, 1998Data Processing from the Actuarial Point of View, 1998
Let Me SeeLet Me SeeVisualization and Presentation of Actuarial Results, 1999Visualization and Presentation of Actuarial Results, 1999
The Big PictureThe Big PictureActuarial Process From the Data Management Point of View, 1996Actuarial Process From the Data Management Point of View, 1996