Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
The Use of Data in Commercial Lines P&C Insurance
Toronto Data Mining Forum Stéphane McGee, BSc, FCIA, FCASMay 7, 2014 Maki Dahchour, Ph.D., ACIA, ACAS
Adam Scarth, B.Comm., FCIA, FCAS
Agenda
An introduction to P&C Insurance
Specific examples in Commercial Automobile Insurance
Use of Data in Commercial Lines Insurance
Questions
Construction & Contracting
Manufacturing & Resources
Consumer & Business Services
Health, Education & Social Services
Transportation & Logistics
Actuaries use:
Statistics
Probability
Economics
Business Knowledge
To solve business problems.
Some Examples of Actuarial Problems:
Pricing New Products
Price Classification
Modeling Catastrophes
Graphing Size of Loss Distributions
Establishing Loss Reserves
Solvency Monitoring
Rate Adequacy Studies
Trending and Development of Losses
Projecting Future Rate Needs
Interaction with Other Departments
Underwriters
Actuarial
Finance Claims
Frequency: # Claims per Risk Insured
Severity : Average Cost of each Claim
Gl / TPL: General Liability / Third Party Liability
BI: Liability for Bodily Injuries
PD: Liability for Physical Damages
IBC Code: 4-digit code representing an industry
FSA: Forward Sortation Area: First 3 digits of a postal Code
IBNR: Incurred But Not Reported
Glossary
What are we insuring?
• Injuries to drivers, passengers, and third parties
• Damage to vehicles
• Damage to physical structures
• Loss to contents inside the vehicle (cargo)
• Liability incurred by the operation of the company
Commercial Lines Auto
What drives risk/premiums?
• Characteristics of the driver(s)
• Types of vehicles
• Distance and locations travelled
• Cargo carried
• Characteristics of the fleet/operations
• Coverage options
Commercial Lines Auto
The Good
• Mandatory data reporting
• Lots of event data
• Non-fleet follows filed rates
The Bad
• Unstructured data collected for fleets
• Subjective rating variables
• Results are volatile
The Ugly
• Larger fleets self-insure lower layers
• Fraud!
Commercial Lines Auto
What does our team do?
• Pricing analytics
• Rate filings
• Large account pricing
• Financial planning
• Communication
Commercial Lines Auto
• 2013 Towers Watson predictive modeling survey
Industry trend
Source: Towers Watson
- Why Analytics is lagging in P&C commercial lines?
• Focus on personal lines (urgent priority for some companies, more volume, better data quality)
• Data issues (see next slide)
• Lack of trust: P&C commercial lines pricing mostly UW driven
- P&C commercial lines characteristics allow for more analytics?
• Almost no regulation (in Canada)
• Derived prices are advisory prices
• Impact of soft markets (Analytics can help growing in such environment)
Current State
• Volume
• Quality
• Dimensionality
• Loss development
• Exposure base
Commercial lines Data and Modelling challenges
Volume: e.g., GL, low frequency and high severity business
Difficult to fit a distribution
Some solutions:
• Use more years of data
• Group sub-coverages/perils (e.g., analyze TPL combined instead of analyzing BI and PD separately)
• Incorporate competitors/industry rates (as an offset where not enough exposure)
• Model validation: avoid using a holdout sample and use consistency over time and judgment instead
Data volume
Quality: examples
• Null records, especially for attributes not used in the pricing
• Difficulty in linking losses with policy attributes
• Change in variables definition over time (e.g., IBC codes)
Some solutions:
• Replace Null records (imputation), various techniques exist
• Make a difference between null and zero values (e.g., credit)
• Focus more on variables used in rating and underwriting
• Work with IT and Claims departments to improve losses data files
Data quality
Dimensionality: a variable with a large number of levels (e.g., IBC codes, FSA, car codes)
Difficult to use as is in as predictor in a model
Solutions:
• Grouping: can rely on UW’s knowledge
• Dimension reduction: use proxies of the variable in question (e.g., for IBC codes, could use Sector, major class, commercial credit, Average size of businesses in the sector, Hazard grade, whether the business is B2B or B2C, Relative risk ranking of the IBC code by the company,…etc)
Dimensionality
Loss development: can be important for long-tailed lines of business (e.g., GL)
Some solutions:
• Use an existing IBNR allocation to bring losses to their ultimate values
• When dealing with undeveloped losses, use “year” as a predictor in the model and an acceptable number of years of data
Loss development
Two issues:
• Multiple exposure bases (e.g., GL: sales/payroll/area), which exposure base to use as a weight in the model?
• A linear relationship is assumed between the weight in a model and the response variable:
– Auto: car-year (true)
– GL: payroll/sales (questionable)
Some solutions:
• Model each component separately
• Use dominant exposure base and some conversion formula
• Use earned year as a weight in the model and the exposure base as predictor
Exposure base