GeneralJaideep Srivastava Computer Science & Engineering
[email protected]
Customer segmentation Customer loyalty Customer retention
Analytical CRM architecture Data warehouse Dimensional data
modeling On-line analytical processing (OLAP)
Data mining Amazon.com: case study in building customer loyalty
Analytics behind e- marketing Yodlee.com: case study in web
business intelligence Privacy issues Conclusion
2/25/2008 © Jaideep Srivastava 3
Technology Trends Internet growth
Faster than any other infrastructure Data collection
Rapid drop in storage costs Dramatic improvement in resolution and
rate of data collection ‘probes’
Data analytics Increasing deployment of warehouses Major leap
forward in data mining technologies and tools
Becoming possible to really understand what your customers want –
even at the individual level!!
2/25/2008 © Jaideep Srivastava 4
Millions of users
Radio TV Cable Internet
2/25/2008 © Jaideep Srivastava 5
Product Marketing – 75 years ago
• Production – a la Adam Smith • You can have any color as long as
its black – Ford Motor Co.
2/25/2008 © Jaideep Srivastava 6
2/25/2008 © Jaideep Srivastava 7
New approach to marketing TO: Finding products that are right for
each customer
TURN the process through 90 degrees
FROM: Finding customers that are right for each product
Products: 1 2 3 4 5 ….. To achieve this we need to align
around
•Organization and culture •Business processes and skill
•Measurement and incentives •Information management
•Technology
2/25/2008 © Jaideep Srivastava 8
Mass production Cheap to produce Efficient to produce Uniform
features/quality ‘one size fits all’ approach Optimize production
cost
Customization Expensive to produce Inefficient to produce
Customized features ‘tailor made’ approach Optimize customer
satisfaction
Mass customization Cheap & efficient to produce Customized
features ‘tailor made’ approach Optimize production cost &
customer satisfaction
2/25/2008 9
2/25/2008 © Jaideep Srivastava 11
Customer care & support Incident
assignment/escalation/tracking/reporting Problem
management/resolution Order management/promise fulfillment
Warranty/contract management
Field service support Work orders, dispatching Real time
information transfer to field personnel via mobile
technologies
2/25/2008 © Jaideep Srivastava 12
Contact management profiles and history Account management
including activities Order entry Proposal generation
Sales management Pipeline analysis, e.g. forecasting Sales cycle
analysis Territory alignment Roll-up and drill-down reporting
2/25/2008 13
2/25/2008 © Jaideep Srivastava 14
CRM = software + support services European CRM expenditure = $1.2B
+ $3.0B = $4.2B*
UK marketing service industry growing at 17.4% to $7.7B
CRM
*Hewson Consulting October 2000
2/25/2008 © Jaideep Srivastava 15
But - satisfaction is declining
2/25/2008 © Jaideep Srivastava 16
2/25/2008 © Jaideep Srivastava 17
Increasing customer resistance
98% of customer solicitations are irrelevant 82% of individuals
would like to block all marketing access to their own data Campaign
hit rates and customer loyalty indicators are declining
2/25/2008 © Jaideep Srivastava 18
Consequently
The ‘best’ customers are being over communicated to Today’s less
valuable customers are not being developed into tomorrow’s ‘best’
customers The business potential of the customer base is not being
maximized
2/25/2008 © Jaideep Srivastava 19
Analytics = OLAP, Statistical analysis, data mining, etc.
2/25/2008 © Jaideep Srivastava 20
Customer segmentation
2/25/2008 © Jaideep Srivastava 21
Customer segmentation Purpose of segmentation is to identify groups
of customers with similar needs and behavior patterns, so that they
be offered more tightly focused
Products Services Communications
Segments should be Identifiable Quantifiable Addressable Of
sufficient size to be worth addressing
Two approaches to segmentation cluster common characteristics, and
then map out behavior patterns Separate out behavior patterns, then
identify segment characteristics
2/25/2008 © Jaideep Srivastava 22
High
2/25/2008 © Jaideep Srivastava 23
1200 1000 800 600 400 200
0 -200 -400 -600 -800
-1000 -1200
Who are these customers; what do they look like?
Middle 60%, either side of break even. What can we do about
these?
Are these worth keeping? Can we service them with a lower cost
channel? What can we do to make this segment profitable?
Should the focus be on retaining wallet share from segments 8 – 10?
Or, on gaining from segments 1 – 4?
Profit
Deciles
2/25/2008 © Jaideep Srivastava 26
Relationship intensity and defection odds
Evidence suggests that customer ‘lock in’ occurs once 4 or more
products are purchased
Odds of not defecting
2/25/2008 © Jaideep Srivastava 27
A difference of opinion …
32%
2%
Customers are happy with our customer service
We research customer service needs and wants as part of our
customer service improvement
Customer service needs no improvement
Customer service today is better than ever
2/25/2008 © Jaideep Srivastava 28
43%
7%
We want to develop a relationship with our customers
We want to form and develop a relationship with our suppliers
The relationship now is stronger than 12 months ago
2/25/2008 © Jaideep Srivastava 29
Actions which build relationship warmth
•No-fault service •“Have a nice day” •Targeted sales
Customer relationship profitability
2/25/2008 © Jaideep Srivastava 30
Loyalty is built through a virtuous circle of new customer
experience
Virtuous circle of customer experience
Superlative Customer serviceProvides legitimacy
Innovative new products
2/25/2008 © Jaideep Srivastava 31
TIME
2/25/2008 © Jaideep Srivastava 32
•Behavioral Propensity Model based Campaigns generate New Customers
•Selective score-based phone follow-up more than doubles response
•“Event-driven”(Trans. Vol. & Value) Campaigns to stimulate
initial usage of credit-card. •Propensity model + “Event-driven”
Customer Retention program identifies likely non-renewers 3 months
prior to renewal, and kicks in usage stimulation program •Different
offers (“Frequent User Club” versus Premium) being tested
Impact: Over 100% improvement in both Acquisition and Retention.
New market opened up.
2/25/2008 © Jaideep Srivastava 33
Using Negative Events to drive Positive Sales
Event = “ATM request for cash” is rejected due to lack of
funds.
For credit-worthy customers, unsecured personal loan is offered by
mail or phone the next day!
30% acceptance rate of product offered.
Impact: Significant cross-sales of additional product
Significant reduction in negative reactions
2/25/2008 34
Inbound Call Centre
• IMPERSONAL
Impact!Impact!
• PERSONALISED
2/25/2008 © Jaideep Srivastava 38
Monitor &
Integrator
Data Sources Tools
2/25/2008 © Jaideep Srivastava 40
Data Warehouse A decision support database that is maintained
separately from the organization’s operational database A data
warehouse is a
subject-oriented, integrated, time-varying, non-volatile
collection of data that is used primarily in organizational
decision making.
2/25/2008 © Jaideep Srivastava 41
Data Warehouse - Subject Oriented subject oriented: oriented to the
major subject areas of the corporation that have been defined in
the data model.
E.g. for an insurance company: customer, product, transaction or
activity, policy, claim, account, and etc.
operational DB and applications may be organized differently
E.g. based on type of insurance's: auto, life, medical, fire,
...
2/25/2008 © Jaideep Srivastava 42
Data Warehouse - Integrated
There is no consistency in encoding, naming conventions, … among
different data sources When data is moved to the warehouse, it is
converted.
2/25/2008 © Jaideep Srivastava 43
Data Warehouse - Non-Volatile
Operational data is regularly accessed and manipulated a record at
a time and update is done to data in the operational environment.
Warehouse Data is loaded and accessed. Update of data does not
occur in the data warehouse environment.
2/25/2008 © Jaideep Srivastava 44
Data Warehouse - Time Variance The time horizon for the data
warehouse is significantly longer than that of operational systems.
Operational database contain current value data. Data warehouse
data is nothing more than a sophisticated series of snapshots,
taken as of some moment in time. The key structure of operational
data may or may not contain some element if time. The key structure
of the data warehouse always contains some element of time.
2/25/2008 © Jaideep Srivastava 45
Data Sources Data sources are often the operational systems,
providing the lowest level of data. Data sources are designed for
operational use, not for decision support, and the data reflect
this fact. Multiple data sources are often from different systems
run on a wide range of hardware and much of the software is built
in-house or highly customized. Multiple data sources introduce a
large number of issues - - semantic conflicts.
2/25/2008 © Jaideep Srivastava 46
Data Cleaning Important to warehouse clean data (operational data
from multiple sources are often dirty). Three classes of
tools
Data migration: allows simple data transformation Data Scrubbing:
uses domain-specific knowledge to scrub data Data auditing:
discovers rules and relationships by scanning data (detect
outliers).
2/25/2008 © Jaideep Srivastava 47
Load and Refresh Loading the warehouse includes some other
processing tasks: checking integrity constraints, sorting,
summarizing, build indxes, etc. Refreshing a warehouse means
propagating updates on source data to the data stored in the
warehouse
when to refresh determined by usage, types of data source,
etc.
how to refresh data shipping: using triggers to update snapshot log
table and propagate the updated data to the warehouse transaction
shipping: shipping the updates in the transaction log
2/25/2008 © Jaideep Srivastava 48
Monitor detect changes to an information source that are of
interest to the warehouse
define triggers in a full-functionality DBMS examine the updates in
the log file write programs for legacy systems
propagate the change in a generic form to the integrator
2/25/2008 © Jaideep Srivastava 49
Integrator receive changes from the monitors
make the data conform to the conceptual schema used by the
warehouse
integrate the changes into the warehouse merge the data with
existing data already present resolve possible update
anomalies
2/25/2008 © Jaideep Srivastava 50
Metadata Repository Administrative metadata
source database and their contents gateway descriptions warehouse
schema, view and derived data definitions dimensions and
hierarchies pre-defined queries and reports data mart locations and
contents data partitions data extraction, cleansing, transformation
rules, defaults data refresh and purge rules user profiles, user
groups security: user authorization, access control
2/25/2008 © Jaideep Srivastava 51
Metadata Repository Business data
Operational metadata data lineage: history of migrated data and
sequence of transformations applied currency of data: active,
archived, purged Monitoring information: warehouse usage
statistics, error reports, audit trails
2/25/2008 © Jaideep Srivastava 52
Data Marts A data mart (departmental data warehouse) is a
specialized system that brings together the data needed for a
department or related applications. Data marts can be implemented
within the data warehouse by creating special, application-specific
views. Data marts can also be implemented as materialized views
departmental subsets that focus on selected subjects. Data marts
may use different data representations and include their own OLAP
engines
2/25/2008 © Jaideep Srivastava 53
Other Tools User interface that allows users to interact with the
warehouse
query and reporting tools analysis tools data mining tools
2/25/2008 54
Star schema: A single object (fact table) in the middle
connected to a number of objects (dimension tables)
Snowflake schema: A refinement of star schema where
the dimensional hierarchy is represented explicitly by
normalizing the dimension tables.
tables.
Product
Store
Product
AIR-EXPRESS
2/25/2008 © Jaideep Srivastava 59
Summary Tables Data warehouse may store some selected summary data,
the pre-aggregated data. Summary data can store as separate fact
tables sharing the same dimension tables with the base fact table.
Summary data can be encoded in the original fact table and
dimension tables.
DateID ProdID Sales 0 1 1000 1 1 20000 1 2 40000 3 1 300000
id level date month year 0 1 1 1 1998 1 2 NULL 1 1998 2 2 NULL 2
1998 3 3 NULL NULL 1998
2/25/2008 © Jaideep Srivastava 60
Sales volume as a function of product, time, and geography
Pr od
uc t
Reg ion
Industry Country Year
Category Region Quarter
2/25/2008 © Jaideep Srivastava 61
A Sample Data Cube Total annual sales of TV in China.Date
Pro du
India
Japan
sum
total sales volume last year by product category by region
Roll down, drill down, drill through: go from higher level summary
to lower level summary or detailed data
For a particular product category, find the detailed sales data for
each salesperson by date
Slice and dice: select and project Sales of beverages in the West
over the last 6 months
Pivot: reorient cube
FROM SALES
(date, product, customer), (date,product),(date, customer),
(product, customer), (date), (product) (customer)
2/25/2008 © Jaideep Srivastava 65
RData cube can be viewed as a lattice of cuboids
The bottom- most cuboid is the base cube.
The top most cuboid contains only one cell.
2/25/2008 © Jaideep Srivastava 66
Cube Computation -- Array Based Algorithm
An MOLAP approach: the base cuboid is stored as a multidimensional
array Read in a number of cells to compute partial cuboids
B {ABC} {AB} {AC} {BC}
{A} {B} {C} { }
exploits services of relational engine effectively provides
additional OLAP services
design tools for DSS schema performance analysis tool to pick
aggregates to materialize
SQL comes in the way of sequential processing and columnar
aggregation Some queries are hard to formulate and can often be
time consuming to execute
2/25/2008 © Jaideep Srivastava 68
ROLAP versus MOLAP MOLAP
Front-end multidimensional queries map to server capabilities in a
straightforward way
Direct addressing abilities
2/25/2008 69
Data Mining
© Jaideep Srivastava
unknown and potentially useful) information from data in large
databases
Alternative names and their “inside stories”: Data mining: a
misnomer? Knowledge discovery in databases (KDD: SIGKDD), knowledge
extraction, data archeology, data dredging, information harvesting,
business intelligence, etc.
What is not data mining? (Deductive) query processing. Expert
systems or small ML/statistical programs
2/25/2008 © Jaideep Srivastava 71
Examples of Interesting Knowledge
Association rules 98% of people who purchase diapers also buy
beer
Classification People with age less than 25 and salary > 40k
drive sports cars
Similar time sequences Stocks of companies A and B perform
similarly
Outlier Detection Residential customers for telecom company with
businesses at home
© Jaideep Srivastava
Motivation: “Necessity is the Mother of Invention”
Data explosion problem: Automated data collection tools and mature
database technology lead to tremendous amounts of data stored in
databases, data warehouses and other information
repositories.
We are drowning in data, but starving for knowledge!
Data warehousing and data mining : On-line analytical
processing
Extraction of interesting knowledge (rules, regularities, patterns,
constraints) from data in large databases.
© Jaideep Srivastava
Data Mining and Business Intelligence Increasing potential to
support business decisions End User
Business Analyst
Data Analyst
Data Warehouses / Data Marts
© Jaideep Srivastava
Statistics
2/25/2008 75
Data mining: the core of knowledge discovery process.
Data Cleaning
Data Integration
Data Warehouse
Task-relevant Data
Steps of a KDD Process Learning the application domain:
relevant prior knowledge and goals of application Creating a target
data set: data selection Data cleaning and preprocessing: (may take
60% of effort!) Data reduction and projection:
Find useful features, dimensionality/variable reduction, invariant
representation.
Choosing functions of data mining summarization, classification,
regression, association, clustering.
Choosing the mining algorithm(s) Data mining: search for patterns
of interest Interpretation: analysis of results.
visualization, transformation, removing redundant patterns, etc.
Use of discovered knowledge.:
2/25/2008 78
© Jaideep Srivastava
Summarization (characterization), comparison, association,
classification, clustering, trend, deviation and pattern analysis,
etc. Mining knowledge at different abstraction levels: primitive
level, high level, multiple-level, etc.
Databases to be mined: Relational, transactional, object-oriented,
object- relational, active, spatial, time-series, text, multi-
media, heterogeneous, legacy, etc.
Techniques adopted: Database-oriented, data warehouse (OLAP),
machine learning, statistics, visualization, neural network,
etc.
© Jaideep Srivastava
Predictive data mining
Different views, different classifications: Kinds of knowledge to
be discovered,
Kinds of databases to be mined, and
Kinds of techniques adopted.
Generalize, summarize, and possibly contrast data characteristics,
e.g., dry vs. wet regions.
Association:
finding rules like “inside(x, city) near(x, highway)”.
Classification and Prediction:
Classify data based on the values in a classifying attribute, e.g.,
classify countries based on climate, or classify cars based on gas
mileage.
Predict some unknown or missing attribute values based on other
information.
© Jaideep Srivastava
Clustering:
Group data to form new classes, e.g., cluster houses to find
distribution patterns.
Time-series analysis: Trend and deviation analysis: Find and
characterize evolution trend, sequential patterns, similar
sequences, and deviation data, e.g., stock analysis.
Similarity-based pattern-directed analysis: Find and characterize
user-specified patterns in large databases. Cyclicity/periodicity
analysis: Find segment-wise or total cycles or periodic behaviours
in time-related data.
Other pattern-directed or statistical analysis:
© Jaideep Srivastava
Relational databases
Data warehouses
Transactional databases
Spatial databases
Heterogeneous and legacy databases
Are All the “Discovered” Patterns Interesting?
A data mining system/query may generate thousands of patterns, not
all of them are interesting.
Suggested approach: Query-based, focused mining
Interestingness measures: A pattern is interesting if it is easily
understood by humans valid on new or test data with some degree of
certainty. potentially useful novel, or validates some hypothesis
that a user seeks to confirm
Objective vs. subjective interestingness measures: Objective: based
on statistics and structures of patterns, e.g., support,
confidence, etc. Subjective: based on user’s beliefs in the data,
e.g., unexpectedness, novelty, etc.
© Jaideep Srivastava
Can It Find All and Only Interesting Patterns?
Find all the interesting patterns: Completeness. Can a data mining
system find all the interesting patterns?
Search for only interesting patterns: Optimization. Can a data
mining system find only the interesting patterns? Approaches
First general all the patterns and then filter out the
uninteresting ones. Generate only the interesting patterns ---
mining query optimization
© Jaideep Srivastava
Requirements and Challenges in Data Mining
Mining methodology issues Mining different kinds of knowledge in
databases. Interactive mining of knowledge at multiple levels of
abstraction. Incorporation of background knowledge Data mining
query languages and ad-hoc data mining. Expression and
visualization of data mining results. Handling noise and incomplete
data Pattern evaluation: the interestingness problem.
Performance issues: Efficiency and scalability of data mining
algorithms. Parallel, distributed and incremental mining
methods.
© Jaideep Srivastava
Requirements/Challenges in Data Mining (Cont.)
Issues relating to the variety of data types: Handling relational
and complex types of data Mining information from heterogeneous
databases and global information systems.
Issues related to applications and social impacts: Application of
discovered knowledge.
Domain-specific data mining tools Intelligent query answering
Process control and decision making.
Integration of the discovered knowledge with existing knowledge: A
knowledge fusion problem. Protection of data security and
integrity.
2/25/2008 88
2/25/2008 © Jaideep Srivastava 89
Need CreationNeed Creation anticipate/stimulate
Post purchase experiencePost purchase experience add value
2/25/2008 © Jaideep Srivastava 90
anticipate/stimulateNeed CreationNeed Creation
2/25/2008 © Jaideep Srivastava 92
2/25/2008 © Jaideep Srivastava 93
••11--click purchaseclick purchase ••‘‘slippery check out
counterslippery check out counter’’ vs. vs. ‘‘sticky aislessticky
aisles’’
2/25/2008 © Jaideep Srivastava 95
2/25/2008 © Jaideep Srivastava 96
2/25/2008 © Jaideep Srivastava 97
Why is loyalty important Amazon’s ‘customer lifetime value’ model
(for book buyers
Average $50 for first time purchase Average $40 per visit
thereafter Average of one visit per 2 months Assume customer will
be active for 10 years – not validated yet
‘4 buys and you are hooked’ empirical law Use Alexa data to bring
back ‘prodigal sons’ (and daughters)
2/25/2008 © Jaideep Srivastava 98
Time
Internet Marketing Insight – Jeff Bezos
Role of Advertisement – get customer to the store Customer
experience – get customer to buy
Brick & mortar stores Getting customer to store is the hard
part Shopping cart abandonment is not common, since the overhead of
going to another store is very high – especially in Minnesota
winters!
Marketing expenses 80% for advertisement; 20% for customer
experience
The 80-20 rule is reversed for on-line stores – Jeff Bezos
2/25/2008 © Jaideep Srivastava 101
Remarks on Amazon.com A very innovative company – the poster child
for e-commerce Is pushing the envelope in personalization Customers
love it Will it make money – we’re all waiting to see
A company of the future, with a product of the past, in a market of
the present
2/25/2008 102
looney.cs.umn.edu han - [09/Aug/1996:09:53:52 -0500] "GET
mobasher/courses/cs5106/cs5106l1.html HTTP/1.0" 200 mega.cs.umn.edu
njain - [09/Aug/1996:09:53:52 -0500] "GET / HTTP/1.0" 200 3291
mega.cs.umn.edu njain - [09/Aug/1996:09:53:53 -0500] "GET
/images/backgnds/paper.gif HTTP/1.0" 200 3014 mega.cs.umn.edu njain
- [09/Aug/1996:09:54:12 -0500] "GET /cgi-bin/Count.cgi?df=CS
home.dat\&dd=C\&ft=1 HTTP mega.cs.umn.edu njain -
[09/Aug/1996:09:54:18 -0500] "GET advisor HTTP/1.0" 302
mega.cs.umn.edu njain - [09/Aug/1996:09:54:19 -0500] "GET advisor/
HTTP/1.0" 200 487 looney.cs.umn.edu han - [09/Aug/1996:09:54:28
-0500] "GET mobasher/courses/cs5106/cs5106l2.html HTTP/1.0"
200
. . . . . . . . .
Access Log Format IP address userid time method url protocol status
size
mega.cs.umn.edu njain 09/Aug/1996:09:54:31
advisor/csci-faq.html
Other Server Logs: referrer logs, agent logs Application server
logs: business event logging
2/25/2008 © Jaideep Srivastava 104
of reaching final state •Maximize expected
sales from each visit Enter store
Browse catalog
Select items
Complete purchase
cross-sell promotions
up-sell promotions
‘sticky’ states
• Shopping pipeline modeled as state transition diagram •
Sensitivity analysis of state transition probabilities • Promotion
opportunities identified • E-metrics and ROI used to measure
effectiveness
2/25/2008 © Jaideep Srivastava 105
dollars spent in past quarter
1 2 3 4 65
1500
1000
500
7
H M
2/25/2008 © Jaideep Srivastava 106
frequency
tenure
monetary recency
• modeled customers in a 4-dim space • used PCA to determine
relative weights
of each dimension • Composite Score = w1*recency + w2*frequency
+
w3*monetary + w4*tenure
2/25/2008 © Jaideep Srivastava 107
Score
… … … … …
72%
Cust M
Cust H
• Cust M => frequent visitor but low spender => potential for
acquiring higher wallet share => focus on improving
relationship
• Cust H => infrequent visitor but heavy spender => focus on
sustaining relationship
2/25/2008 108
2/25/2008 © Jaideep Srivastava 109
Current Situation: Consumer Confusion
“It takes me two hours to get to all my accounts”
“I can’t look at my assets across accounts”
“I can’t remember all my user IDs and passwords”
“I want the web to work for me, not the other way around”
“This is overwhelming……I need some help”
“Make it easier for me!”
2/25/2008 © Jaideep Srivastava 110
Solution – Personal Information Aggregation
2/25/2008 © Jaideep Srivastava 111
Business Intelligence Benefits to Corporation
‘Tip-of-the-iceberg’ analysis for a brokerage house Lifestyle
preference analysis of banking customers for a survey
‘True-wallet-share’ analysis for a credit card organization Dynamic
targeting for banner advertisements, e-mail campaigns, etc.
2/25/2008 © Jaideep Srivastava 113
Asset Based Tiers
Number of Users
> $25M 9
• This brokerage house treated customers with net worth > $1M as
‘high net worth’ (HNW) customers with specialized services
• Almost none of the customers in the green region had > $1M
with this brokerage
2/25/2008 © Jaideep Srivastava 114
- 53% have at least one online banking account
- 51% have an online credit card account -- higher than
Yodlee users as a whole
- 31% also have an E*Trade account, and 11% also have a
Schwab account
preference for users as a whole
- The most popular credit card is American Express
Financial Preferences
25% make travel reservations online -- fewer than users as a
whole
- Expedia is more popular as an on- line travel site than
Travelocity
- 49% have a frequent flier account -- higher than users as a
whole
-The favorite frequent flier programs are United, Delta, American,
in that order
- Half as many of co-brand users shop on Ebay than users as a
whole
Lifestyle Preferences
‘True-Wallet-Share’ Analysis for a Credit Card Organization
Analysis of credit card balance habits of user base • There are1386
people, each of which carries a total balance between $1000 and
$2000 on all credit cards that (s)he owns • 292 of these 1386
people own discover cards, and carry an average balance of $174.55
• 540 of these 1386 people own AmEx cards, with an average balance
of $988.97 • 323 of these 1386 people carry one or more Visa, with
an average Visa network balance of $1018.50
Range Total Users Discover American Express
Mastercard Visa Other Average Total< $100 462 4.13
(73)
$100 - $200 232 -12.61 (39) 120.17 (66) 0 89.95
(40)
167.10 (156) 149.44
$200 - $500 643 36.97 (107) 253.77 (207) 0 218.93 (135) 272.42
(421) 342.99 $500 - $1000 968 75.57 (182) 571.09 (378) 0 597.83
(217) 623.36 (593) 893.47
$1000 - $2000
(1)
957.69
(1)
3648.40
(3)
Business Implications of True Wallet Share Analysis
A credit card offeror knows exactly how much money customers
holding its cards spend (every month) on its card vs. that on the
competition’s cards Offeror can target users falling within various
segments for specific customer acquisition, retention, etc.
purposes Detailed profile and history information of these users
can be used for precision targeting and customer messaging through
various channels including ad serving, e-mail campaigns,
promotions, etc. If transaction level detail information of these
users is analyzed, it can be determined exactly which credit cards
are being used by aggregation users as a whole for what kind of
lifestyle activity, e.g. travel, entertainment, shopping,
groceries, etc; this can help partner decide which market segments
to focus on
2/25/2008 © Jaideep Srivastava 117
Business Implications (contd.) The analysis above, if carried out
at an individual user level detail, can be used to target
individual customers with specific promotions, etc. Transaction
level detail can be classified into charges to specific
organizations, department stores, airlines, etc. This will identify
the top organizations that aggregation users spend money at, either
on the partner’s card or on a competing network. This would be
useful in determining which organizations to partner with for
customer retention, and acquisition, respectively All of these
analyses if performed periodically, and tracked over time, can
provide valuable insight into the evolving credit balance
distribution and usage behavior at the user population or
individual user level
2/25/2008 © Jaideep Srivastava 118
2/25/2008 © Jaideep Srivastava 122
2/25/2008 © Jaideep Srivastava 123
2/25/2008 © Jaideep Srivastava 124
Problem: Tired of mistreatment by financial institutions …
You have tons of money in your investment portfolio But you are
over-worked and slipped a couple of credit card payment deadlines –
after all you are busy managing your investment portfolio Credit
card institution treats you like a deadbeat
2/25/2008 © Jaideep Srivastava 125
Solution Why not let the credit card institution know what your
investment portfolio balance is? Impress them Perhaps even
authorize credit card company to transfer funds from your
investment account to cover the payment? Or maybe not
2/25/2008 © Jaideep Srivastava 126
So, what’s the catch… Shopping example
Allow the vendor to collect detailed information about you and
build an accurate profile Junk mail is only a nuisance for the
receiver, but an expense for the sender! – the sender wants to
avoid it more than the receiver!!
Credit card example Allow the credit card company and investment
company to share your information
Multiple online accounts example Hand over your account names and
passwords to aggregation service Sounds scary – but over 1.5
million people have done this in about 18 months’ time!!
2/25/2008 © Jaideep Srivastava 127
let’s now talk about privacy …
Merriam Webster definition a: the quality or state of being apart
from company or observation b : freedom from unauthorized
intrusion
Justice Oliver Wendell Holmes “the right to be left alone”
Operational definition Collection and analysis of personal data
beyond some limit
2/25/2008 © Jaideep Srivastava 128
Public Attitude Towards Privacy A (self-professed) non scientific
study carried out by a USA Today reporter Asked 10 people the
following two questions
Are you concerned about privacy? 8 said YES If I buy you a Big Mac,
can I keep the wrapper (to get fingerprints)? 8 said YES
ACM E-Commerce 2001 paper [Spiekermann et al] Most people willing
to answer fairly personal questions to anthropomorphic web-bot,
even though not relevant to the task at hand Different privacy
policies had no impact on behavior Study carried out in Europe,
where privacy consciousness is (presumably) higher
2/25/2008 © Jaideep Srivastava 129
Public Attitude (contd.) Amazon.com (and practically every
commercial site) uses cookies to identify and track visitors
97.6% of Amazon.com customers accepted cookies
Airline frequent flier programs with cross promotions
We willingly agree to be tracked Get upset if the tracking
fails!
Over 1.5 million people have trusted the aggregation service
(called Yodlee) with the names and passwords of their financial
accounts in less than 18 months Adoption rate has been over 3 times
the most optimistic projections
Medical data is (perhaps) an exception to this
2/25/2008 © Jaideep Srivastava 130
What people really want Some people will not share any kind of
private data at any cost – the ‘paranoids’ Some people will share
any data for returns – the ‘Jerry Springerites’ The vast majority
in the middle wants
a reasonable level of comfort that private data about them will NOT
be misused Tangible and compelling benefits in return for sharing
their private data – Big Mac example, frequent flier programs
2/25/2008 © Jaideep Srivastava 131
Remarks on Privacy Is it ‘much ado about nothing’?
If indeed data collection was outlawed, and thus personalization
impossible, wouldn’t the public lose – faced with generic,
undifferentiated products/services? Given the public’s attitude
about privacy (as shown in their actions), are privacy advocates
barking up the wrong tree? Is it just a matter of time or
generational issue, e.g. adoption of credit cards
Where do we stand? Current position - loss of your privacy may be
beneficial for you Emerging position (post September 11th ) - loss
of your privacy will be beneficial for everyone Critical emerging
debate - is privacy a right or a privilege?
2/25/2008 © Jaideep Srivastava 132
Concluding Remarks Internet is a high bandwidth, low latency,
negligible cost, interactive channel to the customer Very high
adoption rates for this channel Processing speeds and storage
capacities continuing to increase while costs continue to fall Data
analytics technology has grown rapidly Customer facing applications
are ready for a paradigm shift Innovative companies have moved
ahead Privacy is an issue, but not much of a concern
Data Analytics forCustomer Facing Applications
Presentation Outline
Technology Trends
New approach to marketing
Customer Facing Applications
Customer Facing Applications
Customer Facing Applications
Customer Facing Applications
Companies are spending mega-budgets on CRM
But - satisfaction is declining
Increasing customer resistance
Customer segmentation
Relationship intensity anddefection odds
A difference of opinion …
Increasing propensity to buy over a customer life cycle
Loyalty is built through a virtuous circle of new customer
experience
Lifetime Impact of Customer Loyalty
Analytical CRM Architecture
Analytical CRM Loop
Example of Star Schema
Example of Snowflake Schema
ROLAP versus MOLAP
ROLAP versus MOLAP
Data Mining and Business Intelligence
Data Mining: Confluence of Multiple Disciplines
The Data Mining Process
Data Mining – Some Issues to Consider
Three Schemes in Classification
Data Mining: Classification Schemes
Are All the “Discovered” Patterns Interesting?
Can It Find All and Only Interesting Patterns?
Requirements and Challenges in Data Mining
Requirements/Challenges in Data Mining (Cont.)
Amazon.com: Case study in building customer loyalty
Why is loyalty important
Remarks on Amazon.com
The Analytics Behinde-Marketing
Shopping Pipeline Analysis
Data Driven Customer Segmentation Model
Customer Score Interpretation
Aggregation Service Model
‘Tip-of-the-Iceberg’ Analysis for a Brokerage House
Household Lifestyle Preference Analysis for a Survey
‘True-Wallet-Share’ Analysis for a Credit Card Organization
Business Implications ofTrue Wallet Share Analysis
Business Implications (contd.)
Targeted Ad Serving
Solution
let’s now talk about privacy …
Public Attitude Towards Privacy