Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Copyright © 2014, SAS Institute Inc. All rights reserved.
Turning Data into Value
The Power of Social Network Analysis ExplainedYannic Hulot – Conseiller Général – Directeur ISI SPF FinanceJulie Coyette – Senior Consultant in Analytics -SAS
Tweet about it!
#SFBL14
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
http://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon
Six Degrees of Kevin Bacon
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Six degrees of separation
Six degrees of separation
is the theory that everyone
and everything is six or
fewer steps away, by way
of introduction, from any
other person in the world,
so that a chain of "a friend
of a friend" statements can
be made to connect any
two people in a maximum
of six steps.
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
A real Network
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
From Super Clusters… To Relevant Networks!
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
1. Why should you care about Network Analysis?
• Improved Fraud Detection (Government, Insurance, Banking)
• Better Churn Model (Telco)
2. What are the key concepts in Network Analysis?
3. How to extract relevant networks from the super cluster?
4. How is it used by FPS Finance to detect complex and organized Tax Fraud?
Agenda = 4 Questions
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
1. Why should you care about Network Analysis?
• Improved Fraud Detection (Government, Insurance, Banking)
• Better Churn Model (Telco)
2. What are the key concepts in Network Analysis?
3. How to extract relevant networks from the super cluster?
4. How is it used by FPS Finance to detect complex and organized Tax Fraud?
Agenda = 4 Questions
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Social Network Analysis =
What is SNA?
Data Analysis from social network sites like Facebook
and Twitter?
Data mining technique that explores the patterns
between people, companies (or other entities) in a
network or group
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Used by google to rank web pages (Google Page Rank)
Applications of Network Analysis
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Used by GPS to find the shortest path
Applications of Network Analysis
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Used in Biology
Applications of Network Analysis
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
In the News…
Applications of Network Analysis
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Used by Telco to prevent churn
Applications of Network Analysis
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
How? Attributes derived from SNA may be used alone or as input to classical
predictive models to help improve their accuracy.
SNA to improve churn models
Fact 2: Network models can detect other
types of churners compared to traditional
models
Fact 1a: Customers are influenced by
friends within the network and by friends of
friends
Fact 1b: Incorporating the impact of higher
order leads to improved predictors and
profits
Fact 3: A customer with canceller in their
network churn at three times the rate
http://blogs.sas.com/content/sascom/2011/10/25/using-social-network-analysis-to-predict-churn-in-telco/
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Used by Government, Banks, Insurers to detect fraud
Applications of Network Analysis
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Network
score
Automated
Business Rules
Anomaly
Detection
Predictive
Modeling
Text
Mining Database
Searches
Social
Network
Analysis
SNA for Fraud Detection – Hybrid Approach
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
More prioritized Fraud, Waste and Abuse cases identified
• Including both previously undetected entities and networks and extensions to already identified cases
Reduction in false positive rates
• Hybrid approach reduces false positives by up to 10+ times over traditional rules-based approaches
Improved analyst / investigation efficiency
• Each alert takes 1/2 – 1/3 of the time to investigate due to data aggregation and visualization
• Provides alert logic and suggested path to initiate investigation
Significant increase in ROI per analyst / investigator
Why SAS SNA for Fraud Detection?
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
1. Why should you care about Network Analysis?
• Improved Fraud Detection (Government, Insurance, Banking)
• Better Churn Model (Telco)
2. What are the key concepts in Network Analysis?
3. How to extract relevant networks from the super cluster?
4. How is it used by FPS Finance to detect complex and organized Tax Fraud?
Agenda = 4 Questions
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Key Concepts in Network Analysis
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
What are Networks?
NODES
LINKS
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
What are Networks? More than nodes and links…
Individuals
Fuzzy Match
Address
Link
Transaction
High riskLow risk
Foreign Company
Local Company
Link
Ownership
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
1. Why should you care about Network Analysis?
• Improved Fraud Detection (Government, Insurance, Banking)
• Better Churn Model (Telco)
2. What are the key concepts in Network Analysis?
3. How to extract relevant networks from the super cluster?
4. How is it used by FPS Finance to detect complex and organized Tax Fraud?
Agenda = 4 Questions
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS® Social
Network
Analysis
NetworkAnalytics
NetworkScoring
Business
Rules
Analytics
AnomalyDetection
PredictiveModeling
How to extract relevant networks from the super cluster?
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
One of the most important objectives in network analysis is the detection of cohesive, self-
contained structures known as Communities. These are defined intuitively as groups of nodes
that are more tightly connected to each other than they are to the rest of the network.
Network Analytics - Community Detection
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Claim Network
• Claims
• Insured
• Address
• Employer
• Account Number
Network Analytics – Community Detection
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Louvain Algorithm for Community Detection
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
The centrality of a node or link in a graph gives some indication of its relative importance within a graph.
Many different types of centrality metrics are used to better understand levels of prominence.
Great measures to improve your marketing campaigns!
Target customer with high centrality scores and let them speak about it via word of mouth…
Prioritize who to contact (strong influencers, followers)
Network Analytics – Centrality Scores
Source: http://www.forteconsultancy.wordpress.com
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Network Analytics – Centrality Scores
Counts the number of times a
particular node (or link) occurs on
shortest paths between other nodes.
Number of links connected to it or
number of direct relationships that an
entity has
Reciprocal of the average of the
shortest path to all other nodes.
Extension of degree centrality in which
centrality points are awarded for each
neighbor. It is equal to the sum of the
scores of all nodes connected to it
Degree
Betweenness
Closeness
Eigenvectors
(PageRank)
Number of links connected to it or
number of direct relationships that an
entity has
Counts the number of times a
particular node (or link) occurs on
shortest paths between other nodes.
Reciprocal of the average of the
shortest path to all other nodes.
Extension of degree centrality in which
centrality points are awarded for each
neighbor. It is equal to the sum of the
scores of all nodes connected to it
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Network Analytics – Centrality Scores
Closeness = ?
Betweenness = ?
Degree = ?
Eigenvectors = ?
Source: http://en.wikipedia.org/wiki/Centrality
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Network Analytics – Centrality Scores
Closeness = B
Betweenness = C
Degree = A
Eigenvectors = D
Source: http://en.wikipedia.org/wiki/Centrality
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Network Analytics – Centrality Scores (Illustration)
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
1. Why should you care about Network Analysis?
• Improved Fraud Detection (Government, Insurance, Banking)
• Better Churn Model (Telco)
2. What are the key concepts in Network Analysis?
3. How to extract relevant networks from the super cluster?
4. How is it used by FPS Finance to detect complex and organized Tax Fraud?
Agenda = 4 Questions
36
“HOW FPS FINANCES PROACTIVELY PREVENTS FRAUD
WITH BIG DATA TOOLS”
SAS FORUM
2014
Yannic HULOT
Inspection Spéciale des Impôts
Director
37
Big Data in the FPS Finance
1. A major challenge
2. Seeing the invisible
3. New data incoming
4. Risk : be sitting on a mountain of gold
5. Time is running
6. No capacity to manage
DATA FRAUD FOOTPRINTS
Data management
Data manipulation
Matching
Cross-checking
…..
Predictive models
Networks
The Fraud Framework
An integrated solution Speedness
Swiftness
Powerness
38
Special Tax Inspectorate
And
SAS tools
1.Making your analysis decisive
2.Giving your inspectors ammunition before the
investigation begins
39
Fra
ud
Pro
pe
ns
ity
Signals
5%
95%
The mass of data was a problem
This becomes a solution
40
SAS® Social
NetworkAnalysis
NetworkAnalytics
NetworkScoring
BusinessRules
Analytics
AnomalyDetection
PredictiveModeling
Fuzzy Matching
Reducing the Super Cluster
Why so efficient inside the STS ?
• Mechanism and schemes = networks
• Criminal organisations = networks
• Not a narrow view
• Taking everything into account without knowing all
• Inter agency
• International contexts
41
Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.
Twitter Contest – Tweet to win prizes!SAS Forum
A. 6
B. 3
C. 4,74
5. We live in a very connected world, with the increasing popularity of
Facebook and Twitter, what is the maximum distance between any two
individuals in the world?
Tweet your answer:
Example: @spicyanalytics 3C
Prizes to win:
1st prize: a ticket for Analytics 2015
2nd prize: a book of Prof Bart Baesens: “Analytics in a big
data world”
3rd to 30th prize: chocolates with pepper
Winners will be contacted post-Forum !
Start of your tweet Question # Your answer
Copyright © 2014, SAS Institute Inc. All rights reserved.
Turning Data into Value
Copyright © 2014, SAS Institute Inc. All rights reserved.
Turning Data into Value
Thank You!
See You Next Year!