Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
developer.oracle.com/code
SimplifiedandfastFraudDetectionwithjustSQL
developer.oracle.com/code
About me…
Klaus ThielenConsulting Member of Technical StaffRAC Development
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
Agenda
Findingpatternsinyourdata
Lookingforfraudulenttransfers
Dealingwithnewbusinessrequirements
Combiningdifferenttypesofanalytics:Mash-Ups
Summary
1
2
3
4
5
developer.oracle.com/code
FindingPatternsinYourData
Slide - 4
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
FindingPatternsInYourDataBasicConceptsandLanguages
• Lotsoflanguagesnativelysupport“patternmatching”
Source: wikipedia
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
FindingPatternsInYourDataBasicConceptsandLanguages
• Lotsoflanguagesnativelysupport“patternmatching”• AndsodoesSQL:
– Row-levelregularexpressionprocessing– “NEW”:Patternrecognitioninsequencesofevents
• Sequenceisastreamofrows• Eventequalsarowwithinthestream• “Inter-row”patternrecognition
Source: wikipedia
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
PatternRecognitionInSequencesofRows
ProvidenativeSQLlanguageconstruct
Withintuitiveprocessing
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
PatternRecognitionInSequencesofRowsSQL- anewlanguageforpatternmatching
ProvidenativeSQLlanguageconstruct• NewSQLconstructMATCH_RECOGNIZE
– AddedaspartoftheANSI-2016SQLstandard
Withintuitiveprocessing
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
PatternRecognitionInSequencesofRowsSQL- anewlanguageforpatternmatching
ProvidenativeSQLlanguageconstruct• NewSQLconstructMATCH_RECOGNIZE
– AddedaspartoftheANSI-2016SQLstandard
Withintuitiveprocessing• Threelogicalconcepts:
– Logicallypartitionandorderthedata(Streamsofevents)– Definepatternusingregularexpressionandpatternvariables(WhatpatternamIlookingfor?)
• Eachpatternvariableisdefinedusingconditionsonrowsandaggregates• Regularexpressionismatchedagainstasequenceofrows
– Definemeasuresrepresentingthereturnedpatterninformation (WhatdoIwanttoknow?)
developer.oracle.com/code
SearchingforFraud
Slide - 10
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
Demoisonhttp://livesql.oracle.com
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
BusinessProblem:FindingSuspiciousMoneyTransfers• Suspiciousmoneytransferpatternforanaccountis:
– 3ormoresmall(<2K)moneytransferswithin30days– Largetransfer(>=1M)within10daysoflastsmalltransfer
• Report– Account– Dateoffirstsmalltransfer– Dateoflastlargetransfer– Amountoflargetransfer
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
BusinessProblem:FindSuspiciousMoneyTransfers• DatastoredinJSONlogs..
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
BusinessProblem:FindSuspiciousMoneyTransfers• DatastoredinJSONlogs.. • ..Processedrelational
– withnativeJSONcapabilities
TIME_IDUSER_ID EVENT AMOUNT
1/1/2012 John Deposit 1,000,000
1/2/2012 John Transfer 1,000
1/5/2012 John Withdrawal 2,000
1/10/2012 John Transfer 1,500
1/20/2012 John Transfer 1,200
1/25/2012 John Deposit 1,200,000
1/27/2012 John Transfer 1,000,000
2/2/2012 John Deposit 500,000
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
BusinessProblem:FindingSuspiciousMoneyTransfers
TIME_ID USER ID EVENT AMOUNT1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/2012 John Deposit 500,000
Three small transfers within 30 days
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
TIME_ID USER ID EVENT AMOUNT1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/2012 John Deposit 500,000
BusinessProblem:FindingSuspiciousMoneyTransfers
Three small transfers within 30 days
Large transfer within 10 days of last small transfer
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (
DeclarativePatternMatchingusingSQLNewsyntaxfordiscoveringpatternsusingSQL:findingsuspiciousmoneytransfers…
InputintothepatternmatchingprocessdeterminedbyFROMclause
MATCH_RECOGNIZE() –indicatesstartofpatternmatchingdefinition
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY user_id ORDER BY time_id
DeclarativePatternMatchingusingSQLSTEP1:StreamsofEventsNeedtogroupandorderthedatatodefinethestreamsofevents assequenceofrows…
SetthePARTITIONBYandORDERBYclauses
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY user_id ORDER BY time_id
PATTERN ( X{3,} )
DeclarativePatternMatchingusingSQLSTEP2:PatternOccurences
DescribehowtosearchfortheVARIABLEX –
Searchingforthreeormoreoccurrencesofsmalltransfers
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY user_id ORDER BY time_id
PATTERN ( X{3,} Y )
DeclarativePatternMatchingusingSQLSTEP2:PatternOccurences
DescribehowtosearchfortheVARIABLEY –
Searchingforonlyoneinstanceoflargetransfer
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
UsingRegularExpressionsToControlPatternMatching• Concatenation:nooperator• Quantifiers:
– * 0ormorematches– + 1ormorematches– ? 0or1match– {n} exactlynmatches– {n,} normorematches– {n,m} betweennandm(inclusive)matches– {,m} between0anm(inclusive)matches– Reluctantquantifier– anadditional?
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event ='transfer')
MATCH_RECOGNIZE (PARTITION BY user_id ORDER BY time_id
PATTERN ( X{3,} Y)DEFINE
X as (amount < 2000) ANDLAST(time_id) – FIRST(time_id) < 30,
DeclarativePatternMatchingusingSQLSTEP3:DefineEventsDEFINE eachvariableintermsofwhattosearchfor:
ForamatchonvariableXwearesearchingforsmalltransfersofamountslessthan2Kandallthreetransfersmustoccurwithina30daywindow
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY user_id ORDER BY time_id
PATTERN ( X{3,} Y)DEFINE
X as (amount < 2000) ANDLAST(time_id) – FIRST(time_id) < 30,
Y as (amount >= 1000000
DeclarativePatternMatchingusingSQLSTEP3:DefineEventsDEFINE eachvariableintermsofwhattosearchfor:
ForamatchonvariableYwearesearchingforalargetransferofmorethan1M
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY user_id ORDER BY time_id
PATTERN ( X{3,} Y)DEFINE
X as (amount < 2000) ANDLAST(time_id) – FIRST(time_id) < 30,
Y as (amount >= 1000000 AND time_id – LAST(x.time_id) < 10 ))
DeclarativePatternMatchingusingSQLSTEP3:DefineEventsDEFINE eachvariableintermsofwhattosearchfor:
ForamatchonvariableYwearesearchingforalargetransferofmorethan10K……andthelargetransfermustoccurwithin10daysoflastsmalltransfer
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY user_id ORDER BY time_idMEASURES FIRST(x.time_id) AS first_t,
LAST(y.time_id) AS last_t, y.amount amount
PATTERN ( X{3,} Y)DEFINE
X as (amount < 2000) ANDLAST(time_id) – FIRST(time_id) < 30,
Y as (amount >= 1000000 AND time_id – LAST(x.time_id) < 10 ))
DeclarativePatternMatchingusingSQLSTEP4:RequiredresultDefinetheMEASURES–Listthecolumnstobereturned- Calculatedcolumns- Existingcolumns
Reportaccount,dateoffirstsmalltransfer,dateoflastlargetransfer,sumoflargetransfer
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT . . . FROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY userid ORDER BY timeMEASURES FIRST(x.time_id) AS first_t,
LAST(y.time_ID) AS last_t, y.amount amount
ONE ROW PER MATCHPATTERN ( X{3,} Y)DEFINE
X as (amount < 2000) ANDLAST(time_id) – FIRST(time_id) < 30,
Y as (amount >= 1000000 AND time_id – LAST(x.time_id) < 10 ))
DeclarativePatternMatchingusingSQLSTEP3:Requiredresult
ControltheoutputintermsofROWsreturned
Outputoneroweachtimewefindamatchtoourpattern
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
ControllingtheOutput:Summaryvs.Detailed• Which rows to return
– ONE ROW PER MATCH– ALL ROWS PER MATCH – ALL ROWS PER MATCH WITH UNMATCHED ROWS
• After match SKIP option :– SKIP PAST LAST ROW– SKIP TO NEXT ROW– SKIP TO <VARIABLE>– SKIP TO FIRST(<VARIABLE>)– SKIP TO LAST (<VARIABLE>)
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT user_id, first_t, last_t, amountFROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY user_id ORDER BY time_idMEASURES FIRST(x.time_id) AS first_t,
LAST(y.time_id) AS last_t, y.amount AS amount
ONE ROW PER MATCHPATTERN ( X{3,} Y)DEFINE
X as (amount < 2000) ANDLAST(time_id) – FIRST(time_id) < 30,
Y as (amount >= 1000000 AND time_id – LAST(x.time_id) < 10 ))
DeclarativePatternMatchingusingSQLFinallylistcolumnstoreturnbythepatternmatchingprocesstothecallingapplication
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. | 29
Livedemonstration– SimpleFraudDetection
developer.oracle.com/code
NewRequirementsManagingnewbusinessrequirements,quicklyandefficientlywithnoapplicationcodechanges
Slide - 30
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
RefiningThe“SuspiciousMoneyTransfers”PatternDealingWithNewBusinessRequirements
• Additionalrequirement:– Smalltransfersdonotgotothesameaccounttwiceinarow– Totalsumofsmalltransfersmustbelessthan20K
TIME_ID USER ID EVENT TRANSFER_TO AMOUNT1/1/2012 John Deposit - 1,000,0001/2/2012 John Transfer Bob 1,0001/5/2012 John Withdrawal - 2,0001/10/2012 John Transfer Allen 1,5001/20/2012 John Transfer Tim 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer Tim 1,000,0002/2/2012 John Deposit - 500,000
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
RefiningThe“SuspiciousMoneyTransfers”Pattern
TIME_IDUSER ID
EVENT TRANSFER_TO AMOUNT
1/1/2012 John Deposit - 1,000,0001/2/2012 John Transfer Bob 1,0001/5/2012 John Withdrawal - 2,0001/10/2012 John Transfer Allen 1,5001/20/2012 John Transfer Tim 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer Tim 1,000,0002/2/2012 John Deposit - 500,000
Three small transfers within 30 days
to different acct and total sum < 20K
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
RefiningThe“SuspiciousMoneyTransfers”Pattern
TIME_IDUSER ID
EVENT TRANSFER_TO AMOUNT
1/1/2012 John Deposit - 1,000,0001/2/2012 John Transfer Bob 1,0001/5/2012 John Withdrawal - 2,0001/10/2012 John Transfer Allen 1,5001/20/2012 John Transfer Tim 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer Tim 1,000,0002/2/2012 John Deposit - 500,000
Three small transfers within 30 days
to different acct and total sum < 20K
Large transfer within 10 days of last small transfer
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT user_id, first_t, last_t, amountFROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY userid ORDER BY timeMEASURES FIRST(x.time_id) AS first_t,
LAST(y.time_id) AS last_t, y.amount AS amount
ONE ROW PER MATCHPATTERN (X{3,} Y)DEFINE
X as (amount < 2000) AND LAST(time_id) – FIRST(time_id) < 30
SQLPatternMatchinginaction
ModifythepatternvariablesDEFINE- Checkthetransferaccount
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT user_id, first_t, last_t, amountFROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY userid ORDER BY timeMEASURES FIRST(x.time_id) AS first_t,
LAST(y.time_id) AS last_t, y.amount AS amount
ONE ROW PER MATCHPATTERN (X{3,} Y)DEFINE
X as (amount < 2000) ANDPREV(transfer_id) <> transfer_id ANDLAST(time_id) – FIRST(time_id) < 30
SQLPatternMatchinginaction
ModifythepatternvariablesDEFINE- Checkthetransferaccount
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SELECT user_id, first_t, last_t, amountFROM (SELECT … FROM json_transactions WHERE event = 'transfer')
MATCH_RECOGNIZE (PARTITION BY userid ORDER BY timeMEASURES FIRST(x.time_id) AS first_t,
LAST(y.time_id) AS last_t, y.amount AS amount
ONE ROW PER MATCHPATTERN (X{3,} Y)DEFINE
X as (amount < 2000) AND PREV(transfer_id) <> transfer_id ANDLAST(time_id) – FIRST(time_id) < 30,
Y as (amount >= 1000000 AND time_id – LAST(x.time_id) < 10 ANDSUM(x.amount) < 20000 ));
SQLPatternMatchinginaction
ModifythepatternvariablesDEFINE
- Checkthetotalofthesmalltransfersislessthan20K
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. | 37
LiveDemonstration– NewRequirements
developer.oracle.com/code
CombiningAnalyticsWhypatternmatchingwithSQLissopowerful:
ANALYTICALMASH-UPS
Slide - 38
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
FraudDetectionMash-UpUsingSQLKeybenefitsofSQL– easytocombinedifferenttypesofanalytics
+Real-timestreamingdata+Real-timegeo-coding+SQLPatternMatching+SpatialAnalytics
=Real-timefrauddetection
http://oracle-big-data.blogspot.com/2013/11/sql-analytical-mash-ups-deliver-real.html
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SQLPatternMatchingvs.JavaFraudDetection:Fewerlinesofcoderequired….
0
100
200
300
400
500
600
700
MapReduce Java SQL Pattern Matching
Lines of Code
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
0
100
200
300
400
500
600
700
MapReduce Java SQL Pattern Matching
Lines of Code
0
10
20
30
40
50
60
70
80
MapReduce Java SQL Pattern Matching
Execution Time (secs)
SQLPatternMatchingvs.JavaFraudDetection:Fewerlinesofcoderequired….
developer.oracle.com/code
Summary
Slide - 42
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
Summary– YouCanNow….
• ConstructaMATCH_RECOGNIZEstatement
• Buildsearchcriteriausingpatternvariables
• Checkifyourpatternmatchingreally isworkingcorrectly
• Usebuilt-inmeasurestogetadditionalinformation
• UnderstandthepowerandvalueofSQLpatternmatching
• GoanduseSQLPatternMatchingtoyouradvantage!
✔
✔
✔
✔
✔
✔
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |
SQLPatternMatchingWheretogetmoreinformation
• AnalyticalSQLHomePage onOTNwithlinksto:– Training+OracleByExample– Podcastsforpatternmatching– Whitepapers– SamplescriptsandsimpletutorialsforpatternmatchingonliveSQL
• DataWarehouseandSQLAnalyticsblog– http://oracle-big-data.blogspot.co.uk/
• Dropmealine… [email protected] orsendaeMail [email protected]
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. | 45
NoLiveDemonstration..Q&A
Copyright © 2017 Oracle and/or its affiliates. All rights reserved. |