Making the Most of Customer Data
Srinath Perera Director, Research, WSO2 Inc.
Visiting Faculty, University of Moratuwa Member, Apache Software Foundation
Research Scientist, Lanka Software Foundation
2
About WSO2 § Global enterprise, founded in 2005
by acknowledged leaders in XML, web services technologies, standards and open source
§ Provides only open source platform-as-a-service for private, public and hybrid cloud deployments
§ All WSO2 products are 100% open source and released under the Apache License Version 2.0.
§ Is an Active Member of OASIS, Cloud Security Alliance, OSGi Alliance, AMQP Working Group, OpenID Foundation and W3C.
๏ Driven by Innova=on
๏ Launched first open source API Management solu=on in 2012
๏ Launched App Factory in 2Q 2013
๏ Launched Enterprise Store and first open source Mobile solu=on in 4Q 2013
3
What WSO2 delivers
4
Business Model
Outline § Connected Business and Big data analytics § Why use Analytics? § Big Data Technologies from WSO2
§ BAM – Batch analytics § CEP – Real time analytics § Lambda Architecture to combine
§ From your business to insights § Understand the Customers § Targeted Marketing § Understand Competition and Market § Optimize Operations § Predict Outcomes
Adaptive Connected Business
Connected Business
Be Adaptive § Capture business activity
(identified by messages, transaction execution, and data state changes) and store data points for future analytics
§ Deliver automated notifications to stakeholders and systems based on business activity, stakeholder accountability, and authority.
§ Automatically adapt business process execution based on events and current conditions
Big Picture
Why Analytics? § Because there is
room to improvement, and you do not know where and how!
§ Few Areas o Understand customers o Understand the Market and competition o Efficient Marketing o Optimize your operations o Predict outcome
Understand the Customers § Not all customers are
equal (80/20%) o Bring different amount of
revenue o Needs different things o Lives in different areas o Use your service at
different times o Responds to different
things
Marketing § Old broadcast model
of marketing o People are getting
better at ignoring o People hate when you
knocking on the door o Most eyeballs are at
internet § Market to people who
are interested? Key is finding who is interested
Understand the Market and Competition
§ What if we can? o Know how what market thinks
(follow social feeds)? o Know what customers like and
dislike? o Know who are unhappy? (e.g.
find and react to churn)? o What subset of customers like
our products?
World is inefficient § About 50% of cooked food wasted § About 30% vegetables and fruits wasted § 5% revenue on average lost to fraud, and
22% of cases are > 1M § Most energy (e.g. lighting, mechanical) is
wasted § So much time lost waiting for things,
cleaning up messes, finding things
Big Data Technologies
Collecting Data § Data collected at sensors and sent to big
data system via events or flat files § Event Streams: we name the events by its
content/ originator • Get data through
– Point to Point – Event Bus
• E.g. Data bridge – a thrift based transport we
did that do about 400k events/ sec
Making Sense of Data § Basic Analytics o To know (what happened?) o Statics (min, max, average,
histogram … ) + visualizations o Interactive drill down
§ Advanced Analytics o To explain (why) - Data mining,
classifications, building models, clustering
o To forecast – Regression, Neural networks, decision models
Dashboards and last Mile § Presenting information
o To end user o To decision takers o To scientist
§ Interactive exploration § Sending alerts
http://www.flickr.com/photos/stevefaeembra/3604686097/
Big Data Architecture
Data Collection
• Can receive events via SOAP, HTTP, JMS, ..
• WSO2 Events is highly optimized version (400K events TPS)
• Default Agents and you can write custom agents.
Agent agent = new Agent(agentConfiguration); publisher = new AsyncDataPublisher(
"tcp://localhost:7612", .. ); StreamDefinition definition =
new StreamDefinition(STREAM_NAME, VERSION);
definition.addPayloadData("sid", STRING); ... publisher.addStreamDefinition(definition); ... Event event = new Event(); event.setPayloadData(eventData); publisher.publish(STREAM_NAME, VERSION, event);
Business Activity Monitor
BAM Hive Query Find how much time spent in each cell. CREATE EXTERNAL TABLE IF NOT EXISTS PlayStream … select sid, ceiling((y+33000)*7/10000 + x/10000) as cell, count(sid)
from PlayStream GROUP BY sid, ceiling((y+33000)*7/10000 + x/10000);
Complex Event Processor
CEP Query define partition sidPrt by PlayStream.sid, LocBySecStream.sid from PlayStream#window.timeBatch(1sec) select sid, avg(x) as xMean, avg(y) as yMean, avg(z) as zMean insert into LocBySecStream partition by sidPrt from every e1 = LocBySecStream -> e2 = LocBySecStream [e1.yMean + 10000 > yMean or yMean + 10000 > e1.yMean]
within 2sec select e1.sid insert into LongAdvStream partition by sidPrt ;
Calculate the mean location of each player
every second
Detect more than 10m run
Lambda Architecture
Applying Big Data Technologies
Understand the Customers § Process transactions logs using Hive o Building a profile for customers o Identify key 20% that brings in most revenue o Identify what features and feature
combinations they like most o Find how they reached you
How? Can be done via basic analytics (Hive and Basic Stats)
Build a Profile for Customers § Get them to register (gets you basic
demographics) § Track what they like, what they view? What
they buy? § Track how often they buy? Where he live
(from client IP)? § Follow their social feeds, gauge the
sentiments, find what they like How? > 50% via basic analytics, rest
need some NLP, finding similar items, classification etc.
Targeted Marketing 1. Know your stats: know Leads => Sales
conversion rate, and details about the pipeline. 2. Analyze user profiles and target your activities
(e.g. based on location, interests etc.) 3. Tag campaigns and track the effect (Google
Adv, workshops, events, email campaigns, even TV or paper adv)
4. Find how activities affects Leads => Sales. 5. Use the data for predictive modeling How? 1-4 with basic analytics +
activity monitoring. #5 with advanced analytics
Understand the Market and Competition § Know your current customers and opportunities
are? Find the risk (e.g. predict Churn) § Find which leads are most effective at
conversion? § What common sequences users do often? May
be package it as a new product? § Track social feeds for what users are saying.
Track sentiments. Convert complains to praises by acting fast.
How? 20% basic analytics and rest advanced analytics
Optimize Operations § Instrument your operations pipeline. Know
what happens, where resources spent? o Manufacturing pipeline o Sales pipeline o Marketing pipeline
§ Do predictive maintenance § Optimize your IT infrastructure § Lookout for fraud! (often cost > 30%)
How? 40% basic analytics and rest advanced analytics
Operation Dashboard § Real time view of your business § Visualizations that shows the bottom line
at a glance. § KPIs, thresholds and alerts § Drilldown when there are problems (see
Webinar “Gaining Operational Intelligence with WSO2 BAM”)
§ Different views for different roles
Predict Outcomes § Plan the operations, look for risks. § Use old data to predict outcomes. Fine
tune and improve models. § Do what if analysis, use that to drive your
decisions § Try to find predictions on key external
factors (e.g. Oil and manufacturing companies invest on weather forecasts. )
Conclusion § Analytics are important to you Business o Because there is lot of room to
improvements, but you do not know where. § The Big Data platform § Applying Big Data technologies
§ Understand the Customers § Targeted Marketing § Understand Competition and Market § Optimize Operations § Predict Outcomes
Questions?