Upload
wso2-inc
View
757
Download
1
Embed Size (px)
Citation preview
WSO2 Data Analytics Server 3.0.0Product Release Webinar
Inosh GoonewardenaAssociate Technical Lead
WSO2 Analytics Platform
WSO2 Analytics Platform
Data Processing Pipeline
Introducing WSO2 Data Analytics Server
● Fully-open source solution with the ability to build systems and applications that collect and analyze data and communicate the results.
● Embodies the WSO2 Analytics Platform by combining batch, real-time, interactive and predictive analytics capabilities
● High performance data capture framework
● Highly available and scalable by design
Advantages of DAS 3.0 over WSO2 BAM 2.5.0
● Complete rewrite from the ground up, with performance and extensibility as core values
● Faster analytics powered by Apache Spark, 10x - 100x speedup
● Rich indexing support, with near real-time text search
● Pluggable data store support, from lightweight embedded RDBMS to highly scalable HBase/HDFS
● Revamped Analytics Dashboard with wizard-based gadget generation
WSO2 DAS Architecture
Collecting Data
Data Model{
'name': 'stream.name',
'version': '1.0.0',
'nickName': 'stream nickname',
'description': 'description of the stream',
'metaData':[
{'name':'meta_data_1','type':'STRING'},
],
'correlationData':[
{'name':'correlation_data_1','type':'STRING'}
],
'payloadData':[
{'name':'payload_data_1','type':'BOOL'},
{'name':'payload_data_2','type':'LONG'}
]
}
● Published data conforms to a strongly typed data stream
● One API for Batch and Real-time Analytics.
● Asynchronous and non-blocking nature enables extremely fast writes.
● Supports multiple transport adapters for data collection
Data Receiver
Highly Pluggable Event Receiver Architecture
Data Persistence● Data Abstraction Layer to enable pluggable data connectors
○ RDBMS, Cassandra and HBase/HDFS offered. Custom connectors could be easily written
● Analytics Table○ The data persistence entity in WSO2 Data Analytics Server
○ Provides a backend data source agnostic way of storing and retrieving data
○ Allows applications to be written in a way that it does not depend on a specific data source, e.
g. JDBC (RDBMS), Cassandra APIs etc.
○ WSO2 DAS gives a standard REST API in accessing the Analytics Tables
Data Persistence● Analytics Record Stores
○ An Analytics Record Store houses a specific set of Analytics Tables
○ The Analytics Record Stores to be used for storing incoming events and storing query
processing output are configurable
○ Single Analytics Table namespace, the target record store only given at the time of table
creation
○ Useful in creating Analytics Tables where data will be stored in multiple target databases
● Analytics File System○ The location where the indexing data is stored
○ Multiple implementations provided OOTB, or custom implementations can be written
Analyzing Data
Batch Analytics
Batch Analytics - Overview● Powered by Apache Spark for 10x-100x higher performance than Hadoop
● Parallel, distributed with optimized in-memory processing
● Scalable script-based analytics written using an easy-to-learn, SQL-like query language powered by Spark SQL
● Interactive built in web interface for ad-hoc query execution
● Scheduled query script execution support with high-availability and failover
● Run Spark on a single node, Spark embedded Carbon server cluster or connect to external Spark cluster
create temporary table product_data using CarbonAnalytics
options (schema …)
create temporary table products using CarbonAnalytics
options (schema …)
insert into products select product_name from product_data
group by …
Batch Analytics - Spark SQL
Batch Analytics - Interactive Console
Batch Analytics - Spark Scripts
Interactive Analytics
● Full text data indexing support powered by Apache Lucene● Drill down search support● Distributed data indexing
○ Designed to support scalability● Near real-time data indexing and retrieval
○ Data indexed immediately as received
Interactive Analytics
Interactive Analytics
Real-time Analytics
What is Real-time Analytics?Real-time Analytics in
→
Real-time Analytics in →
● Gather data from multiple sources● Correlate data streams over time● Find interesting occurrences ● And Notify ● All in real-time
What is Real-time Analytics?
Predictive Analytics (upcoming)
Predictive Analytics in →
What is Predictive Analytics?
Predictive Analytics in →
● Extract, pre-process, and explore data
● Create models, tune algorithms and make predictions
● Integrate for better intelligence
What is Predictive Analytics?
Communicating Results
Dashboards● “Overall idea” in a glance (e.g. car
dashboard)
● Support for personalization, you can build your own dashboard.
● The entry point for Drill-down
● Building a custom dashboard○ Dashboard via Google Gadgets and content
via HTML5 + JavaScript○ Leverages WSO2 User Engagement Server to
build a dashboard.○ Uses charting libraries like Vega, D3.js
Dashboards: Gadget Generation Wizard
● Start with data in tabular format
● Map each column to dimension in your plot like X,Y, color, point size, etc
● Also do drill downs
● Create a chart with few clicks
Alerts● Detecting conditions can be
done via CEP Queries
● “Last Mile” is key○ Email
○ SMS
○ Push notifications to a UI
○ Pager
○ Trigger physical Alarm
APIs● With mobile Apps, most data are
exposed and shared as APIs (REST/JSON ) to end users.
● Analytics results can be exposed through APIs
○ REST API
○ JavaScript API
What can WSO2 DAS do for you?
Common Use Cases of WSO2 DAS● KPI Statistics
○ Application Statistics Monitoring○ Network / Service Statistics○ Sensor Data Aggregation
● Solving Optimization Problems○ Urban Planning○ Revenue Distribution Analysis
● Activity Monitoring○ Tracking Message Flows
● HL7 Data Exploration○ ESB HL7 Transport Interfaced with
DAS
● Log Analysis○ Application / System Logs
● Sports○ Real-time Analysis of Player
Performance○ Real-time Match Analysis
● Geo-Spatial○ Traffic Monitoring and Alerting○ Geo-fencing
● Anomaly Detection○ Fraud Detection○ Network Intrusion Detection○ Server Health Monitoring
API Statistics
API Statistics
HTTP Monitoring
Activity MonitoringActivity monitoring is for tracking events from multiple nodes in a flow to understand a specific activity
● Example:○ A client initiating a web services request which travels through multiple ESBs, application
servers and returns back. This flow will be uniquely identified and visualized in DAS
● Used for tracing messages, finding performance hotspots in the flow
● Implemented based on a correlation id based mechanism using Interactive Analytics
Activity Monitoring
Activity Monitoring
Activity Monitoring
Activity Monitoring
Activity Monitoring
Fraud Detection
● Built for detecting credit card fraud
● The rules are extensible with customized Siddhi execution plans for any type of fraud detection
● Currently leverages Real-time and Interactive Analytics features
Source: multichannelmerchant.com
Log Analysis● Distributed indexing and searching
of any type of logs stored in the system
● Notifications support with Real-time event processing features
● Application / Server health prediction with Machine Learning
● Utilizes Interactive + Real-time Analytics + Machine Learning features
Source: www.retrospective.centeractive.com
Urban Route Planning
Urban Route Planning
Product Demonstration
Questions?