Building A Massive Stream Computing Platform ForFlexible Applications
Tianjian Chen
Zhengrui Man
Hao Li
Xin Sun
Raymond K. Wong
Zhiwei Yu
Jun, 2014 IEEE BigData Congress
Highlights
• Applications Design the System Themselves
• Complete Modularization Strategy
• Extreme Simple Stream Model
OP1
OP2
OP3
OP4 Mobile Devices
User Filter
AdsRecalling
AdsRanking
PushController
User Preference Database
Location Logging API
Ads POIDatabase
LBS Ads Service
DRPC
DRPC
LBSAds Query
Can I do this?
• Location Based Ads Push System
• Co-Serving With Online Services
OP1
OP2
Redis Cluster
Web Page Cache
OP3 OP4
OP5 User Model
Log Filter
Data Join
FeatureExtraction
ModelUpdate
Web Crawling
Logging APICan I do this?
• User Preference Tracking System
Stream Application
Vortex Platform Hadoop Platform
Map-Reduce Application
On
line
Web
Ser
vice
sCan I co-operate it with m/r?
Nukua Automation System
Universal Resource Manger
Spinal DMQ
Stream Computing Core
DRPC Service Interface
Layer 1 : Computing Resources
Layer 2 : Deployment Automation
Layer 3 : Data Transmission
Layer 4 : Topology Representation
Layer 5 : Stream Application
5 Independent Sub-Systems
downlink
downlink
downlink
uplink
uplink
OP1 Sub-Links ofOP1Downstream
Sub-Links ofOP2Upstream
OP2
OperatorCluster
OperatorCluster
Spinal DMQ Cluster
Message Queuing Configuration
downlink
downlink
downlink
uplink
uplink
OP1 Sub-Links ofOP1Downstream
Sub-Links ofOP2Upstream
OP2
Message Passing Configuration
Traditional Stream Model
0 1 2 3 5 6 7 8 n
• Independent Consumer Status• High Index Overhead• High Snapshot Overhead
Vortex Stream Model
0 n
Head Tail
• Unified Status• Minimal Index Overhead• Minimal Snapshot Overhead
Lessons Learned
• Highly Configurable System For Flexible Applications
• Big Data Requires Everything Simple & Reliable