Upload
luke-rhee
View
51
Download
0
Embed Size (px)
Citation preview
ashtag#ashta
g#Luke Young Rhee
ashtag#
The trending topics of trending topics
ashtag#
Luke Young Rhee
Impact
Discovery
YouTube
The Pipeline
The Pipeline {"text": "RT @Pozzzzzzz
…
"id": 5469180, “time”: “2016:09:30T...”,"entities": {"user_mentions":
... "hashtags": [
{“text”: ”yum”, “indices”: [32, 35] },
{“text”: ”beer”, “indices”: [32, 36] }
] },
Tweet
The Pipeline kafka-connect-twitter
●Kafka Connect
●Data
Formating
The Pipeline Kafka Streams
●Process
●Filter
The PipelineDruid
● Low latency○ Ingestion
○ Analytics
● Scalable
The PipelineQuery: Table
time hashtags
... [“beer”, “yum”]
... [“beer”, “Lagunitas”]
... [“cats”, “cute”, “notCrazy”]
The PipelineQuery: Filter
time hashtags
... [“beer”, “yum”]
... [“beer”, “Lagunitas”]
... [“cats”, “cute”, “notCrazy”]
... “beer”
... “yum”
... “beer”
... “Lagunitas”
hashtags = “beer”
The PipelineQuery: Count
time hashtags count
... “beer” 2
... “yum” 1
... “Lagunitas” 1
The PipelineQuery: Count
+TopN
time hashtags count
... “beer” 2
... “yum” 1
... “Lagunitas” 1
The Pipeline
The Pipeline
Kafka Connector
Kafka Streams
Kafka Indexing Service
pydruid
ChallengesKafka
Streams / Serdes
+Druid
Druid Cluster
ChallengesKafka
Streams / Serdes
+Druid
Ingest: 1% Twitter ~ 1k - 2k /min
Query: ~ 92.6k rows
Success!
Thanks!
Luke Young Rhee
University of California, IrvineMS Mathematics
Nintex, IrvineTest Analyst
Enjoy being near the ocean and getting lost in new cities