Upload
robert-metzger
View
108
Download
0
Tags:
Embed Size (px)
Citation preview
Berlin Apache Flink Meetup #7Community Update
May 2015
Robert MetzgerCommitter and PMC
@rmetzger_
2
Apache Flink is an open source platform for scalable batch and stream data processing.
Apache Flink is …
flink.apache.org
• The core of Flink is a distributed streaming dataflow engine.• Executing dataflows in
parallel on clusters• Providing a reliable
foundation for various workloads
• DataSet and DataStream programming abstractions are the foundation for user programs and higher layers
3
One engine for many use cases
flink.apache.org
Real time streaming topologies
Machine Learning at scale
Graph Analysis
Long batchpipelines
4
What happened?• Zeppelin on Flink pull request at
Zeppelin project opened• Community agreed on a list of issues
to fix for 0.9 (its coming closer)• Static Code Analysis pull request
opened• Gelly roadmap + upcoming blog post• Gelly Scala API is in progress …• Stockholm and Bay Area Meetup
groups startedflink.apache.org
7
Now in master (0.9-SNAPSHOT)
flink.apache.org
• Reworked streaming fault tolerance with KafkaSources (allowing exactly-once-processing in Flink). New state backend (file system)
• Pipelines in Flink ML (similar to scikit-learn) + a lot of activity in ML
• Batch / Streaming mode switch• Stability improvements
8
Articles and Meetups• Juggling with Bits and Bytes [1]
• Apache Flink@ Strata & Hadoop World London (Slides) [2]
• Real-time stream processing: The next step for Apache Flink [3]
flink.apache.org
Meetup in BudapestMeetup in Stockholm
[1] http://flink.apache.org/news/2015/05/11/Juggling-with-Bits-and-Bytes.html [2] http://www.slideshare.net/stephanewen1/apache-flink-strata-hadoop-world-london[3] http://data-artisans.com/stream-processing-with-flink.html also posted on:http://blog.confluent.io/2015/05/06/real-time-stream-processing-the-next-step-for-apache-flink/