Apache Flink supports a broad ecosystem and works seamlessly with
many other data processing projects and frameworks.
Connectors provide code for interfacing with various third-party systems.
Currently these systems are supported:
To run an application using one of these connectors, additional third party components are usually required to be installed and launched, e.g. the servers for the message queues. Further instructions for these can be found in the corresponding subsections.
This is a list of third party packages (ie, libraries, system extensions, or examples) built on Flink. The Flink community collects links to these packages but does not maintain them. Thus, they do not belong to the Apache Flink project, and the community cannot give any support for them. Is your project missing? Please let us know on the user/dev mailing list.
Apache Zeppelin
Apache Zeppelin (incubator) is a web-based notebook that enables interactive data analytics and can be used with Flink as an execution engine (next to others engines). See also Jim Dowling’s Flink Forward talk about Zeppelin on Flink.
Apache Mahout
Apache Mahout in a machine learning library that will feature Flink as an execution engine soon. Check out Sebastian Schelter’s Flink Forward talk about Mahout-Samsara DSL.
Cascading
Cascading enables an user to build complex workflows easily on Flink and other execution engines. Cascading on Flink is build by dataArtisans and Driven, Inc. See Fabian Hueske’s Flink Forward talk for more details.
Apache Beam (incubating)
Apache Beam (incubating) is an open source, unified programming model that you can use to create a data processing pipeline. Flink is one of the back-ends supported by the Beam programming model.
GRADOOP
GRADOOP enables scalable graph analytics on top of Flink and is developed at Leipzig University. Check out Martin Junghanns’ Flink Forward talk.
BigPetStore
BigPetStore is a benchmarking suite including a data generator and will be available for Flink soon. See Suneel Marthi’s Flink Forward talk as preview.
FastR
FastR in an implemenation of the R language in Java. FastR Flink exeutes R workload on top of Flink.
Apache SAMOA
Apache SAMOA (incubating) a streaming ML library featuring Flink an execution engine soon. Albert Bifet introduced SAMOA on Flink at his Flink Forward talk.
Python Examples on Flink
A collection of examples using Apache Flink’s Python API.
WordCount Example in Clojure
Small WordCount example on how to write a Flink program in Clojure.
Anomaly Detection and Prediction in Flink
flink-htm is a library for anomaly detection and prediction in Apache Flink. The algorithms are based on Hierarchical Temporal Memory (HTM) as implemented by the Numenta Platform for Intelligent Computing (NuPIC).
Apache Ignite
Apache Ignite is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time. See Flink sink streaming connector to inject data into Ignite cache.