Data collection is rampant in 2016, with the number of sensors and information gathering tools set to multiply exponentially over the next few years. However, with all this real time data sparking off these innovations, the challenge for those in charge of deployment is how to analyse events as they happen, and how to spot important patterns in data. Complex Event Processing (CEP) aims to provide a solution by comparing continuously incoming events against a standard pattern.
In contrast to how standard Relational Database Management Systems, which execute queries on stored data, CEP works by executing data on a stored query, discarding any irrelevant data as it goes. Written in Scala and Java, Hadoop MapReduce alternative Apache Flink (which takes its name from the German word for speedy or nimble) is aiming to set the standard for CEP in the JVM world, with a newly released CEP library in Flink 1.0 – FlinkCEP. To get started, you need is a Flink program in place, and then you can start getting stuck in.
Project committer Till Rohrmann writes that, with its “true streaming nature” (ie. data elements are “pipelined” through streaming programs as they arrive) and capabilities for low latency, combined with high throughput stream processing, Flink is “a natural fit for CEP workloads.”
Although often compared to barnstorming fellow top level Apache project Spark – and indeed, there are a number of commonalities between these two data processing tools in terms of APIs and components – Flink is capable of handling much bigger data workloads and is giving Spark a run for its money in arenas like FinTech. The other key differentiator is the fact that Spark supports data in batch mode, whilst Flink streams in real time. The low-latency tool started life as an academic open source project called Stratosphere, but was re-christened to avoid a potential conflict once it joined the foundation, where it has rapidly gained fans.
Whilst it may be a while before Flink reaches anything like the rapid proliferation of Spark, watch this space. Check out this session from Flink Forward, recorded in Berlin in 2015, for a full look of how the two compare.