Real-time analytics with Spark and Cassandra

by Tim Berglund 

Friday, 10 June 2016 16:30 @ Room 2

Apache Cassandra is a leading open-source distributed database capable of amazing feats of scale, but its data model requires a bit of planning for it to perform well. Of course, the nature of ad-hoc data exploration and analysis requires that we be able to ask questions we hadn’t planned on asking—and get an answer fast. Enter Apache Spark.

Spark is a distributed computation framework optimized to work in-memory, and heavily influenced by concepts from functional programming languages. It’s exactly what a Cassandra cluster needs to deliver real-time, ad-hoc querying of operational data at scale.

In this talk, we’ll explore Spark and see how it works together with Cassandra to deliver a powerful open-source big data analytic solution.