August 28, 2017 - Richard Tomlinson | Big Data Ecosystem

Breaking the Barrier for Business Analysts on Streaming Data

The digitization of everything and rise of the Internet of Things are undoubtedly the driving forces behind the increased demand for new types of data-streaming applications and analytics on fast-moving data. Organizations recognize that new opportunities to innovate or provide higher levels of customer service exist from jumping on data as soon as it is created or made available at the source.

For example, through a connected car, an auto manufacturer could alert a customer that an engine part is about to fail, or a retailer could generate a product promotion in real time based on an individual consumer’s last five minutes of browsing activity. Financial services organizations can make faster decisions about asset allocation or fraudulent activity “in the moment” and doctors can make instant decisions about procedures for their patients based on digital devices they are wearing. Opportunities to advance organizations and individuals across all industries and all walks of life are almost boundless.

So that’s the promise, but how in reality do organizations enable such innovation? Undoubtedly, the key to this is to provide access to the data for those who know how to use it, i.e., business analysts and users in the front lines.

It sounds easy enough. Businesses have been doing this for years with slow-moving information via ETL tools, data warehouses, and business intelligence technology, but the problem is these architectures and tools were never designed for fast and continuously-moving data. Hence, we are now seeing the emergence of distributed, fast data processing frameworks and open source technology that attempt to address some of the needs for data in motion.

One such technology that is currently white hot is Apache KafkaTM, a distributed streaming platform that lets you publish and subscribe to streams of records (or topics), store those streams of records in a fault-tolerant way, and process the streams as soon as they occur. Thousands of top organizations are adopting Kafka as the standard for capturing, processing, and providing access to billions of data points in real time.

Even with Kafka in the mix, providing business access to data directly in a Kafka topic via SQL-based tools is still a challenge. All too frequently, organizations will implement complex multi-stage architectures in which data from Kafka is first moved into an intermediate data store or processing framework such as Apache SparkTM or Apache Kudu before it is available for querying via SQL-based BI tools. Every additional component and step in a fast-moving data architecture adds maintenance overhead and cost, creates multiple versions of data, and ultimately slows down immediate business user access.

Thankfully, this very challenge is addressed by a brand new offering from Confluent, founded by the original team that built Kafka, called KSQL. This new technology allows for continuous, interactive queries on Kafka topics via SQL and therefore eliminates the need for intermediate data stores and provides immediate access to SQL tools.

Arcadia Data is excited to partner with Confluent to become the first visual analytics software vendor to support their newly announced KSQL. Enabling KSQL inside Arcadia Enterprise will mean business users, via an intuitive drag-and-drop, web-based UI, will gain direct access to real-time data in Kafka topics to build innovative visual analytic applications in exactly the same way they would on any other more traditional data sources.

Through the upcoming Arcadia Enterprise and KSQL integration, organizations can:

  • Enable self-service data discovery and intuitive drag-and-drop visual analytic, dashboard, and application composition on data directly in Kafka topics.
  • Perform fine-grained, time-series-based analysis on fast-moving data to understand what is happening right now and compare this to any moment in the past.
  • Blend streaming data in Kafka with static data at rest, all within a single user interface to create unified analytical applications.
  • Experience earlier detection of anomalies and exceptions in data via real-time alerts and dynamic dashboards that react and change as soon as the data does.

The combination of Arcadia Enterprise and KSQL (which you can download and try here) puts the power of fast data directly in the hands of the business. At the same time, it dramatically reduces the complexity of the underlying streaming architecture without sacrificing the ability to scale out to billions and billions of real-time events.  Coupled with the unparalleled ability of Arcadia Enterprise to provide visual analytics with interactive performance on massive volumes of historical data without the need to move or extract the data from the cluster, organizations can experience a new class of visual analytics for big data that will drive innovation and take performance to the next level.


Related Posts