Chapter 7:
Selecting a Next-Generation Business Intelligence Platform

Recommendations

Business Intelligence has kept pace with many advances in both hardware and software technologies. However, because they traditionally relied on data extracts and dedicated infrastructures, legacy business intelligence solutions have limited the ability to explore the full range of new data sources and types that are available to them. This is particularly true in the area of streaming or time-sensitive data because of the long lead times commonly associated with ETL.

With the arrival of Hadoop, Spark, Apache Kafka, and other open-source solutions, the cost of storing and processing data has plummeted, opening vast new opportunities for innovation and insight. Many tools have emerged to tap into these data sources, but most still require data translation, transformation, integration, ETL, and intermediary platforms between the Hadoop store and the analytic front end, providing incremental improvements over legacy BI approaches.

The concept of self-service has changed the user experience. Business users are more emboldened than ever to take care of their own data management needs, and they have little patience for long delays getting at their data. Given that the shortage of data scientists is unlikely to abate in the near future, a solution that involves throwing more people at the problem is not attractive. A far more practical solution is to pacify users by giving them self-service access to analytics and visualization with the power to drill down into the back-end data stores.

An Example Technology to Consider

Arcadia Data is one of a new breed of visual analytics companies that is taking an entirely different approach to BI by cutting out intermediate steps and enabling their users to query big data sources with a full range of high-performance analytic and visualization options.

Arcadia Enterprise is a native visual analytics platform for big data that is designed from the ground up to work in concert with Hadoop, cloud environments, and other modern data platforms. This eliminates the need for a large portion of ETL required to “fit” data into legacy BI tools and the many related complications and failure points that we have outlined here, including delays caused by data preparation, data duplication, version conflicts, security vulnerabilities, and more. The Arcadia Data approach minimizes risk by never copying or moving data out of the core data lake, hub, or platform.

Arcadia Enterprise works with popular big data technologies in use today like Hadoop, S3, Hive, Cloudera Manager, Ambari, Apache Sentry, and Apache Ranger, rather than saddling customers with additional layers of tools. Arcadia Data is also committed to making its extensive library of visualizations available through web and mobile browsers. This removes barriers between users and their data, eliminates client licensing costs, and delivers major operational efficiency benefits.

One of the biggest limitations associated with conventional BI is cost-effective scalability of the analytics server. Because Arcadia Data runs natively on the data platform itself, it transparently scales linearly with Hadoop and cloud environments.

Until now, BI users did not have the ability to access real-time data, much less integrate it with batch sources. Arcadia Enterprise provides the means to do this with drag-and-drop simplicity, opening up a world of new applications in the process. For example, factory managers in an instrumented IoT environment can match live data streams from sensors on the floor to historical data illustrating averages and thresholds. Managers can see in real time when equipment is malfunctioning and replace it without downtime. Or web server administrators can monitor clickstream data and adjust resources in real time to minimize performance delays.