BI on Hadoop

Arcadia Data provides visual analytics and BI native to Apache Hadoop® and the cloud.

Why use BI on Hadoop?

Data is coming from every direction. But less than a third of companies turn their big data into insight. Apache Hadoop enables agility in addressing the volume, velocity, and variety of big data. Hadoop—more specifically, the Hadoop Distributed file system (HDFS) and HDFS-compliant storage systems such as Amazon S3 and MapR-FS — enables you to store data in its raw format at large scale without requiring pre-processing that data in advance of storing it (i.e., schema-on-read requirements). Also, Hadoop has an ecosystem of open source projects that provide processing engines to perform data transformation and analysis. Organizations want analytic agility for end-users to make the most of new, large-scale data within Hadoop.

The Traditional Approach

Given many companies already use business intelligence (BI) tools designed for their relational database-centric data warehouse, their first inclination is to apply those same BI tools to Hadoop. Can you get data warehouse BI tools to work with Hadoop? Yes, with enough experts, money, and time, you can treat Hadoop just like any other database. So why is this a bad idea?

BI tools were architected for data warehouses and integration with a relational database. Because relational parallel database technologies do not allow other software to run on the database nodes themselves, BI tools have a two-tier architecture for cube or aggregation data management and visual rendering. This architecture is highly optimized for a data warehouse, but it requires loading data in two places, securing data in two places, and optimizing data for processing performance in two places, which all can be complex and costly. As this approach delivers less than acceptable results time and again, more companies are adopting a new BI standard for Hadoop.

The New Standard for BI on Hadoop

Arcadia Enterprise leads a new class of BI architected specifically for Hadoop. Arcadia Enterprise runs directly within the data lake cluster. The query engine is fully distributed, or massively parallel, and runs on each data node on the cluster, providing tremendous scale and performance. You can perform analytics where the data lives, eliminating data summarization and movement to a separate BI-specific cluster.

Such BI platforms native to Hadoop provide unified security, seamless semantic modeling, and query processing optimizations. BI native to Hadoop is a single-tier architecture, unlike traditional BI tools that run in a two-tier architecture. This yields reduced administration, improved self-service, and high speed and scale.

Search-Based BI on Hadoop

Native architectures for BI on Hadoop enable additional benefits including search-based BI and analytics. With Arcadia Enterprise, business users can easily query their data with a simple, Google-like search interface and receive instant visual answers and insights. Traditional BI and analytics tools require users to know in advance which dataset to query, and which fields to pick. Arcadia Enterprise allows users to simply type in natural language questions and get answers back, encouraging more organizations to embrace modern BI on Hadoop and reap the benefits of its speed and agility.

Still Want to Use Your Existing BI Tool?

If you want to use your existing BI tool for your BI on Hadoop deployment, we can enable that too. Don't move your Hadoop data to a separate BI server, and don't accept compromises on scale when connecting directly to Hadoop. Arcadia Data accelerates your BI tools to let you run production dashboards for thousands of users on huge volumes of data.