Accelerating Tableau on Apache Hadoop®

Arcadia Data accelerates Tableau (and other BI tools) connected to Hadoop.

Why Accelerate Tableau on Hadoop?

There are two primary models for applying a business intelligence tool such as Tableau to Hadoop. You can extract your Hadoop data to Tableau Server, and then run your analytics from there (a slow process). Or you can connect Tableau directly to Hadoop and let a Hadoop query engine handle the backend processing (slow results). Both models have significant limitations.

A better alternative is to let Arcadia Enterprise accelerate your dashboards on data residing in Hadoop so that end users get responsive applications and can quickly drill into data to get insights. Arcadia Enterprise was architected to seamlessly provide dashboard acceleration so IT teams don’t have to make a huge time investment in data movement and performance analysis.

The Challenges of Tableau on Hadoop

The challenges of applying Tableau to Hadoop quickly become obvious. In the scenario above with Tableau Server, you compromise on scale, because it’s impractical to continually move large volumes of data to a BI server. Hadoop query engines also have scale challenges, particularly with regard to latency and concurrency. As you add more users, the response time dramatically slows down, leaving you with non-responsive dashboards.

As with most BI tools, Tableau was architected for data warehouses and integration with a relational database. Because relational parallel database technologies do not allow other software to run on the database nodes themselves, traditional BI tools have a two-tier architecture for cube or aggregation data management and visual rendering. This architecture is highly optimized for a data warehouse. But it requires loading data in two places, securing data in two places, and optimizing data for processing performance in two places, which all can be complex and costly. And the dedicated BI server has limitations on how much data it can manage and how much user load it can sustain.

Accelerating Tableau on Hadoop with Arcadia Enterprise

Real-time BI and customer insight are among the top initiatives for organizations, according to recent research by The Enterprise Strategy Group. But data lakes have not delivered their promised value because traditional BI architectures struggle with high volume and concurrency.

By installing Arcadia Enterprise on your Hadoop cluster, and pointing Tableau at Arcadia Enterprise rather than a BI server, you eliminate the need for the traditional two-tier BI architecture. Arcadia Enterprise provides a distributed and parallel in-cluster BI execution engine so you can do computations where the data resides, with no data movement. The analytics and BI platform provides scale, performance, concurrency, unified security, and reliability. Moreover, it allows end users to analyze granular data directly and quickly without requiring upfront data modeling, ETL, data structuring or “cubing.” Arcadia Enterprise eliminates the need for a separate BI server or to build data cubes or extracts before analyzing the data and developing BI reports with Tableau. You can explore raw data immediately — no need to submit a ticket to IT to build a cube first. Arcadia Enterprise and Analytical Views™ provide unrivaled capabilities for fast Tableau on massive data for thousands of users.

Arcadia Enterprise gives you direct access to the data that’s sitting in the nodes of Hadoop or other scale-out modern data platforms – with no data movement – through the Tableau interface with which you are already familiar. You can gain cutting-edge insight directly from your data.

The Result: High-Performance Distributed, Parallel BI. No ETL, No Extraction, No Data Movement.

Third-party tests have shown that Arcadia Enterprise can BI dashboards on Hadoop 21 to 88 times faster. Data remains in its native form, with no upfront modeling, so there are no new administrative burdens. That translates to blazing fast performance of Tableau on Hadoop.

With faster queries, users can explore their data the speed of thought to develop and deliver more informed BI insights. The speed and agility inherent in this model lead to greater adoption of BI throughout organizations.