September 19, 2017 - Richard Tomlinson | Big Data Ecosystem

Announcing Arcadia Enterprise 4.2 — Machine-Assisted Insights for Business Analysts


With every new version of Arcadia Enterprise, we seek to double down on our big data capabilities, and our next release is by no means an exception. In our latest release, we are adding a whole host of features while at the same time investing in multiple strategic themes that line up with both our customers’ needs and the direction of the market in general.

All of these capabilities, however, have one thing in common—the goal of simplifying big data analytics for business users and analysts. These capabilities drive what we refer to as “machine-assisted insights,” to lighten the load from a BI user standpoint when performing advanced analytics.

Some of our newest marquee features include:

  • Easier composition of visuals, dashboards, and apps on streaming data
  • Instant visual recommendations for different combinations of dataset attributes
  • Visualizations on data without flattening complex, nested schemas
  • File browsing for cloud storage repositories like Amazon S3 to visualize data without moving or importing it

Let’s take a look at some of these capabilities and the benefits they provide in more detail.

Visual Analytics on Streaming Data for Everyone

The rise of the Internet of Things is driving the momentum for new types of data streaming applications and visual analytics on fast-moving data. It is widely recognized that new opportunities to innovate or provide higher levels of customer service can be enabled by harnessing data at the moment it is created. Identifying the need for a repair as soon as a part fails in a connected car or flagging the need for an urgent surgical procedure based on anomalies detected in data from a wearable device are two such examples.

The key to enabling innovation on fast-moving data, however, is to provide access to those who know how to use it, i.e., business analysts and users on the front lines. They should be able to work with data in motion in exactly the same way they would data at rest. They should be able to see new records or events as they arrive, aggregate this data based on any of its attributes, create visuals and dashboards, and set alerts and thresholds to trigger actions when anomalies are detected exactly as they would on slow-moving or static data. Traditional BI tools were never designed for streaming data. When the data is in a stream even before it has been loaded in a database or an extract, the mechanism to investigate even basic information is absent in BI tools.

In our next release, we have made significant improvements to how Arcadia Enterprise works on fast moving-data from both a front and a backend perspective. At the UI layer we have created intuitive filter widgets that allow the user to analyze very fine-grained time series data, e.g., a rolling filter for the last five minutes or five seconds. We have improved the transitions and animations inside our visuals so as data changes, the visuals react and change smoothly as time passes. We also will provide the ability to pause the dashboard which is updating in real time and freeze the analysis at a specific point in time, take some action, then un-pause and let the visuals continue to change once more.

Select a filter

At the backend we have improved our integration with Apache Solr, a popular engine used for text-heavy data such as log files, in addition to already providing support for data in Apache Kudu, Spark, HBase, MongoDB, and many more. Most recently we just announced a unique partnership with our friends at Confluent that will allow us to provide first-class support for their new KSQL offering, ultimately enabling business users in ArcViz to build real-time visuals directly on streaming topics in Apache Kafka.

Instant Visuals via Recommendations

In the world of big data, the notion of getting anything instantly is a rare concept, especially when it comes to data visualization. In our next release, Arcadia Enterprise has made investments to help business users get results faster via a brand new concept in the product that we are calling “Instant Visuals.”

Instant Visuals are a part of our overall strategy to help accelerate business agility and also expose unanticipated patterns by offering smart, machine-assisted insights. This helps users throughout the essential but click-heavy process of data discovery and the development of visual analytics, dashboards, and big data apps. For example, business users may want to visualize patterns related to traffic on their website by various demographic factors such as age, income level, education level, and so on. They understand the data attributes (metrics, dimensions, etc.) they wish to analyze, but they don’t know the best way to represent these attributes visually to show patterns in the data.

Instant Visuals allow users to simply pick the data attributes of interest, then click a button to instantly see a side-by-side representation of a wide variety of alternative ways to visualize the data. They can then pick the option that most closely meets their needs and add that visual to their dashboard. This drastically reduces the time taken to manually curate data visualizations and exposes options they may never have thought of in the first place. Instant Visuals not only expose alternative visual types but also present alternative style options and color palettes for users to simply pick and use, versus taking up the time clicking through all the possible combinations themselves.

Select data fields, then one click shows which visuals best represent your data.
Select data fields, then one click shows which visuals best represent your data

We have also made our Instant Visuals smart. Recommended visual types presented to the user are based on a number of factors like the number of fields (metrics and dimensions), data types (string, date, numeric, geocode, etc), the actual values in the data (low vs. high cardinality, sparsity, data quality, etc.), and how other users of the system are using the datasets (users who look at attribute x also look at attribute y, etc).

Visualize Big Data in Complex Schemas Immediately. No Flattening Gymnastics Required.

Big data is extremely diverse in structure and doesn’t always fit neatly into the rows and columns found in traditional database schemas. For example, log files and event data generated by machines, or JSON data generated by REST APIs typically organize data in more efficient but complex nested structures or in raw text format that has no structure at all.

To accommodate this diversity, we are seeing more organizations adopt schemas that contain complex data types including ARRAY, MAP, and STRUCT. These complex types make it easy to model and represent nested and unstructured data naturally without first needing to use ETL tools to split apart the data into normalized tables or explode the data into giant monolithic data sets.

Traditional BI tools don’t recognize data in ARRAY, MAP, and STRUCT formats, and as a result, they rely on the inefficient approaches mentioned above to either change the underlying structure of the data via ETL, or create flattened views to represent the naturally nested data in extremely unnatural ways. Both approaches kill agility and time to insight because they involve the intervention of technical resources and external tools before the data can be analyzed, significantly impacting SQL query performance at run time.

It is for these reasons that Arcadia Enterprise is excited to offer first-class support for ARRAY, MAP, and STRUCT complex data types in our next release. Not only will we recognize complex types and allow the creation of Arcadia datasets with no upfront data gymnastics required, but we will also expose these complex schemas within the UI in an incredibly intuitive and user-friendly way so business users can operate on the data elements the exact same way they would basic data types. To the user, the experience is completely transparent, but behind the scenes, Arcadia takes care of the complexity of generating the correct SQL to ensure the nested data is queried in the most efficient way possible.

Self-Service Visuals for Big Data in Cloud Storage
Self-Service Visuals for Big Data in Cloud Storage

More and more organizations are leveraging cloud-based object storage systems like Amazon S3 as a key part of their big data architecture, due to better flexibility and elasticity to easily increase and decrease capacity, significantly reduced storage costs, improved data protection, and enhanced searchability.

In our last release, we added support for connecting and directly querying tables built on files located in S3 buckets. We are very excited about this, as it removes the need for customers who are all-in on the cloud to first create a pipeline to move their data and keep it fresh in an alternative storage mechanism such as HDFS before exposing it to SQL query engines for business analytics.

For our next release, we build on the momentum around cloud storage and will allow non-technical users, via an intuitive UI, to browse data in their S3 repositories, select files of interest, and simply hit a button to create a new table in Arcadia. That way they can start visualizing the data right immediately, without first involving technical resources or data preparation tools.

This capability furthers the notion of business self-service in Arcadia Enterprise, and aligns with the growing trend among organizations who want to empower their business users and increase agility by allowing them to bring new data sets into the big data environment on their own.

A Whole Host of Enterprise Features for Visual Analytics on Big Data

In addition to the strategic capabilities just mentioned, we have also included a whole bunch of additional new enterprise features, enhancements, and bug fixes, some of which include:

  • Native execution engine fallback (e.g., if possible, a query can fall back to Impala vs. ArcEngine)
  • Improved smart acceleration (Analytical View support for more complex, multi-table data models)
  • Logging of usage stats to a database table for easy reporting on ArcViz and ArcEngine activity
  • Integration with iNotify in HDFS to automate the process of metadata updates as data changes

We are proud of what we were able to achieve with this new release of Arcadia Enterprise, and we feel that our customers and prospects are going to love what we will deliver. Please reach out to us to learn more about what Arcadia Enterprise can do for your organization today.

Related Posts