Chapter 7:
Selecting a Next-Generation Business Intelligence Platform

Big data has enabled new applications and created new consumers for business intelligence and analytics. “Where in the past BI was limited by rigid data definitions, highly constrained data types, and the cost of delivering speed and scale, big data is now eliminating those constraints,” wrote Enterprise Strategy Group Senior Analyst Nik Rouda. A new class of applications “will empower a new generation of business data consumers, much broader than just the technical specialist pool of data scientists, DBAs, and analysts.”

One big question is, “What should these next-generation solutions provide?” We suggest the following:

  • They will reduce and in some cases eliminate ETL steps and intermediary data stages.
  • They will natively support Hadoop, cloud, NoSQL, and other modern data platforms. By doing so, this will reduce the creation of disparate task-specific data silos.
  • They will offer rich visualization features that are extensible to accommodate third-party visualization engines such as maps.
  • They will empower more business users to create their own BI content, curate streaming and stationary data feeds, provision applications, and foster collaboration with peers.

The reason greater scrutiny will be required is because the market is rapidly evolving, with many new entrants taking advantage of big data platforms to approach BI in new ways. Each has different strengths and weaknesses. At the same time, existing vendors such as Oracle are retrofitting their BI platforms to support native Hadoop. This trend appears to be unstoppable; in the words of Forrester Research’s Boris Evelson, “It’s a question of when, not if.”

Here are some additional capabilities to consider:

Self-service for business users.

If offering self-service BI is important to your organization, look for the ability to browse data structures and sources at a fine level of granularity, create semantic relationships across multiple sources, and set hierarchies and logical data sets.

All users should be able to assemble their own dashboards with the ability to drill down to raw data and resume and pivot through multiple data levels easily. End users should also be able to create and publish visualizations to anyone with a browser. “Citizen data scientists” should have the ability to author their own BI content, curate data, create calculated measures, provision applications, and easily collaborate with others.

Data visualization features.

Most BI packages provide basic visualizations such as bar and line graphs and pie charts. Modern BI systems should provide support for sophisticated visualizations like funnel charts, network graphs, dendrograms, packed bubble charts, heat maps, geographic map displays, and interactive legends. Extensibility is important to accommodate new visualizations. For example, the D3.js JavaScript library enables developers to take advantage of the full capabilities of modern browsers without being tied to a proprietary framework.

If many of your users are non-technical, pay extra attention to ease-of-use, and the strategies that were applied to maximize it. Users should be able to build a chart very quickly, and certainly without writing any code. If you want a system that incorporates user experience (UX) expertise and best practices, look for products with leading browser-based innovations. HTML5 is extremely powerful for deploying analytics across a large user base, so look for technologies that use it for rich, responsive interfaces. Also look for technologies that incorporate principles from Material Design. Initially announced in June 2014 at the Google I/O conference, Material Design is a design language developed by Google that aims to unify user experience across their products and other platforms such as iOS as well as modern web browsers. Material Design improves the overall digital experience for end users to make their analytical activities more intuitive.

Advanced analytics support.

Advanced analytics goes beyond looking at historical data, and often deals with real-time responses. A common use of advanced analytics is on time-series data for a variety of analytical tasks, such as calculating averages, identifying variances, and making predictions. It is used in cybersecurity, network management, and website traffic analysis where streaming and historical data can be combined to identify outliers and exceptions.

In a security scenario, real-time analytics can trigger alerts based on unusual activity from a particular node or IP address. Once discovered, an analysis can be conducted across users, endpoints, and networks within a specified time window to look for correlations and patterns.

In an e-commerce setting, it can be used to create heat maps that show the most active areas on a website and predict preventable actions like cart abandonment. It can also drive recommendations to not only give customers ideas on what they might need to purchase, but also to help the seller increase sales.

Derived data is another particularly powerful form of advanced analytics. It applies the output of one set of calculations to the input of another in a single pass, thus cutting down on query overhead and enabling much richer derived visualizations to be created.

Get the PDF version for easy access to read offline or print.