Democratizing Big Data
As mentioned in the introduction of this book, Gartner and industry experts maintain there are far too few data scientists to meet the current demand, and those that are available are expensive. This sets the stage for the emergence of “citizen data scientists” leveraging powerful big data analytics tools. This trend is part of the larger movement many call “democratizing big data”—the enabling of more and more business users to quickly access the data they need to perform analyses and enterprise reporting of their own making, independent of IT.
However, for organizations to move closer towards true “self-service BI” their data must be freed from their traditional silos. Decision makers, influencers, and even observers need a unified view of such data, and getting that has traditionally involved a number of highly labor- and therefore cost-intensive processes. One potential solution that organizations have explored is dumping data into a single, large “data lake” that holds tidal volumes of raw big data in its native format until requested.
Leave the Data Where It Is:
A Case Study of Data Democratization in Action
To truly “democratize” data, it is essential to leave it where it naturally resides, such as in the Hadoop platform. Then the only thing what would be needed is an intelligent layer above these many and varied data types that can transparently integrate all the data. Ideally a visualization tool, this top-down “smart” layer will provide everyone what they need—namely, a unified view of all data regardless of its source.
The democratization of big data and the ensuing benefits is illustrated in a fast-growing marketing analytics technology provider to major brands. Acquired by Neustar in late 2015, MarketShare helps marketers make better decisions more rapidly, offering both decision analytics and prescriptive recommendations to help clients optimize marketing spending. At the heart of MarketShare’s value proposition are data-driven recommendations leveraging big data analysis.
After moving from MySQL to Hadoop, Neustar soon realized it wanted to provide dynamic rather than predefined reporting capabilities supplied by tools that more of their business analysts could leverage. For MarketShare, the set of “small data” tools it was using just didn’t work on big data. It would take MarketShare a full day and a half to develop customer-specific data sets then transform and load them into an Oracle database. Analysts then had to manually produce one-off reports, and then embed them into a cloud application for analysis—another day and a half process. The result was static, predefined reports instead of the highly dynamic type of reporting MarketShare wanted and needed.
The solution was a native visual analytics and BI platform for big data, which gave business analysts the ability to drill down into the raw data details on individual customer interactions. Overall, this solution eliminated labor- and time-consuming data extraction and data movement, allowing analysts to point directly to data stored in highly elastic cloud platforms like Amazon S3 for very fast ad-hoc visualizations. Business analysts now could create sophisticated reports on the fly by simply selecting client-specific parameters. This is vastly different from what MarketShare did previously when analysts waited and waited for data to be moved into the relational DBMS. As a result, reporting time and effort has been slashed from two full-time equivalents for three days all the way down to one full-time equivalent for a half-day.
Democratization To-Do List
The MarketShare experience also outlines the value proposition of an analytics-as-a-service solution, which can deliver very timely, relevant and insightful big data analytics but without the heavy costs of investing up front in infrastructure. Essentially all that is needed is big data— something most all organizations have plenty of.
Thus, in the interest of fostering this democratization of data and realizing the fuller potential of big data as a competitive tool used by non-IT business professionals, organizations should consider the following:
- Deploy solutions proven to enable access to new sources of data and new kinds of data as well, most notably big data in all its many forms and from its various sources of origin.
- Expand and leverage new analytics capabilities, in particular ones that can uncover new insights such as those derived from predictive and prescriptive analytics. Traditional BI tools are good at deriving insights from events that have already occurred, such as transaction data. Predictive analytics deliver insights into events and behaviors that haven’t even occurred yet, giving organizations an opportunity to respond a priori.
- Push relentlessly to expand the pool of business analysts exploiting and leveraging big data through advanced analytics tools, such as data visualization. A sort of “multiplier effect” in this regard can result in newfound business value from the one thing organizations have plenty of, and that is data.