February 7, 2018 - Dale Kim | Big Data Ecosystem

BI and Analytics Meet Business Transformation

In this second blog post in our series that’s based on the ebook Modern Business Intelligence: Leading the Way to Big Data Success, we explore BI and analytics in greater detail, along with our views on the present and future of enterprise reporting and the accompanying rise in self-service BI reporting.

It’s an exciting time for any organization that wants to leverage data to drive business operations. Self-service BI is becoming increasingly important when it comes to leveraging data agility to gain a competitive advantage, and there are some exciting new self-service technologies that are out there that can help you to stay ahead of the game, whether you’re a business analyst, data scientist, or data/application architect.

If you missed the first post, which provided an overview of the big data ecosystem with an emphasis on Apache Hadoop and Apache Spark, you can read it here.

What’s the Difference between BI and Analytics?

When you think about it, both BI and analytics have the same high-level purpose: to help organizations take advantage of their data to improve their decision making. But they go about achieving that goal differently. In general, BI is about what happened in the past, and what’s happening now. It’s mostly about assessing past work. Analytics is more about what might happen in the future. It’s a broad discipline that looks for trends and patterns that point to specific responsive actions.

Why does this difference matter? The problem in the past was that companies wanted the benefits of both BI and analytics, so they often were forced to purchase multiple technologies from different vendors. And now with the increased amount of unstructured data being created from various sources, organizations are again pursuing multiple technologies as they rethink how to store, analyze, and exploit this information in a cost-effective manner.

A Brief History of BI

The term “business intelligence” was actually first used way back in 1865 by Richard Miller Devens in his book Cyclopedia of Commercial and Business Anecdotes. It wasn’t until 100 years later when it was first adopted by the technical community, when Peter Luhn, an IBM researcher,  published “A Business Intelligence System” in 1958. Even then, it took until the 1980s for BI to gain visibility. That’s when Bill Inmon and Ralph Kimball came up with the concept of a “data warehouse” which brought in data from multiple sources, storing the results in a central “system of record.”

Here’s a visual timeline of recent BI/analytics events:

BI timeline

If I may deviate a bit from the narrative in the ebook, I want to call out that one of the most interesting events in this timeline is the “disappearance” of Hadoop. No longer do we have the “Strata + Hadoop World” and the “Hadoop Summit” conference names which were recently modified to be more data-centric. So now we have the Strata Data Conference (which forces you to pronounce the word “data” to rhyme with “Strata,” lest you use the less mellifluous “Strata DAY-tuh”), and the DataWorks Summit. The software vendors who were previously known as “the major Hadoop vendors” are putting less emphasis on Hadoop, at least in their marketing materials. And even the Hadoop bashing trend of the last few years has led to less discussion about Hadoop.

So Hadoop has not enjoyed the same explosive growth that relational databases saw in the 1990s, but that doesn’t mean Hadoop will suffer the fate of, say, object-oriented databases. If you’re looking for a data platform that provides scale and flexibility, especially in a cost-effective way, then Hadoop is still the way to go. And as Hadoop moves beyond its awkward tween years (it is, after all, almost 13 years old), and progresses through its awkward teen years, it will continue to provide value for use cases pertaining to big data.

The Present and Future of Enterprise Reporting

So now we come to the present day, where organizations are struggling with traditional data warehouses and enterprise data hubs that can’t support streaming/unstructured data in their native format unless IT goes through the time, hassle, and expense of preparing that data for the structured warehouse. The solution? Companies are turning to Hadoop to store these emerging data formats, as well as “less frequently used”  warehouse data, more economically.

Although traditional RDBMS systems will wane over time as Hadoop/Spark and other technologies increase, RDBMS/EDW and Hadoop both have unique attributes. RDBMS systems are both the past and the present, while big data analytics on Hadoop are the present and future, due to the massive growth of valuable data that businesses collect. Hadoop was designed to be able to analyze data at a detailed level while handling a large amount of data.

Self-Service BI Is Transforming Enterprise Reporting

The concept of self-service BI is the new star of the BI era, giving users the ability to analyze mission-critical enterprise data with minimal or no IT intervention. Gone are the days of waiting for weeks or even months for a new report.

Self-service BI will continue to experience phenomenal growth over the next few years. Gartner predicts that by 2020, self-service BI will comprise 80% of all enterprise reporting. Keep in mind that self-service BI should not be confused with self-sufficient BI. IT still has a role to play—they must provide trusted data to users, and for those organizations not in the cloud, they must still maintain and update the BI systems when needed.

Case Study: Self-Service BI Environment that Enabled Agility and Collaboration

Take the case of an IT vendor who produces and sells application-integrated data storage solutions. A large volume of data is created by their customer deployments, and this data is continually analyzed for both the customer base and the IT company. For example, this data can be used to quickly identify a failed drive. The IT vendor can quickly address the problem, providing a positive customer experience and avoiding customer data loss.

The IT company collected hundreds of millions of data points from tens of thousands of servers every single day. However, the data collection wasn’t the problem; they used Hadoop to cost-effectively scale as the amount of data grew. The key challenge was that they wanted to offer the data to their end users in a way that was granular, comprehensive, consistent, and quickly accessible across business teams.

The answer was the deployment of an in-cluster analytics solution that was architected for Hadoop. By using in-cluster analytics, the IT company was able to quickly identify potential sales opportunities, underutilized and poorly provisioned systems, historical and real-time details on equipment reliability, customer usage patterns, and more. BI users across the organization could easily create interactive data applications with no coding required and less IT intervention, resulting in increased agility and collaboration across teams.

The Rebirth of BI

BI and analytics continue to be important capabilities in organizations around the world. BI is enjoying a rebirth, thanks to the data explosion that big data, social media, the Internet of Things (IoT), and other sources are bringing to organizations. With the increased amount of unstructured data being created from various sources, organizations have begun to rethink about how to store, analyze, and exploit this information, whether it’s stored on-premises or in the cloud.

In the next blog post in this series, we’ll take about the concept of the citizen data scientist, which is someone who is more advanced than a business analyst but does not directly deal with data science. We’ll also discuss the concept of Hadoop as a platform game changer, and the growing importance of data visualization when it comes to expanding big data analytics use to business analysts. Be sure to check out the entire eBook here either online or as a PDF.


Related Posts