May 1, 2018 - Dan Woods - Evolved Media | Big Data Ecosystem

Why Going Data-Native Can Save Your Data Lake

This post was guest-written by Dan Woods, CEO of Evolved Media & Chief Analyst of Early Adopter Research. Dan creates ideas about technology products, based on a broad technical understanding. By writing as an analyst in Forbes and working with Evolved Media’s clients, he sees the magic in technology and why it matters to IT buyers.

It’s often really hard to immediately understand the deeper implications of new technology and ideas. It takes time to see how an innovation can be fully and best used across many contexts.

Data lakes fall into this category. The way businesses have viewed the success and utility of data lakes has been hindered because they’ve placed data lakes under the shadow of the practices used to support data warehouses. In fact, data lakes represent a paradigm shift in the kind of data that is being stored and how it can be analyzed — they are not just a flashier version of data warehouses. To get the most out of them, companies need to recognize this and act accordingly.

There are many ways to talk about what’s possible in a data lake, but one of the simpler methods is through a data-native approach. A data-native approach is about harnessing big data and its power in a way that impacts your business.

Going Native

Since companies began adopting data lakes, most of the time the data they have stored there offers a distilled and low-resolution version of the business. And frankly, there’s nothing wrong with such a picture — it can be incredibly powerful and add value to the business. But, if that’s the degree of resolution you want, there’s no reason to use a data lake for storage. You’re underutilizing data lakes in such a scenario.

A data-native approach gets the most out of a data lake because it starts from the understanding that big data can and should provide us with a high resolution picture. It expands the types of data we can be using from beyond that usually distilled in data warehouses to encompass a lot of new data in its raw, original form. A data-native approach uses this data to create a high resolution view of the business, but also exploits techniques that are able to distill the signal in the incredible granularity of big data using analytics and machine learning.

Additionally, a data-native approach recognizes that while in production, batch processing will continue to be a great way to operationalize signal capture. But in order to capture those signals, you need an interactive environment to explore and analyze a big data world. You also need a way to incorporate streaming data arriving in real time.

Here is how the power of a data-native approach can ensure companies get the full value out of what data lakes make possible.

Data-native as a Savior for Data Lakes

In a data-native approach, the processing and storage of data occur within the data lake so that meaning from the data can be more easily extracted and operationalized. This greatly aids a company’s ability to derive value from big data.

In a data-native approach:

  • You analyze the data in place. Taking big data from the data lake and using small data tools on it is like taking a photo of a 3D model: you lose resolution.
  • You empower business users. Non-technical users can give a business-oriented meaning to the underlying data; that’s called semantic modeling.
  • You operationalize the data. You get faster, more agile business impact by combining deep data from the data lake informed by business context, enabling timely action. Better dashboards aren’t just used at the end of the process, but throughout it, and are designed with the end user in mind.
  • You view data across all endpoints and users: This allows you to see all the angles of your data, including who is using it, across all networks, and endpoints down to individual log files.

Data lakes enable companies to live in a big data world. They empower users to ask big, rather than small data questions.

A data-native approach lays the foundation for this type of exploration by allowing data lakes to be more dynamic and avoid becoming data swamps. With a data-native approach, data can flow in and out of the data lake and remain fresh, and relevant to the business. By providing direct access and self management of the data in data lakes to a wide array of users, including non-technical business analysts, more value can be extracted from the data. And users are able to add their specific subject domain expertise to the data in a way that can inform others, a process known as semantic modeling.

A data-native approach provides companies with the ability to improve their:

  • Data discovery and organization, by allowing users to get to the data themselves, without IT intermediation, and at a real-time rate because the data storage and analysis occur in the same place
  • Interactions and explorations of big data
  • Configuration, operations, and administration of data
  • Ability to use the resulting insights in real time and in support of applications

The ultimate goal of a data-native approach is to ensure companies can operationalize the insights gleaned from their data in a simpler, faster fashion. By making data lakes more user centric, a data-native approach empowers users to act on what they find in the data. With a data-native approach, companies have an analytics platform that is easy to navigate, and allows users to act on data in real-time. It’s therefore the best way to ensure you avoid failure and maximize the potential of your data lake.

To learn more, download the white paper, “Saving the Data Lake.”


Related Posts