Learn about the latest big data analytics and BI trends in Apache Hadoop, the cloud, data lakes, IoT analytics, data visualization, and more as you browse through these insightful posts.

July 19, 2018 - John Thuma

The Data Science Iron Triangle – Modern BI and Machine Learning

Originally posted here. The New Iron Triangle It is cliché to discuss IT/business solutions as people, process, and technology. Some call it the “golden triangle,” but in this blog, we refer to it as the iron triangle. Since the 1960s, technology has disrupted business through the advent of computing and information management. These systems replaced…

July 18, 2018 - John Thuma

Data Has Time Value: Winners Exploit Data Streaming Now! Not Later!

Originally posted on Medium. Before I dig into Confluent KSQL, Apache Kafka, and Spark Streaming let’s first take a look at what ‘streaming’ is and why it is so valuable. Data streaming is a continuous generation of lightweight messages, typically in kilobytes, from potentially many different data sources. It can be from a variety of…

July 17, 2018 - Steve Wooledge

Three Surprises about Data Lakes, Hadoop, and the Cloud

If you’ve been paying attention to trends around Apache Hadoop, data lakes, big data analytics, and the cloud, you’ve probably noticed the see-saw hype around each of these. In 2012, there was no end in sight to what Hadoop could do, and organizations were beginning to build data lakes to augment or replace data warehouses…

July 10, 2018 - Dale Kim

What’s the Difference between Hadoop and a Data Lake

I recently participated in a webinar hosted by DBTA titled, Unlocking the Power of the Data Lake, where one of the audience members asked, “will data lakes be replacing Hadoop in the future”? I think the three speakers sufficiently answered the question on the webinar, but considering that many others might have similar questions, I…

July 3, 2018 - Paul Lashmet

Hypothetical to Actionable: From CCAR to CRE Market Factors

Introduction DFAST and CCAR require bank holding companies to report, in detail, how they would respond to hypothetical market scenarios that represent macroeconomic shocks like a housing meltdown or a stock market crash. The data used by each company to predict losses and create a response plan must be actual data, not approximated.  Using the…

June 19, 2018 - John Thuma

Five Classes of Use Cases for the Data Lake

A data lake was initially described as a storage system that held a very large amount of data in its original format until required. Originally the term data lake was synonymous with Apache Hadoop. Apache Hadoop enabled organizations to both store and compute data on commodity hardware. Apache Hadoop was a place where you had…

June 14, 2018 - John Thuma

Superheroes? Or Just the Best Women in Tech!

Superheroes are supernatural characters, many of whom have superhuman powers like flight, x-ray vision, or indestructibility. Some are mortals with loads of resources that enable them to create armored suits, amazing vehicles, and powered gadgets that give them superhuman capabilities. They are usually the protagonist in the story, and their goal is to protect the…

June 12, 2018 - Paul Lashmet

Alternative Data Strategy: How and Why

An alternative data strategy is a collaborative, iterative, and exploratory process that is driven by domain subject matter expertise. That is our take from the research survey that we commissioned from Greenwich Associates to understand how buy-side portfolio managers, chief investment officers, and fund managers “Put Alternative Data to Use.” Our primary focus was to…

June 5, 2018 - Dale Kim

Democratizing Big Data: The Power of a Unified View of Data as a Competitive Tool

Industry experts and data-driven corporations around the world know there are far too few data scientists to meet the current demand, and those that are available are expensive. As a result, we’re seeing the emergence of power users referred to as “citizen data scientists” who are able to leverage powerful big data analytics tools. This…

May 31, 2018 - Paul Lashmet

Cross-Functional Trade Surveillance: The Keystone to a Holistic Trade Surveillance Strategy

This article was originally posted on the Cloudera VISION site. Criminals don’t refer to a playbook of best practices to execute a crime. They are creative in their thinking and collaborative in their efforts (including with parties who may not know they are complicit) to obtain their objectives and avoid getting caught. Trade surveillance in the financial…

May 29, 2018 - Dale Kim

Simplifying Big Data Analytics Acceleration

In the blog post titled, Beyond the Cube: Embrace Analytical Views, we discussed how analytical views represent a new way to accelerate queries in a production environment. The next blog in the series, A Closer Look at Query Acceleration with Analytical Views, discussed analytical views in more detail in how to set them up. In…