Learn about the latest big data analytics and BI trends in data lakes, IoT analytics, data visualization, and more as you browse through these insightful posts. Sign up now to get fresh articles delivered to your inbox!


August 9, 2018 - John Thuma

What is Apache Solr

This blog was first published on Medium. A LITTLE HISTORY: Developed by Yonik Seeley in 2004, Solr was an in-house project at CNET Networks to provide search capabilities for its company website. CNET Networks then donated it to the Apache Software Foundation in 2006. In 2009 Yonik Seeley joined Lucidworks which provided commercial support, training, and consulting…

August 7, 2018 - Richard Tomlinson

Typical Cloud BI Deployment Patterns with Arcadia Data

An increasing number of our customers and prospects are asking us what options we provide for cloud BI with Arcadia Enterprise. Some customers are just experimenting, others are moving their test and dev environments off premises, and a few brave souls are all in on cloud for their enterprise production applications. The key driver for…

August 1, 2018 - Dale Kim

Is the Big Data Bully Impairing Your Analytics?

Do you know what big data is? Do you excel in big data analytics? What if I told you that you don’t actually know what big data is about? You might think you do, but you don’t. Or maybe you DO know. Interestingly, I’ve found that many people have a misconception about big data. Think…

July 31, 2018 - Paul Lashmet

Keys to a Successful Alternative Data Strategy

Asset management firms increasingly leverage alternative data to enhance their investment strategies and gain an informational edge over the rest of the market.  It is becoming the normal course of business to use new types of data in alternative ways and soon “alternative data” will simply be referred to as “data.” The ability for one firm…

July 26, 2018 - Paul Lashmet

Consolidated Audit Trail: Outside Looking In

The primary purpose of the Consolidated Audit Trail (CAT), a rule under the Securities and Exchange Act, is to arm regulators with the data they need to effectively conduct market surveillance and investigations into suspicious trading activities across all national exchanges.  The difference between this and current trade reporting regimes is that it covers more…

July 26, 2018 - John Thuma

FNU: For Non-Unicorns: What is Apache Spark?

This blog was first published on Medium. WARNING:  This is not for the high-tech unicorns, you mythical beasts who sparkle SQL and Java and make code bloom wherever you go. This is for the regular person who wants to understand Apache Spark at a pedestrian level. There are many resources online that help you take a…

July 24, 2018 - John Thuma

Three Ways Apache Kudu Supports BI on Apache Hadoop

This article was first published on Medium. Apache Kudu is a columnar storage system developed for the Apache Hadoop ecosystem. Kudu runs on commodity hardware, is horizontally scalable, and supports highly available operation.   Apache Kudu has a tight integration with Apache Impala, providing an alternative to using HDFS with Apache Parquet. Before Kudu existing formats…

July 20, 2018 - Shant Hovsepian

Five Things Soccer Analytics Teaches Us About Data Lakes

This blog was first published on Forbes. With the World Cup upon us, it’s an apt time to draw inspiration from soccer. In 1950, Charles Reep, an accountant, attended every game of the Swindon Town soccer team’s season, tracking events and recording statistics. He analyzed his data and concluded that long passes were the most effective way…

July 19, 2018 - John Thuma

The Data Science Iron Triangle – Modern BI and Machine Learning

Originally posted here. The New Iron Triangle It is cliché to discuss IT/business solutions as people, process, and technology. Some call it the “golden triangle,” but in this blog, we refer to it as the iron triangle. Since the 1960s, technology has disrupted business through the advent of computing and information management. These systems replaced…