If there’s one thing that’s getting more press than Big Data, it’s data security. After all, the FBI and Apple didn’t lock horns about Hadoop. Let’s put the question directly: are Big Data and data security on a collision course?
At the very least, as Cloudera VP Tim Stevens put it, “Security is a key inflection point on the road to faster, broader Hadoop adoption.” So in this week’s segment of the Hadooponomics Podcast, host James Haight speaks with Eddie Garcia, Chief Security Architect at Cloudera. The first observation? Many of the fastest growth use cases and industries are those where security concerns loom large. Healthcare, Financial Services, Telecom, and of course the public sector. In every one of these, the increasing volume of data creates more places to touch it. Says Garcia:
As you start to collect all this data … the value of the data together is much more than the individual pieces. So if you think about the siloed systems, and maybe have their own security controls, maybe something with a database and a data warehouse, and this other repository, or maybe just going through the logs and throwing away. But now you’re actually taking all of these data sets, putting them together, and as a collection they’re even much more valuable.
So are we better off than with siloed systems, where Big Data has a Big Target painted on its back? Silos actually cut both ways. The balkanization of small data systems may have created security through obscurity, but perhaps we just didn’t know how bad it was. Again, Garcia observes:
We see organizations that in the past would take over 200 days to realize that they have been breached. We need to apply the tools that understand these threats more practically, real-time, and when you can’t really catch them in the action, be able to notice it within hours or days. And that requires a lot of data aggregation.
Clearly, there’s room for improving the platform – the leverage available in the consolidated landscape of the data lake means that you need not double your security team as you double data. Does that mean that Big Data is its own worst enemy when it comes to security? Or is there another way? Machine learning?
Actually, the upside is that the platform technologies that make up the Hadoop stack include powerful new tools that make it easier for humans to find anomalies and act on them faster.
We’re far from getting to a point where a human doesn’t have to look at it. There’s still gonna be false positives, there’s certain things that still are going to require someone to take a look at. But being able to very quickly, within minutes or hours, being able to put the most relevant, important, and at-risk information on their monitor, the better we’re gonna be at early detecting and preventing these large types of security breaches that we’ve seen in the past.
All that data aggregation? What makes it work best is when you can put it front of a human, and make it easier to visualize, to dig through large volumes of information, and to do so securely. Turns out that humans may not just be what causes collisions; keeping humans in the loop may be the best way to keep big data secure.