June 11, 2017 - Dale Kim | Big Data Ecosystem

Big Data Security Is Important, Right?

We all know that big data security is important, but we each have different interpretations of what “important” means. That’s not good. We all should aspire to a higher standard of data security, and I encourage you to be more diligent when it comes to protecting data. We have to be in this together because after all, it’s not just “your” data you’re dealing with. If you work at a big consumer-oriented company, you are likely dealing with “my” data (like my credit card numbers) as well. So yes, I want you to protect it. But I acknowledge security is hard, and that’s why data security is often a struggle.

Let me call out three reasons why data security continues to be so problematic.

Data security is burdensome, so people try to get around it.
Security necessarily requires extra work that we don’t want to do. We see examples of this all the time, particularly in news reports of “123456” and “password” as the most common passwords at large Internet sites. These types of simple passwords are used by people who are so inconvenienced by complex passwords that they are essentially willing to remove security out of the way. In fairness, perhaps these folks have done risk analysis to determine the liberation they gain from a simple password outweighs the chances of getting hacked (of course, I say that tongue-in-cheek). Either way, security is vacated in these situations.

Many think of security as a longer-term problem, not an immediate one.
They know it’s important, and they know they must implement it, but just not now. This means that systems will get deployed with no legitimate security strategy. This is related to the point above in which security is viewed as a burden. The difference here is that the system implementers, not end users, are avoiding the burden. Far too frequently we read about computer breaches in which proper security was not in place. And oftentimes the problem was as simple as unchanged default passwords. Compliance frameworks like the Payment Card Industry Data Security Standard (PCI-DSS) specifically require changing default passwords, and that’s only one small part of ensuring your data is protected. A complete plan must be in place from the very beginning as a priority.

Data security involves so many moving parts that it is difficult to address all of them.
Even if you think you have all your bases covered, there are other issues to consider. For example, security often requires 100 percent participation from the user base, so if a few users follow lax security practices, the whole system could become at risk. Protecting data often entails protecting all aspects of your environment, including physical security. A common tactic is to fool authorized users into providing access. If you’ve ever watched an action movie where the protagonists break into a secure environment dressed as janitors, firefighters, or other authorized personnel, you might dismiss that as Hollywood fiction, but it does happen. And even if you think you have the right controls in place, there are always opportunities for human error, especially in complex environments.

An interesting point about the three issues above is that they all pertain to people and process, not technology. This means that in many cases, it’s not the technology that’s failing, it’s the users. How can we address this? We’re not going to fully solve security challenges instantly, but one high level goal is to make security easier to use and implement. For ease of use, one example technology is single sign-on (SSO). SSO makes security easier by reducing the number of secure passwords you have to track, while also trying to ensure no major gaps are introduced. For ease of implementation, leveraging a unified security model in a data platform is a great approach. If you can reduce the number of distinct security models in your deployment, you also reduce the risk for error.

In a big data analytics environment, Arcadia Data takes a great approach. Our native visual analytics architecture enables in-cluster analytics, so you can easily integrate with the security model of your data platform. So not only do you get powerful, easy-to-use, and extensible visualizations that help you to build sophisticated analytical applications, but you also get a platform that enables unified security, higher scale/performance/concurrency, and greater agility. We recently announced our Apache Ranger integration, and you can read more about how an in-cluster architecture reduces the complexity of securing big data.

If you’re going to the DataWorks Summit in San Jose, be sure to listen to Arcadia Data CTO, Shant Hovsepian, on a panel about Apache Ranger and Apache Atlas on Wednesday, June 14, at 2:10 pm PT in room 211. Also be sure to stop by our booth, we’re between the T-shirt printing station and “The Cube” (the place where they record interviews). See you there!


Related Posts