Rarely will you come across anyone in the business world who belittles the value of data (politicians are a different story). That’s what’s made ‘big data’ a catchy thing. The importance of data science, however, starts with the science, even before data or data scientists enter the picture.
Episode 9 of the Hadooponomics Podcast features “data scientist” Dr. Bob Hayes, founder of Business Over Broadway and Chief Research Officer at AnalyticsWeek. In his recently published study on what it means to be a data scientist, Hayes explores who data scientists really are, what they do, and how to understand how they do it.
First, this research questions whether data scientists are the magical solution to the pursuit of data value. Rather, pursuit of data value must rely on three distinct skills: (a) expertise in the domain being evaluated; (b) ability to use technology to manipulate data, and (c) grasp of basic statistics. And just because you can count these on three fingers doesn’t mean there’s a large population of professionals possessed of all three skills:
“To find somebody… who’s proficient in skills in those three areas is akin to finding a unicorn. They just don’t exist. We didn’t find any one person in our sample of over 1,000 respondents who said they were experts in all three areas.”
The sample data revealed no unicorns. These limitations are not unique to data scientists; in general, it’s rare to find anyone who has more than two of the three key skills. And here’s the second interesting part of Hayes’ observation:
“I think data science is more of a team sport. [B]igger teams had better outcomes in their data science projects. When you consider, if you have more people on the team you have different perspectives, you have diverse skills bringing to bear on your problem.”
In other words, the data skills shortage is not that there is no one out there with any of the skills, but that no one person has all the skills necessary. Hayes himself does not claim to be that person, either.
“I call myself a scientist, not a data scientist, because in reality, science has always included data because we have to specify using data.”
Data science may be catchy, but it’s a scientific mindset that unlocks the data: hypothesis, testing, and repetition. This is the lifestyle, of data + science. Exposing more people to data, with data fundamentals, goes a long way:
“Even your front line staff needs to know something about statistics if they want to make sense of the reports they’re given… basic stat concepts around sampling error, predictive analytics, measurement quality …the kinds of questions to ask somebody when somebody gives you a report. Where’d the data come from? What are the quality of the metrics? Why did you choose those metrics? … [I]t’s about the process of researching how the data are generated or collected and then analyzed.”
There remains one yawning gap, and it’s an even more basic statistic, particularly important to highlight: the pressing shortage of women in the field. Across data science practitioners, only about 25 percent are female. Even modern digital-native companies like Facebook have only about 30 percent female data scientists. You can’t fix the skills shortage if you’re only drawing from half the pool.
The best way to eliminate talent barriers in the workplace is by empowering a broad range of existing staff talent, so people across roles can understand the basics of analytics and statistics. It’s the best way to spark an interest in the science and lifestyle of data, without relying exclusively on an elusive quest for a high-cost analyst unicorn.
To hear more of Bob’s commentary, check out The Hadooponomics Podcast, Episode 9 – “ Building an effective team of data scientists.”. The Hadooponomics Podcast series is produced by Blue Hill Research in partnership with Arcadia Data. You can listen to prior episodes here.