With Big Data Comes Big Responsibility – Hadooponomics Podcast Episode 8

For the last 5000 years, doctors have been swearing, and not just for the reasons you or I might. The Hippocratic Oath, created by the ancient Greeks, required a new physician to swear (in its original form, by a number of healing gods) to uphold specific ethical standards; the best known of which is “First, do no harm.” Parts of the oath are still used today in most medical schools.

ethics of dataIn this week’s Hadooponomics podcast Emer Coleman, Chair of the Open Data Governance Board in Ireland and Director at UK-based startup Transport API, makes the case that data scientists, engineers and developers should think twice about the ethics of data engineering and science, in the same way we would expect from our physicians.

“Just like we would not put a medical doctor out into the world without having had his education or her education in ethics, we need to the same for software engineers. As much as they are looking at their technology, I would want them to be understanding the socioeconomic consequences of what their work is doing because we have some of the best brains in the technology community.”

So, why would you worry about the ethics of analytics and algorithms? Coleman encourages us to think about some of our modern day tech monoliths and the amount of data, often personal, they hold in data centers and Hadoop clusters. Google has 1.17 billion users and a body of information about our society that’s without precedent. Facebook, with its user base of 1.65 billion, recently came under fire for studying users’ emotions by manipulating their newsfeeds without their knowledge. Seem arcane? This is an election year: according to research by the National Academy of Sciences, results displayed by a search engine can affect electoral preferences by up to 20 percent. Big data enables us to glean a magnitude of new insights about users and business, and we’re probably not best off merely hoping that this power is used for good.

The broader social impact of data science is more important than we may realize. As algorithms increasingly influence our lives, we must accept that the data-heavy services and apps we use and work on every day can affect what large swaths of society do and feel. Seems reasonable to expect some sort of ethical considerations for engineers to work from.

While it’s hard to know what comes out of using the data we are all working to create and analyze, Coleman points out that we can certainly take a good hard look at what goes into it.

“I don’t think technology should be indulged in some special way, that what, it’s so special that it’s only for men? And because the world is now being created by technology and by software, that needs to represent the voices of both sexes.”

She talks at length in the podcast about the need for diversity among data scientists, because different viewpoints are also manifest in the code and algorithms we use everyday.

This kind of diversity is about more than equal opportunity; it’s about monoculture. The best answers come from testing ideas against multiple perspectives:

“When we look at [big data] from a gender perspective, we know that the world has largely been coded by men … If you have the majority of developers who are coding very important parts of our infrastructure, yet they come at that from a sole perspective, it’s a kind of a scary thing, isn’t it?”

Does this mean data scientists are what Redmonk Analyst Stephen O’Grady called “The New Kingmakers”? This, too, is a diversity problem. The fewer people who have access to data, the more likely it is that key new insights will go undiscovered. Data scientists have a critical role to play in enabling collaborative access to data, which, after all, is created by a larger and larger swath of the population, inside companies and in their marketplaces.

The upside is that more access to data can increase its value. The natural diversity that emerges with a larger population of users opens new perspectives and exposes new opportunities for innovation.

Gender diversity is one of the most basic places in which we in the tech community can challenge our own thinking. We have a lot of power at our disposal.

