Using Data to Understand Correlation between Police Encounters and Their Reasons

Published on January 11, 2017

In an era of media manipulation, propaganda news, and fabricated and sensational stories, there’s nothing you should take without a pinch of salt. The more sensitive a subject, the higher the chances of it being diluted and sensationalized. The only reports you can afford to believe are the ones backed by data because it adds a level of credibility to a story. Having said that, data can also be misinterpreted, especially when it can potentially fuel a debate.

One of the most concerning stories of 2015 was the increasing number of deaths of criminal suspects in police encounters, or the ‘death by cop epidemic’ as the media refers to it. The police are alleged of using excessive force dealing with suspects and being “trigger happy.” The police, on the other hand, claim that most of these fatalities were justified. The police claim that they only open fire under extreme circumstances and quote the number of police officers who lost their lives in such encounters.

Questions naturally arise: Were those who lost their lives in law enforcement encounters involved in unfortunate accidents? Also did those who were shot pose a threat to the police, or were they actually innocent as well.

Focusing on these questions, I created an app using Arcadia Instant with a data set “The counted” provided by The Guardian. The dataset contains details of people involved in encounters with the US law enforcement.

Data Prep and Loading

For a clearer analysis, I adjusted and added a couple additional data fields as noted below.


  • Age Groups: The raw dataset has information on the age of each of those who died. To visualize what age groups are comparatively more likely to get involved in encounters, grouping them into different age brackets makes sense. The formula used to create age groups is:


(case when [Age] in (“1” , “2” , “3” , “4” , “5” , “6” , “7” , “8” , “9” , “10” , “11” , “12” , “13” , “14” , “15”) then “Below 16” when [Age] in (“16” , “17” , “18” , “19” , “20”) then “16-20” when [Age] in (“21” , “22” , “23” , “24” , “25”) then “21-25” when [Age] in (“26” , “27” , “28” , “29” , “30”) then “26-30” when [Age] in (“31” , “32” , “33” , “34” , “35”) then “31-35” when [Age] in (“36” , “37” , “38” , “39” , “40”) then “36-40” when [Age] in (“41” , “42” , “43” , “44” , “45”) then “41-45” when [Age] in (“46” , “47” , “48” , “49” , “50”) then “46-50” when [Age] in (“51” , “52” , “53” , “54” , “55”) then “51-55” when [Age] in (“56” , “57” , “58” , “59” , “60”) then “56-60” when [Age] in (“61” , “62” , “63” , “64” , “65”) then “61-65” when [Age] in (“66” , “67” , “68” , “69” , “70”) then “65-70” else “70+” end).


  • Carrying Weapon: The field armed in the raw dataset contains details of the weapon carried, if any, by the deceased. To analyze if the police used guns only to counter a suspect armed with a lethal weapon, I grouped the weapon into Lethal, Non-lethal, and Unarmed or Unknown. The expression used for this categorization is:


(case when [armed] in (‘No’, ‘Disputed’ , ‘Unknown’) then “Unarmed or Unknown” when [armed] in (‘Firearm’ , ‘Knife’) then “Lethal Weapon” when [armed] in (‘Vehicle’) then “Vehicle” when [armed] in (‘Non-lethal firearm’) then “Non-Lethal Firearm” else “Other” end).

Visualizing the Data

To answer the above questions, I created a correlation visual and applied the Age Groups filter to dig further for more insights.

Correlation Chart: The chart shows a correlation between the causes of fatalities and whether the deceased was armed, unarmed, or was in a vehicle. The following fields were used on the Dimensions, Measures, and Filters shelves.

For all age groups, the correlation chart looks like the following. Hovering over a point in the chart shows the numbers and percentages of both, the source and target in a tooltip.

The visual makes it clear that the cause of most fatalities was gunshot. The other causes like Taser, Death in custody, Struck by vehicle, and others took comparatively fewer lives. Since the data includes only encounters with fatalities rather than all encounters, this makes sense that a gunshot is more likely to cause a death than any of the other causes listed here.

The right side of the chart shows that of all the deceased, the majority carried weapons and most of those were lethal weapons. This, to a great extent, supports the claims of legal authorities that they use force only when faced with life-threatening situations.

To evaluate the other side of the story, the Unarmed or Unknown field needs to be analyzed. Hovering over Unarmed or Unknown shows that 59% of the deceased who died of a gunshot were unarmed. This fact validates the allegation that innocent people are losing their lives in law enforcement encounters.  

Filtering by Age Groups

To explore the data further and visualize the numbers by age groups, those age groups can be selected from the filter.

Selecting the age groups between 0 and 30, the visual shows the following data:

Hovering over Unarmed or Unknown, the tooltip shows all the deceased who were shot and below or equal to 30 years. As the graph exemplifies, 20% were unarmed. Similarly, selecting the age groups above 51, the deceased who were shot accounted for 11% of all who were shot.

Even if we take into account the unfortunate accidents when innocent people get involved in a law enforcement encounter and are shot by either an escaping suspect or the police, the issue is still fraught with emotion for those mishaps. Where it’s not an accident, the police are held accountable for such tragedies.

Often times it is challenging to analyze and visualize correlation and causation between two variables within data. The correlation chart in Arcadia Instant is a highly effective way to visualize that correlation and causation. It helped us understand that if a person is carrying a lethal weapon, there’s a high percent probability that they will be dealt with like force. Without the Correlation Chart in Arcadia Instant it would not be so easy to draw this conclusion with just a glance at the data. You can try your hand at making similar visuals with Arcadia Instant downloadable here.   

Arcadia Instant, Release 4.2.1
Copyright © 2018, Arcadia Data Inc. All rights reserved.
Category: Business Analysts, How To