How to Use a Variety of Visuals to Show Different Trends

Published on November 2, 2016

In a legal system premised on the presumption of innocence, it’s hard to think of a more fundamental issue than the death of alleged criminal suspects at the hands of police officers. Confrontations between the authorities and citizens feed directly into urgent political dialog, very much of late in minority communities in the US facing longstanding socioeconomic challenges.

Without question, each death is a tragedy; controversy erupts around questions of whether any one death was avoidable. We need to remember that people are on both sides of the equation, and mistakes, tragically, are not impossible.

To this volatile mix, add the question of patterns: are there similarities between such incidents that highlight systemic bias, or a fixable problem? Are there situations in which law enforcement erred on the side of racial bias or used excessive force? Are there factors causing these violent encounters – race and ethnicity or economic factors like unemployment rate, education or personal income?

With these questions in mind, I created an app using Arcadia Instant with the dataset Police KillingsThis allowed me to visualize the information needed to come to some conclusions. In order to make a clearer analysis, I adjusted certain data fields and added new fields as noted below.

Data Prep and Loading

The raw dataset contains details of fatalities in police encounters – location and date of the incident, ethnicity and age of the deceased, cause of death, and whether the deceased was armed or not. It also includes fields like tract population, median income, poverty, and unemployment rate. The new fields added to the dataset are:

  • Population: The dataset only had a tract-level population where the incidents took place; for a more complete picture, statewide population was added from here.
  • Incidents/Million: In order to understand how many incidents occur per one million people, a calculated field with the number of incidents for each state divided by population parts per million was added.

A few additional attributes were added in Arcadia Instant:

  • Age groups: The deceased were segmented into different groups e.g. “16-20”, “21-25”,… based on their age using the following expression:

(case when [age] in (“16” , “17” , “18” , “19” , “20”) then “16-20” when [age] in (“21” , “22” , “23” , “24” , “25”) then “21-25” when [age] in (“26” , “27” , “28” , “29” , “30”) then “26-30” when [age] in (“31” , “32” , “33” , “34” , “35”) then “31-35” when [age] in (“36” , “37” , “38” , “39” , “40”) then “36-40” when [age] in (“41” , “42” , “43” , “44” , “45”) then “41-45” when [age] in (“46” , “47” , “48” , “49” , “50”) then “46-50” when [age] in (“51” , “52” , “53” , “54” , “55”) then “51-55” when [age] in (“56” , “57” , “58” , “59” , “60”) then “56-60” when [age] in (“61” , “62” , “63” , “64” , “65”) then “61-65” when [age] in (“66” , “67” , “68” , “69” , “70”) then “65-70” else “70+” end).

  • Population type: The population was classified into broad buckets based on ethnicity – White/Caucasian, Minority, Unknown or No Record for all people using the following expression:

(case when [raceethnicity] in (“Asian/Pacific Islander” , “Black”, “Native American” , “Hispanic/Latino”) then “Minority” when [raceethnicity] in (“Unknown”) then “Unknown” when [raceethnicity] in (“White”) then “White” else “No Record” end).

  • Population percentage: The field, which uses statewide population data from an external source, shows the percentage share of each of the four population types. In order to show the percentages in Arcadia Instant, the values were entered in decimal format. When this field is displayed as a percentage it is visible in the appropriate format. The expression used is:

(case when [Population Type] in (“White”) then 0.61 when [Population Type] in (“Minority”) then 0.37 when [Population Type] in (“unknown”) then 0.02 else 0 end).

  • Armed/Unarmed: This field groups the incidents on the basis of whether the deceased was armed and if the weapon they used was lethal. The expression used to create this field was:

(case when [armed] in (“No”) then “Not Armed” when [armed] in (“Non-lethal firearm”) then “Non-lethal firearm” when [armed] in (“Disputed” , “Unknown”) then “Unknown” when [armed] in (“Firearm” , “Knife” ) then “Lethal Weapon” when [armed] in (“Vehicle”) then “Vehicle” when [armed] in (‘Other’) then “Other” else null end).

  • MonthNumber: In this field, the months were converted into their numeric values. This was done to sort the months chronologically rather than alphabetically.

Visualizing the Complete Story

Using this data, a number of visuals were created and stitched together into a single dashboard shown below. Within the dashboard, it is also possible to drill down to the state level.

The visual components of the application include:

Police Incidents: Showing a tabular summary of the total number of incidents, cities, and number of law enforcement agencies involved.

Incidents/Million by State: The visual – State Map – shows states with the number of encounters as a function of parts per million, colored by the density of incidents in each state. By hovering over various states, it is evident that although California has the highest overall number of incidents – 74, the number of
incidents per one million people is highest in Oklahoma.

Cause of Death: The visual – Horizontal Bar Graph – shows the various causes of death in the incidents. Additionally, it shows whether the deceased was armed or not. We can easily see that most were armed; out of all the people who got shot – 260 were carrying a lethal weapon.

Unemployment Rate, Personal Income, and College Rate: The Line Chart below, a Trends Comparison visual, shows a comparison of economic factors like Unemployment Rate, Personal Income (median) and College Rate (25+ year old population with a B.A. or higher) with the incidents over a period of five months.

Population Percent by Type: The Donut Chart below shows the percentage of minority vs.  Caucasian population in the United States for the year 2015. It shows that all minorities accounted for 37% of the US population.

Incidents by Population Type: This donut chart shows the number of incidents for each population type. According to this dataset, Caucasians account for about half of the fatalities; 236 fatalities out of a total 467.

Age Groups: The number of fatalities involving different age groups is shown in the bar chart below. It’s evident that most fatalities are in the age group 31-35 closely followed closely by the younger group, 26-30 year olds and older group of 36-40 year olds.

Now let’s see how these visuals shed light on the questions we pondered in the beginning. Comparing the composition of the population and the number of incidents for each population type, this is what we see:

Almost half of the incidents (46.25%) involved minority population (including Black, Asian/Pacific, and Hispanic) while these minorities account for only 37% of the US population. Dividing the incidents percentage by the population composition, we find that incidents involving minorities occur 1.5 times more often as compared with Caucasians. Although we have to keep in mind that this data is only for a short five month period and is limited, a certain racial bent is evident. We also need to keep in mind that the cause of that bent is not clear based on this data alone.

We can also look into how the deceased were killed:

We see that the maximum number of people were killed by gunshots. It’s also evident that the majority of them were carrying weapons and many of them were carrying lethal weapons, which leads us to believe that most were killed because they were perceived as a threat.

Comparing the incidents to economic factors like unemployment rate, income, and education, we see the following trends:

Looking at the graphs above once again, one cannot definitively say whether these factors affect the number of violent encounters or not. The theory was that the crime rate increases with unemployment and decreases with higher incomes or a higher level of education. With the limited data used here, we cannot assess anything concrete.

From the data, we cannot make definitive comments on questions like to what extent economic factors contribute to crimes and subsequently to fatalities in police encounters. There may be instances of racial bias, but the data available do not provide clear indication that this is so. We can infer that leaving apart rare accidents, US police officers use their weapons primarily under extreme circumstances when they are faced with life-threatening situations. As with many issues so emotionally charged, there are no simple answers.

Still, using data to better understand the issues at hand is a good exercise to attempt. If you’d like to try your hand at something similar or build on this application, give Arcadia Instant a try.