Using drill downs on Big Data to learn about the streets of NYC

Published on July 27, 2016

In 2015, pedestrian related accidents occurred at the rate of almost 25 per hour in New York City, about six people hurt (or worse) every hour. I wanted to find answers to the following questions:

  • What are the major causes of motor vehicle accidents?
  • What are the most accident-prone areas/roads in NYC?
  • Is there any particular time of day when more accidents occur?
  • Do pedestrians face a higher risk of injury/death in accidents than cyclists and (or) motorists?
  • Are there any particular types of motor vehicles that get involved more often in accidents as compared to others?

Data preparation and loading

I used the dataset provided by the NYPD here to see what answers I could find. I also created a couple new data fields to get better analysis and visualization. The dataset had different fields for the causes and vehicle types involved in the accidents, so I grouped them and created one field containing all the vehicle types, categorizing causes into a broader spectrum, and dividing the time into different periods of the day. I also renamed the fields for better reading. The attributes I created include:

  • Time Period: This groups the time of accidents into periods of a day.
  • Cause of Accident: This groups different causes into broader categories like Driver’s Fault or Instrument Failure to figure out what causes the most accidents.
  • Accident Type: This classifies accidents into Fatal, Non-fatal, and No Injuries based on the number of injuries and deaths.
  • Vehicle Type: This groups all the vehicles into broad vehicle types.

Using Arcadia Instant I created an application to visualize all of this relevant information in a single view. Now I can see data for all of NYC or for one particular borough. One cool thing which can be done in Arcadia Instant is to dynamically select the dimension or measure to be shown in visuals by setting a parameter in the application and using this parameter to dynamically select measures, which will then update the visuals.

Note that the default selection is ‘Accidents’ in the drop down menu at the top of the app. The initial app looks like this:

1-app

The app is comprised of the following visuals:

  1. Summary: The three tables at the top show summary numbers for 2015. The numbers refresh when you click on a particular borough within the Borough – Accidents bubble chart.  We’ll start by comparing the number of injuries and fatalities in Pedestrians and Motorists: Injuries – Pedestrians: 10081 and Motorists: 36978, Fatalities – Pedestrians: 133 and Motorists: 95. The disparity in numbers show that when pedestrians are involved in accidents, the chances of them getting killed are higher than surviving with injuries.

2-summary

  1. Borough – Accidents: This Packed Bubble Chart shows a comparison of the number of accidents in each borough. A variety of metrics can be selected from the dropdown menu. Clicking on a particular borough refreshes the data within all the visuals for that particular borough. Hovering over a borough shows the exact number of accidents in a tooltip. The visual below shows that Brooklyn has the highest number of accidents, closely followed by Manhattan and Queens.

3-bubble-chart

4-bubble-chart-2

  1. Most Dangerous Streets: The bar chart below shows the streets with the highest number of accidents. This is the default metric and can be changed from the drop down menu. The highest number of accidents occur on Broadway. Could it be that those bright Broadway lights are distracting drivers?

5-bar-chart

  1. Time of Day: This bar chart shows the number of accidents for different time periods during the day. The number of collisions or accidents is the highest in the evenings. Perhaps drowsy driving or the impatience to reach home faster to watch the latest episode of a favorite television show increases the chance of accidents occurring.

6-bar-chart-2

  1. Accident Trend: My next inquiry is regarding a correlation between the time of year and the number of accidents. In the line chart below, there is a slight peak in October, which might be due to slippery roads because of the weather or the increased number of pumpkins on the road.

7-line-chart

  1. Vehicles Involved in Accidents: This Horizontal Bar chart shows the types of vehicles involved in the collisions. The chart makes it quite clear that most were passenger vehicles which makes sense due to their high presence on the roads compared to other types of vehicles. The chart also illustrates the type of accident distinguished by color. For most of the accidents, no one suffers injuries; and in the rest of the cases, where people suffer injuries, most of those are non-fatal.

8-horizontal-bar-graph

  1. Cause of Death: This Donut chart shows a variety of factors that led to an accident. ‘Driver’s Negligence’ is the biggest one, including both driving under the influence and ignoring traffic rules.

9-pie-chart

Now let’s dig deeper into the visuals by clicking on Brooklyn in the bubble chart.

The visuals refresh and show Atlantic Avenue is the most accident-prone area; the maximum number of accidents occur in the evening closely followed by noon. A large number of pedestrians were killed in these accidents in Brooklyn, almost three times that of motorists involved.

10-refresh-1

Now we switch to data on Fatalities involving Pedestrians in Brooklyn. When selecting the option ‘Pedestrians Fatalities’ from the dropdown and clicking on Brooklyn, the visuals show that most accidents that kill pedestrians happened on Linden Boulevard. The evidence indicates that Brooklynites might want to be careful walking that road and city planners may want to investigate why there are so many fatalities there.

11-refresh-2

The data shows that although the number of accidents is very high, most do not involve fatalities. The visuals here show that motorists on the road must take more responsibility, not only for themselves but also for the safety of others. The number 243 might not appear large as compared to the number of accidents occurring but absolutely nothing can replace a human life! For me, this number is still too large. Be careful out there.

Bringing these visuals to life in Arcadia Instant made the journey of discovery easier. Want to try your hands at it? Go ahead and download Arcadia Instant here.