Visualizing Data by Date, Time, and Region

Published on December 2, 2015

Arcadia Instant supports most of the date/time functions supported in Impala, SQLite, and Hive.

To demonstrate how date formatting works in Arcadia, let’s work with a sample app. We have a dataset SFPD Reported Incidents in 2014 [Data source]. The dataset shows information about the offenses reported by the San Francisco Police Department in the year 2014.

In this post, I’ll show you how to take the data from the Northern, Southern, and Central regions, and visualize:

  • Which region has the highest incidence of theft reported;
  • What has been the theft trend quarter over quarter;
  • During which hours of a day more incidents of theft occurred, and
  • When more incidents of theft occurred—over the weekdays or weekends.

Prerequisites

  1. Import the Dataset

Import the dataset SFPD Reported Incidents in 2014 to Arcadia Instant.

  1. Create Segments

After importing the dataset and before creating the app, create segments defining

  1. Quarters of a year
  • Q1 – January, February, March
  • Q2 – April, May, June
  • Q3 – July, August, September
  • Q4 – October, November, December
  1. Periods of the day
  • Morning – 5 AM to 11 AM
  • Afternoon – 12 PM to 4 PM
  • Evening – 5 PM to 9 PM
  • Night – 10 PM to 4 AM
  1. Weekdays and weekends
  • Weekdays – Monday to Friday
  • Weekends – Saturday and Sunday
  1. Business and non-business hours
  • Business hours – 8 AM to 6 PM
  • Non-business hours – 7 PM to 7 AM

The new segments show on the left navigation bar.

rtaimage

  1. Edit Attributes 

To show data by hours, you have to convert time into hours. You can do so by editing attributes of the dataset after importing it.

  1. Edit attributes of the dataset SFPD Reported Incidents 2014 using the following Expression in the Column View window. Be sure to look for the ‘Validation Successful!’ message.
strftime(‘%H’, [time])

rtaimage01

This expression converts time into hours, and you get a new dimension with the hour value.

Create an App

To turn the information into a visualization, we’ll next create an app comprised of the following visuals:

  1. A bar chart showing the number of theft incidents in the three regions quarter over quarter
  2. A multiple line chart showing the number of theft incidents in the morning, afternoon, evening and night in each region
  3. A multiple line chart showing the number of theft incidents by hour and region
  4. A multiple line chart showing the number of theft incidents during business and non-business hours
  5. A cross tabulation chart showing the average number of incidents over the weekdays and weekends
  1. Quarter over quarter comparison

Use a trellised grouped bar chart to show the incidents of theft per region in each quarter. To build the chart, populate the shelves with the following attributes:

rtaimage02

  1. From Segments, select Q1, Q2, Q3, and Q4.
  2. On the Filters shelf, from pddistrict, select Central, Northern, and Southern; from Category, select Larceny/Theft.
  3. On the X shelf, for date, use the following expression in the Enter/Edit Expression window:

This expression converts Months from number to string, e.g. 1 = January and so on.

case when strftime(‘%m’, [date]) = ’12’ then ‘December’ when strftime(‘%m’, [date]) = ’11’ then ‘November’ when strftime(‘%m’, [date]) = ’10’ then ‘October’ when strftime(‘%m’, [date]) = ’09’ then ‘September’ when strftime(‘%m’, [date]) = ’08’ then ‘August’ when strftime(‘%m’, [date]) = ’07’ then ‘July’ when strftime(‘%m’, [date]) = ’06’ then ‘June’ when strftime(‘%m’, [date]) = ’05’ then ‘May’ when strftime(‘%m’, [date]) = ’04’ then ‘April’ when strftime(‘%m’, [date]) = ’03’ then ‘March’ when strftime(‘%m’, [date]) = ’02’ then ‘February’ when strftime(‘%m’, [date]) = ’01’ then ‘January’ else null end as ‘Month’

The following chart appears. It shows that the Southern region has a higher number of reported incidents of theft and these incidents occur more in the last two quarters than in the first two. That means theft incidents increase in winters.

rtaimage03

  1. Comparison by periods of a day

Use a trellised multiple line chart to visualize the incidents of theft in the morning, afternoon, evening and night in each region. While building the chart, populate the shelves with the following attributes:

rtaimage04

  1. From Segments, select Morning, Afternoon, Evening, and Night.
  2. On the Filters shelf, from pddistrict, select Central, Northern, and Southern; from Category, select Larceny/Theft.

The following chart appears. It shows that maximum incidents of theft occurred in the evening, particularly between 6 PM and 8 PM, while the number of theft incidents was lowest between 3 AM and 4 AM. This trend shows that criminals are most active in the evening and least active after midnight.

rtaimage05

  1. Comparison between business and non-business hours

Use a trellised multiple line chart to visualize the incidents during the business and non-business hours in each district. While building the chart, populate the shelves with the following attributes:

rtaimage06

  1. From Segments, select Working hours and Non-working hours.
  2. On the Filters shelf, from pddistrict, select Central, Northern, and Southern; from Category, select Larceny/Theft.

The following visual appears. It shows that the number of theft incidents is higher during the non-business hours, especially in the evening hours.

rtaimage07

  1. Theft incidents per hour

Use an areas chart to show the theft incidents per hour in each district. While building the chart, populate the shelves with the following attributes:

rtaimage08

  1. On the Filters shelf, from pddistrict, select Central, Northern, and Southern; from Category, select Larceny/Theft.

The following visual appears. It shows that the number of theft incidents is highest in the evenings and lowest in the early mornings. This reinforces the fact that evenings are the favorite working hours of thieves.

rtaimage09

  1. Theft incidents over the weekdays and weekends

Use a cross tabulation chart to showing the incidents during the weekdays and the weekends in each district. While building the chart. While building the chart, populate the shelves with the following attributes:

rtaimage10

  1. From Segments, select Weekdays and Weekends.
  2. On Segments, click the down arrow icon and change the default aggregate from Sum to Count.
  3. On the Filters shelf, from pddistrict, select Central, Northern, and Southern; from Category, select Larceny/Theft.
  4. Place time on the Measures shelf, click the down arrow icon and enter the following expression in the Enter/Edit Expression window.

The following expression converts the total number of theft incidents to the average number over the weekends and weekdays, so we can have a fair comparison of theft trends between weekdays and weekends.

case when  [dayofweek] = ‘Friday’ or [dayofweek] = ‘Monday’ or [dayofweek] = ‘Tuesday’ or [dayofweek] = ‘Wednesday’ or [dayofweek] = ‘Thursday’ then count(incidntnum)/5 else count(incidntnum)/2 end as ‘Average Incidents’

The following table appears. It shows that the average number of theft incidents across all three regions is higher over the weekends than that over the weekdays. The data tells us that thieves work harder over the weekends.

rtaimage11

Use app control filters and enable click behavior on visuals to see information about the other incident categories the other districts.