How to Prepare and Import Data

Published on November 4, 2015

The previous “Basics” section should have given you a fairly good handle on the elements of the Arcadia interface. The next sections will cover things more quickly and assume you are familiar with the major elements of the interface.

There are some interesting default visuals available in Arcadia that are not common to most visual analytics tools: a calendar heat map, correlation flow chart, radial chart, chord chart, and extension.

  • Extension lets you add all sorts of things via writing custom HTML. The most common use is to add some explanatory text and a few useful links as well as images. However, the possibilities are vast. You could embed a video or even a Tableau Visualization using their embed code. You could also potentially work in some visuals made in D3.js, php, or other programming languages.
  • Correlation Flow charts and Chord charts are similar in that they let you look at movement between dimensions. Chord charts are more visually interesting, but they are often harder to read and interpret than Correlation Flow charts.
  • Radial charts are a good way to contrast a small group across many measures. You’ll contrast the activity levels (measured in five different ways) of four Internet forums using the stackexchange platform.
  • You’ll also look at the activity levels throughout the year in a Calendar heat map. While you could make a calendar heat map fairly easily in most visual analytics tools, it is interesting to see this as a default visual.

New Data: Stackexchange

Stackexchange is a network of forums specializing in asking questions and getting correct answers. The progenitor was the well-known programming forum StackOverflow. In 2011, stackexchange launched 33 additional sites, and they’ve expanded since then.

Stackexhange makes an interesting dataset to explore because it contains member usage data on a transactional level. Anyone can access their data through the Stackexchange Data Explorer. Here you can write a query against their database, save your queries (this happens automatically if you give it a title and run it, but they are easiest to find if you create an account), and export results to a csv (even if you don’t have an account).

In this tutorial, you’ll look at data for three sites that are good resources for professionals: Cross Validated, Geographic Information Systems, and Database Administrators.

The tutorial also includes a smaller site: Role-Playing Games.This is a hobby site that will provide a bit more of a contrast as behavior and activity levels are notably different there.

You can download a csv of the data here.

The queries used to collect this data can be accessed here.

StackExchange User Engagement Query

Stackexchange Daily Activity Posts Comments Users


Importing New Data

Since Arcadia provides direct connections to entire databases (making a “new connection” will connect all the databases and tables on a particular server), importing new data may be fairly uncommon. However, there are always cases where augmenting data with some outside information is helpful.

You will import two csv files.

  1. Go to “Data->Import Data” (click “Samples” on the sidebar if it is not already selected, as you’ll bring this data into the existing Samples SQLite connection).
  2. This will bring up a new dialogue. Choose the StackUserShift.csv (that you’ve downloaded here) and click get data.
  3. This will open the data import screen, which will let you review and rename your new dataconnection. Here you can double check that all your fields are coming in as expected, that their datatype is correct, and you can also change their names. If you prefer field names to start with a capital, you can change that here. Rename the “Tablename” from the current name to “quarterly_stackexchange_engagement_shifts.”
  4. You can click “Apply Changes” to ensure that your changes are accepted. Once you are done reviewing this, click the blue “Confirm Import” button.
  5. You’ll get a popup informing you that you still need to create a dataset from the data connection menu. This is your next step.
  6. In “Data->Samples->Connection Explorer,” find “quarterly_stackexchange_engagement_shifts” and click “New Dataset.”
  7. Enter a Title for your new dataset, “Stackexchange Shifts,” and click “Create.”
  8. You’ll now see it at the top of the Datasets screens and it will be visible throughout the interface.

Follow the same steps to import the StackActivity.csv (which you can download from here . . .[need link!]). Name the data connection “Stackexchange_Activity” and the dataset “Stackexhange Daily Activity.” You may also want to rename the fields with the first letter capitalized and change “date_activity” to “Activity Date.”

Creating Interesting Visuals is continued here – Creating a Radial Chart