This tutorial was written by Katherine Walden, Digital Liberal Arts Specialist at Grinnell College.
This tutorial was reviewed by Sarah Purcell (L.F. Parker Professor of History) and Gina Donovan (Instructional Technologist) at Grinnell College, and edited by Papa Ampim-Darko, a student research assistant at Grinnell College.
This tutorial incorporates elements of the Calling Bullshit curriculum developed by Carl Bergstrom and Jevin West at the University of Washington.
Visualizing Data (Excel and Tableau) is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Data Visualization with Microsoft Excel
We’ll use the same 1870 Federal Census Grinnell Township file to build preliminary data visualizations in Microsoft Excel. Navigate to https://sarahjpurcell.sites.grinnell.edu/digital_methods/files/1870_Federal_Census_Grinnell_Township.xlsx in a browser if you need to download the file again.
1-Start in cell A1 and select all cells in the sheet that contain data. Click Insert-> PivotChart.
6-The default Value and bar chart is showing us a count of the number of data points represented in each field.
9-Make this change for both fields.
11-You can right click on various parts of the bar chart to customize colors and labels.
12-Experiment with other PivotChart functions and other data fields to generate different types of visualizations.
What types of visualizations were you able to generate in Excel using PivotChart? How could those visualizations shape or impact your understanding of the data? Did you generate any visualizations that were confusing or misleading? Alternatively, did you generate any visualizations that were unexpected or illuminating?
Data Visualization in Tableau
Tableau is a software company founded in Silicon Valley in 2003. Developed by researchers affiliated with Stanford University’s Computer Science Department, Tableau is a commercialized application of academic research. Represented as DATA on the New York Stock Exchange after a 2013 initial public offering, Tableau reported $877 million in revenue in the 2017 fiscal year. Most often deployed in business environments, Tableau Desktop is a subscription-based data analysis and visualization software. Tableau Server and Tableau Online offer subscription-based web-publishing options for making data and interactive visualizations available on the web. Tableau Public offers limited Tableau Desktop functionality with some options for uploading visualizations through the Tableau Public website.
13-Open Tableau by clicking on the Desktop icon, or searching in the Start menu.
14-Click on Microsoft Excel and navigate to the 1870 Federal Census Grinnell Township file saved to your Desktop.
16-Tableau’s Data Source previews the structure of the data we loaded from the Excel file. Tableau determines what type of data is contained in each field (integer number values, dates, geographic spatial information, strings of letters or characters, etc.).
17-If we wanted to analyze or visualize data from multiple tables, Tableau’s Data Source tab offers some functionality for joining tables and building a database structure.
How is the Federal census data in Tableau presented or organized differently than the same information in Excel? What similarities or differences do you notice between the two user interfaces for building data visualizations?
19-Tableau distinguishes between Dimensions (data fields that cannot be aggregated) and Measures (data fields that can be aggregated–or have mathematical operations performed on them).
20-To visualize the distribution of male householder occupations, we need a graph that counts the number of names represented for each distinct occupation term.
22-Tableau’s default visualization is a table that shows where last names and occupations intersect in the data.
23-To visualize the distribution of occupations, we need to calculate the count of Last Name data points for each Occupation.
25-Select Measure and Count to convert the Last Name field to a calculation for each Occupation category.
27-Move your cursor over the chart to see the interactive data points.
28-Tableau allows chart customization with the Marks panel. For example, dragging the “Occupation of male householder” field from Dimensions to the Color icon in Marks colors the bars according to the different Occupation categories.
30-Experiment with other data fields and calculations to generate different types of visualizations. You can add new worksheets or duplicate an existing worksheet to build multiple visualizations.
What types of visualizations were you able to generate in Tableau? How were those visualizations similar or different than what you generated in Excel?
How could those visualizations shape or impact your understanding of the data? Did you generate any visualizations that were confusing or misleading? Alternatively, did you generate any visualizations that were unexpected or illuminating?
Uploading to Tableau Public
One of Tableau’s features is that it allows users to upload interactive data visualizations to the web and embed them in websites and other online spaces.
If you have additional time, create a free profile on Tableau Public’s website.
Click File-> Save to Tableau Public, and use your login credentials to save your Tableau workbook to Tableau Public’s website.
Your Tableau Public online profile can host your interactive visualizations, and also gives you the option to share, download, and embed the interactive content.
A wide range of data analysis and visualization programs are available, depending on what types of data you are wanting to analyze and what the end goal is for your data analysis and visualization. Software programs like Tableau, Plotly, and Microsoft Power BI are subscription-driven software programs that provide a range of interactive data visualization options. Such software programs have a low barrier to entry and allow a user with minimal technical expertise to quickly generate powerful, interactive visualizations.
However, many digital scholars, including the authors of Exploring Big Historical Data, encourage using open-source or freely-available software and programs for historical analysis whenever possible, particularly in education or research applications of digital tools. An open-source ethos can make code and data available via platforms like GitHub, because the tools used to analyze and visualize the data are not proprietary and are freely available.
Final reflection questions:
- How did you understand or think about data analysis and visualization before starting this tutorial? How has your understanding or thinking changed or shifted by the end of the tutorial?
- How would you describe different types of data or data structures to someone who was learning about them for the first time? What questions do you have about data types and structures as they were presented in this tutorial?
- Thinking about Blevins’s sample data and the Iowa Township data, how did thinking about the source and structure for the data shape your understanding of it?
- How would you compare your experience analyzing and visualizing data in Microsoft Excel and Tableau? What similarities and differences did you notice between the two programs?
- What other thoughts, questions, or ideas about data analysis and visualization did you have while going through this tutorial? How can you think more critically about the data visualizations you encounter on the internet or in journalistic publications?