This project is the result of nearly quitting my doctoral program. I moved to the other side of the world, finished the comprehensive essays required by my program, and selected a dissertation topic that was academically rigorous and utterly boring. I started teaching with the UB-SIM program, loved my job and the students, and thought, “Maybe I don’t need the PhD after all.”
My advisor wisely encouraged me to take a leave of absence instead of quitting, at which point I realized my students were doing exciting things with the digital tools and social media I used in the classroom. So I leaned into their work and thought, What are these digital tools doing to their study and understanding of history? Thus began “Hashtag History.”
Creating this project required learning a variety of new methods and tools and using familiar ones in new ways. It would not have been possible without the support of a broad network of scholars and experts who generously shared their time and expertise. The digital nature of this dissertation was also encouraged and supported by my advisors and program. Their flexibility and openness to a digital dissertation were essential to completing the project.
New Methods & Tools
IRB Proposal & Approval
Before I could even begin data collection for this project, I needed the approval of an Institutional Review Board (IRB) to collect data from human subjects. Acquiring public data from social media does not require IRB approval and I ultimately did not include any private data in “Hashtag History” because only a handful of students provided consent. However, I originally intended to use students’ privately-submitted discussion reflections as well as their publicly produced tweets and blog posts.
Seeking IRB approval proved difficult partly because historians don’t usually need IRB approval (our subjects are typically long dead) and partly because I was a member of two institutions with vastly different requirements for IRB approval. As my intended participants were students at University at Buffalo (UB) but I was a graduate student at Drew University, neither institution agreed about whose approval I needed to seek. It took nearly eight months to sort out.
Lois Levy, the IRB consultant at Drew University, and Christian Marks, the director of UB’s Social and Behavioral Research Support Program, offered their guidance and helped me resolve the IRB process. Lois gave me a crash course in how to speak the language of the IRB and patiently read multiple drafts of my proposal to Drew’s IRB committee. Chris responded to my queries with lengthy emails and a late-night Skype conversation with me that finally clarified who I needed approval from (Drew only). His expertise was invaluable in forwarding the progress of this dissertation.
Once I received IRB approval from Drew, data collection and analysis for this project involved web scraping, text mining, and data visualization. Ian Milligan, from
Initial Analysis & Visualization
To analyze and visualize the collected data, I initially used Voyant (an open-source, free, web-based analysis package, also introduced by Ian) and Tableau (which offers a free student license and web-based version). Both tools served me well in my initial explorations of the data but fell short in a few significant ways. Voyant is useful for in-the-moment analysis of data, but saving datasets and visualizations is difficult.
Tableau produces beautiful visuals and is handy for embedding dynamic visualizations in which users can view the data points that make up a given column or point on a graph. The stories feature is an asset too. However, Tableau proved challenging to navigate when I wanted to join datasets or work with tidy data. The resources and tutorials available for researchers working with text instead of numeric data are limited and I was often unable to locate solutions for the issues I encountered.
In addition to Voyant and Tableau, I originally used coding to analyze my data. Coding is a best practice in the social sciences and initially offered a way to label my data without learning an intensive programming language, like R or Python. I also knew I could draw on my colleagues’ expertise to navigate this methodology since I work with educators and researchers from the fields of sociology, psychology, and communications.
Lacey Stein, colleague, friend, and fellow dissertator, was especially helpful for navigating the coding process. She pointed me to the introductory resources cited in the Methodology, notably Destination Dissertation by Sonia Foss and William Waters and “Thematic Analysis” by Virginia Braun and Victoria Clarke. I did not employ much coding in the final version of the dissertation. Relying only on my sense of which codes applied to which data seemed uncomfortably subjective and I was unable to find a knowledgeable collaborator to help me verify the codes. Additionally, coding works best for relatively small datasets; the method was not feasible for working with the entire Twitter dataset of 11,454 tweets.
Final Analysis & Visualizations in R
In the end, I abandoned Voyant, Tableau, and coding in favor of R. Carlos Yordan at Drew University initially recommended R as the best tool for my dataset. I also reached out to Paige Morgan, the Digital Humanities Librarian at the University of Miami whom I met through DHSI. Her thoughtful reply to my email assured me I didn’t need to learn everything in order to do some things in R.
R allowed me to import, analyze, and visualize data with one tool instead of three tools/methods, so in roughly six weeks I learned the basics of the R packages tidyverse and tidytext. Tutorials from The Programming Historian as well as foundational texts by Hadley Wickham, Garrett Grolemund, Julia Silge, and David Robinson were my R “bibles” for the remainder of the project. I also utilized Tyler Rinker’s sentimentr package in R and his introduction to the package on GitHub.
Most of the code for analysis and visuzalization in the final R script for “Hashtag History” is copied or adapted from the following works:
- Hadley Wickham and Garrett Grolemund, R for Data Science
- Julia Silge and David Robinson, Text Mining with R: A Tidy Approach
- Taylor Arnold and Lauren Tilton, Humanities Data in R
- Taylor Arnold and Lauren Tilton, “Basic Text Processing in R“
- Nabeel Siddiqui, “Data Wrangling and Management in R“
- Winston Chang, R Cookbook
- The Comprehensive R Archive Network (CRAN)
Web accessibility was not part of my plan for “Hashtag History” until June 2017. I signed up for the Web Accessibility course offered by George Williams and Erin Templeton at the Digital Humanities Summer Institute (DHSI) partly because other programming courses were already full and partly because knowing something about web accessibility seemed like a generally good thing. During the week-long course, George and Erin led participants through a series of discussions and practical exercises surrounding web accessibility in the class.
Their instruction and guidance shaped the web accessibility features of “Hashtag History.” Tools like the Wave Chrome Extension helped me troubleshoot accessibility issues on the site. The WordPress Accessibility plug-in provided the widget enabling high contrast and larger text in my sidebar/footer. Resources from WebAIM aided my creation of alt text, allowed me to check color contrast, and guided the structure of my headings on each page of this site.
Guest speakers and members of the class were likewise influential in the development of my thinking regarding web accessibility. Rick Godden discussed universal design and its limits with us. All participants, and notably Shawna Ross, Rebecca Parker, Gia Alexander, Alisa Beer, Vange Heiliger, Abi Lemak, and Mia Warren, gently and persistently challenged my assumptions and rhetoric surrounding disability in a variety of growth-inspiring ways.
I stored most of my data in Google Drive and double-checked my R calculations using Google Sheets. I have been using Google Docs, Sheets, Slides, and Forms for my classes since 2014, so these familiar tools provided a safeguard for my data storage and analysis. Drive also enabled easy sharing of resources with my committee.
For citation management, I used the open-source software Zotero. Desktop and web apps enable users to easily sort resources into topic-specific collections. The Chrome extension mines necessary citation information from most sources on the web, including Amazon and most academic databases, and a Google Docs add-on allows users to insert citations directly from Zotero.
Finally, I chose WordPress to present the dissertation. I have used WordPress for personal and class websites since 2014, so the platform offered a familiar bit of tech in the midst of learning a variety of new tools. WordPress sites are open-source and editable; all themes and plug-ins can be freely changed and adapted by users. WordPress also boasts accessibility-ready themes and accessibility plug-ins that help ensure websites conform to the W3C POUR standards for web accessibility. In addition, WordPress is supported by Reclaim Hosting, which focuses on users’ control over their digital identity by offering affordable domains and web-hosting to students, educators, and institutions.
Institutional & Committee Support
Edward Baring, along with Jonathan Rose, Wyatt Evans, Andrew Bonamici and Guy Dobson, constructed Drew University’s new Digital Dissertation Guidelines. The faculty and librarians involved in creating the guidelines included fellow dissertator Jessica Brandt and me in conversations about the final document to ensure the proposed standards were feasible for our digital projects. Jessica penned the sentences about web accessibility in Drew’s guidelines. I am grateful for her solidarity in this aspect of the dissertation process especially.
Wyatt Evans said yes to this project before it was fully formed. Gamin Bartle and Rick Mikulski stayed on as readers even after they departed Drew for William Patterson University and Portland State University, respectively. Every step of the way, these advisors have offered critical insights regarding the arguments and format of this project. And unlike most committees, all three have provided reviews and revisions for each portion of the writing. Their advice, encouragement, and commitment to seeing this project through have been invaluable.
As is clear from the sections above, “Hashtag History” is the result of the practical and academic support of scholars and peers from myriad institutions and locations. A few sundry acknowledgments are in order:
- Jeffrey McClurken welcomed me warmly and made me feel like I belonged at the AHA digital drop-in session in 2017.
- Jessica Otis shared a workflow for network analysis of Twitter data after we chatted at DHSI. I didn’t end up using her document, but this was still a remarkably generous act.
- Maren Wood and Jennifer Polk (co-founders of Beyond the Professoriate) offer wisdom and guidance to Ph.D. candidates and new graduates alike. Their advice helped me discern a career path and learn how to talk about my dissertation in a way that forwards my goals.
- The staff and faculty of the UB-SIM program have been immensely supportive of this project. Jorge Arditi told me to stop making excuses and just spend two hours a day writing. Kris D’Amuro bought me coffee and let me share early visualizations. Jess Covert gave me advice about quantitative data. Marissa Bell, Jess Gilbert, and Watoii Rabii are/were working on dissertations simultaneously; their empathy and friendship in the midst of an often arduous process are much appreciated. Katie Fassbinder, our Assistant Administrative Director, told me it was okay to be anti-social and just get this thing done. Kevin McKelvey, our program director, defended his dissertation in early March 2019 and offers a tireless encouragement to all the writers of dissertations who work for him.
My students get the penultimate nod. The Spring 2017 batch created the data for “Hashtag History.” Their generosity, openness, and trust made the project possible. The Fall 2017 to Spring 2019 cohorts provided new insights through their participation in class and patiently waited for email replies from their scattered, tired instructor. Some of them even asked me what I was researching, which was a remarkably brave and thoughtful thing to do.
The final thanks go to Mark Lempke, my partner in all things. He has supported this project and process in too many ways to name.