Exploring trends, technologies, e-resource management, and digital services in libraries.

Fundamentals of Text Mining: Curating, Preparing, Analyzing, and Visualizing Textual Data

March 15 – 25, 2021

This is a self-paced course with live office hours.

Course Materials Released: March 15

Live Office Hours: March 18 & 25

AUDIENCE LEVEL: Foundational


As academic libraries continually shift to keep up with changing research and pedagogical needs, many are looking at the digital humanities as an opportunity for closer collaboration with faculty and other campus stakeholders. Text mining fosters a natural partnership between library staff, faculty and students by facilitating a research workflow that promotes close and distant reading, project management skills, and critical thinking.

The purpose of this workshop is to familiarize attendees with the basic workflow, terms and output a student or new researcher would encounter when trying to accomplish a text mining project. We will give an introduction to text mining, including: what it is, what’s possible, and how it is being used for research and instruction. In addition to a discussion on the theories and methodologies in the field, participants will get hands-on practice with the major components of a text mining project.

Participants will build mini-projects in order to familiarize themselves with the fundamental steps of text mining:

  1. the curation of a textual dataset,
  2. digital literacies and critical thinking skills,
  3. ideating, developing, and Interpreting research,
  4. the cleaning and preparation of that data,
  5. computational analysis, and visualization of results.

To accomplish these tasks, we will provide a sample dataset, but will also include a list of primary sources (found on the Web and in the library) where they could procure their own datasets. Using one of the most popular text mining and visualization tools used by digital humanists today (Voyant), attendees will work independently to generate visualizations from the texts in their datasets, and answer questions based on their results.

Course history: This workshop was first offered at the DLF conference in 2019, and has been adapted for a fully online/asynchronous format. We’d like to acknowledge the work of Dr. Wendy Perla Kurtz in developing the content, which also draws on a text-mining workshop offered by the University of Nevada Las Vegas.

This workshop is an introductory session covering the basics of text mining where no previous background is required.

Following the workshop, participants will: 

  • Understand the nature of text mining in the humanities and social sciences.
  • Source relevant textual data for text mining and curate the material to optimize analysis results.
  • Interpret the results of a variety of visualization outputs and choose the output best suited to answer a research question.


Sarah Ketchley
University of Washington/Gale, a Cengage Company

Sarah Ketchley is an Egyptologist and art history scholar in the Department of Near Eastern Languages & Civilization at the University of Washington. She teaches introductory and graduate-level classes in digital humanities through NELC and Informatics, and directs a student DH internship program working to create digital editions of primary source material related to Nile travel in the 19th century. Sarah is also a Digital Humanities Specialist at Gale.

Lindsey Gervais
Gale, a Cengage Company

Lindsey Gervais is a Digital Learning Manager at Gale where she assists in the learning design and development of Gale’s Digital Scholarship Program. With a doctorate background and research recognition in the field of Cognition, Instruction, and Learning Technology, Lindsey is helping to elevate the instructional framework of Gale’s Digital Scholar Lab. She is a graduate of UCONN and taught Educational Psychology and Research Practicum for undergraduate and graduate students for 9 years. 

Maggie Waligora
Gale, a Cengage Company

Maggie Waligora (MLIS) is the product owner for the Gale Digital Scholar Lab. She holds a Masters in Library Information Science with a concentration in digital curation and preservation from Wayne State University. When she’s not in the office leading a project, she can be found taking long walks with her partner Lee and their two dogs (Walter and Charlie), volunteering for causes she is passionate about, listening to audiobooks in the comfort of her home, or catching up with friends.


We recommend using a desktop computer or laptop for this workshop.


  • Module 1: Introduction to Text Mining and your datasets
  • Module 2: Digital Literacies and Critical Thinking Skills, Ideating, Developing, and Interpreting Research Questions
  • Module 3: Text cleaning with Lexos
  • Module 4: Text analysis with Voyant