Instructors: Christian Casey
Duration: both weeks


Overview

Summary

This course introduces participants to computational philology through the study of manuscripts, text corpora, online databases, available tools, and custom-built software. It begins with traditional philological methods and gradually introduces workflows, quantitative analysis, and online publication of digital research.

Intended Outcomes

Upon completion of the course, participants will be able to…

  • articulate traditional philological methods and explain how digital approaches extend and formalize them
  • structure manuscript and linguistic data in digital form
  • build small digital corpora and generate concordances, word lists, and aligned variant texts
  • perform basic quantitative analyses of language and script (frequency distributions, correlation, diachronic comparison, simple information-theoretic measures)
  • apply exploratory visualization techniques (e.g., clustering, dimensionality reduction) to textual and paleographic data
  • formulate computationally tractable research questions and design small experiments to test philological hypotheses
  • critically interpret computational results and understand the limits of quantitative methods in historical research
  • design and publish a small born-digital philological project online using sustainable, lightweight infrastructure

Prerequisites

This course will teach all required digital techniques, including hands-on introduction to widely used digital tools and basic programming concepts. No prior programming experience is required. All programming activities will be based on structured templates and guided examples. When appropriate, we will also critically discuss and demonstrate the use of AI-assisted coding tools. Rather than treating them as black boxes, students will learn how such tools can support structured experimentation while retaining full intellectual control over methods and results. Students who arrive with programming experience will be offered additional or augmented in-class activities.

Datasets and Materials

All datasets used in this project are publicly available online or will be provided to the students by the instructor.

Technical Requirements

Students must come equipped with a laptop for writing code. Tablets with physical keyboards may be used if it is possible to write code efficiently with them, but laptops should be preferred.

Final Project

Students will produce one final project throughout their time in the course, which they will publish online at the end of the two-week classwork period. Some possible projects include (but are not limited to):

  • Concordance
  • Variant alignment
  • Morphological tagging demonstration
  • Paleographic clustering visualization
  • Diachronic frequency comparison

Course Schedule

Week 1: Philological Foundations

Day 1: Traditional Philology as Method

  • What is philology and how do (did) we do it?
  • Textual criticism and the material transmission of texts
  • Textual analysis at the level of words, variants, and meaning
  • Palaeography and scribal practice
  • Orthography and variation
  • Phonology as reconstruction and interpretation
  • The birth of linguistics and The Schism
  • Project check-in: identifying possible areas for final project focus

Day 2: Textual Criticism

  • Philosophical underpinnings of textual reconstruction
  • Alternative views (e.g. synoptic text editions and non-stemmatic models) * Variant alignment as structured comparison
  • Bioinformatics approaches to textual transmission
  • Visualizing cladograms and transmission trees
  • Hands-on exploration of a small aligned text dataset
  • Project check-in: how variant alignment could become a small digital project

Day 3: Text as Data

  • What is a word?
  • Lemmatizers and parsers (conceptual overview and demonstration) * Generating concordances and word lists automatically
  • Linking lexical data across corpora
  • Building small digital corpora from structured text
  • Hands-on creation of a miniature text-analysis pipeline
  • Project check-in: drafting the data component of each student’s final project

Day 4: Script as Data

  • Graph vs. grapheme
  • Digitizing manuscripts and defining annotation units
  • Sign segmentation and overlap problems
  • Solutions to common digitization challenges
  • HTR overview: what exists and what remains difficult
  • Demonstration of paleographic clustering and sign comparison * Project check-in: identifying possible script-based project directions

Day 5: Sound as Data

  • Phone vs. phoneme
  • Working with audio data and phonetic features
  • Working with online phonetic resources and structured datasets * Analyzing phonetic features computationally
  • Connecting modern phonetic data to ancient languages
  • Discussion: how sound can be modeled and measured historically * Project check-in: short informal proposal discussion

Week 2: Modeling, Analyzing, Publishing

Day 6: Measuring Script and Language

  • Information theory foundations in accessible form
  • Frequency distributions and statistical analysis
  • Measuring complexity and variation
  • Finding meaningful relationships vs. spurious patterns
  • Guided exercise applying basic quantitative analysis
  • Project check-in: refining chosen method and dataset

Day 7: Visualization

  • Making philological analysis legible and interpretable
  • Dimensionality reduction (conceptual overview)
  • Choosing the right approach for different questions
  • Using visualization tools
  • Interpreting clusters and outliers critically
  • Project check-in: designing visual component of final project

Day 8: Research Questions

  • Asking tractable questions in computational philology
  • Formulating reasonable, testable hypotheses
  • Designing experiments to test philological questions
  • Understanding limitations and methodological risks
  • Structuring a clear research narrative for a digital project
  • Project check-in: outline and technical checklist

Day 9: Sustainability, Accessibility, Infrastructure

  • Designing born-digital publications for philological work * Making materials accessible online (format, structure, documentation) * Why digital projects die and how to avoid it
  • Sustainability in digital publishing (lightweight infrastructure, versioning, archiving)
  • Introduction to GitHub Pages template for final project * Project check-in: building and preparing for publication

Day 10: Project Workshop & Publication

  • Finalize final project
  • Publish online using GitHub Pages
  • Present results to the class
  • Analyze and critique together
  • Discussion of next steps: expanding toy models into research projects

← Back to all workshops

Updated: