Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
- DHbox - An environment for digital humanities computational work that can be deployed quickly and easily from the cloud. Ready-to-go configurations of Omeka, NLTK, IPython, R Studio, and Mallet are included. (free)
- Digital Resarch Tools (DiRT) Directory - Aggregates information about digital research tools for scholarly use. DiRT makes it easy to find and compare resources available for text mining and data visualization, etc.
- Seeing Speech - Provides ultrasound tongue imaging (UTI) video of speech, magnetic resonance imaging (MRI) video of speech and 2D midsagittal head animations based on MRI and UTI data.
- TokenX - A text visualization, analysis, and play tool. Created by University of Nebraska-Lincoln. (free)
- FLEx (Fieldwork Language Explorer) by SIL (Summer Institute of Linguistics) - helps in compiling dictionaries and links dictionary entries to text documents in order to facilitate annotation. Its predecessor "Toolbox" is discontinued, but still widely used in the language documentation community.
- FileMaker - A relational database that is very useful in collecting different types of information (phonological, inflectional, semantic) for lexemes. (It is a proprietary program; it is available on our department student computers and students are not expected to buy it.)
- CMDI Maker - a tool for compiling metadata for recordings H
- HandBrake, Avidemux, and Audacity for video and/or audio conversion and editing.
Phonetic Analysis & Annotation
- MALLET (MAchine Learning for LanguagE Toolkit) - A collection of tools to document classification, sequence tagging, and topic modeling. There is also an add-on toolkit (Graphical Models in MALLET) for visualization. (open-source, free)
A freeware corpus analysis toolkit for concordancing and text analysis (works with Mac OS & Windows)
- A text concordancing tool for Mac OS that allows you to analyze your own collection of text files.
Crossref Text and Data Mining for Researchers
- Designed to allow researchers to easily harvest full text documents from all participating publishers regardless of their business model (e.g. open access, subscription). Provides step-by-step instructions
GloVe: Global Vectors for Word Representation -
unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.
- Juxta - An open-source tool for comparing and collating multiple witnesses to a single textual work; add or remove witnesses to a comparison set, switch the base text at will. Once you’ve collated a comparison, Juxta also offers several kinds of analytic visualizations.
- WordHoard - An application for the close reading and scholarly analysis of deeply tagged texts.
WordHoard contains the entire canon of Early Greek epic in the original and in translation, as well as all of Chaucer, Shakespeare, and Spenser.
- A Text Analysis Environment for Humanities Scholars. A collection of text analysis tools targeted at humanities scholars that includes side-by-side comparison, grammatical search, and document/sentence/word-set features.
- Bookworm - Created by Harvard. A tool for visualizing trends in repositories of digitized texts. Uses metadata and books collected by the Open Library. It at once describes the contents of the library as a whole in a useful and intuitive way.
- Voyant - An easy to use and free text analysis tool. Upload text and Voyant will automatically determine word frequencies and colocates and display them graphically.
Working With Webpages
- import.io - Instantly turn web pages into data
- Tapor - This collection of text analysis tools hosted by the University of Alberta providing XML, HTML, and plain text analysis. Upload documents to extract common words, determine colocates, separate HTML tags, and extract XML tagged information.