Skip to Main Content

HathiTrust at the University of Rochester

What Is HathiTrust?

HathiTrust Digital Library is a collection of over 15 million volumes of searchable books, journals, and government documents, including 5.8 million available as full-text online.

In March 2015, the University of Rochester Libraries joined more than 100 academic and research libraries as a partner in HathiTrust. For more information about the HathiTrust community, see Partnership Community.

For more detail about the kinds of content in HathiTrust and to visualize by call numbers, languages, or dates, see Statistics and Visualizations.

What can I do in HathiTrust?

Anyone can search and view the full-text content in HathiTrust. With login, patrons at member institutions benefit from added features. Below is a brief summary of what you can do. Here is where you can log in.

  without login with UR login
Search full-text of all volumes
View full-text of non-copyright volumes
Download single page of non-copyright volumes (PDF image)
Download full volume of non-copyright volumes (PDF image) X
Search within collections
Create and save your own collections X
Chart courtesy of Syracuse University Libraries

Research Datasets

The HTRC Extracted Features Dataset (1.0) contains page-level features for 13.7 million public-domain and in-copyright volumes, including

  • part-of-speech tagged term token counts
  • header/footer identification
  • marginal character counts
  • line information for each page, such as number of lines with text and a count of characters for starting and ending lines

Find out more at the HTRC Extracted Features Dataset webpage, which includes full documentation, a sample dataset, and links for downloading the data.

The HathiTrust Research Center links to additional tools, datasets, and information about workshops.


HathiTrust Research Center's Bookworm tool charts trends in word use from 1500-2015 in hundreds of thousands of texts in HathiTrust. Filters are available for subject classification, fiction/non-fiction, genres, language, format, page and word counts, and publication information. Controls allow choice of date ranges, different metrics and case sensitivity.

Bookworms based on other text collections are available at