Skip to main content

Datasets at the Library of Congress: A Research Guide

Dataset Repositories

Harris & Ewing, photographer. Gathering meteorological data from stratosphere. 1935. Library of Congress Prints & Photographs Division.

Datasets are listed here as potential sources for data science or machine learning projects. Time series are available for most economic, business, census, and demographic statistics. For additional sources of datasets, see also the Business Reference Services guide on Data sets (BeOnline).

The Library of Congress makes these two datasets freely available to researcher and analysts.

  • By the People Data Sets
    Transcription data was created from completed By the People campaigns is available here in bulk as zipped .csv files.
  • Web Archive Datasets
    The Library of Congress Web Archives provides users with derivative datasets for users to download, re-use and explore.

The open dataset finders ICPSR External and Kaggle External assist users in identifying free datasets for practice projects.

An Application Programming Interface (API) is a set of communication protocols, functions, and commands that programmers use to develop software or facilitate interaction between systems. API analysis allows researchers to gain insight into database usage and performance. Following are a number of guides that provide lists of APIs.

Historical statistical data provides market information over time, and includes statistics on commodities, prices, and wages. 

The following links provide datasets of demographic information.

Governments are increasingly supplying public data to support accountability, efficiency, transparency and to strengthen role of citizens.

Following are sites with images for data projects.

These sites provide datasets associated with select publications.

The home of the U.S. Government's open data is at Data.gov.  While it is possible to easily locate datasets by topic, direct access points to U.S. Government datasets are provided here. A Google search on the agency name and "datasets" or "data" or "statistics" will provide you with results for agency websites of interest.