Images of historic newspaper pages, as well as uncorrected page text, are displayed through your web browser. However, Chronicling America also contains high-resolution images (JPEG2000) and enhanced text (PDF) that may require special viewers or instructions for accessing them.
When selecting an image of a newspaper page, you are presented with three image displays at the top left.
At the bottom left of the display, there is an option for downloading the full newspaper page.
For a side-by-side view of the Image and its corresponding OCR text, select the button on the top left of the screen Image w/Text.
Although errors in the OCR text may be common, OCR is still a powerful tool for making text-based items accessible to searching. For example, important concept words often appear more than once within an article. Therefore, if OCR misreads one instance of a key word in a passage, but correctly reads the second instance, the passage will still be found in a full-text search.
Images of the full newspaper page from Chronicling America can be downloaded.
Name | File Format | Description or Use | Required Software |
---|---|---|---|
OCR(ALTO) = Optical Character Recognition encoded using the Analyzed Layout and Text Object XML Schema | .XML | Does not contain images. Contains associated text found on images in coordinates of textual boundaries. |
Internet Browsers, Text Editors |
PDF (Portable Document Format) | Medium to high quality. Contains image with associated OCR embedded text. |
Browsers with build-in PDF viewers, free and commercial PDF viewers, Adobe Acrobat | |
JPEG | .JPEG or .JPG | Low to medium quality. Pay attention to the file size of the download. Lower display size equals lower quality. |
Any image viewer |
JPEG 2000 | .JP2 | Highest quality available as a download on Chronicling America. | Free and commercial image viewers and image editing programs (Irfan View, Adobe Photoshop, etc.) |