Optical Character Recognition (OCR) refers to the process of converting visual materials (such as scanned documents) into text that can be read by a computer.
OCR technology is crucially important for non-visual users, particularly those who use screen readers (software that converts text to audio or braille). However, while OCR has come a long way, it’s not perfect. Humans still need to review the output for accuracy, particularly when digitizing documents at scale.
Below, we’ll explain the basics. First, a quick note: In digital accessibility conversations, OCR may also refer to the U.S. Department of Education’s Office of Civil Rights.
If you’ve received an OCR complaint, that has nothing to do with optical character recognition — and in this article, we’re going to focus on optical character recognition. To learn about the other OCR, read: What Is an OCR Web Accessibility Complaint?
If text is presented as a flattened (pre-rendered) image, it’s less robust. Screen readers cannot announce the content, and people with visual impairments may be unable to access it.
By creating a text version of a visual document, you improve the experience for a wide range of users:
These days, automated text recognition tools are fairly reliable. Artificial intelligence (A.I.) tools have improved the accuracy of OCR significantly, as A.I. can use the context of the surrounding text to translate words that are blurry or heavily stylized.
But to fulfill the requirements of the Web Content Accessibility Guidelines (WCAG), text alternatives must be completely accurate. That means that humans need to review OCR output carefully, particularly when working with PDFs and other web-delivered documents.
Related: Examples of Text Alternatives to Non-Text Content
Many applications have built-in OCR. Adobe Acrobat, the most popular software for building PDF web documents, includes a powerful OCR text converter that preserves the text and formatting of the original document.
Of course, the output is dependent on the quality of the source image. When using OCR features, keep these tips in mind:
To learn about the best practices of digital accessibility for web documents, read: 7 Basic Steps to Making PDFs More Accessible.
Whether you’re digitizing documents for internal or public use, you need to consider the needs and preferences of your entire audience. The materials must be reasonably accessible for people with disabilities; otherwise, they may not be compliant with the Americans with Disabilities Act (ADA), Section 508 of the Rehabilitation Act, and other non-discrimination laws.
If you’re digitizing a single form, you can use WCAG to follow the best practices of inclusive design. But at scale, this becomes more difficult: An accessibility partner can help you improve compliance and provide all users with a better experience.
The Bureau of Internet Accessibility provides PDF accessibility remediation services, along with website audits, 24/7 accessibility support, and training resources. To learn more, send us a message and connect with an expert.