Optical character recognition (OCR) refers to a computer's ability to recognize printed letters, numerals, or symbols (optical characters) as discrete entities rather than as simply an image containing lines, curves, and shading. Useful for document management, form processing, and a host of other commercial applications, this powerful tool allows businesses to convert paper documents into electronic files that can then be manipulated and retrieved at will. Although the earliest OCR devices debuted in the late 1950s, it was in the 1980s when OCR technology first reached the mass market, spurred by the increasing power of personal computer systems.
The OCR process is simple in theory. When a printed page of text is scanned, the scanner delivers an image of the text to OCR software stored in the attached computer. The software then attempts to identify each letter of each word in the image in order to covert it to an editable text document or to process the information in whatever format is needed.
Companies often use OCR to reduce human data entry, as in bill processing, and for a wide number of other applications that save time and improve accuracy. Newer uses under development have included noncontact scanning from a distance (for instance, scanning license plate numbers) and recognition of handwriting as opposed to printed text. Likewise, OCR is increasingly used in conjunction with bar coding and other forms of automatic identification systems.
One of the problems with OCR when it was first developed was that the computer was frequently baffled by what the human eye and mind readily accepts. For example, the letter "e" might be interpreted as the letter "o." Early OCR software programs achieved accuracy rates of more than 90 percent. Though a seemingly high percentage, this rate, measured on a character-by-character basis, in practice meant that in approximately every second or third word, that is, in every tenth character, an error would occur. As a result, the document would have to be carefully proofread and corrected by a typist, who would use the original paper document as a guide.
What usually confused the OCR software were imperfections on the printed page like stains, extraneous marks, fading, and blurring. Letters had to be crisply printed. Unusual fonts were impossible for the OCR to understand and duplicate. Strikeovers also confused OCR programs. Even slightly blurry, shiny type from thermal fax paper could throw the software into fits.
Modem scanners for commercial purposes can achieve under ideal conditions accuracy rates in the range of one error per tens of thousands of characters, or more than 99.99 percent accuracy. Still, when one considers that a single article of several pages may contain over 10,000 characters, for large scanning projects this rate will allow many errors to pass if other quality control methods aren't practiced.
To maximize accuracy, OCR software developers try to find a happy medium between making recognition of characters too strict and too flexible. Too strict an interpretation of OCR means mistakes when a letter collects a little bit of dirt or suffers a slightly broken letter form. Too flexible, and it will make the same mistakes as it tries to interpret anything it sees as a letter. Uncommon or highly stylized typefaces only compound recognition problems. Also, too many variations of typefaces on one page can confuse the software as it looks through its programming for something that it recognizes. When companies have control over the typeface, such as that on their own forms, they can print with consistent fonts or even special OCR fonts that help maximize accuracy. OCR fonts tend to be more angular or square in shape and have slightly wider spacing between letters (kerning) to reduce the likelihood of misreading.
The biggest problem, however, remains with "degraded" text, such as a slight bit of dirt on the paper that causes the OCR software to interpret a lower case "h" as a "b." Faxes remain a problem because faxed text frequently has poor resolution and thus confuses OCR software. Research and development continues to focus on ways to improve OCR performance under such circumstances.
On the plus side, OCR developers are working to train their software to handle potential problems. This often involves interpreting characters in their broader context, e.g., a word or sentence, rather than solely on an individual basis, to achieve greater accuracy. Faster computers with greater memory capacity have enabled such complex processing. Some OCR programs are designed to recognize correct grammar and common spellings so they automatically highlight words that they have copied, but that they also find questionable. The software, in effect, tells its user that it may have made a mistake, but it does not know what to do about it. In the end the machine may turn control of the scanning back over to the human to make the final decision about how to handle what it considers a problem.
An effective OCR system depends heavily on both the physical scanner and the software used to interpret the scanner's input. A good scanner can be undermined by weak software and vice versa. Entrylevel and mid level scanners and their software can be obtained readily through retail channels; specialized and high-power systems may require contacting a regional distributor or even custom manufacturing and programming.
The most common type of scanner is flatbed. Flatbed scanners look and act much like photocopy machines, with pages being scanned placed flat on the scanner's glass. They generally copy single pages at a time. Advanced scanners have feeding systems for scanning large batches of documents without requiring a human operator to switch pages. These scanners may digitize dozens of pages per mirpute. Drum scanners are high end devices for capturing fine details, and therefore are often used more for graphic images than for OCR. Instead of bringing the page to the scanner, a handheld scanner allows the scanner to go to the page. Software allows wide images to be "stitched" together from two passes of a handheld over a large image. Generally, however, handheld devices are not as effective at full-page scanning as flatbeds. Finally, one of the newest types of scanners looks like a pen. It allows the user to select certain lines of type in a book for scanning. This type of scanner connects to a computer printer port without the addition of any other computer boards, which may be necessary with other types of scanners.
[ Clint Johnson ]
Callan, Tim. "Putting OCR Where It's Not: Opportunities for New Expansion." Advanced Imaging, April 1997.
Davis, Andrew. "Industrial OCR: Machine Vision Takes on New Characters." Advanced Imaging, April 1997.
Koulopoulos, Thomas M., and Carl Frappaolo. Electronic Document Management Systems. New York: McGraw-Hill, 1995.
Ross, Fred F. OCR with a Smile. Englewood, CO: House of Scanning, 1998.