Optical character recognition (OCR) software, generally used with scanning devices, transfers paper documents that are typed or printed into digital “images” or files. The finished product can then be used with desktop publishing and word processing programs or text editing products. OCR software is not new to the world of digital imaging. It has seen a resurgence in popularity, however, as products offer more formatting options and better support for muddied or “imperfect” documents. The key to successfully using these products is to first understand basic OCR terminology.
Portable document format (PDF)
A portable document format (PDF) can include graphics and text. This type of document is commonly used for file sharing. Some OCR software is designed for creating PDF files.
Dots per inch (DPI)
Dots per inch indicates the print quality of an image. The term also refers to the output resolution of a printer. In OCR software, DPI refers to the number of “dots” per linear inch. The more dots per inch, the better the print quality. Most OCR software scans documents at 200 DPI.
The term pixel combines the words picture (“pix”) and element. In OCR software, it generally refers to the smallest bit of information present in a digital image. The more pixels an image contains, the more the final image will look like the original. Pixels are typically organized in a two-dimensional framework that may be depicted as dots, squares or rectangles. Arranging them in this fashion allows multiple operations to be used by applying the same procedure to each individual pixel.
Bitmap is the most basic type of image. Bitmap images are a compilation of squares that when coupled together create an image. These “bits” can be various colors but they are uniform in size. Generally, OCR software can convert bitmap images into text that can be imported into a word processing program.
In OCR software, a raster image is a group of pixels. The most common types of raster images are GIF (Graphic Interchange Format) and JPEG (Joint Photographic Experts Group).
Vector images are a group of connected lines, curves, points or shapes that create an object. These images are defined by mathematical equations rather than pixels.