Document scanning is the means through which documents are converted to their digital image representation for electronic storage or distribution. Among the types of documents being scanned by government agencies are tax forms, patent documents, office correspondence, mail pieces, engineering drawings, microfilm, archived historical papers, and fingerprint cards. Increasingly, the resulting digital images are used as the input for further automated processing including: conversion to a full-text-searchable representation via machine printed or handwritten (optical) character recognition (OCR), postal zone identification, raster-to-vector conversion, and fingerprint matching. These diverse document images may be bi-tonal, gray scale, or color. Spatial sampling frequencies range from about 200 pixels per inch to over 1,000. The quality of the digital images can have a major effect on the accuracy and speed of any subsequent automated processing, as well as on any human-based processing which may be required. During imaging system design, there is, therefore, a need to specify the criteria by which image quality will be judged and, prior to system acceptance, to measure the quality of images produced. Unfortunately, there are few, if any, agreed-upon techniques for measuring document image quality objectively. In the output images, it is difficult to distinguish image degradation caused by the poor quality of the input paper or microfilm from that caused by the scanning system. We propose several document image quality criteria and have developed techniques for their measurement. These criteria include spatial resolution, geometric image accuracy, (distortion), gray scale resolution and linearity, and temporal and spatial uniformity. The measurement of these criteria requires scanning one or more test targets along with computer-based analyses of the test target images.
|