Message-Id: <mailto:199507102107.QAA15991@library.wustl.edu> Date: Mon, 10 Jul 1995 17:04:46 -0400 From: Judi Zidar <mailto:jzidar@NALUSDA.GOV> Subject: Re: OCR and fonts To: Multiple recipients of list IMAGELIB
Wordscan Plus [Calera/Caere Corp., 1-800-224-0660] handles serifs with no problem. In fact, when Calera used to be Palantir, their software seemed to "prefer" fonts with serifs. It claims to handle sizes down to 8 points, but it actually does a little better than that. As for sub/superscripts -- if we output to a wordprocessing format that handles them, sometimes the subs/sups are fine, sometimes they're missed altogether, and sometimes they're just part of the text. Depends on how good the source materials are, and on other unknown, mysterious factors. Boosting the scanning resolution above 300 dpi definitely helps, though. I haven't tried the latest version, 4.0; my experience is based on 2.0 and 3.0.--Judi
% Judith A. Zidar, Coordinator % Internet: mailto:jzidar@nalusda.gov % % Natl. Agric. Text Digitizing Program % Phone: (301) 504-6813 % % National Agricultural Library, USDA % Fax: (301) 504-7473 % % 10301 Baltimore Blvd. - Rm. 013 % % % Beltsville, MD 20705-2351 % %
On Mon, 10 Jul 1995, Bob Rosenberg wrote:
> I'm trying to find OCR software that likes Times Roman (or related serif
> fonts). The software I first tried, which is three or four years old, loves
> the sans-serif font in one of our publications, but it just gags on the font
> we have used in our book edition, which is in two sizes. The software
> particularly dislikes the almost-9pt font size in our footnotes. I believe
> it automatically scans text at 300 dpi, which leads me to place the blame on
> the software's recognition capability, since it ought to be able to read
> 9-pt type at that resolution. Has anyone found a package that reads a
> standard serif font without difficulty? The software also brings
> superscripts down into the line and has a few other format problems, but I
> have laid that problem to age and am hoping that newer software will
> preserve format better. (I know that is the boast of the new Xerox OCR
> product.)
>
> Thanks.
>
> Bob Rosenberg
>
>
> Robert Rosenberg
> Thomas Edison Papers
> Rutgers University
> New Brunswick, NJ 08903
> mailto:rarosenb@gandalf.rutgers.edu
>