![]() |
||||
![]() |
![]() |
![]() |
![]() |
![]() |
| JL, 13 June 2007 Digitizing Text, Part II If you've OCR'd all the paper you can, the pile that's left needs a different solution. This is the do-it-yourself method for multi-page documents. Unless you want to type it all by hand, your other solution is scanning and saving it as graphics. If it's only a one-page document you can use Transcript to type it from the graphic, or just type it from the original. If you have a 10-page or 40-page document that you want to scan and compile into just one, it takes a bit more. I don't want to be sharing documents that look like they've been through a tornado, so I put in the extra effort. We're creating a whole generation of digitized history and it's a matter of respect I guess. First, the scanning. I take all the pages and scan them in order into my photo editor. That happens to be Adobe Photoshop Elements, although I'm sure any one of them will have a similar option. In Adobe it starts with File/Import Scanner, (turn the scanner on first) and they automatically hook up. After scanning, I straighten and crop, tidy things up a bit with the Eraser tool, and resize each page. Any or all of these options might be appropriate to cleaning up your pages before joining them together. Straightening is a no-brainer. Pages are easier to read if they're not tilted this way and that. With a little experience, you'll know how many degrees a page is tilted and needs fixing. In the meantime just guess. It's how you'll learn. Sometimes the auto-straightener works, sometimes it doesn't, so you may need this. ![]() I find the easiest
way to crop
pages to have a consistent outcome is to use a "Fixed Aspect Ratio"
such as 8 x 10. Experiment
around to find a proportion that makes sense with that particular set
of
pages. For instance, if there's too much white crop some of
it
off. It doesn't need a 3 inch margin all the way around.
![]() ![]() Regardless of what
else you've
done, the pages still have to be resized to the same pixel width.
Having your pages all the same width
is good so when you string them together in the next step they'll all
line up. It makes them easier to read if you don't have to
use a magnifier to zoom in and out on different size pages with
different size text. I
find for the purposes of creating a .pdf a width of about
1500 pixels will give a
good size. Experiment with this yourself. Sometimes
you
won't
have enough pixels to go to 1500 and you don't want your text
pixellized any more than you would want a photograph of your
grandmother looking that way. Once your pages are all
prepared, the next step is to join them together. Before I
had a pdf
editor I
would insert each page (graphics file) into my
word-processor, fit each one to the page and then save it as a
pdf. Sometimes, for no known reason, the pages would go out
of order or disappear while I was trying to arrange them and
I'd have to
start all over again. With some patience
it's do-able though
and if you don't have a pdf editor it's a way to go. You may
be
more adept with your word-processor than I am.
I find my pdf editor much easier. I click the "create a new pdf from multiple files" button, browse for the files, click "Go" and it gets done. If pages need to be added, deleted or re-arranged that's all possible too at the click of a button. When you're finished, this document receives a number and goes into your digital Source Library, ready for future reference or emailing to a lucky recipient.
|