How to OCR images and printed documents using Office 2007 on Vista
By Jonathan Schlaffer
Recently we became interested in a problem, one of our writers lacked a scanner but needed to get text out of a document (known as “OCR” or Optical Character Recognition). There weren’t very many good freeware options, a few paid ones but they were fairly lacking. All he needed to do was get access to a computer running Vista which had Office 2007 installed which is not hard in our parts.
I’m the one who eventually discovered how to do it relatively painlessly. There was a tutorial that would have you do some things manually using Office Word 2007 but that was too hard and too time consuming so I found an easier option. It’s really too bad Word doesn’t have this built in.
You will need Microsoft Office 2007 with OneNote installed. OneNote has OCR abilities built right in and is accurate, accurate to the point of being scary, accurate. All you need to do is to save the image with the text you want in it as any standard format though I stuck with the standard JPEG image format, others might work but I wouldn’t wander too far off the path.
Once you have the image with the text you want, it can either be from an image you found online, from a scanned document or from a digital camera. Insert the image into a new OneNote document, right click the image and select “Copy Text from Picture.” Depending on the amount of text this could take a while. Once it has finished, open another new document in OneNote and select Edit > Paste which will paste the text that it gathered from the image into the new document. That was easy enough, wasn’t it?
Related:






Stumble It!

August 2nd, 2007
What a lifesaver!!! Well done! I had a PDF which I opened in CS3 and saved all the pages as .BMP’s. Then I followed your instructions exactly and voila’; all seven pages were now in text! You are so right when you state that Word ‘07 should have this as “standard equipment”, but hey; nobody asked us right?
THANK YOU SO MUCH!!
August 2nd, 2007
I’m glad the post helped you.
August 4th, 2007
Its also very easy to extract text from image using Office 2007. You can utilise its OCR facility using Microsoft Office Document Imaging.
Check about this here:
http://vasudevg.blogspot.com/2007/08/ocr-feature-in-office-2007.html
March 13th, 2008
Hi,
Yes, Thanks for sharing your experience!
You can not imagine how much time have we saved reading text directly from the image!
April 18th, 2008
Hi,
I have no idea why… but it didn’t work for me… when I followed the instruction nothing happened when I select “Copy Text from Picture” and nothing happened when I pasted it on a new OneNote page.
Do you know why this happened???
Thanks
October 5th, 2008
Yes I’ve got the same problem as Lew
October 14th, 2008
I knew that Microsoft Office Document Imaging could OCR images in Office 2003. Page layout was not maintained here. Does One Note maintain the page layout, table/ paragraph formatting etc?
In any case, there’s a free web service called OCR Terminal (www.ocrterminal.com) that can do this.