VISTA.BLORGE
TECH.BLORGE.com
MAC.BLORGE.com
GAMER.BLORGE.com

July 29, 2007 |

How to OCR images and printed documents using Office 2007 on Vista

By Jonathan Schlaffer





onenote Recently we became interested in a problem, one of our writers lacked a scanner but needed to get text out of a document (known as “OCR” or Optical Character Recognition).  There weren’t very many good freeware options, a few paid ones but they were fairly lacking.  All he needed to do was get access to a computer running Vista which had Office 2007 installed which is not hard in our parts.

I’m the one who eventually discovered how to do it relatively painlessly.  There was a tutorial that would have you do some things manually using Office Word 2007 but that was too hard and too time consuming so I found an easier option.  It’s really too bad Word doesn’t have this built in.

You will need Microsoft Office 2007 with OneNote installed.  OneNote has OCR abilities built right in and is accurate, accurate to the point of being scary, accurate.  All you need to do is to save the image with the text you want in it as any standard format though I stuck with the standard JPEG image format, others might work but I wouldn’t wander too far off the path.

Once you have the image with the text you want, it can either be from an image you found online, from a scanned document or from a digital camera.  Insert the image into a new OneNote document, right click the image and select “Copy Text from Picture.”  Depending on the amount of text this could take a while.  Once it has finished, open another new document in OneNote and select Edit > Paste which will paste the text that it gathered from the image into the new document.  That was easy enough, wasn’t it?

Related:

  • Microsoft ships Office security tools
  • Recent WGA failure shows flaw in Microsoft’s strategy for Vista, Office
  • Office 2007 could be outselling Vista
  • Office 2007 security plans make it crash
  • How to relocate your Documents folder in Windows Vista

  • Sign up for the BLORGE email newsletter

    7 Responses to “How to OCR images and printed documents using Office 2007 on Vista”

    1. Jon Cazne:

      What a lifesaver!!! Well done! I had a PDF which I opened in CS3 and saved all the pages as .BMP’s. Then I followed your instructions exactly and voila’; all seven pages were now in text! You are so right when you state that Word ‘07 should have this as “standard equipment”, but hey; nobody asked us right?

      THANK YOU SO MUCH!!

    2. Jonathan:

      I’m glad the post helped you.

    3. vasudev:

      Its also very easy to extract text from image using Office 2007. You can utilise its OCR facility using Microsoft Office Document Imaging.
      Check about this here:
      http://vasudevg.blogspot.com/2007/08/ocr-feature-in-office-2007.html

    4. Toni:

      Hi,

      Yes, Thanks for sharing your experience!

      You can not imagine how much time have we saved reading text directly from the image!

      :-)

    5. Lew:

      Hi,

      I have no idea why… but it didn’t work for me… when I followed the instruction nothing happened when I select “Copy Text from Picture” and nothing happened when I pasted it on a new OneNote page.
      Do you know why this happened???

      Thanks

    6. Adam:

      Yes I’ve got the same problem as Lew

    7. mk:

      I knew that Microsoft Office Document Imaging could OCR images in Office 2003. Page layout was not maintained here. Does One Note maintain the page layout, table/ paragraph formatting etc?

      In any case, there’s a free web service called OCR Terminal (www.ocrterminal.com) that can do this.

    Leave a Reply:

    Copyright © 2007 Engaging and compelling blogs that entertain and inform