How to OCR images and printed documents using Office 2007 on Vista

July 29, 2007

onenote Recently we became interested in a problem, one of our writers lacked a scanner but needed to get text out of a document (known as “OCR” or Optical Character Recognition).  There weren’t very many good freeware options, a few paid ones but they were fairly lacking.  All he needed to do was get access to a computer running Vista which had Office 2007 installed which is not hard in our parts.

I’m the one who eventually discovered how to do it relatively painlessly.  There was a tutorial that would have you do some things manually using Office Word 2007 but that was too hard and too time consuming so I found an easier option.  It’s really too bad Word doesn’t have this built in.

You will need Microsoft Office 2007 with OneNote installed.  OneNote has OCR abilities built right in and is accurate, accurate to the point of being scary, accurate.  All you need to do is to save the image with the text you want in it as any standard format though I stuck with the standard JPEG image format, others might work but I wouldn’t wander too far off the path.

Once you have the image with the text you want, it can either be from an image you found online, from a scanned document or from a digital camera.  Insert the image into a new OneNote document, right click the image and select “Copy Text from Picture.”  Depending on the amount of text this could take a while.  Once it has finished, open another new document in OneNote and select Edit > Paste which will paste the text that it gathered from the image into the new document.  That was easy enough, wasn’t it?

Be Sociable, Share!

23 Responses to “How to OCR images and printed documents using Office 2007 on Vista”

  1. Jon Cazne:

    What a lifesaver!!! Well done! I had a PDF which I opened in CS3 and saved all the pages as .BMP’s. Then I followed your instructions exactly and voila’; all seven pages were now in text! You are so right when you state that Word ’07 should have this as “standard equipment”, but hey; nobody asked us right?

    THANK YOU SO MUCH!!

  2. Jonathan:

    I’m glad the post helped you.

  3. vasudev:

    Its also very easy to extract text from image using Office 2007. You can utilise its OCR facility using Microsoft Office Document Imaging.
    Check about this here:
    http://vasudevg.blogspot.com/2007/08/ocr-feature-in-office-2007.html

  4. Toni:

    Hi,

    Yes, Thanks for sharing your experience!

    You can not imagine how much time have we saved reading text directly from the image!

    :-)

  5. Lew:

    Hi,

    I have no idea why… but it didn’t work for me… when I followed the instruction nothing happened when I select “Copy Text from Picture” and nothing happened when I pasted it on a new OneNote page.
    Do you know why this happened???

    Thanks

  6. Adam:

    Yes I’ve got the same problem as Lew

  7. mk:

    I knew that Microsoft Office Document Imaging could OCR images in Office 2003. Page layout was not maintained here. Does One Note maintain the page layout, table/ paragraph formatting etc?

    In any case, there’s a free web service called OCR Terminal (www.ocrterminal.com) that can do this.

  8. Mohammed:

    Great!! thanks a lot for that post, it was really helpful and saved me a lot of time to search for a good OCR progs

  9. kevin:

    the result is bad , it can hardly identity the words

  10. Heather:

    Awesome tip, thank you! Saved us a few hours of work.
    notes – used a PDF, was not able to select page, had to select what i needed from the page manually and then copy/paste that into one note. then it rendered in a usable format. it wasn’t pretty, but it works!

  11. Mikey:

    thanks a lot that made my work very easy….
    I was forgetting my own styl of doin that..

    its when you scan something from a scanner then you can select save as tiff or word convertible doc file and then open it with word and convert it …something that sort i dont remember now i compeleted my project of 100 pages of engineering,,,,,again thanks a lot.

  12. Sascha:

    Absolutely super! You are the wizard of all wizards. This works great – thank you so much!!!!!

  13. R C Klostermeyer:

    One Note indeed converts the “image” to tex and in fact you can paste it directly to Word as well.

    But…the formatting is lost however you do it and you are left with a mess to reformat. Almost as easy to type it in manually.

    There must be something missing here.

  14. SAFWAT:

    man i cud kiss u rite now
    this works very good

  15. Yutie:

    I couldn’t use it. I had tried right click on the image but it didn’t have “copy text from picture”.
    Is there any other version of microsoft 2007 that doestn’t have this OCR like i do?
    I’ve just wonder could there be any options to do this?

  16. Yutie:

    That was my misunderstanding. I can do it now.
    Thank you for your information. It have helped me a lot for doing my study project. :)

  17. misaki:

    You can Using Open Office, this convenient way to save a word document to pdf files. Abode company have the same service. But some word 2003 ,2000 can be convert to pdf files. Second way: You can Using Online Converter, Search for in the internet.
    Click ‘browse’ and select the file on your computer.
    Click ‘Convert to PDF’ and wait for it to process.
    They will send the link to your PDF file to your e-mail inbox. Maybe sometimes this way waste long time to get E-MAIL. It’s not convenient.
    There are also many website can convert, You can try them too. But I chose it only because its fast, secure and more importantly supports upto 20 MB of file size to convert to PDF which others don’t.

  18. Dan:

    Thanks for the tip. Good if you just want the text.

    You said “Options…a few paid ones but they were fairly lacking.”

    I have used Omnipage and TextBridge for years and they work very well. Omnipage will create a copy of the page including images, that has editable text. Text Bridge just captures the text and is generally the same layout as the original.

  19. James McNamara:

    Many thanks for that nugget……you’re a lifesaver !!

  20. Greg:

    Thank you, thank you, thank you!!!!

  21. Jeet Kumar Budha Magar:

    This is actually much easier. I always wondered how I could extact texts from scanned documents. Thanks

  22. Torben:

    Above all the above programmes I would prefer Abby Finereader. It’s much better, more intuitive and also available in a free version with full functionality.

  23. Kevin West:

    Thank you for this concise and useful information.
    It worked perfectly well.
    Actually better than some OCR software I have used.

Leave a Reply:


Recent stories

Featured stories

Archives

Copyright © 2014 Blorge.com NS