Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

tesseract-ocr. Open source OCR, sponsored by Google. I just sat in on a talk on this at OSCON and the complexity of the problem is pretty incredible.

Tagged , , , ,

1 comment

  1. Here is free online OCR service, based on Tesseract OCR engine, that recognizes multilingual text from a scanned document or photo, handles image files in different formats, can process 29 languages (Bulgarian, Catalan, Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hungarian, Indonesian, Italian, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovene, Spanish, Swedish, Tagalog, Turkish, Ukrainian, Vietnamese) and supports layout analysis, which means that it can interpret multi-column text.

    Alex - 29th October 2009 19:37 - #

Comments are closed.
A django site