Simon Willison’s Weblog


4 items tagged “documentcloud”


Our search for the best OCR tool in 2023, and what we found. DocumentCloud’s Sanjin Ibrahimovic reviews the best options for OCR. Tesseract scores highly for easily machine readable text, newcomer docTR is great for ease of use but still not great at handwriting. Amazon Textract is great for everything except non-Latin languages, Google Cloud Vision is great at pretty much everything except for ease-of-use. Azure AI Document Intelligence sounds worth considering as well.

# 31st October 2023, 7:21 pm / documentcloud, ocr


Backbone.js. As should be expected for a DocumentCloud project, Backbone is a concise, elegant and educational take on the JavaScript MVC pattern. Depends on Underscore.js and plays well with jQuery.

# 13th October 2010, 5:23 pm / documentcloud, javascript, jquery, mvc, underscore, backbone, recovered (via) The annotated grammar for CoffeeScript, a new language that compiles to JavaScript developed by DocumentCloud’s Jeremy Ashkenas. The linked page is generated using Jeremy’s Docco tool for literate programming, also written in CoffeeScript. CoffeeScript itself is implemented in CoffeeScript, using a bootstrap compiler originally written in Ruby.

# 8th March 2010, 7:27 pm / coffeescript, compilers, docco, documentcloud, javascript, jeremy-ashkenas, literateprogramming, programming, ruby, selfhosting


Underscore.js. A new library of functional programming primitives for JavaScript—each, map, all, any, inject, detect etc. Unlike some similar libraries this one doesn’t extend the built-in objects, instead opting to bind the new functions to the underscore symbol. A jQuery-style noConflict() option is available if even that is too much namespace pollution for you.

# 28th October 2009, 5:08 pm / documentcloud, functional, javascript, jquery, noconflict, underscore