Simon Willison’s Weblog

Subscribe

9th November 2021

TIL Using Tesseract.js to OCR every image on a page — Pasting this code into a DevTools console should load [Tesseract.js](https://github.com/naptha/tesseract.js) from a CDN, loop through every image loaded by that page (every PNG, GIF, JPG or JPEG), run OCR on them and output the result to the DevTools console.

Recent articles

This is a beat by Simon Willison, posted on 9th November 2021.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe