textract-cli. This is my other OCR project from yesterday: I built the thinnest possible CLI wrapper around Amazon Textract, out of frustration at how hard that tool is to use on an ad-hoc basis.
It only works with JPEGs and PNGs (not PDFs) up to 5MB in size, reflecting limitations in Textract’s synchronous API: it can handle PDFs amazingly well but you have to upload them to an S3 bucket yet and I decided to keep the scope tight for the first version of this tool.
Assuming you’ve configured AWS credentials already, this is all you need to know:
pipx install textract-cli
textract-cli image.jpeg > output.txt
Recent articles
- Olmo 3 is a fully open LLM - 22nd November 2025
- Nano Banana Pro aka gemini-3-pro-image-preview is the best available image generation model - 20th November 2025
- How I automate my Substack newsletter with content from my blog - 19th November 2025