Simon Willison’s Weblog

Subscribe

AI for data engineers with Simon Willison. I recorded an episode last week with Claire Giordano for the Talking Postgres podcast. The topic was "AI for data engineers" but we ended up covering an enjoyable range of different topics.

  • How I got started programming with a Commodore 64 - the tape drive for which inspired the name Datasette
  • Selfish motivations for TILs (force me to write up my notes) and open source (help me never have to solve the same problem twice)
  • LLMs have been good at SQL for a couple of years now. Here's how I used them for a complex PostgreSQL query that extracted alt text from my blog's images using regular expressions
  • Structured data extraction as the most economically valuable application of LLMs for data work
  • 2025 has been the year of tool calling a loop ("agentic" if you like)
  • Thoughts on running MCPs securely - read-only database access, think about sandboxes, use PostgreSQL permissions, watch out for the lethal trifecta
  • Jargon guide: Agents, MCP, RAG, Tokens
  • How to get started learning to prompt: play with the models and "bring AI to the table" even for tasks that you don't think it can handle
  • "It's always a good day if you see a pelican"

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe