Simon Willison’s Weblog

Subscribe
Atom feed

Entries

Filters: Sorted by date

Experimenting with audio input and output for the OpenAI Chat Completion API

Visit Experimenting with audio input and output for the OpenAI Chat Completion API

OpenAI promised this at DevDay a few weeks ago and now it’s here: their Chat Completion API can now accept audio as input and return it as output. OpenAI still recommend their WebSocket-based Realtime API for audio tasks, but the Chat Completion API is a whole lot easier to write code against.

[... 1,555 words]

Video scraping: extracting JSON data from a 35 second screen capture for less than 1/10th of a cent

Visit Video scraping: extracting JSON data from a 35 second screen capture for less than 1/10th of a cent

The other day I found myself needing to add up some numeric values that were scattered across twelve different emails.

[... 1,294 words]

ChatGPT will happily write you a thinly disguised horoscope

Visit ChatGPT will happily write you a thinly disguised horoscope

There’s a meme floating around at the moment where you ask ChatGPT the following and it appears to offer deep insight into your personality:

[... 1,236 words]

OpenAI DevDay: Let’s build developer tools, not digital God

Visit OpenAI DevDay: Let’s build developer tools, not digital God

I had a fun time live blogging OpenAI DevDay yesterday—I’ve now shared notes about the live blogging system I threw other in a hurry on the day (with assistance from Claude and GPT-4o). Now that the smoke has settled a little, here are my impressions from the event.

[... 2,090 words]

OpenAI DevDay 2024 live blog

Visit OpenAI DevDay 2024 live blog

I’m at OpenAI DevDay in San Francisco, and I’m trying something new: a live blog, where this entry will be updated with new notes during the event.

[... 68 words]

Weeknotes: Three podcasts, two trips and a new plugin system

I fell behind a bit on my weeknotes. Here’s most of what I’ve been doing in September.

[... 693 words]

NotebookLM’s automatically generated podcasts are surprisingly effective

Visit NotebookLM's automatically generated podcasts are surprisingly effective

Audio Overview is a fun new feature of Google’s NotebookLM which is getting a lot of attention right now. It generates a one-off custom podcast against content you provide, where two AI hosts start up a “deep dive” discussion about the collected content. These last around ten minutes and are very podcast, with an astonishingly convincing audio back-and-forth conversation.

[... 1,489 words]

Themes from DjangoCon US 2024

Visit Themes from DjangoCon US 2024

I just arrived home from a trip to Durham, North Carolina for DjangoCon US 2024. I’ve already written about my talk where I announced a new plugin system for Django; here are my notes on some of the other themes that resonated with me during the conference.

[... 1,470 words]

DJP: A plugin system for Django

Visit DJP: A plugin system for Django

DJP is a new plugin mechanism for Django, built on top of Pluggy. I announced the first version of DJP during my talk yesterday at DjangoCon US 2024, How to design and implement extensible software with plugins. I’ll post a full write-up of that talk once the video becomes available—this post describes DJP and how to use what I’ve built so far.

[... 1,664 words]

Notes on using LLMs for code

Visit Notes on using LLMs for code

I was recently the guest on TWIML—the This Week in Machine Learning & AI podcast. Our episode is titled Supercharging Developer Productivity with ChatGPT and Claude with Simon Willison, and the focus of the conversation was the ways in which I use LLM tools in my day-to-day work as a software developer and product engineer.

[... 861 words]

Things I’ve learned serving on the board of the Python Software Foundation

Two years ago I was elected to the board of directors for the Python Software Foundation—the PSF. I recently returned from the annual PSF board retreat (this one was in Lisbon, Portugal) and this feels like a good opportunity to write up some of the things I’ve learned along the way.

[... 2,702 words]

Notes on OpenAI’s new o1 chain-of-thought models

OpenAI released two major new preview models today: o1-preview and o1-mini (that mini one is not a preview)—previously rumored as having the codename “strawberry”. There’s a lot to understand about these models—they’re not as simple as the next step up from GPT-4o, instead introducing some major trade-offs in terms of cost and performance in exchange for improved “reasoning” capabilities.

[... 1,568 words]

Notes from my appearance on the Software Misadventures Podcast

Visit Notes from my appearance on the Software Misadventures Podcast

I was a guest on Ronak Nathani and Guang Yang’s Software Misadventures Podcast, which interviews seasoned software engineers about their careers so far and their misadventures along the way. Here’s the episode: LLMs are like your weird, over-confident intern | Simon Willison (Datasette).

[... 1,740 words]

Teresa T is name of the whale in Pillar Point Harbor near Half Moon Bay

Visit Teresa T is name of the whale in Pillar Point Harbor near Half Moon Bay

There is a young humpback whale in the harbor at Pillar Point, just north of Half Moon Bay, California right now. Their name is Teresa T and they were first spotted on Thursday afternoon.

[... 254 words]

Calling LLMs from client-side JavaScript, converting PDFs to HTML + weeknotes

Visit Calling LLMs from client-side JavaScript, converting PDFs to HTML + weeknotes

I’ve been having a bunch of fun taking advantage of CORS-enabled LLM APIs to build client-side JavaScript applications that access LLMs directly. I also span up a new Datasette plugin for advanced permission management.

[... 2,050 words]

Building a tool showing how Gemini Pro can return bounding boxes for objects in images

Visit Building a tool showing how Gemini Pro can return bounding boxes for objects in images

I was browsing through Google’s Gemini documentation while researching how different multi-model LLM APIs work when I stumbled across this note in the vision documentation:

[... 1,792 words]

Claude’s API now supports CORS requests, enabling client-side applications

Visit Claude's API now supports CORS requests, enabling client-side applications

Anthropic have enabled CORS support for their JSON APIs, which means it’s now possible to call the Claude LLMs directly from a user’s browser.

[... 625 words]

Optimizing Datasette (and other weeknotes)

Visit Optimizing Datasette (and other weeknotes)

I’ve been working with Alex Garcia on an experiment involving using Datasette to explore FEC contributions. We currently have a 11GB SQLite database—trivial for SQLite to handle, but at the upper end of what I’ve comfortably explored with Datasette in the past.

[... 2,069 words]

django-http-debug, a new Django app mostly written by Claude

Visit django-http-debug, a new Django app mostly written by Claude

Yesterday I finally developed something I’ve been casually thinking about building for a long time: django-http-debug. It’s a reusable Django app—something you can pip install into any Django project—which provides tools for quickly setting up a URL that returns a canned HTTP response and logs the full details of any incoming request to a database table.

[... 2,692 words]

Weeknotes: a staging environment, a Datasette alpha and a bunch of new LLMs

My big achievement for the last two weeks was finally wrapping up work on the Datasette Cloud staging environment. I also shipped a new Datasette 1.0 alpha and added support to the LLM ecosystem for a bunch of newly released models.

[... 1,465 words]

Datasette 1.0a14: The annotated release notes

Visit Datasette 1.0a14: The annotated release notes

Released today: Datasette 1.0a14. This alpha includes significant contributions from Alex Garcia, including some backwards-incompatible changes in the run-up to the 1.0 release.

[... 1,424 words]

Weeknotes: GPT-4o mini, LLM 0.15, sqlite-utils 3.37 and building a staging environment

Upgrades to LLM to support the latest models, and a whole bunch of invisible work building out a staging environment for Datasette Cloud.

[... 730 words]

Imitation Intelligence, my keynote for PyCon US 2024

Visit Imitation Intelligence, my keynote for PyCon US 2024

I gave an invited keynote at PyCon US 2024 in Pittsburgh this year. My goal was to say some interesting things about AI—specifically about Large Language Models—both to help catch people up who may not have been paying close attention, but also to give people who were paying close attention some new things to think about.

[... 10,624 words]

Give people something to link to so they can talk about your features and ideas

If you have a project, an idea, a product feature, or anything else that you want other people to understand and have conversations about... give them something to link to!

[... 685 words]

Weeknotes: a livestream, a surprise keynote and progress on Datasette Cloud billing

Visit Weeknotes: a livestream, a surprise keynote and progress on Datasette Cloud billing

My first YouTube livestream with Val Town, a keynote at the AI Engineer World’s Fair and some work integrating Stripe with Datasette Cloud. Plus a bunch of upgrades to my blog.

[... 1,124 words]

Open challenges for AI engineering

Visit Open challenges for AI engineering

I gave the opening keynote at the AI Engineer World’s Fair yesterday. I was a late addition to the schedule: OpenAI pulled out of their slot at the last minute, and I was invited to put together a 20 minute talk with just under 24 hours notice!

[... 5,640 words]

Building search-based RAG using Claude, Datasette and Val Town

Visit Building search-based RAG using Claude, Datasette and Val Town

Retrieval Augmented Generation (RAG) is a technique for adding extra “knowledge” to systems built on LLMs, allowing them to answer questions against custom information not included in their training data. A common way to implement this is to take a question from a user, translate that into a set of search queries, run those against a search engine and then feed the results back into the LLM to generate an answer.

[... 3,372 words]

Weeknotes: Datasette Studio and a whole lot of blogging

Visit Weeknotes: Datasette Studio and a whole lot of blogging

I’m still spinning back up after my trip back to the UK, so actual time spent building things has been less than I’d like. I presented an hour long workshop on command-line LLM usage, wrote five full blog entries (since my last weeknotes) and I’ve also been leaning more into short-form link blogging—a lot more prominent on this site now since my homepage redesign last week.

[... 736 words]

Language models on the command-line

Visit Language models on the command-line

I gave a talk about accessing Large Language Models from the command-line last week as part of the Mastering LLMs: A Conference For Developers & Data Scientists six week long online conference. The talk focused on my LLM Python command-line utility and ways you can use it (and its plugins) to explore LLMs and use them for useful tasks.

[... 4,992 words]

A homepage redesign for my blog’s 22nd birthday

Visit A homepage redesign for my blog's 22nd birthday

This blog is 22 years old today! I wrote up a whole bunch of higlights for the 20th birthday a couple of years ago. Today I’m celebrating with something a bit smaller: I finally redesigned the homepage.

[... 314 words]