Simon Willison’s Weblog

Subscribe

Items tagged projects in 2023

Filters: Year: 2023 × projects × Sorted by date


Weeknotes: symbex, LLM prompt templates, a bit of a break

I had a holiday to the UK for a family wedding anniversary and mostly took the time off... except for building symbex, which became one of those projects that kept on inspiring new features.

[... 1120 words]

Symbex: search Python code for functions and classes, then pipe them into a LLM

I just released a new Python CLI tool called Symbex. It’s a search tool, loosely inspired by ripgrep, which lets you search Python code for functions and classes by name or wildcard, then see just the source code of those matching entities.

[... 1183 words]

LLM 0.4. I released a major update to my LLM CLI tool today—version 0.4, which adds conversation mode and prompt templates so you can store and re-use interesting prompts, plus a whole bunch of other large and small improvements.

I also released 0.4.1 with some minor fixes and the ability to install the tool using Hombrew: brew install simonw/llm/llm # 17th June 2023, 10:58 pm

Understanding GPT tokenizers

Large language models such as GPT-3/4, LLaMA and PaLM work in terms of tokens. They take text, convert it into tokens (integers), then predict which tokens should come next.

[... 1570 words]

Weeknotes: Parquet in Datasette Lite, various talks, more LLM hacking

I’ve fallen a bit behind on my weeknotes. Here’s a catchup for the last few weeks.

[... 769 words]

Trogon (via) The latest project from the Textualize/Rich crew, Trogon provides a Python decorator—@tui—which, when applied to a Click CLI application, adds a new interactive TUI mode which introspects the available subcommands and their options and creates a full Text User Interface—with keyboard and mouse support—for assembling invocations of those various commands.

I just shipped sqlite-utils 3.32 with support for this—it uses an optional dependency, so you’ll need to run “sqlite-utils install trogon” and then “sqlite-utils tui” to try it out. # 21st May 2023, 9:39 pm

llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs

I’ve been building out a small suite of command-line tools for working with ChatGPT, GPT-4 and potentially other language models in the future.

[... 1317 words]

Weeknotes: sqlite-utils 3.31, download-esm, Python in a sandbox

A couple of speaking appearances last week—one planned, one unplanned. Plus sqlite-utils 3.31, download-esm and a new TIL.

[... 608 words]

download-esm: a tool for downloading ECMAScript modules

I’ve built a new CLI tool, download-esm, which takes the name of an npm package and will attempt to download the ECMAScript module version of that package, plus all of its dependencies, directly from the jsDelivr CDN—and then rewrite all of the import statements to point to those local copies.

[... 1240 words]

Weeknotes: Miscellaneous research into Rye, ChatGPT Code Interpreter and openai-to-sqlite

I gave myself some time off stressing about my core responsibilities this week after PyCon, which meant allowing myself to be distracted by some miscellaneous research projects.

[... 891 words]

Enriching data with GPT3.5 and SQLite SQL functions

I shipped openai-to-sqlite 0.3 yesterday with a fun new feature: you can now use the command-line tool to enrich data in a SQLite database by running values through an OpenAI model and saving the results, all in a single SQL query.

[... 1219 words]

GPT-3 token encoder and decoder. I built an Observable notebook with an interface to encode, decode and search through GPT-3 tokens, building on top of a notebook by EJ Fox and Ian Johnson. # 27th April 2023, 11:48 pm

sqlite-history: tracking changes to SQLite tables using triggers (also weeknotes)

In between blogging about ChatGPT rhetoric, micro-benchmarking with ChatGPT Code Interpreter and Why prompt injection is an even bigger problem now I managed to ship the beginnings of a new project: sqlite-history.

[... 1680 words]

image-to-jpeg (via) I built a little JavaScript app that accepts an image, then displays that image as a JPEG with a slider to control the quality setting, plus a copy and paste textarea to copy out that image with a data-uri. I didn’t actually write a single line of code for this: I got ChatGPT/GPT-4 to generate the entire thing with some prompts (transcript in the via link). # 5th April 2023, 10:10 pm

Weeknotes: A new llm CLI tool, plus automating my weeknotes and newsletter

I started publishing weeknotes in 2019 partly as a way to hold myself accountable but mainly as a way to encourage myself to write more.

[... 830 words]

Semi-automating a Substack newsletter with an Observable notebook

I recently started sending out a weekly-ish email newsletter consisting of content from my blog. I’ve mostly automated that, using an Observable Notebook to generate the HTML. Here’s how that system works.

[... 2520 words]

AI-enhanced development makes me more ambitious with my projects

The thing I’m most excited about in our weird new AI-enhanced reality is the way it allows me to be more ambitious with my projects.

[... 3334 words]

I built a ChatGPT plugin to answer questions about data hosted in Datasette

Yesterday OpenAI announced support for ChatGPT plugins. It’s now possible to teach ChatGPT how to make calls out to external APIs and use the responses to help generate further answers in the current conversation.

[... 1801 words]

Weeknotes: AI won’t slow down, a new newsletter and a huge Datasette refactor

I’m a few weeks behind on my weeknotes, but it’s not through lack of attention to my blog. AI just keeps getting weirder and more interesting.

[... 1255 words]

apple-notes-to-sqlite (via) With the help of ChatGPT I finally figured out just enough AppleScript to automate the export of my notes to a SQLite database. AppleScript is a notoriously read-only language, which is turns out makes it a killer app for LLM-assisted coding. # 9th March 2023, 6:04 am

Weeknotes: A bunch of things I learned this week, plus datasette-explain

The Datasette table view refactor, JSON redesign and ?_extra= continues this week, mainly in this ongoing pull request and this tracking issue.

[... 1528 words]

datasette-scraper, Big Local News and other weeknotes

In addition to exploring the new MusicCaps training and evaluation data I’ve been working on the big Datasette JSON refactor, and getting excited about a Datasette project that I didn’t work on at all.

[... 1744 words]

Examples of sites built using Datasette (via) I gave the examples page on the Datasette website a significant upgrade today: it now includes screenshots (taken using shot-scraper) of six projects chosen to illustrate the variety of problems Datasette can be used to tackle. # 29th January 2023, 3:40 am

Exploring MusicCaps, the evaluation data released to accompany Google’s MusicLM text-to-music model

Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “high-fidelity” music track. Here’s the paper (and a more readable version using arXiv Vanity).

[... 1323 words]

How to implement Q&A against your documentation with GPT3, embeddings and Datasette

If you’ve spent any time with GPT-3 or ChatGPT, you’ve likely thought about how useful it would be if you could point them at a specific, current collection of text or documentation and have it use that as part of its input for answering questions.

[... 3491 words]