Simon Willison’s Weblog

Subscribe

Items in 2022

Filters: Year: 2022 × Sorted by date


2022 in projects and blogging

In lieu of my regular weeknotes (I took two weeks off for the holidays) here’s a look back at 2022, mainly in terms of projects and things I’ve written about.

[... 1154 words]

Draw SVG rope using JavaScript (via) Delightful interactive tutorial by Stanko Tadić showing how to render an illustration of a rope using SVG, starting with a path. The way the tutorial is presented is outstanding. # 31st December 2022, 5:31 pm

Reverse Prompt Engineering for Fun and (no) Profit (via) swyx pulls off some impressive prompt leak attacks to reverse engineer the new AI features that just got added to Notion. He concludes that “Prompts are like clientside JavaScript. They are shipped as part of the product, but can be reverse engineered easily, and the meaningful security attack surface area is exactly the same.” # 28th December 2022, 8:56 pm

Detailed comment on HN describing how Second Life works these days. “There are about 27,500 live regions today, each with its own simulator program, always on even if nobody is using it. Each simulator program takes about one CPU and under 4GB on a server.” # 24th December 2022, 11:57 pm

Speech-to-text with Whisper: How I Use It & Why. Sumana Harihareswara’s in-depth review of Whisper, the shockingly effective open source text-to-speech transcription model release by OpenAI a few months ago. Includes an extremely thoughtful section considering the ethics of using this model—some of the most insightful short-form writing I’ve seen on AI model ethics generally. # 22nd December 2022, 9:49 pm

A 4.2GiB file isn’t a heist of every single artwork on the Internet, and those who think it is are the ones undervaluing their own contributions and creativity. It’s an amazing summary of what we know about art, and everyone should be able to use it to learn, grow, and create.

Danny O'Brien # 22nd December 2022, 9:47 pm

However, with millions of new active users rushing into Mastodon, I’m forced to reevaluate that. I think I may have become too focused on what I saw of as the limits of a federated setup (putting yourself into someone else’s fiefdom), without recognizing that if it started to take off (as it has), it would become easier and easier for people to set up their own instances, allowing those who are concerned about setting up in someone else’s garden the freedom to set up their own plot of land.

Mike Masnick # 22nd December 2022, 10:56 am

Boring Python: code quality. James Bennett provides an opinionated guide to setting up Python tools for linting, code formatting and and other code quality concerns. Of particular interest to me is his section on packaging checks, which introduces a whole bunch of new-to-me tools that can help avoid accidentally shipping broken packages to PyPI. # 20th December 2022, 7:55 pm

Weeknotes: Datasette 0.63.3, datasette-ripgrep

We’re back in the UK to see family over Christmas (our first trip back since 2019). Here are a few notes from the past couple of weeks.

[... 801 words]

TL;DR: To serve users at the 75th percentile (P75) of devices and networks, we can now afford ~150KiB of HTML/CSS/fonts and ~300-350KiB of JavaScript (gzipped). This is a slight improvement on last year’s budgets, thanks to device and network improvements. [... This is] what we should be aiming to send over the wire per page in 2023 to reach interactivity in less than 5 seconds on first load

Alex Russell # 20th December 2022, 9:54 am

Datasette 1.0a2: Upserts and finely grained permissions

I’ve released the third alpha of Datasette 1.0. The 1.0a2 release introduces upsert support to the new JSON API and makes some major improvements to the Datasette permissions system.

[... 2844 words]

Over-engineering Secret Santa with Python cryptography and Datasette

We’re doing a family Secret Santa this year, and we needed a way to randomly assign people to each other without anyone knowing who was assigned to who.

[... 2044 words]

Playing with ActivityPub (via) Tom MacWright describes his attempts to build the simplest possible ActivityPub publication—for a static site powered by Jekyll, where he used Netlify functions to handle incoming subscriptions (storing them in PlanetScale via their Deno API library) and wrote a script which loops through and notifies all of his subscriptions every time he publishes something new. # 10th December 2022, 12:58 am

Data-driven performance optimization with Rust and Miri (via) Useful guide to some Rust performance optimization tools. Miri can be used to dump out a detailed JSON profile of a program which can then be opened and explored using the Chrome browser’s performance tool. # 9th December 2022, 5:19 pm

Introducing sqlite-loadable-rs: A framework for building SQLite Extensions in Rust. Alex Garcia has built a new Rust library for creating SQLite extensions—initially supporting custom scalar functions, virtual tables and table functions and with more types of extension coming soon. This looks very easy to use, partly because the documentation and examples are already delightfully thorough, especially for an initial release. # 7th December 2022, 11:08 pm

talk.wasm (via) “Talk with an Artificial Intelligence in your browser”. Absolutely stunning demo which loads the Whisper speech recognition model (75MB) and a GPT-2 model (240MB) and executes them both in your browser via WebAssembly, then uses the Web Speech API to talk back to you. The result is a full speak-with-an-AI interface running entirely client-side. GPT-2 sadly mostly generates gibberish but the fact that this works at all is pretty astonishing. # 7th December 2022, 10:52 pm

I Taught ChatGPT to Invent a Language (via) Dylan Black talks ChatGPT through the process of inventing a new language, with its own grammar. Really fun example of what happens when someone with a deep understanding of both the capabilities of language models and some other field (in this case linguistics) can achieve with an extended prompting session. # 6th December 2022, 7:30 pm

Understanding a Protocol. Andrew’s latest notes on how ActivityPub and Mastodon work under the hood, based on his extensive development work building out Takahē. # 6th December 2022, 12:50 am

The primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce. There are also many people trying out ChatGPT to create answers, without the expertise or willingness to verify that the answer is correct prior to posting. Because such answers are so easy to produce, a large number of people are posting a lot of answers. The volume of these answers (thousands) and the fact that the answers often require a detailed read by someone with at least some subject matter expertise in order to determine that the answer is actually bad has effectively swamped our volunteer-based quality curation infrastructure.

StackOverflow Temporary policy: ChatGPT is banned # 6th December 2022, 12:16 am

Weeknotes: datasette-ephemeral-tables, datasette-export

Most of what I’ve been working on for the past week and a half is already documented:

[... 603 words]

AI assisted learning: Learning Rust with ChatGPT, Copilot and Advent of Code

I’m using this year’s Advent of Code to learn Rust—with the assistance of GitHub Copilot and OpenAI’s new ChatGPT.

[... 2661 words]

Building A Virtual Machine inside ChatGPT (via) Jonas Degrave presents a remarkable example of a creative use of ChatGPT: he prompts it to behave as a if it was a Linux shell, then runs increasingly complex sequences of commands against it and gets back surprisingly realistic results. By the end of the article he’s getting it to hallucinate responses to curl API requests run against imagined API versions of itself. # 5th December 2022, 1:43 am

A new AI game: Give me ideas for crimes to do

Less than a week ago OpenAI unleashed ChatGPT on the world, and it kicked off what feels like a seismic shift in many people’s understand of the capabilities of large language models.

[... 1069 words]

Datasette’s new JSON write API: The first alpha of Datasette 1.0

This week I published the first alpha release of Datasette 1.0, with a significant new feature: Datasette core now includes a JSON API for creating and dropping tables and inserting, updating and deleting data.

[... 2817 words]

three.js examples: webgl_postprocessing_pixel (via) Neat new example for three.js that uses a pixel-shader postprocessor to apply an isometric pixel-art feel to a 3D scene. # 1st December 2022, 9:57 pm

People are complex, and they get energy in complex ways. Some managers get energy from writing some software. That’s great, particularly if you avoid writing software with strict dependencies. Some managers get energy from coaching others. That’s great. Some get energy from doing exploratory work. Others get energy from optimizing existing systems. That’s great, too. Some get energy from speaking at conferences. Great. Some get energy from cleaning up internal wiki’s. You get the idea: that’s great. All these things are great, not because managers should or shouldn’t program/speak at conferences/clean up wiki’s/etc, but because folks will accomplish more if you let them do some energizing work, even if that work itself isn’t very important.

Will Larson # 1st December 2022, 6:35 pm

Scaling Mastodon: The Compendium (via) Hazel Weakly’s collection of notes on scaling Mastodon, covering PostgreSQL, Sidekiq, Redis, object storage and more. # 29th November 2022, 5:46 am

Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results. Stable Diffusion 2.0 is out, and it’s a very different model from 1.4/1.5. It’s trained using a new text encoder (OpenCLIP, in place of OpenAI’s CLIP) which means a lot of the old tricks—notably using “Greg Rutkowski” to get high quality fantasy art—no longer work. What DOES work, incredibly well, is negative prompting—saying things like “cyberpunk forest by Salvador Dali” but negative on “trees, green”. Max Woolf explores negative prompting in depth in this article, including how to combine it with textual inversion. # 29th November 2022, 1:22 am

If posts in a social media app do not have URLs that can be linked to and viewed in an unauthenticated browser, or if there is no way to make a new post from a browser, then that program is not a part of the World Wide Web in any meaningful way.

Consign that app to oblivion.

JWZ # 28th November 2022, 6:22 am

Coping strategies for the serial project hoarder

I gave a talk at DjangoCon US 2022 in San Diego last month about productivity on personal projects, titled “Massively increase your productivity on personal projects with comprehensive documentation and automated tests”.

[... 3863 words]