Simon Willison’s Weblog

Subscribe

Blogmarks in Feb, 2024

Filters: Type: blogmark × Year: 2024 × Month: Feb × Sorted by date


Datasette 1.0a12. Another alpha release, this time with a new query_actions() plugin hook, a new design for the table, database and query actions menus, a “does not contain” table filter and a fix for a minor bug with the JavaScript makeColumnActions() plugin mechanism. # 29th February 2024, 11:56 pm

GGUF, the long way around (via) Vicki Boykis dives deep into the GGUF format used by llama.cpp, after starting with a detailed description of how PyTorch models work and how they are traditionally persisted using Python pickle.

Pickle lead to safetensors, a format that avoided the security problems with downloading and running untrusted pickle files.

Llama.cpp introduced GGML, which popularized 16-bit (as opposed to 32-bit) quantization and bundled metadata and tensor data in a single file.

GGUF fixed some design flaws in GGML and is the default format used by Llama.cpp today. # 29th February 2024, 9:39 pm

The Zen of Python, Unix, and LLMs. Here’s the YouTube recording of my 1.5 hour conversation with Hugo Bowne-Anderson yesterday.

I fed a Whisper transcript to Google Gemini Pro 1.5 and asked it for the themes from our conversation, and it said we talked about “Python’s success and versatility, the rise and potential of LLMs, data sharing and ethics in the age of LLMs, Unix philosophy and its influence on software development and the future of programming and human-computer interaction”. # 29th February 2024, 9:04 pm

Testcontainers (via) Not sure how I missed this: Testcontainers is a family of testing libraries (for Python, Go, JavaScript, Ruby, Rust and a bunch more) that make it trivial to spin up a service such as PostgreSQL or Redis in a container for the duration of your tests and then spin it back down again.

The Python example code is delightful:

redis = DockerContainer(“redis:5.0.3-alpine”).with_exposed_ports(6379)
redis.start()
wait_for_logs(redis, “Ready to accept connections”)

I much prefer integration-style tests over unit tests, and I like to make sure any of my projects that depend on PostgreSQL or similar can run their tests against a real running instance. I’ve invested heavily in spinning up Varnish or Elasticsearch ephemeral instances in the past—Testcontainers look like they could save me a lot of time.

The open source project started in 2015, span off a company called AtomicJar in 2021 and was acquired by Docker in December 2023. # 28th February 2024, 2:41 am

The Zen of Python, Unix, and LLMs with Simon Willison (via) I’m participating in a live online fireside chat with Hugo Bowne-Anderson tomorrow afternoon (3pm Pacific / 6pm Eastern / 11pm GMT) talking about LLMs, Datasette, my open source process, applying the Unix pipes philosophy to LLMs and a whole lot more. It’s free to register. # 27th February 2024, 11:11 pm

All you need is Wide Events, not “Metrics, Logs and Traces” (via) I’ve heard great things about Meta’s internal observability platform Scuba, here’s an explanation from ex-Meta engineer Ivan Burmistrov describing the value it provides and comparing it to the widely used OpenTelemetry stack. # 27th February 2024, 10:57 pm

Mistral Large. Mistral Medium only came out two months ago, and now it’s followed by Mistral Large. Like Medium, this new model is currently only available via their API. It scores well on benchmarks (though not quite as well as GPT-4) but the really exciting feature is function support, clearly based on OpenAI’s own function design.

Functions are now supported via the Mistral API for both Mistral Large and the new Mistral Small, described as follows: “Mistral Small, optimised for latency and cost. Mistral Small outperforms Mixtral 8x7B and has lower latency, which makes it a refined intermediary solution between our open-weight offering and our flagship model.” # 26th February 2024, 11:23 pm

dclient 0.3. dclient is my CLI utility for working with remote Datasette instances—in particular for authenticating with them and then running both read-only SQL queries and inserting data using the new Datasette write JSON API. I just picked up work on the project again after a six month gap—the insert command can now be used to constantly stream data directly to hosted Datasette instances such as Datasette Cloud. # 25th February 2024, 8:06 pm

Upside down table trick with CSS (via) I was complaining how hard it is to build a horizontally scrollable table with a scrollbar at the top rather than the bottom and RGBCube on Lobste.rs suggested rotating the container 180 degrees and then the table contents and headers 180 back again... and it totally works! Demo in this CodePen. # 24th February 2024, 9 pm

How to make self-hosted maps that work everywhere and cost next to nothing. Chris Amico provides a detailed roundup of the state of web mapping in 2024. It’s never been easier to entirely host your own mapping infrastructure, thanks to OpenStreetMap, Overture, MBTiles, PMTiles, Maplibre and a whole ecosystem of other fine open source projects.

I like Protomaps creator Brandon Liu’s description of this: “post-scarcity web mapping”. # 24th February 2024, 4:19 am

Does Offering ChatGPT a Tip Cause it to Generate Better Text? An Analysis (via) Max Woolf:“I have a strong hunch that tipping does in fact work to improve the output quality of LLMs and its conformance to constraints, but it’s very hard to prove objectively. [...] Let’s do a more statistical, data-driven approach to finally resolve the debate.” # 23rd February 2024, 5:42 pm

Bloom Filters, explained by Sam Rose. Beautifully designed explanation of bloom filters, complete with interactive demos that illustrate exactly how they work. # 23rd February 2024, 3:59 pm

PGlite (via) PostgreSQL compiled for WebAssembly and turned into a very neat JavaScript library. Previous attempts at running PostgreSQL in WASM have worked by bundling a full Linux virtual machine—PGlite just bundles a compiled PostgreSQL itself, which brings the size down to an impressive 3.7MB gzipped. # 23rd February 2024, 3:56 pm

Okay, Color Spaces (via) Fantastic interactive explanation of how color spaces work by Eric Portis. # 22nd February 2024, 11:38 pm

JavaScript Bloat in 2024 (via) Depressing review of the state of page bloat in 2024 by Nikita Prokopov. Some of these are pretty shocking: 12MB for a Booking.com search, 9MB for a Google search, 20MB for Gmail(!), 31MB for LinkedIn. No wonder the modern web can feel sludgy even on my M2 MacBook Pro. # 22nd February 2024, 11:31 pm

Gemma: Introducing new state-of-the-art open models. Google get in on the openly licensed LLM game: Gemma comes in two sizes, 2B and 7B, trained on 2 trillion and 6 trillion tokens respectively. The terms of use “permit responsible commercial usage”. In the benchmarks it appears to compare favorably to Mistral and Llama 2.

Something that caught my eye in the terms: “Google may update Gemma from time to time, and you must make reasonable efforts to use the latest version of Gemma.”

One of the biggest benefits of running your own model is that it can protect you from model updates that break your carefully tested prompts, so I’m not thrilled by that particular clause.

UPDATE: It turns out that clause isn’t uncommon—the phrase “You shall undertake reasonable efforts to use the latest version of the Model” is present in both the Stable Diffusion and BigScience Open RAIL-M licenses. # 21st February 2024, 4:22 pm

Let’s build the GPT Tokenizer. When Andrej Karpathy left OpenAI last week a lot of people expressed hope that he would be increasing his output of educational YouTube videos.

Here’s an in-depth 2 hour dive into how tokenizers work and how to build one from scratch, published this morning.

The section towards the end, “revisiting and explaining the quirks of LLM tokenization”, helps explain a number of different LLM weaknesses—inability to reverse strings, confusion over arithmetic and even a note on why YAML can work better than JSON when providing data to LLMs (the same data can be represented in less tokens). # 20th February 2024, 6:02 pm

htmz (via) Astonishingly clever browser platform hack by Lean Rada.

Add this to a page:

<iframe hidden name=htmz onload="setTimeout(() => document.querySelector( this.contentWindow.location.hash || null)?.replaceWith( ...this.contentDocument.body.childNodes ))"></iframe>

Then elsewhere add a link like this:

<a href="/flower.html#my-element" target=htmz>Flower</a>

Clicking that link will fetch content from /flower.html and replace the element with ID of my-element with that content. # 20th February 2024, 1:21 am

aiolimiter. I found myself wanting an asyncio rate limiter for Python today—so I could send POSTs to an API endpoint no more than once every 10 seconds. This library worked out really well—it has a very neat design and lets you set up rate limits for things like “no more than 50 items every 10 seconds”, implemented using the leaky bucket algorithm. # 20th February 2024, 1:15 am

ActivityPub Server in a Single PHP File (via) Terence Eden: “Any computer program can be designed to run from a single file if you architect it wrong enough!”

I love this as a clear, easy-to-follow example of the core implementation details of the ActivityPub protocol—and a reminder that often a single PHP file is all you need. # 19th February 2024, 12:20 am

datasette-studio. I’ve been thinking for a while that it might be interesting to have a version of Datasette that comes bundled with a set of useful plugins, aimed at expanding Datasette’s default functionality to cover things like importing data and editing schemas.

This morning I built the very first experimental preview of what that could look like. Install it using pipx:

pipx install datasette-studio

I recommend pipx because it will ensure datasette-studio gets its own isolated environment, independent of any other Datasette installations you might have.

Now running “datasette-studio” instead of “datasette” will get you the version with the bundled plugins.

The implementation of this is fun—it’s a single pyproject.toml file defining the dependencies and setting up the datasette-studio CLI hook, which is enough to provide the full set of functionality.

Is this a good idea? I don’t know yet, but it’s certainly an interesting initial experiment. # 18th February 2024, 8:38 pm

Datasette 1.0a10. The only changes in this alpha release concern the way Datasette handles database transactions. The database.execute_write_fn() internal method used to leave functions to implement transactions on their own—it now defaults to wrapping them in a transaction unless they opt out with the new transaction=False parameter.

In implementing this I found several places inside Datasette—in particular parts of the JSON write API—which had not been handling transactions correctly. Those are all now fixed. # 18th February 2024, 5:10 am

Representation Engineering: Mistral-7B on Acid (via) Theia Vogel provides a delightfully clear explanation (and worked examples) of control vectors—a relatively recent technique for influencing the behaviour of an LLM by applying vectors to the hidden states that are evaluated during model inference.

These vectors are surprisingly easy to both create and apply. Build a small set of contrasting prompt pairs—“Act extremely happy” v.s. “Act extremely sad” for example (with a tiny bit of additional boilerplate), then run a bunch of those prompts and collect the hidden layer states. Then use “single-component PCA” on those states to get a control vector representing the difference.

The examples Theia provides, using control vectors to make Mistral 7B more or less honest, trippy, lazy, creative and more, are very convincing. # 18th February 2024, 3:49 am

wddbfs – Mount a sqlite database as a filesystem. Ingenious hack from Adam Obeng. Install this Python tool and run it against a SQLite database:

wddbfs --anonymous --db-path path/to/content.db

Then tell the macOS Finder to connect to Go -> Connect to Server -> http://127.0.0.1:8080/ (connect as guest)—connecting via WebDAV.

/Volumes/127.0.0.1/content.db will now be a folder full of CSV, TSV, JSON and JSONL files—one of each format for every table.

This means you can open data from SQLite directly in any application that supports that format, and you can even run CLI commands such as grep, ripgrep or jq directly against the data!

Adam used WebDAV because “Despite how clunky it is, this seems to be the best way to implement a filesystem given that getting FUSE support is not straightforward”. What a neat trick. # 18th February 2024, 3:31 am

Paying people to work on open source is good actually. In which Jacob expands his widely quoted (including here) pithy toot about how quick people are to pick holes in paid open source contributor situations into a satisfyingly comprehensive rant. This is absolutely worth your time—there’s so much I could quote from here, but I’m going to go with this:

“Many, many more people should be getting paid to write free software, but for that to happen we’re going to have to be okay accepting impure or imperfect mechanisms.” # 17th February 2024, 1:42 am

Datasette 1.0a9. A new Datasette alpha release today. This adds basic alter table support API support, so you can request Datasette modify a table to add new columns needed for JSON objects submitted to the insert, upsert or update APIs.

It also makes some permission changes—fixing a minor bug with upsert permissions, and introducing a new rule where every permission plugin gets consulted for a permission check, with just one refusal vetoing that check. # 16th February 2024, 11:20 pm

llmc.sh (via) Adam Montgomery wrote this a neat wrapper around my LLM CLI utility: it adds a “llmc” zsh function which you can ask for shell commands (llmc ’use ripgrep to find files matching otter’) which outputs the command, an explanation of the command and then copies the command to your clipboard for you to paste and execute if it looks like the right thing. # 16th February 2024, 6:19 pm

uv: Python packaging in Rust (via) “uv is an extremely fast Python package installer and resolver, written in Rust, and designed as a drop-in replacement for pip and pip-tools workflows.”

From Charlie Marsh and Astral, the team behind Ruff, who describe it as a milestone in their pursuit of a “Cargo for Python”.

Also in this announcement: Astral are taking over stewardship of Armin Ronacher’s Rye packaging tool, another Rust project.

uv is reported to be 8-10x faster than regular pip, increasing to 80-115x faster with a warm global module cache thanks to copy-on-write and hard links on supported filesystems—which saves on disk space too.

It also has a --resolution=lowest option for installing the lowest available version of dependencies—extremely useful for testing, I’ve been wanting this for my own projects for a while.

Also included: “uv venv”—a fast tool for creating new virtual environments with no dependency on Python itself. # 15th February 2024, 7:57 pm

Val Town Newsletter 15 (via) I really like how Val Town founder Steve Krouse now accompanies their “what’s new” newsletter with a video tour of the new features. I’m seriously considering imitating this for my own projects. # 15th February 2024, 4:26 pm

Our next-generation model: Gemini 1.5 (via) The big news here is about context length: Gemini 1.5 (a Mixture-of-Experts model) will do 128,000 tokens in general release, available in limited preview with a 1 million token context and has shown promising research results with 10 million tokens!

1 million tokens is 700,000 words or around 7 novels—also described in the blog post as an hour of video or 11 hours of audio. # 15th February 2024, 4:17 pm