Simon Willison’s Weblog

Subscribe

Wednesday, 16th October 2024

A common misconception about Transformers is to believe that they're a sequence-processing architecture. They're not.

They're a set-processing architecture. Transformers are 100% order-agnostic (which was the big innovation compared to RNNs, back in late 2016 -- you compute the full matrix of pairwise token interactions instead of processing one token at a time).

The way you add order awareness in a Transformer is at the feature level. You literally add to your token embeddings a position embedding / encoding that corresponds to its place in a sequence. The architecture itself just treats the input tokens as a set.

François Chollet

# 1:18 am / llms, ai, generative-ai, francois-chollet

Un Ministral, des Ministraux (via) Two new models from Mistral: Ministral 3B and Ministral 8B - joining Mixtral, Pixtral, Codestral and Mathstral as weird naming variants on the Mistral theme.

These models set a new frontier in knowledge, commonsense, reasoning, function-calling, and efficiency in the sub-10B category, and can be used or tuned to a variety of uses, from orchestrating agentic workflows to creating specialist task workers. Both models support up to 128k context length (currently 32k on vLLM) and Ministral 8B has a special interleaved sliding-window attention pattern for faster and memory-efficient inference.

Mistral's own benchmarks look impressive, but it's hard to get excited about small on-device models with a non-commercial Mistral Research License (for the 8B) and a contact-us-for-pricing Mistral Commercial License (for the 8B and 3B), given the existence of the extremely high quality Llama 3.1 and 3.2 series of models.

These new models are also available through Mistral's la Plateforme API, priced at $0.1/million tokens (input and output) for the 8B and $0.04/million tokens for the 3B.

The latest release of my llm-mistral plugin for LLM adds aliases for the new models. Previously you could access them like this:

llm mistral refresh # To fetch new models
llm -m mistral/ministral-3b-latest "a poem about pelicans at the park"
llm -m mistral/ministral-8b-latest "a poem about a pelican in french"

With the latest plugin version you can do this:

llm install -U llm-mistral
llm -m ministral-8b "a poem about a pelican in french"

$ llm -m ministral-8b 'a poem about a pelican in french' - returns:  Bien sûr, voici un poème sur une pelican en français :  ---  Un pelican, sage et majestueux, Sur les mers bleues, il se promène. Avec ses ailes déployées, Il survole les flots, léger et serein.  Ses grands becs jaunes, un joyau, Attirent les poissons qui s'éloignent. Avec grâce, il plonge, s'entraîne, Dans l'eau profonde, il trouve son chemin.  Pelican, roi des cieux marins, Dans la lumière du soleil levant, Il mène sa danse, son ballet, Un spectacle de force et de beauté.  Sous le ciel infini, il navigue, Porté par les vents, par les courants. Pelican, symbole de la mer, Un gardien des profondeurs, un prince.  ---  J'espère que ce poème vous plaît

# 3:40 pm / mistral, llms, ai, generative-ai, llm

[red-knot] type inference/checking test framework (via) Ruff maintainer Carl Meyer recently landed an interesting new design for a testing framework. It's based on Markdown, and could be described as a form of "literate testing" - the testing equivalent of Donald Knuth's literate programming.

A markdown test file is a suite of tests, each test can contain one or more Python files, with optionally specified path/name. The test writes all files to an in-memory file system, runs red-knot, and matches the resulting diagnostics against Type: and Error: assertions embedded in the Python source as comments.

Test suites are Markdown documents with embedded fenced blocks that look like this:

```py
reveal_type(1.0) # revealed: float
```

Tests can optionally include a path= specifier, which can provide neater messages when reporting test failures:

```py path=branches_unify_to_non_union_type.py
def could_raise_returns_str() -> str:
    return 'foo'
...
```

A larger example test suite can be browsed in the red_knot_python_semantic/resources/mdtest directory.

This document on control flow for exception handlers (from this PR) is the best example I've found of detailed prose documentation to accompany the tests.

The system is implemented in Rust, but it's easy to imagine an alternative version of this idea written in Python as a pytest plugin. This feels like an evolution of the old Python doctest idea, except that tests are embedded directly in Markdown rather than being embedded in Python code docstrings.

... and it looks like such plugins exist already. Here are two that I've found so far:

I tried pytest-markdown-docs by creating a doc.md file like this:

# Hello test doc

```py
assert 1 + 2 == 3
```

But this fails:

```py
assert 1 + 2 == 4
```

And then running it with uvx like this:

uvx --with pytest-markdown-docs pytest --markdown-docs

I got one pass and one fail:

_______ docstring for /private/tmp/doc.md __________
Error in code block:
```
10   assert 1 + 2 == 4
11   
```
Traceback (most recent call last):
  File "/private/tmp/tt/doc.md", line 10, in <module>
    assert 1 + 2 == 4
AssertionError

============= short test summary info ==============
FAILED doc.md::/private/tmp/doc.md
=========== 1 failed, 1 passed in 0.02s ============

I also just learned that the venerable Python doctest standard library module has the ability to run tests in documentation files too, with doctest.testfile("example.txt"): "The file content is treated as if it were a single giant docstring; the file doesn’t need to contain a Python program!"

# 8:43 pm / testing, rust, python, astral, markdown, ruff, pytest, uv

Supercharge the One Person Framework with SQLite: Rails World 2024 (via) Stephen Margheim shares an annotated transcript of the YouTube video of his recent talk at this year's Rails World conference in Toronto.

The Rails community is leaning hard into SQLite right now. Stephen's talk is some of the most effective evangelism I've seen anywhere for SQLite as a production database for web applications, highlighting several new changes in Rails 8:

... there are two additions coming with Rails 8 that merit closer consideration. Because these changes make Rails 8 the first version of Rails (and, as far as I know, the first version of any web framework) that provides a fully production-ready SQLite experience out-of-the-box.

Those changes: Ensure SQLite transaction default to IMMEDIATE mode to avoid "database is locked" errors when a deferred transaction attempts to upgrade itself with a write lock (discussed here previously, and added to Datasette 1.0a14 in August) and SQLite non-GVL-blocking, fair retry interval busy handler - a lower-level change that ensures SQLite's busy handler doesn't hold Ruby's Global VM Lock (the Ruby version of Python's GIL) while a thread is waiting on a SQLite lock.

The rest of the talk makes a passionate and convincing case for SQLite as an option for production deployments, in line with the Rails goal of being a One Person Framework - "a toolkit so powerful that it allows a single individual to create modern applications upon which they might build a competitive business".

Animated slide. The text Single-machine SQLite-only deployments can't serve production workloads is stamped with a big red Myth stamp

Back in April Stephen published SQLite on Rails: The how and why of optimal performance describing some of these challenges in more detail (including the best explanation I've seen anywhere of BEGIN IMMEDIATE TRANSACTION) and promising:

Unfortunately, running SQLite on Rails out-of-the-box isn’t viable today. But, with a bit of tweaking and fine-tuning, you can ship a very performant, resilient Rails application with SQLite. And my personal goal for Rails 8 is to make the out-of-the-box experience fully production-ready.

It looks like he achieved that goal!

# 10:24 pm / scaling, rails, sqlite, ruby, gil

2025 DSF Board Nominations. The Django Software Foundation board elections are coming up. There are four positions open, seven directors total. Terms last two years, and the deadline for submitting a nomination is October 25th (the date of the election has not yet been decided).

Several community members have shared "DSF initiatives I'd like to see" documents to inspire people who may be considering running for the board:

  • Sarah Boyce (current Django Fellow) wants a marketing strategy, better community docs, more automation and a refresh of the Django survey.
  • Tim Schilling wants one big sponsor, more community recognition and a focus on working groups.
  • Carlton Gibson wants an Executive Director, an updated website and better integration of the community into that website.
  • Jacob Kaplan-Moss wants effectively all of the above.

There's also a useful FAQ on the Django forum by Thibaud Colas.

# 11:01 pm / django, dsf, jacob-kaplan-moss

files-to-prompt 0.4. New release of my files-to-prompt tool adding an option for filtering just for files with a specific extension.

The following command will output Claude XML-style markup for all Python and Markdown files in the current directory, and copy that to the macOS clipboard ready to be pasted into an LLM:

files-to-prompt . -e py -e md -c | pbcopy

# 11:29 pm / projects, python, llms