Simon Willison’s Weblog


Tuesday, 9th January 2024 A page for every blog. I hadn’t checked in on Phil Gyford’s blog directory since it first launched in November 2022. I’m delighted to see that it’s thriving—2,117 blogs have now been carefully curated, and the latest feature is a page for each blog showing its categories, description, an activity graph and the most recent posts syndicated via RSS/Atom. # 10:15 pm

The Eight Golden Rules of Interface Design (via) By HCI researcher Ben Shneiderman. I particularly like number 4, “Design dialogs to yield closure”, which encourages feedback at the completion of a group of actions that “gives users the satisfaction of accomplishment, a sense of relief.” # 9:37 pm

WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia. This paper describes a really interesting LLM system that runs Retrieval Augmented Generation against Wikipedia to help answer questions, but includes a second step where facts in the answer are fact-checked against Wikipedia again before returning an answer to the user. They claim “97.3% factual accuracy of its claims in simulated conversation” on a GPT-4 backed version, and also see good results when backed by LLaMA 7B.

The implementation is mainly through prompt engineering, and detailed examples of the prompts they used are included at the end of the paper. # 9:30 pm

Python 3.13 gets a JIT. “In late December 2023 (Christmas Day to be precise), CPython core developer Brandt Bucher submitted a little pull-request to the Python 3.13 branch adding a JIT compiler.”

Anthony Shaw does a deep dive into this new experimental JIT, explaining how it differs from other JITs. It’s an implementation of a copy-and-patch JIT, an idea that only emerged in 2021. This makes it architecturally much simpler than a traditional JIT, allowing it to compile faster and take advantage of existing LLVM tools on different architectures.

So far it’s providing a 2-9% performance improvement, but the real impact will be from the many future optimizations it enables. # 9:25 pm

What I should have said about the term Artificial Intelligence

With the benefit of hindsight, I did a bad job with my post, It’s OK to call it Artificial Intelligence a few days ago.

[... 376 words]

Mixtral of Experts. The Mixtral paper is out, exactly a month after the release of the Mixtral 8x7B model itself. Thanks to the paper I now have a reasonable understanding of how a mixture of experts model works: each layer has 8 available blocks, but a router model selects two out of those eight for each token passing through that layer and combines their output. “As a result, each token has access to 47B parameters, but only uses 13B active parameters during inference.”

The Mixtral token context size is an impressive 32k, and it compares extremely well against the much larger Llama 70B across a whole array of benchmarks.

Unsurprising but disappointing: there’s nothing in the paper at all about what it was trained on. # 4:03 am