Simon Willison’s Weblog

Subscribe

Sunday, 17th March 2024

Grok-1 code and model weights release (via) xAI have released their Grok-1 model under an Apache 2 license (for both weights and code). It’s distributed as a 318.24G torrent file and likely requires 320GB of VRAM to run, so needs some very hefty hardware.

The accompanying blog post (via link) says “Trained from scratch by xAI using a custom training stack on top of JAX and Rust in October 2023”, and describes it as a “314B parameter Mixture-of-Experts model with 25% of the weights active on a given token”.

Very little information on what it was actually trained on, all we know is that it was “a large amount of text data, not fine-tuned for any particular task”. # 8:20 pm

Add ETag header for static responses. I’ve been procrastinating on adding better caching headers for static assets (JavaScript and CSS) served by Datasette for several years, because I’ve been wanting to implement the perfect solution that sets far-future cache headers on every asset and ensures the URLs change when they are updated.

Agustin Bacigalup just submitted the best kind of pull request: he observed that adding ETag support for static assets would side-step the complexity while adding much of the benefit, and implemented it along with tests.

It’s a substantial performance improvement for any Datasette instance with a number of JavaScript plugins... like the ones we are building on Datasette Cloud. I’m just annoyed we didn’t ship something like this sooner! # 7:25 pm

How does SQLite store data? Michal Pitr explores the design of the SQLite on-disk file format, as part of building an educational implementation of SQLite from scratch in Go. # 6:47 pm