Simon Willison’s Weblog


Grok-1 code and model weights release (via) xAI have released their Grok-1 model under an Apache 2 license (for both weights and code). It’s distributed as a 318.24G torrent file and likely requires 320GB of VRAM to run, so needs some very hefty hardware.

The accompanying blog post (via link) says “Trained from scratch by xAI using a custom training stack on top of JAX and Rust in October 2023”, and describes it as a “314B parameter Mixture-of-Experts model with 25% of the weights active on a given token”.

Very little information on what it was actually trained on, all we know is that it was “a large amount of text data, not fine-tuned for any particular task”.