Simon Willison’s Weblog

Subscribe

LLM 0.13: The annotated release notes

26th January 2024

I just released LLM 0.13, the latest version of my LLM command-line tool for working with Large Language Models—both via APIs and running models locally using plugins.

Here are the annotated release notes for the new version.

  • Added support for new OpenAI embedding models: 3-small and 3-large and three variants of those with different dimension sizes, 3-small-512, 3-large-256 and 3-large-1024. See OpenAI embedding models for details. #394

The original inspiration for shipping a new release was OpenAI’s announcement of new models yesterday: New embedding models and API updates.

I wrote a guide to embeddings in Embeddings: What they are and why they matter. Until recently the only available OpenAI embedding model was ada-002—released in December 2022 and now feeling a little bit old in the tooth.

The new 3-small model is similar to ada-002 but massively less expensive (a fifth of the price) and with higher benchmark scores.

3-large has even higher benchmark, but also produces much bigger vectors. Where ada-002 and 3-small produce 1536-dimensional vectors, 3-large produces 3072 dimensions!

Each dimension corresponds to a floating point number in the array of numbers produced when you embed a piece of content. The more numbers, the more storage space needed for those vectors and the longer any cosine-similarity calculations will take against them.

Here’s where things get really interesting though: since people often want to trade quality for smaller vector size, OpenAI now support a way of having their models return much smaller vectors.

LLM doesn’t yet have a mechanism for passing options to embedding models (unlike language models which can take -o setting value options), but I still wanted to make the new smaller sizes available.

That’s why I included 3-small-512, 3-large-256 and 3-large-1024: those are variants of the core models hard-coded to the specified vector size.

In the future I’d like to support options for embedding models, but this is a useful stop-gap.

  • The default gpt-4-turbo model alias now points to gpt-4-turbo-preview, which uses the most recent OpenAI GPT-4 turbo model (currently gpt-4-0125-preview). #396

Also announced yesterday—gpt-4-0125-preview is the latest version of the GPT-4 model which, according to OpenAI, “completes tasks like code generation more thoroughly than the previous preview model and is intended to reduce cases of “laziness” where the model doesn’t complete a task”.

This is technically a breaking change—the gpt-4-turbo LLM alias used to point to the older model, but now points to OpenAI’s gpt-4-turbo-preview alias which in turn points to the latest model.

  • New OpenAI model aliases gpt-4-1106-preview and gpt-4-0125-preview.

These aliases let you call those models explicitly:

llm -m gpt-4-0125-preview 'Write a lot of code without being lazy'
  • OpenAI models now support a -o json_object 1 option which will cause their output to be returned as a valid JSON object. #373

This is a fun feature, which uses an OpenAI option that claims to guarantee valid JSON output.

Weirdly you have to include the word “json” in your prompt when using this or OpenAI will return an error!

llm -m gpt-4-turbo \
  '3 names and short bios for pet pelicans in JSON' \
  -o json_object 1

That returned the following for me just now:

{
  "pelicans": [
    {
      "name": "Gus",
      "bio": "Gus is a curious young pelican with an insatiable appetite for adventure. He's known amongst the dockworkers for playfully snatching sunglasses. Gus spends his days exploring the marina and is particularly fond of performing aerial tricks for treats."
    },
    {
      "name": "Sophie",
      "bio": "Sophie is a graceful pelican with a gentle demeanor. She's become somewhat of a local celebrity at the beach, often seen meticulously preening her feathers or posing patiently for tourists' photos. Sophie has a special spot where she likes to watch the sunset each evening."
    },
    {
      "name": "Captain Beaky",
      "bio": "Captain Beaky is the unofficial overseer of the bay, with a stern yet endearing presence. As a seasoned veteran of the coastal skies, he enjoys leading his flock on fishing expeditions and is always the first to spot the fishing boats returning to the harbor. He's respected by both his pelican peers and the fishermen alike."
    }
  ]
}

The JSON schema it uses is entirely made up. You can prompt it with an example schema and it will probably stick to it.

I wrote the first two, but llm-ollama is by Sergey Alexandrov and llm-bedrock-meta is by Fabian Labat. My plugin writing tutorial is starting to pay off!

  • The keys.json file for storing API keys is now created with 600 file permissions. #351

A neat suggestion from Christopher Bare.

  • Documented a pattern for installing plugins that depend on PyTorch using the Homebrew version of LLM, despite Homebrew using Python 3.12 when PyTorch have not yet released a stable package for that Python version. #397

LLM is packaged for Homebrew. The Homebrew package upgraded to Python 3.12 a while ago, which caused surprising problems because it turned out PyTorch—a dependency of some LLM plugins—doesn’t have a stable build out for 3.12 yet.

Christian Bush shared a workaround in an LLM issue thread, which I’ve now added to the documentation.

  • Underlying OpenAI Python library has been upgraded to >1.0. It is possible this could cause compatibility issues with LLM plugins that also depend on that library. #325

This was the bulk of the work. OpenAI released their 1.0 Python library a couple of months ago and it had a large number of breaking changes compared to the previous release.

At the time I pinned LLM to the previous version to paper over the breaks, but this meant you could not install LLM in the same environment as some other library that needed the more recent OpenAI version.

There were a lot of changes! You can find a blow by blow account of the upgrade in my pull request that bundled the work.

  • Arrow keys now work inside the llm chat command. #376

The recipe for doing this is so weird:

import readline
readline.parse_and_bind("\\e[D: backward-char")
readline.parse_and_bind("\\e[C: forward-char")

I asked on Mastodon if anyone knows of a less obscure solution, but it looks like that might be the best we can do!

  • LLM_OPENAI_SHOW_RESPONSES=1 environment variable now outputs much more detailed information about the HTTP request and response made to OpenAI (and OpenAI-compatible) APIs. #404

This feature worked prior to the OpenAI >1.0 upgrade by tapping in to some requests internals. OpenAI dropped requests for httpx so I had to rebuild this feature from scratch.

I ended up getting a TIL out of it: Logging OpenAI API requests and responses using HTTPX.

  • Dropped support for Python 3.7.

I wanted to stop seeing a pkg_resources related warning, which meant switching to Python 3.8’s importlib.medata. Python 3.7 hit end-of-life for support back in June 2023 so I think this is an OK change to make.