LLM 0.13: The annotated release notes
26th January 2024
I just released LLM 0.13, the latest version of my LLM command-line tool for working with Large Language Models—both via APIs and running models locally using plugins.
Here are the annotated release notes for the new version.
- Added support for new OpenAI embedding models:
3-small
and3-large
and three variants of those with different dimension sizes,3-small-512
,3-large-256
and3-large-1024
. See OpenAI embedding models for details. #394
The original inspiration for shipping a new release was OpenAI’s announcement of new models yesterday: New embedding models and API updates.
I wrote a guide to embeddings in Embeddings: What they are and why they matter. Until recently the only available OpenAI embedding model was ada-002
—released in December 2022 and now feeling a little bit old in the tooth.
The new 3-small
model is similar to ada-002
but massively less expensive (a fifth of the price) and with higher benchmark scores.
3-large
has even higher benchmark, but also produces much bigger vectors. Where ada-002
and 3-small
produce 1536-dimensional vectors, 3-large
produces 3072 dimensions!
Each dimension corresponds to a floating point number in the array of numbers produced when you embed a piece of content. The more numbers, the more storage space needed for those vectors and the longer any cosine-similarity calculations will take against them.
Here’s where things get really interesting though: since people often want to trade quality for smaller vector size, OpenAI now support a way of having their models return much smaller vectors.
LLM doesn’t yet have a mechanism for passing options to embedding models (unlike language models which can take -o setting value
options), but I still wanted to make the new smaller sizes available.
That’s why I included 3-small-512
, 3-large-256
and 3-large-1024
: those are variants of the core models hard-coded to the specified vector size.
In the future I’d like to support options for embedding models, but this is a useful stop-gap.
- The default
gpt-4-turbo
model alias now points togpt-4-turbo-preview
, which uses the most recent OpenAI GPT-4 turbo model (currentlygpt-4-0125-preview
). #396
Also announced yesterday—gpt-4-0125-preview
is the latest version of the GPT-4 model which, according to OpenAI, “completes tasks like code generation more thoroughly than the previous preview model and is intended to reduce cases of “laziness” where the model doesn’t complete a task”.
This is technically a breaking change—the gpt-4-turbo
LLM alias used to point to the older model, but now points to OpenAI’s gpt-4-turbo-preview
alias which in turn points to the latest model.
- New OpenAI model aliases
gpt-4-1106-preview
andgpt-4-0125-preview
.
These aliases let you call those models explicitly:
llm -m gpt-4-0125-preview 'Write a lot of code without being lazy'
- OpenAI models now support a
-o json_object 1
option which will cause their output to be returned as a valid JSON object. #373
This is a fun feature, which uses an OpenAI option that claims to guarantee valid JSON output.
Weirdly you have to include the word “json” in your prompt when using this or OpenAI will return an error!
llm -m gpt-4-turbo \
'3 names and short bios for pet pelicans in JSON' \
-o json_object 1
That returned the following for me just now:
{
"pelicans": [
{
"name": "Gus",
"bio": "Gus is a curious young pelican with an insatiable appetite for adventure. He's known amongst the dockworkers for playfully snatching sunglasses. Gus spends his days exploring the marina and is particularly fond of performing aerial tricks for treats."
},
{
"name": "Sophie",
"bio": "Sophie is a graceful pelican with a gentle demeanor. She's become somewhat of a local celebrity at the beach, often seen meticulously preening her feathers or posing patiently for tourists' photos. Sophie has a special spot where she likes to watch the sunset each evening."
},
{
"name": "Captain Beaky",
"bio": "Captain Beaky is the unofficial overseer of the bay, with a stern yet endearing presence. As a seasoned veteran of the coastal skies, he enjoys leading his flock on fishing expeditions and is always the first to spot the fishing boats returning to the harbor. He's respected by both his pelican peers and the fishermen alike."
}
]
}
The JSON schema it uses is entirely made up. You can prompt it with an example schema and it will probably stick to it.
- New plugins since the last release include llm-mistral, llm-gemini, llm-ollama and llm-bedrock-meta.
I wrote the first two, but llm-ollama
is by Sergey Alexandrov and llm-bedrock-meta
is by Fabian Labat. My plugin writing tutorial is starting to pay off!
- The
keys.json
file for storing API keys is now created with600
file permissions. #351
A neat suggestion from Christopher Bare.
LLM is packaged for Homebrew. The Homebrew package upgraded to Python 3.12 a while ago, which caused surprising problems because it turned out PyTorch—a dependency of some LLM plugins—doesn’t have a stable build out for 3.12 yet.
Christian Bush shared a workaround in an LLM issue thread, which I’ve now added to the documentation.
- Underlying OpenAI Python library has been upgraded to
>1.0
. It is possible this could cause compatibility issues with LLM plugins that also depend on that library. #325
This was the bulk of the work. OpenAI released their 1.0 Python library a couple of months ago and it had a large number of breaking changes compared to the previous release.
At the time I pinned LLM to the previous version to paper over the breaks, but this meant you could not install LLM in the same environment as some other library that needed the more recent OpenAI version.
There were a lot of changes! You can find a blow by blow account of the upgrade in my pull request that bundled the work.
- Arrow keys now work inside the
llm chat
command. #376
The recipe for doing this is so weird:
import readline readline.parse_and_bind("\\e[D: backward-char") readline.parse_and_bind("\\e[C: forward-char")
I asked on Mastodon if anyone knows of a less obscure solution, but it looks like that might be the best we can do!
LLM_OPENAI_SHOW_RESPONSES=1
environment variable now outputs much more detailed information about the HTTP request and response made to OpenAI (and OpenAI-compatible) APIs. #404
This feature worked prior to the OpenAI >1.0 upgrade by tapping in to some requests
internals. OpenAI dropped requests
for httpx
so I had to rebuild this feature from scratch.
I ended up getting a TIL out of it: Logging OpenAI API requests and responses using HTTPX.
- Dropped support for Python 3.7.
I wanted to stop seeing a pkg_resources related warning, which meant switching to Python 3.8’s importlib.medata
. Python 3.7 hit end-of-life for support back in June 2023 so I think this is an OK change to make.
More recent articles
- My AI/LLM predictions for the next 1, 3 and 6 years, for Oxide and Friends - 10th January 2025
- Weeknotes: Starting 2025 a little slow - 4th January 2025
- I still don't think companies serve you ads based on spying through your microphone - 2nd January 2025