December in LLMs has been a lot

20th December 2024

I had big plans for December: for one thing, I was hoping to get to an actual RC of Datasette 1.0, in preparation for a full release in January. Instead, I’ve found myself distracted by a constant barrage of new LLM releases.

On December 4th Amazon introduced the Amazon Nova family of multi-modal models—clearly priced to compete with the excellent and inexpensive Gemini 1.5 series from Google. I got those working with LLM via a new llm-bedrock plugin.

The next big release was Llama 3.3 70B-Instruct, on December 6th. Meta claimed that this 70B model was comparable in quality to their much larger 405B model, and those claims seem to hold weight.

I wrote about how I can now run a GPT-4 class model on my laptop—the same laptop that was running a GPT-3 class model just 20 months ago.

Llama 3.3 70B has started showing up from API providers now, including super-fast hosted versions from both Groq (276 tokens/second) and Cerebras (a quite frankly absurd 2,200 tokens/second). If you haven’t tried Val Town’s Cerebras Coder demo you really should.

I think the huge gains in model efficiency are one of the defining stories of LLMs in 2024. It’s not just the local models that have benefited: the price of proprietary hosted LLMs has dropped through the floor, a result of both competition between vendors and the increasing efficiency of the models themselves.

Last year the running joke was that every time Google put out a new Gemini release OpenAI would ship something more impressive that same day to undermine them.

The tides have turned! This month Google shipped four updates that took the wind out of OpenAI’s sails.

The first was gemini-exp-1206 on December 6th, an experimental model that jumped straight to the top of some of the leaderboards. Was this our first glimpse of Gemini 2.0?

That was followed by Gemini 2.0 Flash on December 11th, the first official release in Google’s Gemini 2.0 series. The streaming support was particularly impressive, with https://aistudio.google.com/live demonstrating streaming audio and webcam communication with the multi-modal LLM a full day before OpenAI released their own streaming camera/audio features in an update to ChatGPT.

Then this morning Google shipped Gemini 2.0 Flash “Thinking mode”, their version of the inference scaling technique pioneered by OpenAI’s o1. I did not expect Gemini to ship a version of that before 2024 had even ended.

OpenAI have one day left in their 12 Days of OpenAI event. Previous highlights have included the full o1 model (an upgrade from o1-preview) and o1-pro, Sora (later upstaged a week later by Google’s Veo 2), Canvas (with a confusing second way to run Python), Advanced Voice with video streaming and Santa and a very cool new WebRTC streaming API, ChatGPT Projects (pretty much a direct lift of the similar Claude feature) and the 1-800-CHATGPT phone line.

Tomorrow is the last day. I’m not going to try to predict what they’ll launch, but I imagine it will be something notable to close out the year.

Update: They announced benchmarks for their new o3 model. I live-blogged their announcement here.

Blog entries

Releases

llm-gemini 0.8—2024-12-19
LLM plugin to access Google’s Gemini family of models
datasette-enrichments-slow 0.1—2024-12-18
An enrichment on a slow loop to help debug progress bars
llm-anthropic 0.11—2024-12-17
LLM access to models by Anthropic, including the Claude series
llm-openrouter 0.3—2024-12-08
LLM plugin for models hosted by OpenRouter
prompts-js 0.0.4—2024-12-08
async alternatives to browser alert() and prompt() and confirm()
datasette-enrichments-llm 0.1a0—2024-12-05
Enrich data by prompting LLMs
llm 0.19.1—2024-12-05
Access large language models from the command-line
llm-bedrock 0.4—2024-12-04
Run prompts against models hosted on AWS Bedrock
datasette-queries 0.1a0—2024-12-03
Save SQL queries in Datasette
datasette-llm-usage 0.1a0—2024-12-02
Track usage of LLM tokens in a SQLite table
llm-mistral 0.9—2024-12-02
LLM plugin providing access to Mistral models using the Mistral API
llm-claude-3 0.10—2024-12-02
LLM plugin for interacting with the Claude 3 family of models
datasette 0.65.1—2024-11-29
An open source multi-tool for exploring and publishing data
sqlite-utils-ask 0.2—2024-11-24
Ask questions of your data with LLM assistance
sqlite-utils 3.38—2024-11-23
Python CLI utility and library for manipulating SQLite databases

TILs

Fixes for datetime UTC warnings in Python—2024-12-12
Publishing a simple client-side JavaScript package to npm with GitHub Actions—2024-12-08
GitHub OAuth for a static site using Cloudflare Workers—2024-11-29

Posted 20th December 2024 at 6:30 am · Follow me on Mastodon, Bluesky, Twitter or subscribe to my newsletter

Simon Willison’s Weblog