December in LLMs has been a lot
20th December 2024
I had big plans for December: for one thing, I was hoping to get to an actual RC of Datasette 1.0, in preparation for a full release in January. Instead, I’ve found myself distracted by a constant barrage of new LLM releases.
On December 4th Amazon introduced the Amazon Nova family of multi-modal models—clearly priced to compete with the excellent and inexpensive Gemini 1.5 series from Google. I got those working with LLM via a new llm-bedrock plugin.
The next big release was Llama 3.3 70B-Instruct, on December 6th. Meta claimed that this 70B model was comparable in quality to their much larger 405B model, and those claims seem to hold weight.
I wrote about how I can now run a GPT-4 class model on my laptop—the same laptop that was running a GPT-3 class model just 20 months ago.
Llama 3.3 70B has started showing up from API providers now, including super-fast hosted versions from both Groq (276 tokens/second) and Cerebras (a quite frankly absurd 2,200 tokens/second). If you haven’t tried Val Town’s Cerebras Coder demo you really should.
I think the huge gains in model efficiency are one of the defining stories of LLMs in 2024. It’s not just the local models that have benefited: the price of proprietary hosted LLMs has dropped through the floor, a result of both competition between vendors and the increasing efficiency of the models themselves.
Last year the running joke was that every time Google put out a new Gemini release OpenAI would ship something more impressive that same day to undermine them.
The tides have turned! This month Google shipped four updates that took the wind out of OpenAI’s sails.
The first was gemini-exp-1206 on December 6th, an experimental model that jumped straight to the top of some of the leaderboards. Was this our first glimpse of Gemini 2.0?
That was followed by Gemini 2.0 Flash on December 11th, the first official release in Google’s Gemini 2.0 series. The streaming support was particularly impressive, with https://aistudio.google.com/live demonstrating streaming audio and webcam communication with the multi-modal LLM a full day before OpenAI released their own streaming camera/audio features in an update to ChatGPT.
Then this morning Google shipped Gemini 2.0 Flash “Thinking mode”, their version of the inference scaling technique pioneered by OpenAI’s o1. I did not expect Gemini to ship a version of that before 2024 had even ended.
OpenAI have one day left in their 12 Days of OpenAI event. Previous highlights have included the full o1 model (an upgrade from o1-preview) and o1-pro, Sora (later upstaged a week later by Google’s Veo 2), Canvas (with a confusing second way to run Python), Advanced Voice with video streaming and Santa and a very cool new WebRTC streaming API, ChatGPT Projects (pretty much a direct lift of the similar Claude feature) and the 1-800-CHATGPT phone line.
Tomorrow is the last day. I’m not going to try to predict what they’ll launch, but I imagine it will be something notable to close out the year.
Blog entries
- Gemini 2.0 Flash “Thinking mode”
- Building Python tools with a one-shot prompt using uv run and Claude Projects
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode
- ChatGPT Canvas can make API requests now, but it’s complicated
- I can now run a GPT-4 class model on my laptop
- Prompts.js
- First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin)
- Storing times for human events
- Ask questions of SQLite databases and CSV/JSON files in your terminal
Releases
-
llm-gemini 0.8—2024-12-19
LLM plugin to access Google’s Gemini family of models -
datasette-enrichments-slow 0.1—2024-12-18
An enrichment on a slow loop to help debug progress bars -
llm-anthropic 0.11—2024-12-17
LLM access to models by Anthropic, including the Claude series -
llm-openrouter 0.3—2024-12-08
LLM plugin for models hosted by OpenRouter -
prompts-js 0.0.4—2024-12-08
async alternatives to browser alert() and prompt() and confirm() -
datasette-enrichments-llm 0.1a0—2024-12-05
Enrich data by prompting LLMs -
llm 0.19.1—2024-12-05
Access large language models from the command-line -
llm-bedrock 0.4—2024-12-04
Run prompts against models hosted on AWS Bedrock -
datasette-queries 0.1a0—2024-12-03
Save SQL queries in Datasette -
datasette-llm-usage 0.1a0—2024-12-02
Track usage of LLM tokens in a SQLite table -
llm-mistral 0.9—2024-12-02
LLM plugin providing access to Mistral models using the Mistral API -
llm-claude-3 0.10—2024-12-02
LLM plugin for interacting with the Claude 3 family of models -
datasette 0.65.1—2024-11-29
An open source multi-tool for exploring and publishing data -
sqlite-utils-ask 0.2—2024-11-24
Ask questions of your data with LLM assistance -
sqlite-utils 3.38—2024-11-23
Python CLI utility and library for manipulating SQLite databases
TILs
More recent articles
- Gemini 2.0 Flash "Thinking mode" - 19th December 2024
- Building Python tools with a one-shot prompt using uv run and Claude Projects - 19th December 2024