Weeknotes: asynchronous LLMs, synchronous embeddings, and I kind of started a podcast
22nd November 2024
These past few weeks I’ve been bringing Datasette and LLM together and distracting myself with a new sort-of-podcast crossed with a live streaming experiment.
- Project: interviewing people about their projects
- Datasette Public Office Hours
- Async LLM
- Various embedding models
- Blog entries
- Releases
- TILs
Project: interviewing people about their projects
My response to the recent US election was to stress-code, and then to stress-podcast. On the morning after the election I started a video series called Project (I guess you could call it a “vlog”?) where I interview people about their interesting data projects. The first episode was with Rajiv Sinclair talking about his project VERDAD, tracking misinformation on US broadcast radio. The second was with Philip James talking about Civic Band, his project to scrape and search PDF meeting minutes and agendas from US local municipalities.
I was a guest on another podcast-like thing too: an Ars Technica Live sesison with Benj Edwards, which I wrote about in Notes from Bing Chat—Our First Encounter With Manipulative AI.
Datasette Public Office Hours
I also started a new thing with Alex Garcia called Datasette Public Office Hours, which we plan to run approximately once every two weeks as a live-streamed Friday conversation about Datasette and related projects. I wrote up our first session in Visualizing local election results with Datasette, Observable and MapLibre GL. The Civic Band interview was part of our second session—I still need to write about the rest of that session about sqlite-vec, embeddings and some future Datasette AI features, but you can watch the full video on YouTube.
Async LLM
I need to write this up in full, but last weekend I quietly released LLM 0.18 with a huge new feature: plugins can now provide asynchronous versions of their models, ready to be used with Python’s asyncio
. I built this for Datasette, which is built entirely around ASGI and needs to be able to run LLM models asynchronously to enable all sorts of interesting AI features.
LLM provides async OpenAI models, and I’ve also versions of the llm-gemini, llm-claude-3 and llm-mistral plugins that enable async models as well.
Here’s the documentation, but the short version is that you can now do this:
import llm model = llm.get_async_model("claude-3.5-sonnet") async for chunk in model.prompt( "Five surprising names for a pet pelican" ): print(chunk, end="", flush=True)
I’ve also been working on adding token accounting to LLM, to keep track of how many input and output tokens a prompt has used across multiple different models. I have an alpha release with that but it’s not yet fully stable.
The reason I want that is that I need it for both Datasette and Datasette Cloud. I want the ability to track token usage and grant users a free daily allowance of tokens that gets cut off once they’ve exhausted it. That’s an active project right now, more on that once it’s ready to ship in a release.
Various embedding models
LLM doesn’t yet offer asynchronous embeddings (see issue #628) but I’ve found myself hacking on a few different embeddings plugins anyway:
- llm-gguf now supports embedding models distributed as GGUF files. This means you can use the excitingly small (just 30.8MB) mxbai-embed-xsmall-v1 with LLM.
- llm-nomic-api-embed added support for the Nomic Embed Vision models. These work like CLIP in that you can embed both images and text in the same space, allowing you to do similarity search of a text string against a collection of images.
Blog entries
- Notes from Bing Chat—Our First Encounter With Manipulative AI
- Project: Civic Band—scraping and searching PDF meeting minutes from hundreds of municipalities
- Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac
- Visualizing local election results with Datasette, Observable and MapLibre GL
- Project: VERDAD—tracking misinformation in radio broadcasts using Gemini 1.5
- Claude 3.5 Haiku
Releases
-
llm-gemini 0.4.2—2024-11-22
LLM plugin to access Google’s Gemini family of models -
llm-nomic-api-embed 0.3—2024-11-21
Create embeddings for LLM using the Nomic API -
llm-gguf 0.2—2024-11-21
Run models distributed as GGUF files using LLM -
llm 0.19a2—2024-11-21
Access large language models from the command-line -
llm-mistral 0.9a0—2024-11-20
LLM plugin providing access to Mistral models using the Mistral API -
llm-claude-3 0.10a0—2024-11-20
LLM plugin for interacting with the Claude 3 family of models -
asgi-csrf 0.11—2024-11-15
ASGI middleware for protecting against CSRF attacks -
sqlite-utils 3.38a0—2024-11-08
Python CLI utility and library for manipulating SQLite databases -
asgi-proxy-lib 0.2a0—2024-11-06
An ASGI function for proxying to a backend over HTTP -
llm-lambda-labs 0.1a0—2024-11-04
Run prompts against LLMs hosted by lambdalabs.com -
llm-groq-whisper 0.1a0—2024-11-01
Transcribe audio using the Groq.com Whisper API
TILs
More recent articles
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode - 11th December 2024
- ChatGPT Canvas can make API requests now, but it's complicated - 10th December 2024
- I can now run a GPT-4 class model on my laptop - 9th December 2024