Simon Willison’s Weblog

Subscribe
Atom feed for llm Random

562 posts tagged “llm”

LLM is my command-line tool for running prompts against Large Language Models.

2024

First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin)

Visit First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin)

Amazon released three new Large Language Models yesterday at their AWS re:Invent conference. The new model family is called Amazon Nova and comes in three sizes: Micro, Lite and Pro.

[... 2,385 words]

Release llm-bedrock 0.4 — Run prompts against models hosted on AWS Bedrock
Release llm-bedrock 0.3.1 — Run prompts against models hosted on AWS Bedrock
Release llm-bedrock 0.3 — Run prompts against models hosted on AWS Bedrock
Release llm-bedrock 0.2.1 — Run prompts against models hosted on AWS Bedrock
Release llm-bedrock 0.2 — Run prompts against models hosted on AWS Bedrock
Release llm-bedrock 0.1a0 — Run prompts against models hosted on AWS Bedrock

datasette-queries. I released the first alpha of a new plugin to replace the crusty old datasette-saved-queries. This one adds a new UI element to the top of the query results page with an expandable form for saving the query as a new canned query:

Animated demo. I start on the table page, run a search, click View and edit SQL, then on the SQL query page open a Save query dialog, click a Suggest title and description button, wait for that to suggest something and click save.

It's my first plugin to depend on LLM and datasette-llm-usage - it uses GPT-4o mini to power an optional "Suggest title and description" button, labeled with the becoming-standard ✨ sparkles emoji to indicate an LLM-powered feature.

I intend to expand this to work across multiple models as I continue to iterate on llm-datasette-usage to better support those kinds of patterns.

For the moment though each suggested title and description call costs about 250 input tokens and 50 output tokens, which against GPT-4o mini adds up to 0.0067 cents.

# 3rd December 2024, 11:59 pm / projects, releases, datasette, plugins, llm, generative-ai, openai, ai, llms

datasette-llm-usage. I released the first alpha of a Datasette plugin to help track LLM usage by other plugins, with the goal of supporting token allowances - both for things like free public apps that stop working after a daily allowance, plus free previews of AI features for paid-account-based projects such as Datasette Cloud.

It's using the usage features I added in LLM 0.19.

The alpha doesn't do much yet - it will start getting interesting once I upgrade other plugins to depend on it.

Design notes so far in issue #1.

# 2nd December 2024, 9:33 pm / llm, datasette-cloud, plugins, ai, llms, datasette, generative-ai, projects, releases

PydanticAI (via) New project from Pydantic, which they describe as an "Agent Framework / shim to use Pydantic with LLMs".

I asked which agent definition they are using and it's the "system prompt with bundled tools" one. To their credit, they explain that in their documentation:

The Agent has full API documentation, but conceptually you can think of an agent as a container for:

  • A system prompt — a set of instructions for the LLM written by the developer
  • One or more retrieval tool — functions that the LLM may call to get information while generating a response
  • An optional structured result type — the structured datatype the LLM must return at the end of a run

Given how many other existing tools already lean on Pydantic to help define JSON schemas for talking to LLMs this is an interesting complementary direction for Pydantic to take.

There's some overlap here with my own LLM project, which I still hope to add a function calling / tools abstraction to in the future.

# 2nd December 2024, 9:08 pm / llm, python, generative-ai, llms, pydantic, llm-tool-use, ai-agents, agent-definitions

Release datasette-llm-usage 0.1a0 — Track usage of LLM tokens in a SQLite table
Release llm-mistral 0.9 — LLM plugin providing access to Mistral models using the Mistral API
Release llm-gemini 0.5 — LLM plugin to access Google's Gemini family of models
Release llm-claude-3 0.10 — LLM plugin for interacting with the Claude 3 family of models

LLM 0.19. I just released version 0.19 of LLM, my Python library and CLI utility for working with Large Language Models.

I released 0.18 a couple of weeks ago adding support for calling models from Python asyncio code. 0.19 improves on that, and also adds a new mechanism for models to report their token usage.

LLM can log those usage numbers to a SQLite database, or make then available to custom Python code.

My eventual goal with these features is to implement token accounting as a Datasette plugin so I can offer AI features in my SaaS platform without worrying about customers spending unlimited LLM tokens.

Those 0.19 release notes in full:

  • Tokens used by a response are now logged to new input_tokens and output_tokens integer columns and a token_details JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610
  • llm prompt now takes a -u/--usage flag to display token usage at the end of the response.
  • llm logs -u/--usage shows token usage information for logged responses.
  • llm prompt ... --async responses are now logged to the database. #641
  • llm.get_models() and llm.get_async_models() functions, documented here. #640
  • response.usage() and async response await response.usage() methods, returning a Usage(input=2, output=1, details=None) dataclass. #644
  • response.on_done(callback) and await response.on_done(callback) methods for specifying a callback to be executed when a response has completed, documented here. #653
  • Fix for bug running llm chat on Windows 11. Thanks, Sukhbinder Singh. #495

I also released three new plugin versions that add support for the new usage tracking feature: llm-gemini 0.5, llm-claude-3 0.10 and llm-mistral 0.9.

# 1st December 2024, 11:59 pm / llm, releasenotes, generative-ai, projects, ai, llms, releases, cli

Release llm 0.19 — Access large language models from the command-line

QwQ: Reflect Deeply on the Boundaries of the Unknown. Brand new openly licensed (Apache 2) model from Alibaba Cloud's Qwen team, this time clearly inspired by OpenAI's work on reasoning in o1.

I love the flowery language they use to introduce the new model:

Through deep exploration and countless trials, we discovered something profound: when given time to ponder, to question, and to reflect, the model’s understanding of mathematics and programming blossoms like a flower opening to the sun. Just as a student grows wiser by carefully examining their work and learning from mistakes, our model achieves deeper insight through patient, thoughtful analysis.

It's already available through Ollama as a 20GB download. I initially ran it like this:

ollama run qwq

This downloaded the model and started an interactive chat session. I tried the classic "how many rs in strawberry?" and got this lengthy but correct answer, which concluded:

Wait, but maybe I miscounted. Let's list them: 1. s 2. t 3. r 4. a 5. w 6. b 7. e 8. r 9. r 10. y Yes, definitely three "r"s. So, the word "strawberry" contains three "r"s.

Then I switched to using LLM and the llm-ollama plugin. I tried prompting it for Python that imports CSV into SQLite:

Write a Python function import_csv(conn, url, table_name) which acceopts a connection to a SQLite databse and a URL to a CSV file and the name of a table - it then creates that table with the right columns and imports the CSV data from that URL

It thought through the different steps in detail and produced some decent looking code.

Finally, I tried this:

llm -m qwq 'Generate an SVG of a pelican riding a bicycle'

For some reason it answered in Simplified Chinese. It opened with this:

生成一个SVG图像,内容是一只鹈鹕骑着一辆自行车。这听起来挺有趣的!我需要先了解一下什么是SVG,以及如何创建这样的图像。

Which translates (using Google Translate) to:

Generate an SVG image of a pelican riding a bicycle. This sounds interesting! I need to first understand what SVG is and how to create an image like this.

It then produced a lengthy essay discussing the many aspects that go into constructing a pelican on a bicycle - full transcript here. After a full 227 seconds of constant output it produced this as the final result.

You can tell which bit is the bicycle and which bit is the pelican. It's quite elegant.

I think that's pretty good!

# 27th November 2024, 11:59 pm / llm, ollama, generative-ai, ai, qwen, llms, local-llms, pelican-riding-a-bicycle, llm-reasoning, llm-release, ai-in-china

Ask questions of SQLite databases and CSV/JSON files in your terminal

Visit Ask questions of SQLite databases and CSV/JSON files in your terminal

I built a new plugin for my sqlite-utils CLI tool that lets you ask human-language questions directly of SQLite databases and CSV/JSON files on your computer.

[... 723 words]

Release sqlite-utils-ask 0.2 — Ask questions of your data with LLM assistance

Weeknotes: asynchronous LLMs, synchronous embeddings, and I kind of started a podcast

These past few weeks I’ve been bringing Datasette and LLM together and distracting myself with a new sort-of-podcast crossed with a live streaming experiment.

[... 896 words]

Say hello to gemini-exp-1121. Google Gemini's Logan Kilpatrick on Twitter:

Say hello to gemini-exp-1121! Our latest experimental gemini model, with:

  • significant gains on coding performance
  • stronger reasoning capabilities
  • improved visual understanding

Available on Google AI Studio and the Gemini API right now

The 1121 in the name is a release date of the 21st November. This comes fast on the heels of last week's gemini-exp-1114.

Both of these new experimental Gemini models have seen moments at the top of the Chatbot Arena. gemini-exp-1114 took the top spot a few days ago, and then lost it to a new OpenAI model called "ChatGPT-4o-latest (2024-11-20)"... only for the new gemini-exp-1121 to hold the top spot right now.

(These model names are all so, so bad.)

I released llm-gemini 0.4.2 with support for the new model - this should have been 0.5 but I already have a 0.5a0 alpha that depends on an unreleased feature in LLM core.

I tried my pelican benchmark:

llm -m gemini-exp-1121 'Generate an SVG of a pelican riding a bicycle'
Not great at all, description follows

Since Gemini is a multi-modal vision model, I had it describe the image it had created back to me (by feeding it a PNG render):

llm -m gemini-exp-1121 describe -a pelican.png

And got this description, which is pretty great:

The image shows a simple, stylized drawing of an insect, possibly a bee or an ant, on a vehicle. The insect is composed of a large yellow circle for the body and a smaller yellow circle for the head. It has a black dot for an eye, a small orange oval for a beak or mouth, and thin black lines for antennae and legs. The insect is positioned on top of a simple black and white vehicle with two black wheels. The drawing is abstract and geometric, using basic shapes and a limited color palette of black, white, yellow, and orange.

Update: Logan confirmed on Twitter that these models currently only have a 32,000 token input, significantly less than the rest of the Gemini family.

# 22nd November 2024, 6:14 am / vision-llms, gemini, llm, google, generative-ai, ai, llms, logan-kilpatrick, pelican-riding-a-bicycle, llm-release, chatbot-arena

Release llm-gemini 0.4.2 — LLM plugin to access Google's Gemini family of models
Release llm-nomic-api-embed 0.3 — Create embeddings for LLM using the Nomic API
Release llm-nomic-api-embed 0.2 — Create embeddings for LLM using the Nomic API

llm-gguf 0.2, now with embeddings. This new release of my llm-gguf plugin - which provides support for locally hosted GGUF LLMs - adds a new feature: it now supports embedding models distributed as GGUFs as well.

This means you can use models like the bafflingly small (30.8MB in its smallest quantization) mxbai-embed-xsmall-v1 with LLM like this:

llm install llm-gguf
llm gguf download-embed-model \
  'https://huggingface.co/mixedbread-ai/mxbai-embed-xsmall-v1/resolve/main/gguf/mxbai-embed-xsmall-v1-q8_0.gguf'

Then to embed a string:

llm embed -m gguf/mxbai-embed-xsmall-v1-q8_0 -c 'hello'

The LLM docs have extensive coverage of things you can then do with this model, like embedding every row in a CSV file / file in a directory / record in a SQLite database table and running similarity and semantic search against them.

Under the hood this takes advantage of the create_embedding() method provided by the llama-cpp-python wrapper around llama.cpp.

# 21st November 2024, 7:24 am / llm, generative-ai, projects, ai, embeddings, llama-cpp

Release llm-gguf 0.2 — Run models distributed as GGUF files using LLM
Release llm 0.19a2 — Access large language models from the command-line
Release llm-mistral 0.9a0 — LLM plugin providing access to Mistral models using the Mistral API
Release llm 0.19a1 — Access large language models from the command-line
Release llm-gemini 0.5a0 — LLM plugin to access Google's Gemini family of models