Entries in 2024

Things we learned about LLMs in 2024

A lot has happened in the world of Large Language Models over the course of 2024. Here’s a review of things we figured out about the field in the past twelve months, plus my attempt at identifying key themes and pivotal moments.

[... 7,490 words]

6:07 pm / 31st December 2024 / google, ai, openai, generative-ai, local-llms, llms, anthropic, gemini, meta, llm-reasoning, long-context, ai-energy-usage, coding-agents

Trying out QvQ—Qwen’s new visual reasoning model

I thought we were done for major model releases in 2024, but apparently not: Alibaba’s Qwen team just dropped the ~~Apache 2.0 licensed~~ Qwen licensed (the license changed) QvQ-72B-Preview, “an experimental research model focusing on enhancing visual reasoning capabilities”.

[... 1,838 words]

8:49 pm / 24th December 2024 / python, ai, generative-ai, local-llms, llms, hugging-face, vision-llms, uv, qwen, mlx, llm-reasoning, llm-release, prince-canuma

My approach to running a link blog

I started running a basic link blog on this domain back in November 2003—publishing links (which I called “blogmarks”) with a title, URL, short snippet of commentary and a “via” link where appropriate.

[... 1,510 words]

6:37 pm / 22nd December 2024 / blogging, django, django-admin, john-gruber

Live blog: the 12th day of OpenAI—“Early evals for OpenAI o3”

It’s the final day of OpenAI’s 12 Days of OpenAI launch series, and since I built a live blogging system a couple of months ago I’ve decided to roll it out again to provide live commentary during the half hour event, which kicks off at 10am San Francisco time.

[... 76 words]

5:40 pm / 20th December 2024 / ai, openai, prompt-injection, generative-ai, llms, o1, llm-reasoning, o3

December in LLMs has been a lot

I had big plans for December: for one thing, I was hoping to get to an actual RC of Datasette 1.0, in preparation for a full release in January. Instead, I’ve found myself distracted by a constant barrage of new LLM releases.

[... 901 words]

6:30 am / 20th December 2024 / google, ai, weeknotes, openai, generative-ai, chatgpt, llms, gemini, o1, llm-reasoning

Gemini 2.0 Flash “Thinking mode”

Those new model releases just keep on flowing. Today it’s Google’s snappily named gemini-2.0-flash-thinking-exp, their first entrant into the o1-style inference scaling class of models. I posted about a great essay about the significance of these just this morning.

[... 569 words]

11:59 pm / 19th December 2024 / google, ai, generative-ai, llms, llm, gemini, o1, pelican-riding-a-bicycle, llm-reasoning, llm-release

Building Python tools with a one-shot prompt using uv run and Claude Projects

I’ve written a lot about how I’ve been using Claude to build one-shot HTML+JavaScript applications via Claude Artifacts. I recently started using a similar pattern to create one-shot Python utilities, using a custom Claude Project combined with the dependency management capabilities of uv.

[... 899 words]

7 am / 19th December 2024 / aws, cli, python, s3, ai, prompt-engineering, generative-ai, llms, ai-assisted-programming, claude, claude-artifacts, uv

Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode

Huge announcment from Google this morning: Introducing Gemini 2.0: our new AI model for the agentic era. There’s a ton of stuff in there (including updates on Project Astra and the new Project Mariner), but the most interesting pieces are the things we can start using today, built around the brand new Gemini 2.0 Flash model. The developer blog post has more of the technical details, and the Gemini 2.0 Cookbook is useful for understanding the API via Python code examples.

[... 1,740 words]

8:16 pm / 11th December 2024 / google, ai, generative-ai, llms, gemini, vision-llms, multi-modal-output, llm-release

ChatGPT Canvas can make API requests now, but it’s complicated

Today’s 12 Days of OpenAI release concerned ChatGPT Canvas, a new ChatGPT feature that enables ChatGPT to pop open a side panel with a shared editor in it where you can collaborate with ChatGPT on editing a document or writing code.

[... 1,116 words]

9:49 pm / 10th December 2024 / python, security, usability, ai, webassembly, pyodide, openai, prompt-injection, generative-ai, chatgpt, llms, claude-artifacts, cors

I can now run a GPT-4 class model on my laptop

Meta’s new Llama 3.3 70B is a genuinely GPT-4 class Large Language Model that runs on my laptop.

[... 2,905 words]

3:08 pm / 9th December 2024 / python, ai, generative-ai, llama, gpt-4, local-llms, llms, ai-assisted-programming, llm, meta, uv, mlx, ollama, pelican-riding-a-bicycle

Prompts.js

I’ve been putting the new o1 model from OpenAI through its paces, in particular for code. I’m very impressed—it feels like it’s giving me a similar code quality to Claude 3.5 Sonnet, at least for Python and JavaScript and Bash... but it’s returning output noticeably faster.

[... 1,119 words]

8:35 pm / 7th December 2024 / javascript, projects, releases, npm, openai, llms, ai-assisted-programming, llm, gemini, claude-3-5-sonnet, o1

First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin)

Amazon released three new Large Language Models yesterday at their AWS re:Invent conference. The new model family is called Amazon Nova and comes in three sizes: Micro, Lite and Pro.

[... 2,385 words]

3:50 pm / 4th December 2024 / amazon, projects, releases, ai, openai, generative-ai, llms, llm, anthropic, gemini, vision-llms, llm-pricing, multi-modal-output, llm-release

Storing times for human events

I’ve worked on various event websites in the past, and one of the unintuitively difficult problems that inevitably comes up is the best way to store the time that an event is happening. Based on that past experience, here’s my current recommendation.

[... 1,676 words]

8:45 pm / 27th November 2024 / databases, events, time, timezones

Ask questions of SQLite databases and CSV/JSON files in your terminal

I built a new plugin for my sqlite-utils CLI tool that lets you ask human-language questions directly of SQLite databases and CSV/JSON files on your computer.

[... 723 words]

1:33 am / 25th November 2024 / cli, plugins, projects, sqlite, ai, sqlite-utils, generative-ai, llms, ai-assisted-programming, llm

Weeknotes: asynchronous LLMs, synchronous embeddings, and I kind of started a podcast

These past few weeks I’ve been bringing Datasette and LLM together and distracting myself with a new sort-of-podcast crossed with a live streaming experiment.

[... 896 words]

10:35 pm / 22nd November 2024 / podcasts, projects, datasette, weeknotes, embeddings, llm

Notes from Bing Chat—Our First Encounter With Manipulative AI

I participated in an Ars Live conversation with Benj Edwards of Ars Technica today, talking about that wild period of LLM history last year when Microsoft launched Bing Chat and it instantly started misbehaving, gaslighting and defaming people.

[... 438 words]

10:41 pm / 19th November 2024 / arstechnica, bing, ethics, microsoft, podcasts, my-talks, ai, openai, generative-ai, gpt-4, llms, benj-edwards, podcast-appearances, ai-ethics, ai-assisted-search, ai-personality

Project: Civic Band—scraping and searching PDF meeting minutes from hundreds of municipalities

I interviewed Philip James about Civic Band, his “slowly growing collection of databases of the minutes from civic governments”. Philip demonstrated the site and talked through his pipeline for scraping and indexing meeting minutes from many different local government authorities around the USA.

[... 762 words]

10:14 pm / 16th November 2024 / data-journalism, political-hacking, politics, sqlite, datasette, datasette-public-office-hours

Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac

There’s a whole lot of buzz around the new Qwen2.5-Coder Series of open source (Apache 2.0 licensed) LLM releases from Alibaba’s Qwen research team. On first impression it looks like the buzz is well deserved.

[... 697 words]

11:37 pm / 12th November 2024 / open-source, ai, generative-ai, local-llms, llms, ai-assisted-programming, llm, uv, qwen, mlx, ollama, pelican-riding-a-bicycle, paul-gauthier, llm-release

Visualizing local election results with Datasette, Observable and MapLibre GL

Alex Garcia and myself hosted the first Datasette Open Office Hours on Friday—a live-streamed video session where we hacked on a project together and took questions and tips from community members on Discord.

[... 3,390 words]

11:32 pm / 9th November 2024 / geospatial, gis, mapping, politics, projects, datasette, datasette-cloud, alex-garcia, datasette-public-office-hours

Project: VERDAD—tracking misinformation in radio broadcasts using Gemini 1.5

I’m starting a new interview series called Project. The idea is to interview people who are building interesting data projects and talk about what they’ve built, how they built it, and what they learned along the way.

[... 1,025 words]

6:41 pm / 7th November 2024 / data-journalism, youtube, ai, prompt-engineering, generative-ai, llms, gemini

Claude 3.5 Haiku

Anthropic released Claude 3.5 Haiku today, a few days later than expected (they said it would be out by the end of October).

[... 502 words]

7:34 pm / 4th November 2024 / ai, openai, generative-ai, llms, llm, anthropic, claude, gemini, llm-pricing, llm-release

W̶e̶e̶k̶n̶o̶t̶e̶s̶ Monthnotes for October

I try to publish weeknotes at least once every two weeks. It’s been four since the last entry, so I guess this one counts as monthnotes instead.

[... 797 words]

4:20 am / 30th October 2024 / weeknotes, llms, llm

You can now run prompts against images, audio and video in your terminal using LLM

I released LLM 0.17 last night, the latest version of my combined CLI tool and Python library for interacting with hundreds of different Large Language Models such as GPT-4o, Llama, Claude and Gemini.

[... 1,399 words]

3:09 pm / 29th October 2024 / cli, projects, ai, openai, generative-ai, local-llms, llms, llm, anthropic, claude, mistral, gemini, vision-llms, llm-pricing

Run a prompt to generate and execute jq programs using llm-jq

llm-jq is a brand new plugin for LLM which lets you pipe JSON directly into the llm jq command along with a human-language description of how you’d like to manipulate that JSON and have a jq program generated and executed for you on the fly.

[... 417 words]

4:26 am / 27th October 2024 / cli, plugins, projects, thomas-ptacek, ai, jq, prompt-engineering, generative-ai, llms, ai-assisted-programming, llm

Notes on the new Claude analysis JavaScript code execution tool

Anthropic released a new feature for their Claude.ai consumer-facing chat bot interface today which they’re calling “the analysis tool”.

[... 918 words]

8:22 pm / 24th October 2024 / javascript, webworkers, ai, prompt-engineering, generative-ai, llms, ai-assisted-programming, anthropic, claude, code-interpreter, alex-albert, llm-tool-use, claude-artifacts, coding-agents

Initial explorations of Anthropic’s new Computer Use capability

Two big announcements from Anthropic today: a new Claude 3.5 Sonnet model and a new API mode that they are calling computer use.

[... 1,569 words]

5:38 pm / 22nd October 2024 / ai, docker, prompt-engineering, prompt-injection, generative-ai, llms, anthropic, claude, llm-tool-use, claude-3-5-sonnet, ai-agents

Everything I built with Claude Artifacts this week

I’m a huge fan of Claude’s Artifacts feature, which lets you prompt Claude to create an interactive Single Page App (using HTML, CSS and JavaScript) and then view the result directly in the Claude interface, iterating on it further with the bot and then, if you like, copying out the resulting code.

[... 2,273 words]

2:32 pm / 21st October 2024 / javascript, projects, tools, ai, pyodide, generative-ai, llms, ai-assisted-programming, anthropic, claude, claude-artifacts, claude-3-5-sonnet

Running Llama 3.2 Vision and Phi-3.5 Vision on a Mac with mistral.rs

mistral.rs is an LLM inference library written in Rust by Eric Buehler. Today I figured out how to use it to run the Llama 3.2 Vision and Phi-3.5 Vision models on my Mac.

[... 1,231 words]

4:14 pm / 19th October 2024 / microsoft, python, ai, rust, generative-ai, llama, local-llms, llms, mistral, phi, vision-llms, meta

Experimenting with audio input and output for the OpenAI Chat Completion API

OpenAI promised this at DevDay a few weeks ago and now it’s here: their Chat Completion API can now accept audio as input and return it as output. OpenAI still recommend their WebSocket-based Realtime API for audio tasks, but the Chat Completion API is a whole lot easier to write code against.

[... 1,555 words]

3:17 pm / 18th October 2024 / audio, projects, ai, openai, generative-ai, gpt-4, llms, ai-assisted-programming, claude, llm-pricing

Video scraping: extracting JSON data from a 35 second screen capture for less than 1/10th of a cent

The other day I found myself needing to add up some numeric values that were scattered across twelve different emails.

[... 1,294 words]

12:32 pm / 17th October 2024 / data-journalism, gmail, google, scraping, ai, generative-ai, llms, ai-assisted-programming, claude, gemini, vision-llms, claude-artifacts, claude-3-5-sonnet

Simon Willison’s Weblog