Simon Willison’s Weblog

Subscribe

Blogmarks in Jun, 2023

Filters: Type: blogmark × Year: 2023 × Month: Jun × Sorted by date


Databricks Signs Definitive Agreement to Acquire MosaicML, a Leading Generative AI Platform. MosaicML are the team behind MPT-7B and MPT-30B, two of the most impressive openly licensed LLMs. They just got acquired by Databricks for $1.3 billion dollars. # 30th June 2023, 1:43 am

abacaj/mpt-30B-inference. MPT-30B, released last week, is an extremely capable Apache 2 licensed open source language model. This repo shows how it can be run on a CPU, using the ctransformers Python library based on GGML. Following the instructions in the README got me a working MPT-30B model on my M2 MacBook Pro. The model is a 19GB download and it takes a few seconds to start spitting out tokens, but it works as advertised. # 29th June 2023, 3:27 am

Status of Python Versions (via) Very clear and useful page showing the exact status of different Python versions. 3.7 reaches end of life today (no more security updates), while 3.11 will continue to be supported until October 2027. # 27th June 2023, 2:01 pm

Building Search DSLs with Django (via) Neat tutorial by Dan Lamanna: how to build a GitHub-style search feature—supporting modifiers like “is:open author:danlamanna”—using PyParsing and the Django ORM. # 19th June 2023, 8:30 am

LLM 0.4. I released a major update to my LLM CLI tool today—version 0.4, which adds conversation mode and prompt templates so you can store and re-use interesting prompts, plus a whole bunch of other large and small improvements.

I also released 0.4.1 with some minor fixes and the ability to install the tool using Hombrew: brew install simonw/llm/llm # 17th June 2023, 10:58 pm

sqlean.py: Python’s sqlite3 with extensions. Anton Zhiyanov built a new Python package which bundles a fresh, compiled copy of SQLite with his SQLean family of C extensions built right in. Installing it gets you the latest SQLite—3.42.0—with nearly 200 additional functions, including things like define() and eval(), fileio_read() and fileio_write(), percentile_95() and uuid4() and many more. “import sqlean as sqlite3” works as a drop-in replacement for the module from the standard library. # 17th June 2023, 10:42 pm

When Zeppelins Ruled The Earth (via) 15 years ago I put together a talk about the history of Zeppelins which I presented a bunch of different times in various different configurations. As far as I know there are no existing videos of it, but I found an MP3 recording today and decided to splice it together with the slides to create a video of the 6m47s version I gave at the Skillswap on Speed lightning talks event in Brighton on the 28th October 2008.

Notes on how I edited the video together using iMovie in the via link. # 15th June 2023, 8:16 pm

Example of OpenAI function calling API to extract data from LAPD newsroom articles (via) Fascinating code example from Kyle McDonald. The OpenAI functions mechanism is intended to drive custom function calls, but I hadn’t quite appreciated how useful it can be ignoring the function calls entirely. Kyle instead uses it to define a schema for data he wants to extract from a news article, then uses the gpt-3.5-turbo-0613 to get back that exact set of extracted data as JSON. # 14th June 2023, 8:57 pm

Emergency Pod: OpenAI’s new Functions API, 75% Price Drop, 4x Context Length (via) I participated in a Twitter Spaces conversation last night about the new OpenAI functions mechanism. The recording has now been turned into a Latent Space podcast, and swyx has accompanied the recording with a detailed write-up of the different topics we covered. # 14th June 2023, 7:23 pm

Llama encoder and decoder. I forked my GPT tokenizer Observable notebook to create a similar tool for exploring the tokenization scheme used by the Llama family of LLMs, using the new llama-tokenizer-js JavaScript library. # 13th June 2023, 10:37 pm

OpenAI: Function calling and other API updates. Huge set of announcements from OpenAI today. A bunch of price reductions, but the things that most excite me are the new gpt-3.5-turbo-16k model which offers a 16,000 token context limit (4x the existing 3.5 turbo model) at a price of $0.003 per 1K input tokens and $0.004 per 1K output tokens—1/10th the price of GPT-4 8k.

The other big new feature: functions! You can now send JSON schema defining one or more functions to GPT 3.5 and GPT-4—those models will then return a blob of JSON describing a function they want you to call (if they determine that one should be called). Your code executes the function and passes the results back to the model to continue the execution flow.

This is effectively an implementation of the ReAct pattern, with models that have been fine-tuned to execute it.

They acknowledge the risk of prompt injection (though not by name) in the post: “We are working to mitigate these and other risks. Developers can protect their applications by only consuming information from trusted tools and by including user confirmation steps before performing actions with real-world impact, such as sending an email, posting online, or making a purchase.” # 13th June 2023, 5:34 pm

simpleaichat (via) Max Woolf released his own Python package for building against the GPT-3.5 and GPT-4 APIs (and potentially other LLMs in the future).

It’s a very clean piece of API design with some useful additional features: there’s an AsyncAIChat subclass that works with Python asyncio, and the library includes a mechanism for registering custom functions that can then be called by the LLM as tools.

One trick I haven’t seen before: it uses a combination of max_tokens: 1 and a ChatGPT logit_bias to ensure that answers to one of its default prompts are restricted to just numerals between 0 and 9. This is described in the PROMPTS.md file. # 8th June 2023, 9:06 pm

Examples of weird GPT-4 behavior for the string “ davidjl”. GPT-4, when told to repeat or otherwise process the string “ davidjl” (note the leading space character), treats it as “jndl” or “jspb” or “JDL” instead. It turns out “ davidjl” has its own single token in the tokenizer: token ID 23282, presumably dating back to the GPT-2 days.

Riley Goodside refers to these as “glitch tokens”.

This token might refer to Reddit user davidjl123 who ranks top of the league for the old /r/counting subreddit, with 163,477 posts there which presumably ended up in older training data. # 8th June 2023, 9:29 am

First Impressions of Vision Pro and VisionOS. John Gruber’s description of his thirty minute Vision Pro demo includes a bunch of details I haven’t seen described anywhere else, including how calibration and corrective lenses work and how precise and stable the overlays of additional information are. # 8th June 2023, 6:16 am

ChatGPT Plugins Don’t Have PMF. Sam Altman was recently quoted (in a since unpublished blog post) noting that ChatGPT plugins have not yet demonstrated product market fit.

This matches my own usage patterns: I use the “browse” and “code interpreter” modes on a daily basis, but I’ve not found any of the third party developer plugins to stick for me yet.

I like Matt Rickard’s observation here: “Chat is not the right UX for plugins. If you know what you want to do, it’s often easier to just do a few clicks on the website. If you don’t, just a chat interface makes it hard to steer the model toward your goal.” # 8th June 2023, 4:59 am

Logan Kilpatrick (OpenAI). “The API does not just change without us telling you. The models are static there.”

That’s the official line on the ongoing questions concerning whether OpenAI’s models have been degrading in quality over the last few weeks and months.

Worth noting that this mentions the API but doesn’t mention ChatGPT itself, which I suspect gets model updates a lot more frequently than the models served through the API. # 5th June 2023, 3:49 pm

pytest-icdiff (via) This is neat: “pip install pytest-icdiff” provides an instant usability upgrade to the output of failed tests in pytest, especially if the assertions involve comparing larger strings or nested JSON objects. # 3rd June 2023, 4:59 pm

Vector Search. Amjith Ramanujam provides a very thorough tutorial on implementing vector similarity search using SentenceTransformers embeddings (all-MiniLM-L6-v2) executed using sqlite-utils, then served via datasette-sqlite-vss and deployed using Fly. # 2nd June 2023, 5:02 am