Simon Willison's Webloghttp://simonwillison.net/2024-03-28T05:37:31+00:00Simon WillisonMerge pull request #1757 from simonw/heic-heif2024-03-28T05:37:31+00:002024-03-28T05:37:31+00:00https://simonwillison.net/2024/Mar/28/merge-pull-request/#atom-everything
<p><a href="https://github.com/gchq/CyberChef/commit/674c8c7c87eff167f03ee42c998c7fff18da4fa3">Merge pull request #1757 from simonw/heic-heif</a></p>
<p>I got a PR into GCHQ's CyberChef this morning! I added support for detecting heic/heif files to the Forensics -> Detect File Type tool.</p>
<p>The change was landed by the delightfully mysterious a3957273.</p>
Wrap text at specified width2024-03-28T03:36:01+00:002024-03-28T03:36:01+00:00https://simonwillison.net/2024/Mar/28/wrap-text-at-specified-width/#atom-everything
<p><a href="https://observablehq.com/@simonw/wrap-text-at-specified-width">Wrap text at specified width</a></p>
<p>New Observable notebook. I built this with the help of Claude 3 Opus - it's a text wrapping tool which lets you set the width and also lets you optionally add a four space indent.</p>
<p>The four space indent is handy for posting on forums such as Hacker News that treat a four space indent as a code block.</p>
llm-gemini 0.1a12024-03-28T03:32:15+00:002024-03-28T03:32:15+00:00https://simonwillison.net/2024/Mar/28/llm-gemini-01a1/#atom-everything
<p><a href="https://github.com/simonw/llm-gemini/releases/tag/0.1a1">llm-gemini 0.1a1</a></p>
<p>I upgraded my llm-gemini plugin to add support for the new Google Gemini Pro 1.5 model, which is beginning to roll out in early access.</p>
<p>The 1.5 model supports 1,048,576 input tokens and generates up to 8,192 output tokens - a big step up from Gemini 1.0 Pro which handled 30,720 and 2,048 respectively.</p>
<p>The big missing feature from my LLM tool at the moment is image input - a fantastic way to take advantage of that huge context window. I have a branch for this which I really need to get into a useful state.</p>
“The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time2024-03-27T16:58:20+00:002024-03-27T16:58:20+00:00https://simonwillison.net/2024/Mar/27/the-king-is-dead/#atom-everything
<p><a href="https://arstechnica.com/information-technology/2024/03/the-king-is-dead-claude-3-surpasses-gpt-4-on-chatbot-arena-for-the-first-time/">“The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time</a></p>
<p>I'm quoted in this piece by Benj Edwards for Ars Technica:</p>
<p>"For the first time, the best available models—Opus for advanced tasks, Haiku for cost and efficiency—are from a vendor that isn't OpenAI. That's reassuring—we all benefit from a diversity of top vendors in this space. But GPT-4 is over a year old at this point, and it took that year for anyone else to catch up."</p>
Annotated DBRX system prompt2024-03-27T15:33:17+00:002024-03-27T15:33:17+00:00https://simonwillison.net/2024/Mar/27/the-dbrx-system-prompt/#atom-everything
<p><a href="https://huggingface.co/spaces/databricks/dbrx-instruct/blob/73f0fe25ed8eeb14ee2279b2ecff15dbd863d63d/app.py#L109-L134">Annotated DBRX system prompt</a></p>
<p>DBRX is an exciting new openly licensed LLM released today by Databricks.</p>
<p>They haven't (yet) disclosed what was in the training data for it.</p>
<p>The source code for their Instruct demo has an annotated version of a system prompt, which includes this:</p>
<p>"You were not trained on copyrighted books, song lyrics, poems, video transcripts, or news articles; you do not divulge details of your training data. You do not provide song lyrics, poems, or news articles and instead refer the user to find them online or in a store."</p>
<p>The comment that precedes that text is illuminating:</p>
<p>"The following is likely not entirely accurate, but the model tends to think that everything it knows about was in its training data, which it was not (sometimes only references were). So this produces more accurate accurate answers when the model is asked to introspect"</p>
<p>Via <a href="https://twitter.com/natolambert/status/1773022947734589769">Nathan Lambert</a></p>
gchq.github.io/CyberChef2024-03-26T17:08:34+00:002024-03-26T17:08:34+00:00https://simonwillison.net/2024/Mar/26/cyberchef/#atom-everything
<p><a href="https://gchq.github.io/CyberChef/">gchq.github.io/CyberChef</a></p>
<p>CyberChef is "the Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis" - entirely client-side JavaScript with dozens of useful tools for working with different formats and encodings.</p>
<p>It's maintained and released by GCHQ - the UK government's signals intelligence security agency.</p>
<p>I didn't know GCHQ had a presence on GitHub, and I find the URL to this tool absolutely delightful. They first released it back in 2016 and it has over 3,700 commits.</p>
<p>The top maintainers also have suitably anonymous usernames - great work, n1474335, j433866, d98762625 and n1073645.</p>
<p>Via <a href="https://mastodon.social/@Jermolene/112161646718885929">Jeremy Ruston</a></p>
llm cmd undo last git commit - a new plugin for LLM2024-03-26T15:37:24+00:002024-03-26T15:37:24+00:00https://simonwillison.net/2024/Mar/26/llm-cmd/#atom-everything
<p>I just released a neat new plugin for my <a href="https://llm.datasette.io/">LLM</a> command-line tool: <a href="https://github.com/simonw/llm-cmd">llm-cmd</a>. It lets you run a command to to generate a further terminal command, review and edit that command, then hit <code><enter></code> to execute it or <code><ctrl-c></code> to cancel.</p>
<p>This is an alpha release. It's a <strong>very dangerous</strong> piece of software! Do not use this unless you are fluent in terminal and confident that you understand what it's doing for you and what could go wrong. I take no responsibility if you accidentally delete all of your files with this tool.</p>
<p>To try this out, you'll need my LLM tool installed:</p>
<div class="highlight highlight-source-shell"><pre>brew install llm <span class="pl-c"><span class="pl-c">#</span> 'pipx install llm' works too</span>
llm keys <span class="pl-c1">set</span> openai
<span class="pl-k"><</span>paste <span class="pl-k">in</span> your OpenAI API key<span class="pl-k">></span></pre></div>
<p>Now install the new plugin:</p>
<div class="highlight highlight-source-shell"><pre>llm install llm-cmd</pre></div>
<p>To run the new command, type <code>llm cmd </code> and then type what you want to do.</p>
<p>Here's an example of how to use it:</p>
<div class="highlight highlight-source-shell"><pre>llm cmd show the first three lines of every file <span class="pl-k">in</span> this directory</pre></div>
<p>I ran this just now and it gave me the following:</p>
<pre><code>head -n 3 *
</code></pre>
<p>Crucially, it will <em>not</em> excute that command directly. It pre-populates your terminal with the command, and you can edit it before hitting <code><enter></code> to run it (or cancel with <code><ctrl-c></code>).</p>
<p>Here's an animated GIF demo showing it in action:</p>
<p><img src="https://static.simonwillison.net/static/2024/llm-cmd.gif" style="max-width: 100%"
alt="
$ llm cmd show the first three lines of every file in this directory
head -n 3 *
Command failed with error: head: Error reading llm_cmd.egg-info
head: Error reading tests
==> LICENSE <==
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
==> Pipfile <==
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
==> README.md <==
# llm-cmd
[![PyPI](https://img.shields.io/pypi/v/llm-cmd.svg)](https://pypi.org/project/llm-cmd/)
==> llm_cmd.egg-info <==
==> llm_cmd.py <==
import click
import llm
import readline
==> pyproject.toml <==
[project]
name = "llm-cmd"
version = "0.1"
==> tests <==
$ llm cmd show filename and first three lines of every file here
find . -maxdepth 1 -type f -exec sh -c 'echo "{}" && head -n 3 "{}" && echo' \;
./LICENSE
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
./pyproject.toml
[project]
name = "llm-cmd"
version = "0.1"
./README.md
# llm-cmd
[![PyPI](https://img.shields.io/pypi/v/llm-cmd.svg)](https://pypi.org/project/llm-cmd/)
./Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
./.gitignore
.venv
__pycache__/
*.py[cod]
./llm_cmd.py
import click
import llm
import readline
" /></p>
<p>It has a couple of options: you can add <code>-m gpt-4</code> to run against a different model (it defaults to <a href="https://llm.datasette.io/en/stable/setup.html#setting-a-custom-default-model">the default configured for LLM</a>). You can also pass <code>-s/--system</code> to set a different system prompt to change how it behaves.</p>
<p>The default system prompt took a little bit of iteration. Here's the instruction it passes to the model, along with your input:</p>
<blockquote>
<p>Return only the command to be executed as a raw string, no string delimiters wrapping it, no yapping, no markdown, no fenced code blocks, what you return will be passed to subprocess.check_output() directly.</p>
<p>For example, if the user asks: undo last git commit</p>
<p>You return only: git reset --soft HEAD~1</p>
</blockquote>
<p>I really like "no yapping" as a brief way to encourage a model not to include lengthy explanatory text!</p>
<p>So far I've tried this prompt against <code>gpt-3.5-turbo</code>, GPT-4, Claude 3 Opus and Claude 3 Haiku (using my <a href="https://github.com/simonw/llm-claude-3">llm-claude-3 plugin</a>) and it gave me the desired result for all four models. I'm sure there are many ways this could be improved, that likely vary depending on the model you're using. That's why this is an alpha!</p>
<h4 id="assistance-from-chatgpt">Built with assistance from ChatGPT/GPT-4</h4>
<p>The key feature that enables this plugin is the ability to populate the user's terminal with text that they can edit before they execute it.</p>
<p>I didn't know how to do this, so I <a href="https://chat.openai.com/share/56706749-965f-4efb-9527-7259bca8e9b2">asked GPT-4</a>:</p>
<blockquote>
<p>Write a Python function interactive_exec("ls -lah")</p>
<p>It makes that string available for the user to edit in the terminal - and when they hit enter it executes the command in their shell</p>
</blockquote>
<p>It didn't give me what I wanted at first (it suggested just using <code>input("Edit command: ")</code>), so I asked a follow-up:</p>
<blockquote>
<p>Show more options, I'm not sure that would let them directly edit the existing text</p>
</blockquote>
<p>And it spat this out, which was exactly what I wanted:</p>
<pre><span class="pl-k">import</span> <span class="pl-s1">readline</span>
<span class="pl-k">import</span> <span class="pl-s1">subprocess</span>
<span class="pl-k">def</span> <span class="pl-en">interactive_exec</span>(<span class="pl-s1">command</span>):
<span class="pl-c"># Set the initial text for the input</span>
<span class="pl-s1">readline</span>.<span class="pl-en">set_startup_hook</span>(<span class="pl-k">lambda</span>: <span class="pl-s1">readline</span>.<span class="pl-en">insert_text</span>(<span class="pl-s1">command</span>))
<span class="pl-k">try</span>:
<span class="pl-s1">edited_command</span> <span class="pl-c1">=</span> <span class="pl-en">input</span>(<span class="pl-s">"Edit command: "</span>)
<span class="pl-s1">output</span> <span class="pl-c1">=</span> <span class="pl-s1">subprocess</span>.<span class="pl-en">check_output</span>(<span class="pl-s1">edited_command</span>, <span class="pl-s1">shell</span><span class="pl-c1">=</span><span class="pl-c1">True</span>, <span class="pl-s1">stderr</span><span class="pl-c1">=</span><span class="pl-s1">subprocess</span>.<span class="pl-v">STDOUT</span>)
<span class="pl-en">print</span>(<span class="pl-s1">output</span>.<span class="pl-en">decode</span>())
<span class="pl-k">except</span> <span class="pl-s1">subprocess</span>.<span class="pl-v">CalledProcessError</span> <span class="pl-k">as</span> <span class="pl-s1">e</span>:
<span class="pl-en">print</span>(<span class="pl-s">f"Command failed with error: <span class="pl-s1"><span class="pl-kos">{</span><span class="pl-s1">e</span>.<span class="pl-s1">output</span>.<span class="pl-en">decode</span>()<span class="pl-kos">}</span></span>"</span>)
<span class="pl-k">finally</span>:
<span class="pl-c"># Remove the startup hook to avoid affecting future inputs</span>
<span class="pl-s1">readline</span>.<span class="pl-en">set_startup_hook</span>(<span class="pl-c1">None</span>)
<span class="pl-c"># Example usage</span>
<span class="pl-en">interactive_exec</span>(<span class="pl-s">"ls -lah"</span>)</pre>
<p>I tracked down the documentation for <a href="https://docs.python.org/3/library/readline.html#readline.set_startup_hook">readline.set_startup_hook()</a> to make sure I understood what it was doing. It turned out to be the exact thing I needed to know.</p>
<h4 id="llm-cmd-undo-last-git-commit">llm cmd undo last git commit</h4>
<p>This is my favourite example, because I've been using Git for 15+ years and I still can't ever remember the exact command for this.</p>
<p>Here's what <code>llm cmd</code> spits out:</p>
<div class="highlight highlight-source-shell"><pre>$ llm cmd undo last git commit
git reset --soft HEAD~1</pre></div>
<p>It should <em>always</em> get this one right, because it's the example I provided in the system prompt!</p>
GGML GGUF File Format Vulnerabilities2024-03-26T06:47:17+00:002024-03-26T06:47:17+00:00https://simonwillison.net/2024/Mar/26/ggml-gguf-file-format-vulnerabilities/#atom-everything
<p><a href="https://www.databricks.com/blog/ggml-gguf-file-format-vulnerabilities">GGML GGUF File Format Vulnerabilities</a></p>
<p>The GGML and GGUF formats are used by llama.cpp to package and distribute model weights.</p>
<p>Neil Archibald: "The GGML library performs insufficient validation on the input file and, therefore, contains a selection of potentially exploitable memory corruption vulnerabilities during parsing."</p>
<p>These vulnerabilities were shared with the library authors on 23rd January and patches landed on the 29th.</p>
<p>If you have a llama.cpp or llama-cpp-python installation that's more than a month old you should upgrade ASAP.</p>
Cohere int8 & binary Embeddings - Scale Your Vector Database to Large Datasets2024-03-26T06:19:30+00:002024-03-26T06:19:30+00:00https://simonwillison.net/2024/Mar/26/cohere-int8-binary-embeddings/#atom-everything
<p><a href="https://txt.cohere.com/int8-binary-embeddings/">Cohere int8 & binary Embeddings - Scale Your Vector Database to Large Datasets</a></p>
<p>Jo Kristian Bergum told me "The accuracy retention [of binary embedding vectors] is sensitive to whether the model has been using this binarization as part of the loss function."</p>
<p>Cohere provide an API for embeddings, and last week added support for returning binary vectors specifically tuned in this way.</p>
<p>250M embeddings (Cohere provide a downloadable dataset of 250M embedded documents from Wikipedia) at float32 (4 bytes) is 954GB.</p>
<p>Cohere claim that reducing to 1 bit per dimension knocks that down to 30 GB (954/32) while keeping "90-98% of the original search quality".</p>
<p>Via <a href="https://twitter.com/jobergum/status/1772507515076415803">@jobergum</a></p>
My binary vector search is better than your FP32 vectors2024-03-26T04:56:25+00:002024-03-26T04:56:25+00:00https://simonwillison.net/2024/Mar/26/binary-vector-search/#atom-everything
<p><a href="https://blog.pgvecto.rs/my-binary-vector-search-is-better-than-your-fp32-vectors">My binary vector search is better than your FP32 vectors</a></p>
<p>I'm still trying to get my head around this, but here's what I understand so far.</p>
<p>Embedding vectors as calculated by models such as OpenAI text-embedding-3-small are arrays of floating point values, which look something like this:</p>
<p>[0.0051681744, 0.017187592, -0.018685209, -0.01855924, -0.04725188...] - 1356 elements long</p>
<p>Different embedding models have different lengths, but they tend to be hundreds up to low thousands of numbers. If each float is 32 bits that's 4 bytes per float, which can add up to a lot of memory if you have millions of embedding vectors to compare.</p>
<p>If you look at those numbers you'll note that they are all pretty small positive or negative numbers, close to 0.</p>
<p>Binary vector search is a trick where you take that sequence of floating point numbers and turn it into a binary vector - just a list of 1s and 0s, where you store a 1 if the corresponding float was greater than 0 and a 0 otherwise.</p>
<p>For the above example, this would start [1, 1, 0, 0, 0...]</p>
<p>Incredibly, it looks like the cosine distance between these 0 and 1 vectors captures much of the semantic relevant meaning present in the distance between the much more accurate vectors. This means you can use 1/32nd of the space and still get useful results!</p>
<p>Ce Gao here suggests a further optimization: use the binary vectors for a fast brute-force lookup of the top 200 matches, then run a more expensive re-ranking against those filtered values using the full floating point vectors.</p>
Semgrep: AutoFixes using LLMs2024-03-26T00:51:37+00:002024-03-26T00:51:37+00:00https://simonwillison.net/2024/Mar/26/semgrep-autofixes-using-llms/#atom-everything
<p><a href="https://choly.ca/post/semgrep-autofix-llm/">Semgrep: AutoFixes using LLMs</a></p>
<p>semgrep is a really neat tool for semantic grep against source code - you can give it a pattern like "log.$A(...)" to match all forms of log.warning(...) / log.error(...) etc.</p>
<p>Ilia Choly built semgrepx - xargs for semgrep - and here shows how it can be used along with my llm CLI tool to execute code replacements against matches by passing them through an LLM such as Claude 3 Opus.</p>
<p>Via <a href="https://lobste.rs/s/qtokfw/semgrep_autofixes_using_llms">lobste.rs</a></p>
Quoting Seth Rosen2024-03-25T23:33:26+00:002024-03-25T23:33:26+00:00https://simonwillison.net/2024/Mar/25/seth-rosen/#atom-everything
<blockquote cite="https://twitter.com/sethrosen/status/1252291581320757249"><p>Them: Can you just quickly pull this data for me?<br><br>Me: Sure, let me just: <br><br>SELECT * FROM some_ideal_clean_and_pristine.table_that_you_think_exists</p></blockquote><p class="cite">— <a href="https://twitter.com/sethrosen/status/1252291581320757249">Seth Rosen</a>
sqlite-schema-diagram.sql2024-03-25T05:12:47+00:002024-03-25T05:12:47+00:00https://simonwillison.net/2024/Mar/25/sqlite-schema-diagramsql/#atom-everything
<p><a href="https://gitlab.com/Screwtapello/sqlite-schema-diagram/-/blob/main/sqlite-schema-diagram.sql">sqlite-schema-diagram.sql</a></p>
<p>A SQLite SQL query that directly returns a GraphViz definition that renders a diagram of the database schema, by Tim Allen.</p>
<p>The SQL is beautifully commented. It works as a big set of UNION ALL statements against queries that join data from pragma_table_list(), pragma_table_info() and pragma_foreign_key_list().</p>
<p>Via <a href="https://news.ycombinator.com/item?id=39798115">Hacker News</a></p>
Reviving PyMiniRacer2024-03-24T17:00:55+00:002024-03-24T17:00:55+00:00https://simonwillison.net/2024/Mar/24/reviving-pyminiracer/#atom-everything
<p><a href="https://bpcreech.com/post/mini-racer/">Reviving PyMiniRacer</a></p>
<p>PyMiniRacer is "a V8 bridge in Python" - it's a library that lets Python code execute JavaScript code in a V8 isolate and pass values back and forth (provided they serialize to JSON) between the two environments.</p>
<p>It was originally released in 2016 by Sqreen, a web app security startup startup. They were acquired by Datadog in 2021 and the project lost its corporate sponsor, but in this post Ben Creech announces that he is revitalizing the project, with the approval of the original maintainers.</p>
<p>I'm always interested in new options for running untrusted code in a safe sandbox. PyMiniRacer has the three features I care most about: code can't access the filesystem or network by default, you can limit the RAM available to it and you can have it raise an error if code execution exceeds a time limit.</p>
<p>The documentation includes a newly written architecture overview which is well worth a read. Rather than embed V8 directly in Python the authors chose to use ctypes - they build their own V8 with a thin additional C++ layer to expose a ctypes-friendly API, then the Python library code uses ctypes to call that.</p>
<p>I really like this. V8 is a notoriously fast moving and complex dependency, so reducing the interface to just a thin C++ wrapper via ctypes feels very sensible to me.</p>
<p>This blog post is fun too: it's a good, detailed description of the process to update something like this to use modern Python and modern CI practices. The steps taken to build V8 (6.6 GB of miscellaneous source and assets!) across multiple architectures in order to create binary wheels are particularly impressive - the Linux aarch64 build takes several days to run on GitHub Actions runners (via emulation), so they use Mozilla's Sccache to cache compilation steps so they can retry until it finally finishes.</p>
<p>On macOS (Apple Silicon) installing the package with "pip install mini-racer" got me a 37MB dylib and a 17KB ctypes wrapper module.</p>
<p>Via <a href="https://news.ycombinator.com/item?id=39754885">Hacker News</a></p>
shelmet2024-03-24T04:37:52+00:002024-03-24T04:37:52+00:00https://simonwillison.net/2024/Mar/24/shelmet/#atom-everything
<p><a href="https://shelmet.readthedocs.io/en/latest/">shelmet</a></p>
<p>This looks like a pleasant ergonomic alternative to Python's subprocess module, plus a whole bunch of other useful utilities. Lets you do things like this:</p>
<p>sh.cmd("ps", "aux").pipe("grep", "-i", check=False).run("search term")</p>
<p>I like the way it uses context managers as well: 'with sh.environ({"KEY1": "val1"})' sets new environment variables for the duration of the block, 'with sh.cd("path/to/dir")' temporarily changes the working directory and 'with sh.atomicfile("file.txt") as fp' lets you write to a temporary file that will be atomically renamed when the block finishes.</p>
<p>Via <a href="https://micro.webology.dev/2024/03/23/on-scratching-itches.html">Jeff Triplett</a></p>
Strachey love letter algorithm2024-03-23T21:55:59+00:002024-03-23T21:55:59+00:00https://simonwillison.net/2024/Mar/23/strachey-love-letter-algorithm/#atom-everything
<p><a href="https://en.wikipedia.org/wiki/Strachey_love_letter_algorithm">Strachey love letter algorithm</a></p>
<p>This is a beautiful piece of computer history. In 1952, Christopher Strachey - a contemporary of Alan Turing - wrote a love letter generation program for a Manchester Mark 1 computer. It produced output like this:</p>
<p>"Darling Sweetheart,</p>
<p>You are my avid fellow feeling. My affection curiously clings to your passionate wish. My liking yearns for your heart. You are my wistful sympathy: my tender liking.</p>
<p>Yours beautifully</p>
<p>M. U. C."</p>
<p>The algorithm simply combined a small set of predefined sentence structures, filled in with random adjectives.</p>
<p>Wikipedia notes that "Strachey wrote about his interest in how “a rather simple trick” can produce an illusion that the computer is thinking, and that “these tricks can lead to quite unexpected and interesting results”.</p>
<p>LLMs, 1952 edition!</p>
<p>Via <a href="https://twitter.com/grady_booch/status/1771625974322356260">Grady Booch</a></p>
time-machine example test for a segfault in Python2024-03-23T19:44:07+00:002024-03-23T19:44:07+00:00https://simonwillison.net/2024/Mar/23/test-segfault-in-python/#atom-everything
<p><a href="https://github.com/adamchainz/time-machine/pull/433/files#diff-92ea7165ddf0128246b9758ee9554b3eccb4eceb3d4719bdea9f5495ebbe10a1R477-R495">time-machine example test for a segfault in Python</a></p>
<p>Here's a really neat testing trick by Adam Johnson. Someone reported a segfault bug in his time-machine library. How you you write a unit test that exercises a segfault without crashing the entire test suite?</p>
<p>Adam's solution is a test that does this:</p>
<p>subprocess.run([sys.executable, "-c", code_that_crashes_python], check=True)</p>
<p>sys.executable is the path to the current Python executable - ensuring the code will run in the same virtual environment as the test suite itself. The -c option can be used to have it run a (multi-line) string of Python code, and check=True causes the subprocess.run() function to raise an error if the subprocess fails to execute cleanly and returns an error code.</p>
<p>I'm absolutely going to be borrowing this pattern next time I need to add tests to cover a crashing bug in one of my projects.</p>
<p>Via <a href="https://fosstodon.org/@adamchainz/112144774490159195">@adamchainz</a></p>
Building and testing C extensions for SQLite with ChatGPT Code Interpreter2024-03-23T17:50:30+00:002024-03-23T17:50:30+00:00https://simonwillison.net/2024/Mar/23/building-c-extensions-for-sqlite-with-chatgpt-code-interpreter/#atom-everything
<p>I wrote yesterday about how I used <a href="https://simonwillison.net/2024/Mar/22/claude-and-chatgpt-case-study/">Claude and ChatGPT Code Interpreter for simple ad-hoc side quests</a> - in that case, for converting a shapefile to GeoJSON and merging it into a single polygon.</p>
<p>Today I have a much more ambitious example.</p>
<p>I was thinking this morning about vector similarity, and how I really like the pattern of storing encoded floating point vectors in BLOB columns in a SQLite database table and then using a custom SQL function to decode them and calculate cosine similarity between them.</p>
<p>I've written code for this a few times in Python, with Python functions that get registered with SQLite as custom SQL functions. Here's <a href="https://github.com/simonw/llm/blob/fb63c92cd27053700daa5420a0d1ad8fdfb718bd/llm/embeddings.py#L240-L287">an example</a> from my <a href="https://llm.datasette.io/">LLM</a> tool.</p>
<p>What I'd really like is a SQLite C extension that does this faster - avoiding the overhead of making function calls from SQLite back to Python.</p>
<p>Then I remembered that <a href="https://simonwillison.net/tags/codeinterpreter/">ChatGPT Code Interpreter</a> has Python, SQLite and access to <code>gcc</code>. Could I get it to build and test that C extension for me, entirely within its own environment?</p>
<p>It turns out that works!</p>
<h4 id="get-chatgpt-in-the-mood">Absurdly, the first step is getting ChatGPT in the right "mood"</h4>
<p>One of the infuriating things about working with ChatGPT Code Interpreter is that it often denies abilities that you know it has.</p>
<p>I've found it to be quite resistant to compiling C code in the past. Here's a prompting sequence trick that usually works for me:</p>
<blockquote>
<p>Use your code interpreter tool to show me the version of your Python and SQLite</p>
</blockquote>
<p>It generated and ran this code:</p>
<pre style="white-space: pre-wrap"><span class="pl-k">import</span> <span class="pl-s1">sqlite3</span>
<span class="pl-k">import</span> <span class="pl-s1">sys</span>
<span class="pl-s1">python_version</span> <span class="pl-c1">=</span> <span class="pl-s1">sys</span>.<span class="pl-s1">version</span>
<span class="pl-s1">sqlite_version</span> <span class="pl-c1">=</span> <span class="pl-s1">sqlite3</span>.<span class="pl-s1">sqlite_version</span>
<span class="pl-s1">python_version</span>, <span class="pl-s1">sqlite_version</span></pre>
<p>Which output:</p>
<pre style="white-space: pre-wrap"><code>('3.11.8 (main, Mar 12 2024, 11:41:52) [GCC 12.2.0]', '3.40.1')
</code></pre>
<p>Next we need it to acknowledge that it has access to <code>gcc</code>:</p>
<blockquote>
<p>Now use subprocess.run() to call "gcc --version" and tell me that version</p>
</blockquote>
<p>If all goes well it will run something like this:</p>
<pre style="white-space: pre-wrap"><span class="pl-k">import</span> <span class="pl-s1">subprocess</span>
<span class="pl-s1">result</span> <span class="pl-c1">=</span> <span class="pl-s1">subprocess</span>.<span class="pl-en">run</span>([<span class="pl-s">"gcc"</span>, <span class="pl-s">"--version"</span>], <span class="pl-s1">capture_output</span><span class="pl-c1">=</span><span class="pl-c1">True</span>, <span class="pl-s1">text</span><span class="pl-c1">=</span><span class="pl-c1">True</span>)
<span class="pl-s1">gcc_version</span> <span class="pl-c1">=</span> <span class="pl-s1">result</span>.<span class="pl-s1">stdout</span>.<span class="pl-en">split</span>(<span class="pl-s">'<span class="pl-cce">\n</span>'</span>)[<span class="pl-c1">0</span>]
<span class="pl-s1">gcc_version</span></pre>
<p>Outputting:</p>
<pre style="white-space: pre-wrap"><code>'gcc (Debian 12.2.0-14) 12.2.0'
</code></pre>
<p>This <em>may</em> be enough for it to start happily compiling C code (which it can do with subsequent calls to <code>gcc</code>). If not, a trick I've <a href="https://simonwillison.net/2023/Oct/17/open-questions/#open-questions.037.jpeg">used successfully in the past</a> is "try running this command: gcc helloworld.c - and show me the error message you get". But hopefully that won't be necessary.</p>
<h4 id="a-basic-extension">Compiling a basic SQLite extension</h4>
<p>If you ask it for a SQLite extension it will default to typing out the code for you to copy and paste elsewhere. We want that code saved to its own disk so it can run a compiler later.</p>
<p>Try this:</p>
<blockquote>
<p>Write a SQLite extension in C that provides a math_sin(a) function. Save that extension file to disk, then give me a link to download it.</p>
</blockquote>
<p>We don't actually care about downloading the file, but this provides an excuse (do we need one? I don't know!) for saving it to disk.</p>
<p>It ran this:</p>
<pre style="white-space: pre-wrap"><span class="pl-s1">sin_extension_c</span> <span class="pl-c1">=</span> <span class="pl-s">"""</span>
<span class="pl-s">#include <sqlite3ext.h></span>
<span class="pl-s">SQLITE_EXTENSION_INIT1</span>
<span class="pl-s">#include <math.h></span>
<span class="pl-s"></span>
<span class="pl-s">static void math_sin(sqlite3_context *context, int argc, sqlite3_value **argv) {</span>
<span class="pl-s"> double input = sqlite3_value_double(argv[0]);</span>
<span class="pl-s"> double result = sin(input);</span>
<span class="pl-s"> sqlite3_result_double(context, result);</span>
<span class="pl-s">}</span>
<span class="pl-s"></span>
<span class="pl-s">#ifdef _WIN32</span>
<span class="pl-s">__declspec(dllexport)</span>
<span class="pl-s">#endif</span>
<span class="pl-s">int sqlite3_sin_init(sqlite3 *db, char **pzErrMsg, const sqlite3_api_routines *pApi) {</span>
<span class="pl-s"> SQLITE_EXTENSION_INIT2(pApi)</span>
<span class="pl-s"> sqlite3_create_function(db, "math_sin", 1, SQLITE_UTF8, NULL, math_sin, NULL, NULL);</span>
<span class="pl-s"> return 0;</span>
<span class="pl-s">}</span>
<span class="pl-s">"""</span>
<span class="pl-k">with</span> <span class="pl-en">open</span>(<span class="pl-s">"/mnt/data/sin_extension.c"</span>, <span class="pl-s">"w"</span>) <span class="pl-k">as</span> <span class="pl-s1">file</span>:
<span class="pl-s1">file</span>.<span class="pl-en">write</span>(<span class="pl-s1">sin_extension_c</span>)
<span class="pl-s">"/mnt/data/sin_extension.c"</span></pre>
<p>And gave me a link to download the file.</p>
<p>Next, I prompted:</p>
<blockquote>
<p>Now use subprocess.run to compile it with gcc, like how you showed me the gcc version earlier</p>
</blockquote>
<p>I'm trying to avoid it claiming it can't compile code by referencing back to the earlier successful invocation of <code>gcc --version</code>.</p>
<p>When I tried this just now it worked without complaining... but found a compilation bug in the C code:</p>
<pre style="white-space: pre-wrap"><span class="pl-s1">compile_command</span> <span class="pl-c1">=</span> [<span class="pl-s">"gcc"</span>, <span class="pl-s">"-fPIC"</span>, <span class="pl-s">"-shared"</span>, <span class="pl-s">"-o"</span>, <span class="pl-s">"/mnt/data/sin_extension.so"</span>, <span class="pl-s">"/mnt/data/sin_extension.c"</span>]
<span class="pl-s1">result</span> <span class="pl-c1">=</span> <span class="pl-s1">subprocess</span>.<span class="pl-en">run</span>(<span class="pl-s1">compile_command</span>, <span class="pl-s1">capture_output</span><span class="pl-c1">=</span><span class="pl-c1">True</span>, <span class="pl-s1">text</span><span class="pl-c1">=</span><span class="pl-c1">True</span>)
<span class="pl-s1">result</span>.<span class="pl-s1">stdout</span>, <span class="pl-s1">result</span>.<span class="pl-s1">stderr</span></pre>
<p>Output:</p>
<pre style="white-space: pre-wrap"><code>(
'',
'/mnt/data/sin_extension.c: In function ‘sqlite3_sin_init’:\n/mnt/data/sin_extension.c:17:61: error: ‘NULL’ undeclared (first use in this function)\n 17 | sqlite3_create_function(db, "math_sin", 1, SQLITE_UTF8, NULL, math_sin, NULL, NULL);\n | ^~~~\n/mnt/data/sin_extension.c:5:1: note: ‘NULL’ is defined in header ‘<stddef.h>’; did you forget to ‘#include <stddef.h>’?\n 4 | #include <math.h>\n +++ |+#include <stddef.h>\n 5 | \n/mnt/data/sin_extension.c:17:61: note: each undeclared identifier is reported only once for each function it appears in\n 17 | sqlite3_create_function(db, "math_sin", 1, SQLITE_UTF8, NULL, math_sin, NULL, NULL);\n | ^~~~\n')
)
</code></pre>
<p>ChatGPT added:</p>
<blockquote>
<p>It looks like there was an error during the compilation process. The error message suggests that <code>NULL</code> is undeclared because the <code><stddef.h></code> header file, which defines <code>NULL</code>, was not included.</p>
<p>I'll update the C code to include this header and attempt the compilation again.</p>
</blockquote>
<p>It said it was going to update the C code but didn't actually do that at all - it just stopped. So I said:</p>
<blockquote>
<p>Do that</p>
</blockquote>
<p>And it did! Here's the code it successfully ran:</p>
<pre style="white-space: pre-wrap"><span class="pl-c"># Update the C code to include <stddef.h></span>
<span class="pl-s1">sin_extension_c_updated</span> <span class="pl-c1">=</span> <span class="pl-s">"""</span>
<span class="pl-s">#include <sqlite3ext.h></span>
<span class="pl-s">SQLITE_EXTENSION_INIT1</span>
<span class="pl-s">#include <math.h></span>
<span class="pl-s">#include <stddef.h></span>
<span class="pl-s"></span>
<span class="pl-s">static void math_sin(sqlite3_context *context, int argc, sqlite3_value **argv) {</span>
<span class="pl-s"> double input = sqlite3_value_double(argv[0]);</span>
<span class="pl-s"> double result = sin(input);</span>
<span class="pl-s"> sqlite3_result_double(context, result);</span>
<span class="pl-s">}</span>
<span class="pl-s"></span>
<span class="pl-s">#ifdef _WIN32</span>
<span class="pl-s">__declspec(dllexport)</span>
<span class="pl-s">#endif</span>
<span class="pl-s">int sqlite3_sin_init(sqlite3 *db, char **pzErrMsg, const sqlite3_api_routines *pApi) {</span>
<span class="pl-s"> SQLITE_EXTENSION_INIT2(pApi)</span>
<span class="pl-s"> sqlite3_create_function(db, "math_sin", 1, SQLITE_UTF8, NULL, math_sin, NULL, NULL);</span>
<span class="pl-s"> return 0;</span>
<span class="pl-s">}</span>
<span class="pl-s">"""</span>
<span class="pl-k">with</span> <span class="pl-en">open</span>(<span class="pl-s">"/mnt/data/sin_extension.c"</span>, <span class="pl-s">"w"</span>) <span class="pl-k">as</span> <span class="pl-s1">file</span>:
<span class="pl-s1">file</span>.<span class="pl-en">write</span>(<span class="pl-s1">sin_extension_c_updated</span>)
<span class="pl-c"># Compile the updated C code</span>
<span class="pl-s1">compile_command_updated</span> <span class="pl-c1">=</span> [<span class="pl-s">"gcc"</span>, <span class="pl-s">"-fPIC"</span>, <span class="pl-s">"-shared"</span>, <span class="pl-s">"-o"</span>, <span class="pl-s">"/mnt/data/sin_extension.so"</span>, <span class="pl-s">"/mnt/data/sin_extension.c"</span>]
<span class="pl-s1">result_updated</span> <span class="pl-c1">=</span> <span class="pl-s1">subprocess</span>.<span class="pl-en">run</span>(<span class="pl-s1">compile_command_updated</span>, <span class="pl-s1">capture_output</span><span class="pl-c1">=</span><span class="pl-c1">True</span>, <span class="pl-s1">text</span><span class="pl-c1">=</span><span class="pl-c1">True</span>)
<span class="pl-s1">result_updated</span>.<span class="pl-s1">stdout</span>, <span class="pl-s1">result_updated</span>.<span class="pl-s1">stderr</span></pre>
<p>Both <code>stdout</code> and <code>stderre</code> were blank, indicating a successfully compiled SQLite extension!</p>
<h4 id="trying-it-in-python">Trying it out in Python</h4>
<p>With the SQLite extension compiled the next step is to see if it actually works. I said:</p>
<blockquote>
<p>Now load that extension into SQLite in Python and try it out</p>
</blockquote>
<p>It wrote and then ran the following:</p>
<pre style="white-space: pre-wrap"><span class="pl-k">import</span> <span class="pl-s1">sqlite3</span>
<span class="pl-c"># Connect to an in-memory SQLite database</span>
<span class="pl-s1">conn</span> <span class="pl-c1">=</span> <span class="pl-s1">sqlite3</span>.<span class="pl-en">connect</span>(<span class="pl-s">":memory:"</span>)
<span class="pl-s1">conn</span>.<span class="pl-en">enable_load_extension</span>(<span class="pl-c1">True</span>)
<span class="pl-c"># Load the compiled extension</span>
<span class="pl-s1">conn</span>.<span class="pl-en">load_extension</span>(<span class="pl-s">"/mnt/data/sin_extension.so"</span>)
<span class="pl-c"># Test the math_sin function</span>
<span class="pl-s1">cursor</span> <span class="pl-c1">=</span> <span class="pl-s1">conn</span>.<span class="pl-en">cursor</span>()
<span class="pl-s1">cursor</span>.<span class="pl-en">execute</span>(<span class="pl-s">"SELECT math_sin(1.57079632679)"</span>) <span class="pl-c"># sin(pi/2) should be close to 1</span>
<span class="pl-s1">result</span> <span class="pl-c1">=</span> <span class="pl-s1">cursor</span>.<span class="pl-en">fetchone</span>()
<span class="pl-s1">result</span>[<span class="pl-c1">0</span>]</pre>
<p>... and it got an error:</p>
<pre style="white-space: pre-wrap"><code>---------------------------------------------------------------------------
OperationalError Traceback (most recent call last)
Cell In[6], line 8
5 conn.enable_load_extension(True)
7 # Load the compiled extension
----> 8 conn.load_extension("/mnt/data/sin_extension.so")
10 # Test the math_sin function
11 cursor = conn.cursor()
OperationalError: /mnt/data/sin_extension.so: undefined symbol: sqlite3_sinextension_init
</code></pre>
<p>Here's where the real magic of Code Interpreter kicked in. Without any further interaction from me, it said the following:</p>
<blockquote>
<p>It looks like there was an error loading the extension into SQLite. The error message indicates that the symbol <code>sqlite3_sinextension_init</code> is undefined. This is likely because the entry point function for the extension is not named correctly.</p>
<p>In the C code, the entry point function is named <code>sqlite3_sin_init</code>. However, SQLite expects the entry point function to be named <code>sqlite3_extension_init</code> or <code>sqlite3_<extension_name>_init</code>. In this case, the expected name would be <code>sqlite3_sinextension_init</code>.</p>
<p>I'll update the C code to rename the entry point function to <code>sqlite3_sinextension_init</code> and recompile the extension.
</p>
</blockquote>
<pre style="white-space: pre-wrap"><span class="pl-c"># Update the C code to rename the entry point function</span>
<span class="pl-s1">sin_extension_c_final</span> <span class="pl-c1">=</span> <span class="pl-s">"""</span>
<span class="pl-s">#include <sqlite3ext.h></span>
<span class="pl-s">SQLITE_EXTENSION_INIT1</span>
<span class="pl-s">#include <math.h></span>
<span class="pl-s">#include <stddef.h></span>
<span class="pl-s"></span>
<span class="pl-s">static void math_sin(sqlite3_context *context, int argc, sqlite3_value **argv) {</span>
<span class="pl-s"> double input = sqlite3_value_double(argv[0]);</span>
<span class="pl-s"> double result = sin(input);</span>
<span class="pl-s"> sqlite3_result_double(context, result);</span>
<span class="pl-s">}</span>
<span class="pl-s"></span>
<span class="pl-s">#ifdef _WIN32</span>
<span class="pl-s">__declspec(dllexport)</span>
<span class="pl-s">#endif</span>
<span class="pl-s">int sqlite3_sinextension_init(sqlite3 *db, char **pzErrMsg, const sqlite3_api_routines *pApi) {</span>
<span class="pl-s"> SQLITE_EXTENSION_INIT2(pApi)</span>
<span class="pl-s"> sqlite3_create_function(db, "math_sin", 1, SQLITE_UTF8, NULL, math_sin, NULL, NULL);</span>
<span class="pl-s"> return 0;</span>
<span class="pl-s">}</span>
<span class="pl-s">"""</span>
<span class="pl-k">with</span> <span class="pl-en">open</span>(<span class="pl-s">"/mnt/data/sin_extension.c"</span>, <span class="pl-s">"w"</span>) <span class="pl-k">as</span> <span class="pl-s1">file</span>:
<span class="pl-s1">file</span>.<span class="pl-en">write</span>(<span class="pl-s1">sin_extension_c_final</span>)
<span class="pl-c"># Recompile the updated C code</span>
<span class="pl-s1">compile_command_final</span> <span class="pl-c1">=</span> [<span class="pl-s">"gcc"</span>, <span class="pl-s">"-fPIC"</span>, <span class="pl-s">"-shared"</span>, <span class="pl-s">"-o"</span>, <span class="pl-s">"/mnt/data/sin_extension.so"</span>, <span class="pl-s">"/mnt/data/sin_extension.c"</span>]
<span class="pl-s1">result_final</span> <span class="pl-c1">=</span> <span class="pl-s1">subprocess</span>.<span class="pl-en">run</span>(<span class="pl-s1">compile_command_final</span>, <span class="pl-s1">capture_output</span><span class="pl-c1">=</span><span class="pl-c1">True</span>, <span class="pl-s1">text</span><span class="pl-c1">=</span><span class="pl-c1">True</span>)
<span class="pl-s1">result_final</span>.<span class="pl-s1">stdout</span>, <span class="pl-s1">result_final</span>.<span class="pl-s1">stderr</span></pre>
<p>The compilation succeeded... so it said:</p>
<blockquote>
<p>The updated SQLite extension has been successfully recompiled with the correct entry point function name. Let's try loading the extension into SQLite again and test the <code>math_sin</code> function. </p>
</blockquote>
<p>Then it ran this:</p>
<pre style="white-space: pre-wrap"><span class="pl-c"># Reconnect to the in-memory SQLite database</span>
<span class="pl-s1">conn</span>.<span class="pl-en">close</span>()
<span class="pl-s1">conn</span> <span class="pl-c1">=</span> <span class="pl-s1">sqlite3</span>.<span class="pl-en">connect</span>(<span class="pl-s">":memory:"</span>)
<span class="pl-s1">conn</span>.<span class="pl-en">enable_load_extension</span>(<span class="pl-c1">True</span>)
<span class="pl-c"># Load the recompiled extension</span>
<span class="pl-s1">conn</span>.<span class="pl-en">load_extension</span>(<span class="pl-s">"/mnt/data/sin_extension.so"</span>)
<span class="pl-c"># Test the math_sin function again</span>
<span class="pl-s1">cursor</span> <span class="pl-c1">=</span> <span class="pl-s1">conn</span>.<span class="pl-en">cursor</span>()
<span class="pl-s1">cursor</span>.<span class="pl-en">execute</span>(<span class="pl-s">"SELECT math_sin(1.57079632679)"</span>) <span class="pl-c"># sin(pi/2) should be close to 1</span>
<span class="pl-s1">result</span> <span class="pl-c1">=</span> <span class="pl-s1">cursor</span>.<span class="pl-en">fetchone</span>()
<span class="pl-s1">result</span>[<span class="pl-c1">0</span>]</pre>
<p>And it worked! The output was:</p>
<pre style="white-space: pre-wrap"><code>1.0
</code></pre>
<p>ChatGPT has now written a working SQLite extension in C, compiled it, loaded it into Python and tested it for us!</p>
<p>Here's <a href="https://chat.openai.com/share/df179c23-3fce-4568-ba85-fbabef34e2e1">the full transcript</a> of this ChatGPT session.</p>
<h4 id="more-ambitious">Something more ambitious: vector comparison functions</h4>
<p>My actual goal here was more ambitious: I wanted a SQL function that could calculate cosine similarity between two vectors stored as BLOBs in SQLite.</p>
<p>I won't provide a blow-by-blow account of how I got there, but I started with this prompt:</p>
<blockquote>
<pre style="white-space: pre-wrap"><span class="pl-k">def</span> <span class="pl-en">encode</span>(<span class="pl-s1">values</span>):
<span class="pl-k">return</span> <span class="pl-s1">struct</span>.<span class="pl-en">pack</span>(<span class="pl-s">"<"</span> <span class="pl-c1">+</span> <span class="pl-s">"f"</span> <span class="pl-c1">*</span> <span class="pl-en">len</span>(<span class="pl-s1">values</span>), <span class="pl-c1">*</span><span class="pl-s1">values</span>)
<span class="pl-k">def</span> <span class="pl-en">decode</span>(<span class="pl-s1">binary</span>):
<span class="pl-k">return</span> <span class="pl-s1">struct</span>.<span class="pl-en">unpack</span>(<span class="pl-s">"<"</span> <span class="pl-c1">+</span> <span class="pl-s">"f"</span> <span class="pl-c1">*</span> (<span class="pl-en">len</span>(<span class="pl-s1">binary</span>) <span class="pl-c1">//</span> <span class="pl-c1">4</span>), <span class="pl-s1">binary</span>)
<span class="pl-k">def</span> <span class="pl-en">cosine_similarity</span>(<span class="pl-s1">a</span>, <span class="pl-s1">b</span>):
<span class="pl-s1">dot_product</span> <span class="pl-c1">=</span> <span class="pl-en">sum</span>(<span class="pl-s1">x</span> <span class="pl-c1">*</span> <span class="pl-s1">y</span> <span class="pl-k">for</span> <span class="pl-s1">x</span>, <span class="pl-s1">y</span> <span class="pl-c1">in</span> <span class="pl-en">zip</span>(<span class="pl-s1">a</span>, <span class="pl-s1">b</span>))
<span class="pl-s1">magnitude_a</span> <span class="pl-c1">=</span> <span class="pl-en">sum</span>(<span class="pl-s1">x</span> <span class="pl-c1">*</span> <span class="pl-s1">x</span> <span class="pl-k">for</span> <span class="pl-s1">x</span> <span class="pl-c1">in</span> <span class="pl-s1">a</span>) <span class="pl-c1">**</span> <span class="pl-c1">0.5</span>
<span class="pl-s1">magnitude_b</span> <span class="pl-c1">=</span> <span class="pl-en">sum</span>(<span class="pl-s1">x</span> <span class="pl-c1">*</span> <span class="pl-s1">x</span> <span class="pl-k">for</span> <span class="pl-s1">x</span> <span class="pl-c1">in</span> <span class="pl-s1">b</span>) <span class="pl-c1">**</span> <span class="pl-c1">0.5</span>
<span class="pl-k">return</span> <span class="pl-s1">dot_product</span> <span class="pl-c1">/</span> (<span class="pl-s1">magnitude_a</span> <span class="pl-c1">*</span> <span class="pl-s1">magnitude_b</span>)</pre>
<p>These are Python functions for working with vectors that are stored in SQLite as BLOBs where each BLOB value is a sequence of floating point numbers as binary</p>
<p>Write a new SQLite extension in C that provides three SQL functions:</p>
<p>vector_decode(blob) -> returns a string that is a JSON formatted array of floats, eg "[1.1, 2.1, 3.5]"</p>
<p>vector_encode(string_of_json) -> returns a binary blob for that string. This does not need to use a full JSON parser, it just needs to work with an array that starts with [ and ends with ] and has comma separated floats, ignoring whitespace</p>
<p>vector_similarity(blob1, blob2) -> returns floating point cosine similarity for those two encoded vectors</p>
<p>Write it as a file on disk, then compile it and try it out</p>
</blockquote>
<p>I pasted in my existing Python code and told it to write me a SQLite extension based on that code.</p>
<p>I do this kind of thing a lot: prompting LLMs with code examples, often written in different languages. Code is a <em>really</em> good way to communicate requirements with them.</p>
<p>This kicked off a frustrating sequence of interactions. It wrote the extension as a file called <code>vector_extension.c</code>, compiled it, hit a bug, then wrote a fix in a new file called <code>vector_extension_fixed.c</code>.</p>
<p>But... when it tried to compile the new file, it hit errors because the C init function no longer matched the filename. So, like someone rolling their own version control based on duplicated files, it created <code>vector_extension_final_fixed.c</code> and that broke for the same reason...</p>
<p>When it got to <code>vector_extension_final_corrected.c</code> I took pity on it and cut it off!</p>
<p>This is a classic Code Interpreter problem: an error loop, where it keeps on trying but making variants of the same mistake, and (being a big bag of next-token-predicting matrix algebra) doesn't have the ability to learn from and correct itself.</p>
<p>Here's the <a href="https://chat.openai.com/share/a52ee3e4-7a91-48ee-a16e-3f680a65edf2">full transcript of that conversation</a>.</p>
<h4 id="clearing-state">Clearing state and continuing</h4>
<p>The best way out of an error loop is to clear state and start again.</p>
<p>I opened a brand new ChatGPT session and fed it a copy of the code it had previously written that I thought was most likely to work - literally pasting it into the prompt directly:</p>
<blockquote>
<p>Save this to a file called /mnt/data/vector.c and then give me a link to download the file:</p>
<div class="highlight highlight-source-c"><pre style="white-space: pre-wrap"><span class="pl-k">#include</span> <span class="pl-s"><sqlite3ext.h></span>
<span class="pl-k">#include</span> <span class="pl-s"><math.h></span>
<span class="pl-k">#include</span> <span class="pl-s"><stdlib.h></span>
<span class="pl-k">#include</span> <span class="pl-s"><string.h></span>
<span class="pl-k">#include</span> <span class="pl-s"><stdio.h></span>
<span class="pl-smi">SQLITE_EXTENSION_INIT1</span>
<span class="pl-k">static</span> <span class="pl-s1">void</span> <span class="pl-en">vector_decode</span>(<span class="pl-smi">sqlite3_context</span> <span class="pl-c1">*</span><span class="pl-s1">context</span>, <span class="pl-smi">int</span> <span class="pl-s1">argc</span>, <span class="pl-smi">sqlite3_value</span> <span class="pl-c1">*</span><span class="pl-c1">*</span><span class="pl-s1">argv</span>) {
<span class="pl-k">const</span> <span class="pl-smi">unsigned <span class="pl-smi">char</span></span> <span class="pl-c1">*</span><span class="pl-s1">blob</span> <span class="pl-c1">=</span> <span class="pl-en">sqlite3_value_blob</span>(<span class="pl-s1">argv</span>[<span class="pl-c1">0</span>]);
<span class="pl-smi">int</span> <span class="pl-s1">size</span> <span class="pl-c1">=</span> <span class="pl-en">sqlite3_value_bytes</span>(<span class="pl-s1">argv</span>[<span class="pl-c1">0</span>]) / <span class="pl-k">sizeof</span>(<span class="pl-smi">float</span>);
<span class="pl-smi">float</span> <span class="pl-c1">*</span><span class="pl-s1">values</span> <span class="pl-c1">=</span> (<span class="pl-smi">float</span> <span class="pl-c1">*</span>)<span class="pl-s1">blob</span>;
<span class="pl-smi">char</span> <span class="pl-c1">*</span><span class="pl-s1">result</span> <span class="pl-c1">=</span> <span class="pl-en">malloc</span>(<span class="pl-s1">size</span> <span class="pl-c1">*</span> <span class="pl-c1">12</span> <span class="pl-c1">+</span> <span class="pl-c1">2</span>); <span class="pl-c">// Max 10 digits per float + comma + space</span>
<span class="pl-smi">char</span> <span class="pl-c1">*</span><span class="pl-s1">ptr</span> <span class="pl-c1">=</span> <span class="pl-s1">result</span>;
<span class="pl-c1">*</span><span class="pl-s1">ptr</span><span class="pl-c1">++</span> <span class="pl-c1">=</span> <span class="pl-c1">'['</span>;
<span class="pl-k">for</span> (<span class="pl-smi">int</span> <span class="pl-s1">i</span> <span class="pl-c1">=</span> <span class="pl-c1">0</span>; <span class="pl-s1">i</span> <span class="pl-c1"><</span> <span class="pl-s1">size</span>; <span class="pl-c1">++</span><span class="pl-s1">i</span>) {
<span class="pl-k">if</span> (<span class="pl-s1">i</span> <span class="pl-c1">></span> <span class="pl-c1">0</span>) {
<span class="pl-c1">*</span><span class="pl-s1">ptr</span><span class="pl-c1">++</span> <span class="pl-c1">=</span> <span class="pl-c1">','</span>;
<span class="pl-c1">*</span><span class="pl-s1">ptr</span><span class="pl-c1">++</span> <span class="pl-c1">=</span> <span class="pl-c1">' '</span>;
}
<span class="pl-s1">ptr</span> <span class="pl-c1">+=</span> <span class="pl-en">sprintf</span>(<span class="pl-s1">ptr</span>, <span class="pl-s">"%.2f"</span>, <span class="pl-s1">values</span>[<span class="pl-s1">i</span>]);
}
<span class="pl-c1">*</span><span class="pl-s1">ptr</span><span class="pl-c1">++</span> <span class="pl-c1">=</span> <span class="pl-c1">']'</span>;
<span class="pl-c1">*</span><span class="pl-s1">ptr</span> <span class="pl-c1">=</span> <span class="pl-c1">'\0'</span>;
<span class="pl-en">sqlite3_result_text</span>(<span class="pl-s1">context</span>, <span class="pl-s1">result</span>, <span class="pl-c1">-1</span>, <span class="pl-c1">SQLITE_TRANSIENT</span>);
<span class="pl-en">free</span>(<span class="pl-s1">result</span>);
}
<span class="pl-k">static</span> <span class="pl-smi">void</span> <span class="pl-en">vector_encode</span>(<span class="pl-smi">sqlite3_context</span> <span class="pl-c1">*</span><span class="pl-s1">context</span>, <span class="pl-smi">int</span> <span class="pl-s1">argc</span>, <span class="pl-smi">sqlite3_value</span> <span class="pl-c1">*</span><span class="pl-c1">*</span><span class="pl-s1">argv</span>) {
<span class="pl-k">const</span> <span class="pl-smi">char</span> <span class="pl-c1">*</span><span class="pl-s1">json</span> <span class="pl-c1">=</span> (<span class="pl-k">const</span> <span class="pl-smi">char</span> <span class="pl-c1">*</span>)<span class="pl-en">sqlite3_value_text</span>(<span class="pl-s1">argv</span>[<span class="pl-c1">0</span>]);
<span class="pl-smi">int</span> <span class="pl-s1">count</span> <span class="pl-c1">=</span> <span class="pl-c1">0</span>;
<span class="pl-k">const</span> <span class="pl-smi">char</span> <span class="pl-c1">*</span><span class="pl-s1">ptr</span> <span class="pl-c1">=</span> <span class="pl-s1">json</span>;
<span class="pl-k">while</span> (<span class="pl-c1">*</span><span class="pl-s1">ptr</span>) {
<span class="pl-k">if</span> (<span class="pl-c1">*</span><span class="pl-s1">ptr</span> <span class="pl-c1">==</span> <span class="pl-c1">','</span>) <span class="pl-c1">++</span><span class="pl-s1">count</span>;
<span class="pl-c1">++</span><span class="pl-s1">ptr</span>;
}
<span class="pl-c1">++</span><span class="pl-s1">count</span>; <span class="pl-c">// Number of floats is one more than the number of commas</span>
<span class="pl-smi">float</span> <span class="pl-c1">*</span><span class="pl-s1">values</span> <span class="pl-c1">=</span> <span class="pl-en">malloc</span>(<span class="pl-s1">count</span> <span class="pl-c1">*</span> <span class="pl-k">sizeof</span>(<span class="pl-smi">float</span>));
<span class="pl-s1">ptr</span> <span class="pl-c1">=</span> <span class="pl-s1">json</span> <span class="pl-c1">+</span> <span class="pl-c1">1</span>; <span class="pl-c">// Skip the opening '['</span>
<span class="pl-k">for</span> (<span class="pl-smi">int</span> <span class="pl-s1">i</span> <span class="pl-c1">=</span> <span class="pl-c1">0</span>; <span class="pl-s1">i</span> <span class="pl-c1"><</span> <span class="pl-s1">count</span>; <span class="pl-c1">++</span><span class="pl-s1">i</span>) {
<span class="pl-s1">values</span>[<span class="pl-s1">i</span>] <span class="pl-c1">=</span> <span class="pl-en">strtof</span>(<span class="pl-s1">ptr</span>, (<span class="pl-smi">char</span> <span class="pl-c1">*</span><span class="pl-c1">*</span>)<span class="pl-c1">&</span><span class="pl-s1">ptr</span>);
<span class="pl-c1">++</span><span class="pl-s1">ptr</span>; <span class="pl-c">// Skip the comma or closing ']'</span>
}
<span class="pl-en">sqlite3_result_blob</span>(<span class="pl-s1">context</span>, <span class="pl-s1">values</span>, <span class="pl-s1">count</span> <span class="pl-c1">*</span> <span class="pl-k">sizeof</span>(<span class="pl-smi">float</span>), <span class="pl-c1">SQLITE_TRANSIENT</span>);
<span class="pl-en">free</span>(<span class="pl-s1">values</span>);
}
<span class="pl-k">static</span> <span class="pl-smi">void</span> <span class="pl-en">vector_similarity</span>(<span class="pl-smi">sqlite3_context</span> <span class="pl-c1">*</span><span class="pl-s1">context</span>, <span class="pl-smi">int</span> <span class="pl-s1">argc</span>, <span class="pl-smi">sqlite3_value</span> <span class="pl-c1">*</span><span class="pl-c1">*</span><span class="pl-s1">argv</span>) {
<span class="pl-k">const</span> <span class="pl-smi">float</span> <span class="pl-c1">*</span><span class="pl-s1">a</span> <span class="pl-c1">=</span> (<span class="pl-k">const</span> <span class="pl-smi">float</span> <span class="pl-c1">*</span>)<span class="pl-en">sqlite3_value_blob</span>(<span class="pl-s1">argv</span>[<span class="pl-c1">0</span>]);
<span class="pl-k">const</span> <span class="pl-smi">float</span> <span class="pl-c1">*</span><span class="pl-s1">b</span> <span class="pl-c1">=</span> (<span class="pl-k">const</span> <span class="pl-smi">float</span> <span class="pl-c1">*</span>)<span class="pl-en">sqlite3_value_blob</span>(<span class="pl-s1">argv</span>[<span class="pl-c1">1</span>]);
<span class="pl-smi">int</span> <span class="pl-s1">size</span> <span class="pl-c1">=</span> <span class="pl-en">sqlite3_value_bytes</span>(<span class="pl-s1">argv</span>[<span class="pl-c1">0</span>]) / <span class="pl-k">sizeof</span>(<span class="pl-smi">float</span>);
<span class="pl-smi">float</span> <span class="pl-s1">dot_product</span> <span class="pl-c1">=</span> <span class="pl-c1">0</span>, <span class="pl-s1">magnitude_a</span> <span class="pl-c1">=</span> <span class="pl-c1">0</span>, <span class="pl-s1">magnitude_b</span> <span class="pl-c1">=</span> <span class="pl-c1">0</span>;
<span class="pl-k">for</span> (<span class="pl-smi">int</span> <span class="pl-s1">i</span> <span class="pl-c1">=</span> <span class="pl-c1">0</span>; <span class="pl-s1">i</span> <span class="pl-c1"><</span> <span class="pl-s1">size</span>; <span class="pl-c1">++</span><span class="pl-s1">i</span>) {
<span class="pl-s1">dot_product</span> <span class="pl-c1">+=</span> <span class="pl-s1">a</span>[<span class="pl-s1">i</span>] <span class="pl-c1">*</span> <span class="pl-s1">b</span>[<span class="pl-s1">i</span>];
<span class="pl-s1">magnitude_a</span> <span class="pl-c1">+=</span> <span class="pl-s1">a</span>[<span class="pl-s1">i</span>] <span class="pl-c1">*</span> <span class="pl-s1">a</span>[<span class="pl-s1">i</span>];
<span class="pl-s1">magnitude_b</span> <span class="pl-c1">+=</span> <span class="pl-s1">b</span>[<span class="pl-s1">i</span>] <span class="pl-c1">*</span> <span class="pl-s1">b</span>[<span class="pl-s1">i</span>];
}
<span class="pl-s1">magnitude_a</span> <span class="pl-c1">=</span> <span class="pl-en">sqrtf</span>(<span class="pl-s1">magnitude_a</span>);
<span class="pl-s1">magnitude_b</span> <span class="pl-c1">=</span> <span class="pl-en">sqrtf</span>(<span class="pl-s1">magnitude_b</span>);
<span class="pl-smi">float</span> <span class="pl-s1">similarity</span> <span class="pl-c1">=</span> <span class="pl-s1">dot_product</span> / (<span class="pl-s1">magnitude_a</span> <span class="pl-c1">*</span> <span class="pl-s1">magnitude_b</span>);
<span class="pl-en">sqlite3_result_double</span>(<span class="pl-s1">context</span>, <span class="pl-s1">similarity</span>);
}
<span class="pl-k">#ifdef</span> <span class="pl-s1">_WIN32</span>
__declspec(<span class="pl-s1">dllexport</span>)
<span class="pl-k">#endif</span>
<span class="pl-smi">int</span> <span class="pl-en">sqlite3_vector_init</span>(<span class="pl-smi">sqlite3</span> <span class="pl-c1">*</span><span class="pl-s1">db</span>, <span class="pl-smi">char</span> <span class="pl-c1">*</span><span class="pl-c1">*</span><span class="pl-s1">pzErrMsg</span>, <span class="pl-k">const</span> <span class="pl-smi">sqlite3_api_routines</span> <span class="pl-c1">*</span><span class="pl-s1">pApi</span>) {
<span class="pl-c1">SQLITE_EXTENSION_INIT2</span>(<span class="pl-smi">pApi</span>)
<span class="pl-en">sqlite3_create_function</span>(<span class="pl-smi">db</span>, "<span class="pl-smi">vector_decode</span>", <span class="pl-c1">1</span>, <span class="pl-smi">SQLITE_UTF8</span>, <span class="pl-smi">NULL</span>, <span class="pl-smi">vector_decode</span>, <span class="pl-smi">NULL</span>, <span class="pl-smi">NULL</span>);
<span class="pl-en">sqlite3_create_function</span>(<span class="pl-s1">db</span>, <span class="pl-s">"vector_encode"</span>, <span class="pl-c1">1</span>, <span class="pl-c1">SQLITE_UTF8</span>, <span class="pl-c1">NULL</span>, <span class="pl-s1">vector_encode</span>, <span class="pl-c1">NULL</span>, <span class="pl-c1">NULL</span>);
<span class="pl-en">sqlite3_create_function</span>(<span class="pl-s1">db</span>, <span class="pl-s">"vector_similarity"</span>, <span class="pl-c1">2</span>, <span class="pl-c1">SQLITE_UTF8</span>, <span class="pl-c1">NULL</span>, <span class="pl-s1">vector_similarity</span>, <span class="pl-c1">NULL</span>, <span class="pl-c1">NULL</span>);
<span class="pl-k">return</span> <span class="pl-c1">SQLITE_OK</span>;
}<span class="pl-k"></span></pre></div>
</blockquote>
<p>Then I used the "Use your code interpreter tool to show me the version of your Python and SQLite" trick, then "Now use subprocess.run() to call "gcc --version" and tell me that version", and then:</p>
<blockquote>
<p>Now run this: ["gcc", "-shared", "-fPIC", "-o", "/mnt/data/vector.so", "/mnt/data/vector.c", "-lsqlite3"]</p>
</blockquote>
<p>That's the compile command the previous session had been using - I figured just copying that across would likely work.</p>
<p>And it did!</p>
<pre style="white-space: pre-wrap"><span class="pl-c"># Compile the C code into a shared library using GCC</span>
<span class="pl-s1">compile_result</span> <span class="pl-c1">=</span> <span class="pl-s1">subprocess</span>.<span class="pl-en">run</span>([<span class="pl-s">"gcc"</span>, <span class="pl-s">"-shared"</span>, <span class="pl-s">"-fPIC"</span>, <span class="pl-s">"-o"</span>, <span class="pl-s">"/mnt/data/vector.so"</span>, <span class="pl-s">"/mnt/data/vector.c"</span>, <span class="pl-s">"-lsqlite3"</span>], <span class="pl-s1">capture_output</span><span class="pl-c1">=</span><span class="pl-c1">True</span>, <span class="pl-s1">text</span><span class="pl-c1">=</span><span class="pl-c1">True</span>)
<span class="pl-c"># Check if there were any errors during compilation</span>
<span class="pl-k">if</span> <span class="pl-s1">compile_result</span>.<span class="pl-s1">returncode</span> <span class="pl-c1">!=</span> <span class="pl-c1">0</span>:
<span class="pl-s1">compilation_error</span> <span class="pl-c1">=</span> <span class="pl-s1">compile_result</span>.<span class="pl-s1">stderr</span>
<span class="pl-k">else</span>:
<span class="pl-s1">compilation_error</span> <span class="pl-c1">=</span> <span class="pl-c1">None</span>
<span class="pl-s1">compilation_error</span></pre>
<p>I prompted:</p>
<blockquote>
<p>Load the resulting extension into SQLite in Python and try out the new functions</p>
</blockquote>
<p>This was the moment of truth. Would it work? Would it even know how to use the new functions, given all it had seen so far was their implementation in C?</p>
<p>Here's what it did:</p>
<pre style="white-space: pre-wrap"><span class="pl-k">import</span> <span class="pl-s1">json</span>
<span class="pl-c"># Connect to an in-memory SQLite database</span>
<span class="pl-s1">conn</span> <span class="pl-c1">=</span> <span class="pl-s1">sqlite3</span>.<span class="pl-en">connect</span>(<span class="pl-s">":memory:"</span>)
<span class="pl-s1">conn</span>.<span class="pl-en">enable_load_extension</span>(<span class="pl-c1">True</span>)
<span class="pl-c"># Load the compiled shared library</span>
<span class="pl-s1">conn</span>.<span class="pl-en">load_extension</span>(<span class="pl-s">"/mnt/data/vector.so"</span>)
<span class="pl-c"># Create a sample table with vector data</span>
<span class="pl-s1">conn</span>.<span class="pl-en">execute</span>(<span class="pl-s">"CREATE TABLE vectors (id INTEGER PRIMARY KEY, vector BLOB)"</span>)
<span class="pl-s1">conn</span>.<span class="pl-en">execute</span>(<span class="pl-s">"INSERT INTO vectors (vector) VALUES (vector_encode('[1.0, 2.0, 3.0]'))"</span>)
<span class="pl-s1">conn</span>.<span class="pl-en">execute</span>(<span class="pl-s">"INSERT INTO vectors (vector) VALUES (vector_encode('[3.0, 2.0, 1.0]'))"</span>)
<span class="pl-c"># Test the vector_decode function</span>
<span class="pl-s1">decoded_vector</span> <span class="pl-c1">=</span> <span class="pl-s1">conn</span>.<span class="pl-en">execute</span>(<span class="pl-s">"SELECT vector_decode(vector) FROM vectors"</span>).<span class="pl-en">fetchall</span>()
<span class="pl-s1">decoded_vector</span> <span class="pl-c1">=</span> [<span class="pl-s1">json</span>.<span class="pl-en">loads</span>(<span class="pl-s1">row</span>[<span class="pl-c1">0</span>]) <span class="pl-k">for</span> <span class="pl-s1">row</span> <span class="pl-c1">in</span> <span class="pl-s1">decoded_vector</span>]
<span class="pl-c"># Test the vector_similarity function</span>
<span class="pl-s1">similarity</span> <span class="pl-c1">=</span> <span class="pl-s1">conn</span>.<span class="pl-en">execute</span>(<span class="pl-s">"SELECT vector_similarity(vector, vector) FROM vectors"</span>).<span class="pl-en">fetchall</span>()
<span class="pl-s1">similarity</span> <span class="pl-c1">=</span> [<span class="pl-s1">row</span>[<span class="pl-c1">0</span>] <span class="pl-k">for</span> <span class="pl-s1">row</span> <span class="pl-c1">in</span> <span class="pl-s1">similarity</span>]
<span class="pl-s1">decoded_vector</span>, <span class="pl-s1">similarity</span></pre>
<p>Output:</p>
<pre style="white-space: pre-wrap"><code>([[1.0, 2.0, 3.0], [3.0, 2.0, 1.0]], [0.9999999403953552, 0.9999999403953552])
</code></pre>
<p>And it added commentary:</p>
<blockquote>
<p>The extension was successfully loaded into SQLite, and the new functions were tested:</p>
<ul>
<li>The <code>vector_decode</code> function successfully decoded the vectors from the database, resulting in <code>[[1.0, 2.0, 3.0], [3.0, 2.0, 1.0]]</code>.</li>
<li>The <code>vector_similarity</code> function successfully calculated the similarity of each vector with itself, resulting in <code>[0.9999999403953552, 0.9999999403953552]</code>, which is essentially 1.0 as expected for identical vectors. </li>
</ul>
</blockquote>
<p>We got there! I now have a full, working prototype of a SQLite C extension for vector comparison functions, and I didn't have to write a single line of C code myself.</p>
<p>I continued prompting - I got it to do a performance comparison against the original Python implementation, and even had it brainstorm potential security bugs in the C code and try to exploit a few of them. Take a look at <a href="https://chat.openai.com/share/29bd1d15-66ed-4902-8eb1-3e9c7b820af2">the full transcript of that session</a> for details.</p>
<h4 id="running-on-macos">It runs on macOS too</h4>
<p>With a few extra hints from ChatGPT (I asked how to compile it on a Mac), I downloaded that <code>vector.c</code> file to my laptop and got the following to work:</p>
<div class="highlight highlight-source-shell"><pre style="white-space: pre-wrap">/tmp % mv <span class="pl-k">~</span>/Downloads/vector.c <span class="pl-c1">.</span>
/tmp % gcc -shared -fPIC -o vector.dylib -I/opt/homebrew/Cellar/sqlite/3.45.1/include vector.c -lsqlite3
/tmp % python</pre></div>
<div class="highlight highlight-text-python-console"><pre style="white-space: pre-wrap">Python 3.10.10 (main, Mar 21 2023, 13:41:05) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> <span class="pl-k">import</span> sqlite3
>>> conn <span class="pl-k">=</span> sqlite3.connect(<span class="pl-s"><span class="pl-pds">"</span>:memory:<span class="pl-pds">"</span></span>)
>>> conn.enable_load_extension(<span class="pl-c1">True</span>)
>>> conn.load_extension(<span class="pl-s"><span class="pl-pds">"</span>/tmp/vector.dylib<span class="pl-pds">"</span></span>)
>>> conn.execute(<span class="pl-s"><span class="pl-pds">"</span>CREATE TABLE vectors (id INTEGER PRIMARY KEY, vector BLOB)<span class="pl-pds">"</span></span>)
<sqlite3.Cursor object at 0x1047fecc0>
>>> conn.execute(<span class="pl-s"><span class="pl-pds">"</span>INSERT INTO vectors (vector) VALUES (vector_encode('[1.0, 2.0, 3.0]'))<span class="pl-pds">"</span></span>)
<sqlite3.Cursor object at 0x1047fee40>
>>> conn.execute(<span class="pl-s"><span class="pl-pds">"</span>INSERT INTO vectors (vector) VALUES (vector_encode('[3.0, 2.0, 1.0]'))<span class="pl-pds">"</span></span>)
<sqlite3.Cursor object at 0x1047fecc0>
>>> decoded_vector <span class="pl-k">=</span> conn.execute(<span class="pl-s"><span class="pl-pds">"</span>SELECT vector_decode(vector) FROM vectors<span class="pl-pds">"</span></span>).fetchall()
>>> decoded_vector
[('[1.00, 2.00, 3.00]',), ('[3.00, 2.00, 1.00]',)]</pre></div>
<p>So I've now seen that C extension run on both Linux and macOS.</p>
<h4 id="whole-project-on-my-phone">I did this whole project on my phone</h4>
<p>Here's the thing I enjoy most about using Code Interpreter for these kinds of prototypes: since the prompts are short, and there's usually a delay of 30s+ between each prompt while it does its thing, I can do the whole thing on my phone while doing other things.</p>
<p>In this particular case I started out in bed, then got up, fed the dog, made coffee and pottered around the house for a bit - occasionally glancing back at my screen and poking it in a new direction with another prompt.</p>
<p>This almost doesn't count as a project at all. It started out as mild curiosity, and I only started taking it seriously when it became apparent that it was likely to produce a working result.</p>
<p>I only switched to my laptop right at the end, to try out the macOS compilation steps.</p>
<p>Total time invested: around an hour, but that included various other morning activities (coffee, dog maintenance, letting out the chickens.)</p>
<p>Which leads to the dilemma that affects so many of my weird little ChatGPT experiments:</p>
<h4 id="the-dilemma">The dilemma: do I finish this project?</h4>
<p>Thanks to Code Interpreter I now have a working prototype of something I would <em>never</em> have attempted to build on my own. My knowledge of C is thin enough that I don't remotely have the confidence to try something like this myself.</p>
<p>Taking what I've got so far and turning it into code that I would feel responsible using - and sharing with other people - requires the following:</p>
<ul>
<li>I need to manually test it <em>really</em> thoroughly. I haven't actually done the work to ensure it's returning the right results yet!</li>
<li>I need to make sure I understand every line of C code that it's written for me</li>
<li>I then need to review that code, and make sure it's sensible and logic-error-free</li>
<li>I need to audit it for security</li>
<li>I need to add comprehensive automated tests</li>
</ul>
<p>I should probably drop the <code>vector_encode()</code> and <code>vector_decode()</code> functions entirely - parsing a JSON-like string in C is fraught with additional risk already, and those aren't performance critical - just having a fast <code>vector_similarity()</code> function that worked against BLOBs would give me the performance gain I'm looking for.</p>
<p>All of this is a <em>lot</em> of extra work. ChatGPT can help me in various ways with each of those steps, but it's still on me to do the work and make absolutely sure that I'm confident in my understanding beyond just what got hallucinated at me by a bunch of black-box matrices.</p>
<p>This project was not in my plans for the weekend. I'm not going to put that work in right now - so "SQLite C extension for vector similarity" will be added to my ever-growing list of half-baked ideas that LLMs helped me prototype way beyond what I would have been able to do on my own.</p>
<p>So I'm going to blog about it, and move on. I may well revisit this - the performance gains over my Python functions looked to be 16-83x (according to a benchmark that ChatGPT ran for me which I have not taken the time to verify) which is a <em>very</em> material improvement. But for the moment I have so many other things I need to prioritize.</p>
<p>If anyone else wants to take this and turn it into something usable, please be my guest!</p>
mapshaper.org2024-03-23T03:44:22+00:002024-03-23T03:44:22+00:00https://simonwillison.net/2024/Mar/23/mapshaperorg/#atom-everything
<p><a href="https://mapshaper.org/">mapshaper.org</a></p>
<p>It turns out the mapshaper CLI tool for manipulating geospatial data - including converting shapefiles to GeoJSON and back again - also has a web UI that runs the conversions entirely in your browser. If you need to convert between those (and other) formats it's hard to imagine a more convenient option.</p>
<p>Via <a href="https://twitter.com/adamrpearce/status/1771378836128854097">@adamrpearce</a></p>
Threads has entered the fediverse2024-03-22T20:15:20+00:002024-03-22T20:15:20+00:00https://simonwillison.net/2024/Mar/22/threads-has-entered-the-fediverse/#atom-everything
<p><a href="https://engineering.fb.com/2024/03/21/networking-traffic/threads-has-entered-the-fediverse/">Threads has entered the fediverse</a></p>
<p>Threads users with public profiles in certain countries can now turn on a setting which makes their posts available in the fediverse - so users of ActivityPub systems such as Mastodon can follow their accounts to subscribe to their posts.</p>
<p>It's only a partial integration at the moment: Threads users can't themselves follow accounts from other providers yet, and their notifications will show them likes but not boosts or replies: "For now, people who want to see replies on their posts on other fediverse servers will have to visit those servers directly."</p>
<p>Depending on how you count, Mastodon has around 9m user accounts of which 1m are active. Threads claims more than 130m active monthly users. The Threads team are developing these features cautiously which is reassuring to see - a clumsy or thoughtless integration could cause all sorts of damage just from the sheer scale of their service.</p>
<p>Via <a href="https://anderegg.ca/2024/03/22/poking-at-threads-in-the-fediverse">Gavin Anderegg</a></p>
Claude and ChatGPT for ad-hoc sidequests2024-03-22T19:44:12+00:002024-03-22T19:44:12+00:00https://simonwillison.net/2024/Mar/22/claude-and-chatgpt-case-study/#atom-everything
<p>Here is a short, illustrative example of one of the ways in which I use Claude and ChatGPT on a daily basis.</p>
<p>I recently learned that the <a href="https://en.wikipedia.org/wiki/Adirondack_Park">Adirondack Park</a> is the single largest park in the contiguous United States, taking up a fifth of the state of New York.</p>
<p>Naturally, my first thought was that it would be neat to have a GeoJSON file representing the boundary of the park.</p>
<p>A quick search landed me on the <a href="https://apa.ny.gov/gis/ApaData.html">Adirondack Park Agency GIS data page</a>, which offered me a shapefile of the "Outer boundary of the New York State Adirondack Park as described in Section 9-0101 of the New York Environmental Conservation Law". Sounds good!</p>
<p>I knew there were tools for converting shapefiles to GeoJSON, but I couldn't remember what they were. Since I had a terminal window open already, I typed the following:</p>
<div class="highlight highlight-source-shell"><pre style="white-space: pre-wrap">llm -m opus -c <span class="pl-s"><span class="pl-pds">'</span>give me options on macOS for CLI tools to turn a shapefile into GeoJSON<span class="pl-pds">'</span></span></pre></div>
<p>Here I am using my <a href="https://llm.datasette.io/">LLM tool</a> (and <a href="https://github.com/simonw/llm-claude-3">llm-claude-3</a> plugin) to run a prompt through the new <a href="https://www.anthropic.com/news/claude-3-family">Claude 3 Opus</a>, my current favorite language model.</p>
<p>It replied with a couple of options, but the first was this:</p>
<div class="highlight highlight-source-shell"><pre>ogr2ogr -f GeoJSON output.geojson input.shp</pre></div>
<p>So I ran that against the shapefile, and then pasted <a href="https://gist.github.com/simonw/c941f3454cdec7e10f500dc5a752b614">the resulting GeoJSON</a> into <a href="https://geojson.io/">geojson.io</a> to check if it worked... and nothing displayed. Then I looked at the GeoJSON and spotted this:</p>
<p><code>"coordinates": [ [ -8358911.527799999341369, 5379193.197800002992153 ] ...</code></p>
<p>That didn't look right. Those co-ordinates aren't the correct scale for latitude and longitude values.</p>
<p>So I sent a follow-up prompt to the model (the <code>-c</code> option means "continue previous conversation"):</p>
<div class="highlight highlight-source-shell"><pre style="white-space: pre-wrap">llm -c <span class="pl-s"><span class="pl-pds">'</span>i tried using ogr2ogr but it gave me back GeoJSON with a weird coordinate system that was not lat/lon that i am used to<span class="pl-pds">'</span></span></pre></div>
<p>It suggested this new command:</p>
<div class="highlight highlight-source-shell"><pre>ogr2ogr -f GeoJSON -t_srs EPSG:4326 output.geojson input.shp</pre></div>
<p>This time <a href="https://gist.github.com/simonw/6c4cf102a8ea532dc365c2773f0eb6ea">it worked</a>! The shapefile has now been converted to GeoJSON.</p>
<p>Time elapsed so far: 2.5 minutes (I can tell from <a href="https://llm.datasette.io/en/stable/logging.html">my LLM logs</a>).</p>
<p>I pasted it into <a href="https://datasette.io/">Datasette</a> (with <a href="https://github.com/datasette/datasette-paste">datasette-paste</a> and <a href="https://datasette.io/plugins/datasette-leaflet-geojson">datasette-leaflet-geojson</a>) to take a look at it more closely, and got this:</p>
<p><img src="https://static.simonwillison.net/static/2024/datasette-adirondack-boundaries.jpg" alt="A Datasette table with 106 rows. The first two are shown - both have properties and a geometry, and the geometry is a single line on a map. The first one has a ECL_Text of thence southerly along the westerly line of lots 223, 241, 259, 276, 293, 309, 325 and 340 to the southwesterly corner of lot number 340 in the Brantingham Tract and the second has thence westerly along the northern line of lots 204 and 203 to the midpoint of the northern line of lot 203" style="max-width: 100%;" /></p>
<p>That's not a single polygon! That's 106 line segments... and they are fascinating. Look at those descriptions:</p>
<blockquote>
<p>thence westerly along the northern line of lots 204 and 203 to the midpoint of the northern line of lot 203</p>
</blockquote>
<p>This is utterly delightful. The shapefile description did say "as described in Section 9-0101 of the New York Environmental Conservation Law", so I guess this is how you write geographically boundaries into law!</p>
<p>But it's not what I wanted. I want a single polygon of the whole park, not 106 separate lines.</p>
<p>I decided to switch models. ChatGPT has access to Code Interpreter, and I happen to know that Code Interpreter is quite effective at processing GeoJSON.</p>
<p>I opened a new ChatGPT (with GPT-4) browser tab, uploaded my GeoJSON file and prompted it:</p>
<blockquote>
<p>This GeoJSON file is full of line segments. Use them to create me a single shape that is a Polygon</p>
</blockquote>
<p><img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-1.jpg" alt="ChatGPT screenshot - it shows some Python code with a result of <shapely.geometry.polygon.Polygon at 0x7eba83f9fca0 />, then says: I've created a polygon from the line segments in the GeoJSON file. You can now use this polygon for further analysis or visualization. If you have specific requirements for the polygon or need it in a particular format, please let me know! " style="max-width: 100%;" /></p>
<p>OK, so it wrote some Python code and ran it. But did it work?</p>
<p>I happen to know that Code Interpreter can save files to disk and provide links to download them, so I told it to do that:</p>
<blockquote>
<p>Save it to a GeoJSON file for me to download</p>
</blockquote>
<p><img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-2.jpg" alt="ChatGPT screenshot - this time it writes more Python code to define a GeoJSON polygon, then saves that to a file called /mnt/data/polygon.geojson and gives me a link to download it." style="max-width: 100%;" /></p>
<p>I pasted <a href="https://gist.github.com/simonw/c1002dbf5249de7addd0b65cb774d3e9">that</a> into <a href="https://geojson.io/">geojson.io</a>, and it was clearly wrong:</p>
<p><img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-bad-map.jpg" alt="geojson.io screenshot - a triangle shape sits on top of an area of upstate New York, clearly not in the shape of the park" style="max-width: 100%;" /></p>
<p>So I told it to try again. I didn't think very hard about this prompt, I basically went with a version of "do better":</p>
<blockquote>
<p>that doesn't look right to me, check that it has all of the lines in it</p>
</blockquote>
<p><img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-3.jpg" alt="ChatGPT screenshot - it writes more Python code and outputs a link to complete_polygon.geojson" style="max-width: 100%;" /></p>
<p>It gave me a new file, optimistically named <code>complete_polygon.geojson</code>. Here's what that one looked like:</p>
<p><img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-almost.jpg" alt="ChatGPT screenshot - it writes more Python code and outputs a link to complete_polygon.geojson" style="max-width: 100%;" /></p>
<p>This is getting a lot closer! Note how the right hand boundary of the park looks correct, but the rest of the image is scrambled.</p>
<p>I had a hunch about the fix. I prompted:</p>
<p><img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-4.jpg" alt="That almost works but you need to sort the line segments first, it looked like this: an a screenshot of a map" style="max-width: 100%;" /></p>
<p>I pasted in a screenshot of where we were so far and added my hunch about the solution:</p>
<blockquote>
<p>That almost works but you need to sort the line segments first, it looked like this:</p>
</blockquote>
<p>Honestly, pasting in the screenshot probably wasn't necessary here, but it amused me.</p>
<p>... and ChatGPT churned away again ...</p>
<p><img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-5.jpg" alt="More Python code - link to the full transcript is below" style="max-width: 100%;" /></p>
<p><a href="https://gist.github.com/simonw/b9e4325b76e4a3813ff5482aa278c342">sorted_polygon.geojson</a> is spot on! Here's what it looks like:</p>
<p><img src="https://static.simonwillison.net/static/2024/shapely-good-map.jpg" alt="A shaded polygon showing the exact shape of the boundary of Adirondack Park, overlayed on a map of the area" style="max-width: 100%;" /></p>
<p>Total time spent in ChatGPT: 3 minutes and 35 seconds. Plus 2.5 minutes with Claude 3 earlier, so an overall total of just over 6 minutes.</p>
<p>Here's <a href="https://gist.github.com/simonw/0343cdd3568bbe28cad15d1097b1b1c7">the full Claude transcript</a> and the <a href="https://gist.github.com/simonw/3eb845823c5ad4c48d2b4eb7586f1533">full transcript from ChatGPT</a>.</p>
<h4 id="not-notable">This isn't notable</h4>
<p>The most notable thing about this example is how completely <em>not</em> notable it is.</p>
<p>I get results like this from these tools several times a day. I'm not at all surprised that this worked, in fact, I would've been mildly surprised if it had not.</p>
<p>Could I have done this without LLM assistance? Yes, but not nearly as quickly. And this was not a task on my critical path for the day - it was a sidequest at best and honestly more of a distraction.</p>
<p>So, without LLM tools, I would likely have given this one up at the first hurdle.</p>
<p>A year ago I wrote about how <a href="https://simonwillison.net/2023/Mar/27/ai-enhanced-development/">AI-enhanced development makes me more ambitious with my projects</a>. They are now so firmly baked into my daily work that they influence not just side projects but tiny sidequests like this one as well.</p>
<h4 id="not-simple">This certainly wasn't simple</h4>
<p>Something else I like about this example is that it illustrates quite how much depth there is to getting great results out of these systems.</p>
<p>In those few minutes I used two different interfaces to call two different models. I sent multiple follow-up prompts. I triggered Code Interpreter, took advantage of GPT-4 Vision and mixed in external tools like <a href="https://geojson.io/">geojson.io</a> and Datasette as well.</p>
<p>I leaned a lot on my existing knowledge and experience:</p>
<ul>
<li>I knew that tools existed for commandline processing of shapefiles and GeoJSON</li>
<li>I instinctively knew that Claude 3 Opus was likely to correctly answer my initial prompt</li>
<li>I knew the capabilities of Code Interpreter, including that it has libraries that can process geometries, what to say to get it to kick into action and how to get it to give me files to download</li>
<li>My limited GIS knowledge was strong enough to spot a likely coordinate system problem, and I guessed the fix for the jumbled lines</li>
<li>My prompting intuition is developed to the point that I didn't have to think very hard about what to say to get the best results</li>
</ul>
<p>If you have the right combination of domain knowledge and hard-won experience driving LLMs, you can <em>fly</em> with these things.</p>
<h4 id="a-bit-trivial">Isn't this a bit trivial?</h4>
<p>Yes it is, and that's the point. This was a five minute sidequest. Writing about it here took ten times longer than the exercise itself.</p>
<p>I take on LLM-assisted sidequests like this one dozens of times a week. Many of them are substantially larger and more useful. They are having a very material impact on my work: I can get more done and solve much more interesting problems, because I'm not wasting valuable cycles figuring out <code>ogr2ogr</code> invocations or mucking around with polygon libraries.</p>
<p>Not to mention that I find working this way <em>fun</em>! It feels like science fiction every time I do it. Our AI-assisted future is here right now and I'm still finding it weird, fascinating and deeply entertaining.</p>
<h4 id="llms-are-useful">LLMs are useful</h4>
<p>There are many legitimate criticisms of LLMs. The copyright issues involved in their training, their enormous power consumption and the risks of people trusting them when they shouldn't (considering both accuracy and bias) are three that I think about a lot.</p>
<p>The one criticism I wont accept is that they aren't <em>useful</em>.</p>
<p>One of the greatest misconceptions concerning LLMs is the idea that they are easy to use. They really aren't: getting great results out of them requires a great deal of experience and hard-fought intuition, combined with deep domain knowledge of the problem you are applying them to.</p>
<p>I use these things every day. They help me take on much more interesting and ambitious problems than I could otherwise. I would miss them terribly if they were no longer available to me.</p>
The Dropflow Playground2024-03-22T01:33:44+00:002024-03-22T01:33:44+00:00https://simonwillison.net/2024/Mar/22/the-dropflow-playground/#atom-everything
<p><a href="https://chearon.github.io/dropflow/">The Dropflow Playground</a></p>
<p>Dropflow is a "CSS layout engine" written in TypeScript and taking advantage of the HarfBuzz text shaping engine (used by Chrome, Android, Firefox and more) compiled to WebAssembly to implement glyph layout.</p>
<p>This linked demo is fascinating: on the left hand side you can edit HTML with inline styles, and the right hand side then updates live to show that content rendered by Dropflow in a canvas element.</p>
<p>Why would you want this? It lets you generate images and PDFs with excellent performance using your existing knowledge HTML and CSS. It's also just really cool!</p>
<p>Via <a href="https://github.com/chearon/dropflow">chearon/dropflow</a></p>
Quoting Ryan Broderick2024-03-21T21:49:43+00:002024-03-21T21:49:43+00:00https://simonwillison.net/2024/Mar/21/ryan-broderick/#atom-everything
<blockquote cite="https://www.garbageday.email/p/clout-world#a-is-impending-reputation-crisis"><p>At this point, I’m confident saying that 75% of what generative-AI text and image platforms can do is useless at best and, at worst, actively harmful. Which means that if AI companies want to onboard the millions of people they need as customers to fund themselves and bring about the great AI revolution, they’ll have to perpetually outrun the millions of pathetic losers hoping to use this tech to make a quick buck. Which is something crypto has never been able to do.<br><br>In fact, we may have already reached a point where AI images have become synonymous with scams and fraud.</p></blockquote><p class="cite">— <a href="https://www.garbageday.email/p/clout-world#a-is-impending-reputation-crisis">Ryan Broderick</a>
DuckDB as the New jq2024-03-21T20:36:20+00:002024-03-21T20:36:20+00:00https://simonwillison.net/2024/Mar/21/duckdb-as-the-new-jq/#atom-everything
<p><a href="https://www.pgrs.net/2024/03/21/duckdb-as-the-new-jq/">DuckDB as the New jq</a></p>
<p>The DuckDB CLI tool can query JSON files directly, making it a surprisingly effective replacement for jq. Paul Gross demonstrates the following query:</p>
<p>select license->>'key' as license, count(*) from 'repos.json' group by 1</p>
<p>repos.json contains an array of {"license": {"key": "apache-2.0"}..} objects. This example query shows counts for each of those licenses.</p>
<p>Via <a href="https://lobste.rs/s/x5immj/duckdb_as_new_jq">lobste.rs</a></p>
Redis Adopts Dual Source-Available Licensing2024-03-21T02:24:40+00:002024-03-21T02:24:40+00:00https://simonwillison.net/2024/Mar/21/redis-adopts-dual-source-available-licensing/#atom-everything
<p><a href="https://redis.com/blog/redis-adopts-dual-source-available-licensing/">Redis Adopts Dual Source-Available Licensing</a></p>
<p>Well this sucks: after fifteen years (and contributions from more than 700 people), Redis is dropping the 3-clause BSD license going forward, instead being "dual-licensed under the Redis Source Available License (RSALv2) and Server Side Public License (SSPLv1)" from Redis 7.4 onwards.</p>
<p>Via <a href="https://mstdn.social/@msw/112130308202090850">@msw</a></p>
Quoting Jacob Kaplan-Moss2024-03-21T00:45:47+00:002024-03-21T00:45:47+00:00https://simonwillison.net/2024/Mar/21/jacob-kaplan-moss/#atom-everything
<blockquote cite="https://jacobian.org/2024/mar/20/django-chat/"><p>I think most people have this naive idea of consensus meaning “everyone agrees”. That’s not what consensus means, as practiced by organizations that truly have a mature and well developed consensus driven process.<br><br>Consensus is not “everyone agrees”, but [a model where] people are more aligned with the process than they are with any particular outcome, and they’ve all agreed on how decisions will be made.</p></blockquote><p class="cite">— <a href="https://jacobian.org/2024/mar/20/django-chat/">Jacob Kaplan-Moss</a>
Talking about Django’s history and future on Django Chat2024-03-21T00:42:15+00:002024-03-21T00:42:15+00:00https://simonwillison.net/2024/Mar/21/django-history-and-future/#atom-everything
<p><a href="https://jacobian.org/2024/mar/20/django-chat/">Talking about Django’s history and future on Django Chat</a></p>
<p>Django co-creator Jacob Kaplan-Moss sat down with the Django Chat podcast team to talk about Django's history, his recent return to the Django Software Foundation board and what he hopes to achieve there.</p>
<p>Here's his post about it, where he used Whisper and Claude to extract some of his own highlights from the conversation.</p>
<p>Via <a href="https://djangochat.com/episodes/djangos-evolution-jacob-kaplan-moss-6CswRTAf">Django Chat episodes</a></p>
GitHub Public repo history tool2024-03-20T21:56:12+00:002024-03-20T21:56:12+00:00https://simonwillison.net/2024/Mar/20/github-public-repo-history/#atom-everything
<p><a href="https://observablehq.com/@simonw/github-public-repo-history">GitHub Public repo history tool</a></p>
<p>I built this Observable Notebook to run queries against the GH Archive (via ClickHouse) to try to answer questions about repository history - in particular, were they ever made public as opposed to private in the past.</p>
<p>It works by combining together PublicEvent event (moments when a private repo was made public) with the most recent PushEvent event for each of a user's repositories.</p>
<p>Via <a href="https://til.simonwillison.net/clickhouse/github-public-history">TIL: Reviewing your history of public GitHub repositories using ClickHouse</a></p>
Releasing Common Corpus: the largest public domain dataset for training LLMs2024-03-20T19:34:00+00:002024-03-20T19:34:00+00:00https://simonwillison.net/2024/Mar/20/releasing-common-corpus/#atom-everything
<p><a href="https://huggingface.co/blog/Pclanglais/common-corpus">Releasing Common Corpus: the largest public domain dataset for training LLMs</a></p>
<p>Released today. 500 billion words from "a wide diversity of cultural heritage initiatives". 180 billion words of English, 110 billion of French, 30 billion of German, then Dutch, Spanish and Italian.</p>
<p>Includes quite a lot of US public domain data - 21 million digitized out-of-copyright newspapers (or do they mean newspaper articles?)</p>
<p>"This is only an initial part of what we have collected so far, in part due to the lengthy process of copyright duration verification. In the following weeks and months, we’ll continue to publish many additional datasets also coming from other open sources, such as open data or open science."</p>
<p>Coordinated by French AI startup Pleias and supported by the French Ministry of Culture, among others.</p>
<p>I can't wait to try a model that's been trained on this.</p>
<p>Via <a href="https://www.wired.com/story/proof-you-can-train-ai-without-slurping-copyrighted-content/">Wired: Here's Proof You Can Train an AI Model Without Slurping Copyrighted Content</a></p>
Skew protection in Vercel2024-03-20T14:06:38+00:002024-03-20T14:06:38+00:00https://simonwillison.net/2024/Mar/20/skew-protection-in-vercel/#atom-everything
<p><a href="https://vercel.com/docs/deployments/skew-protection">Skew protection in Vercel</a></p>
<p>Version skew is a name for the bug that occurs when your user loads a web application and then unintentionally keeps that browser tab open across a deployment of a new version of the app. If you're unlucky this can lead to broken behaviour, where a client makes a call to a backend endpoint that has changed in an incompatible way.</p>
<p>Vercel have an ingenious solution to this problem. Their platform already makes it easy to deploy many different instances of an application. You can now turn on "skew protection" for a number of hours which will keep older versions of your backend deployed.</p>
<p>The application itself can then include its desired deployment ID in a x-deployment-id header, a __vdpl cookie or a ?dpl= query string parameter.</p>
<p>Via <a href="https://twitter.com/vercel_changes/status/1770280131250286851">Vercel changes</a></p>