Entries

Filters: Sorted by date

3,316 results «« first « previous page 3 / 111 next » last »»

First impressions of Claude Cowork, Anthropic’s general agent

New from Anthropic today is Claude Cowork, a “research preview” that they describe as “Claude Code for the rest of your work”. It’s currently available only to Max subscribers ($100 or $200 per month plans) as part of the updated Claude Desktop macOS application. Update 16th January 2026: it’s now also available to $20/month Claude Pro subscribers.

[... 1,863 words]

9:46 pm / 12th January 2026 / sandboxing, ai, prompt-injection, generative-ai, llms, anthropic, claude, ai-agents, claude-code, lethal-trifecta, claude-cowork

My answers to the questions I posed about porting open source code with LLMs

Last month I wrote about porting JustHTML from Python to JavaScript using Codex CLI and GPT-5.2 in a few hours while also buying a Christmas tree and watching Knives Out 3. I ended that post with a series of open questions about the ethics and legality of this style of work. Alexander Petros on lobste.rs just challenged me to answer them, which is fair enough! Here’s my attempt at that.

[... 1,034 words]

10:59 pm / 11th January 2026 / definitions, open-source, ai, generative-ai, llms, ai-assisted-programming, ai-ethics, conformance-suites, vibe-porting

Fly’s new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time

New from Fly.io today: Sprites.dev. Here’s their blog post and YouTube demo. It’s an interesting new product that’s quite difficult to explain—Fly call it “Stateful sandbox environments with checkpoint & restore” but I see it as hitting two of my current favorite problems: a safe development environment for running coding agents and an API for running untrusted code in a secure sandbox.

[... 1,560 words]

11:57 pm / 9th January 2026 / sandboxing, thomas-ptacek, ai, fly, coding-agents

LLM predictions for 2026, shared with Oxide and Friends

I joined a recording of the Oxide and Friends podcast on Tuesday to talk about 1, 3 and 6 year predictions for the tech industry. This is my second appearance on their annual predictions episode, you can see my predictions from January 2025 here. Here’s the page for this year’s episode, with options to listen in all of your favorite podcast apps or directly on YouTube.

[... 1,773 words]

7:42 pm / 8th January 2026 / predictions, sandboxing, ai, kakapo, generative-ai, llms, ai-assisted-programming, oxide, bryan-cantrill, coding-agents, jevons-paradox, conformance-suites, browser-challenge, deep-blue, november-2025-inflection

Introducing gisthost.github.io

I am a huge fan of gistpreview.github.io, the site by Leon Huang that lets you append ?GIST_id to see a browser-rendered version of an HTML page that you have saved to a Gist. The last commit was ten years ago and I needed a couple of small changes so I’ve forked it and deployed an updated version at gisthost.github.io.

[... 956 words]

10:12 pm / 1st January 2026 / github, http, javascript, projects, ai-assisted-programming, cors

2025: The year in LLMs

This is the third in my annual series reviewing everything that happened in the LLM space over the past 12 months. For previous years see Stuff we figured out about AI in 2023 and Things we learned about LLMs in 2024.

[... 8,273 words]

11:50 pm / 31st December 2025 / ai, openai, generative-ai, llms, anthropic, gemini, ai-agents, pelican-riding-a-bicycle, vibe-coding, coding-agents, ai-in-china, conformance-suites

How Rob Pike got spammed with an AI slop “act of kindness”

Rob Pike (that Rob Pike) is furious. Here’s a Bluesky link for if you have an account there and a link to it in my thread viewer if you don’t.

[... 2,158 words]

6:16 pm / 26th December 2025 / rob-pike, ai, shot-scraper, generative-ai, llms, slop, ai-agents, ai-ethics, ai-misuse

A new way to extract detailed transcripts from Claude Code

I’ve released claude-code-transcripts, a new Python CLI tool for converting Claude Code transcripts to detailed HTML pages that provide a better interface for understanding what Claude Code has done than even Claude Code itself. The resulting transcripts are also designed to be shared, using any static HTML hosting or even via GitHub Gists.

[... 1,082 words]

11:52 pm / 25th December 2025 / projects, ai, generative-ai, llms, ai-assisted-programming, anthropic, claude, coding-agents, claude-code

Cooking with Claude

I’ve been having an absurd amount of fun recently using LLMs for cooking. I started out using them for basic recipes, but as I’ve grown more confident in their culinary abilities I’ve leaned into them for more advanced tasks. Today I tried something new: having Claude vibe-code up a custom application to help with the timing for a complicated meal preparation. It worked really well!

[... 1,313 words]

5:01 am / 23rd December 2025 / cooking, devfort, localstorage, tools, ai, generative-ai, llms, anthropic, claude, vision-llms, vibe-coding

Your job is to deliver code you have proven to work

In all of the debates about the value of AI-assistance in software development there’s one depressing anecdote that I keep on seeing: the junior engineer, empowered by some class of LLM tool, who deposits giant, untested PRs on their coworkers—or open source maintainers—and expects the “code review” process to handle the rest.

[... 840 words]

2:49 pm / 18th December 2025 / programming, careers, ai, generative-ai, llms, ai-assisted-programming, ai-ethics, vibe-coding, coding-agents

Gemini 3 Flash

It continues to be a busy December, if not quite as busy as last year. Today’s big news is Gemini 3 Flash, the latest in Google’s “Flash” line of faster and less expensive models.

[... 1,271 words]

10:44 pm / 17th December 2025 / google, ai, web-components, generative-ai, llms, llm, gemini, llm-pricing, pelican-riding-a-bicycle, llm-release

I ported JustHTML from Python to JavaScript with Codex CLI and GPT-5.2 in 4.5 hours

I wrote about JustHTML yesterday—Emil Stenström’s project to build a new standards compliant HTML5 parser in pure Python code using coding agents running against the comprehensive html5lib-tests testing library. Last night, purely out of curiosity, I decided to try porting JustHTML from Python to JavaScript with the least amount of effort possible, using Codex CLI and GPT-5.2. It worked beyond my expectations.

[... 1,818 words]

11:58 pm / 15th December 2025 / html, javascript, python, ai, generative-ai, llms, ai-assisted-programming, gpt-5, codex, november-2025-inflection, vibe-porting, gpt

JustHTML is a fascinating example of vibe engineering in action

I recently came across JustHTML, a new Python library for parsing HTML released by Emil Stenström. It’s a very interesting piece of software, both as a useful library and as a case study in sophisticated AI-assisted programming.

[... 956 words]

3:59 pm / 14th December 2025 / html, python, ai, generative-ai, llms, ai-assisted-programming, vibe-coding, coding-agents, conformance-suites

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI

One of the things that most excited me about Anthropic’s new Skills mechanism back in October is how easy it looked for other platforms to implement. A skill is just a folder with a Markdown file and some optional extra resources and scripts, so any LLM tool with the ability to navigate and read from a filesystem should be capable of using them. It turns out OpenAI are doing exactly that, with skills support quietly showing up in both their Codex CLI tool and now also in ChatGPT itself.

[... 1,360 words]

11:29 pm / 12th December 2025 / pdf, ai, kakapo, openai, prompt-engineering, generative-ai, chatgpt, llms, ai-assisted-programming, anthropic, coding-agents, gpt-5, codex, skills, gpt

GPT-5.2

OpenAI reportedly declared a “code red” on the 1st of December in response to increasingly credible competition from the likes of Google’s Gemini 3. It’s less than two weeks later and they just announced GPT-5.2, calling it “the most capable model series yet for professional knowledge work”.

[... 964 words]

11:58 pm / 11th December 2025 / ai, openai, generative-ai, llms, llm, pelican-riding-a-bicycle, llm-release, gpt-5, gpt

Useful patterns for building HTML tools

I’ve started using the term HTML tools to refer to HTML applications that I’ve been building which combine HTML, JavaScript, and CSS in a single file and use them to provide useful functionality. I have built over 150 of these in the past two years, almost all of them written by LLMs. This article presents a collection of useful patterns I’ve discovered along the way.

[... 4,231 words]

9 pm / 10th December 2025 / definitions, github, html, javascript, localstorage, projects, tools, ai, webassembly, generative-ai, llms, ai-assisted-programming, vibe-coding, coding-agents, claude-code

Under the hood of Canada Spends with Brendan Samek

I talked to Brendan Samek about Canada Spends, a project from Build Canada that makes Canadian government financial data accessible and explorable using a combination of Datasette, a neat custom frontend, Ruby ingestion scripts, sqlite-utils and pieces of LLM-powered PDF extraction.

[... 561 words]

11:52 pm / 9th December 2025 / data-journalism, politics, sqlite, youtube, datasette, sqlite-utils

Highlights from my appearance on the Data Renegades podcast with CL Kao and Dori Wilson

I talked with CL Kao and Dori Wilson for an episode of their new Data Renegades podcast titled Data Journalism Unleashed with Simon Willison.

[... 2,964 words]

12:29 am / 26th November 2025 / data, data-journalism, django, ai, datasette, podcast-appearances

Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult

Anthropic released Claude Opus 4.5 this morning, which they call “best model in the world for coding, agents, and computer use”. This is their attempt to retake the crown for best coding model after significant challenges from OpenAI’s GPT-5.1-Codex-Max and Google’s Gemini 3, both released within the past week!

[... 1,120 words]

7:37 pm / 24th November 2025 / prompt-injection, generative-ai, llms, anthropic, claude, evals, llm-pricing, pelican-riding-a-bicycle, llm-release, november-2025-inflection

sqlite-utils 4.0a1 has several (minor) backwards incompatible changes

I released a new alpha version of sqlite-utils last night—the 128th release of that package since I started building it back in 2018.

[... 1,049 words]

2:52 pm / 24th November 2025 / projects, sqlite, sqlite-utils, annotated-release-notes, ai-assisted-programming, coding-agents, claude-code

Olmo 3 is a fully open LLM

Olmo is the LLM series from Ai2—the Allen institute for AI. Unlike most open weight models these are notable for including the full training data, training process and checkpoints along with those releases.

[... 1,834 words]

11:59 pm / 22nd November 2025 / ai, generative-ai, llms, interpretability, pelican-riding-a-bicycle, llm-reasoning, ai2, ai-ethics, llm-release, lm-studio, nathan-lambert, olmo

Nano Banana Pro aka gemini-3-pro-image-preview is the best available image generation model

Hot on the heels of Tuesday’s Gemini 3 Pro release, today it’s Nano Banana Pro, also known as Gemini 3 Pro Image. I’ve had a few days of preview access and this is an astonishingly capable image generation model.

[... 1,641 words]

4:32 pm / 20th November 2025 / google, ai, datasette, generative-ai, llms, gemini, text-to-image, llm-release, nano-banana

How I automate my Substack newsletter with content from my blog

I sent out my weekly-ish Substack newsletter this morning and took the opportunity to record a YouTube video demonstrating my process and describing the different components that make it work. There’s a lot of digital duct tape involved, taking the content from Django+Heroku+PostgreSQL to GitHub Actions to SQLite+Datasette+Fly.io to JavaScript+Observable and finally to Substack.

[... 1,345 words]

10 pm / 19th November 2025 / blogging, django, javascript, postgresql, sql, sqlite, youtube, heroku, datasette, observable, github-actions, fly, newsletter, substack, site-upgrades

Trying out Gemini 3 Pro with audio transcription and a new pelican benchmark

Google released Gemini 3 Pro today. Here’s the announcement from Sundar Pichai, Demis Hassabis, and Koray Kavukcuoglu, their developer blog announcement from Logan Kilpatrick, the Gemini 3 Pro Model Card, and their collection of 11 more articles. It’s a big release!

[... 2,476 words]

7 pm / 18th November 2025 / google, ai, generative-ai, llms, llm, gemini, llm-pricing, pelican-riding-a-bicycle, llm-reasoning, llm-release

What happens if AI labs train for pelicans riding bicycles?

Almost every time I share a new example of an SVG of a pelican riding a bicycle a variant of this question pops up: how do you know the labs aren’t training for your benchmark?

[... 325 words]

4:03 pm / 13th November 2025 / ai, generative-ai, llms, pelican-riding-a-bicycle

Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican

OpenAI partially released a new model yesterday called GPT-5-Codex-Mini, which they describe as "a more compact and cost-efficient version of GPT-5-Codex". It’s currently only available via their Codex CLI tool and VS Code extension, with proper API access "coming soon". I decided to use Codex to reverse engineer the Codex CLI tool and give me the ability to prompt the new model directly.

[... 1,774 words]

3:31 am / 9th November 2025 / ai, rust, openai, generative-ai, llms, ai-assisted-programming, pelican-riding-a-bicycle, llm-release, vibe-coding, coding-agents, gpt-5, codex, gpt-codex, gpt

Video + notes on upgrading a Datasette plugin for the latest 1.0 alpha, with help from uv and OpenAI Codex CLI

I’m upgrading various plugins for compatibility with the new Datasette 1.0a20 alpha release and I decided to record a video of the process. This post accompanies that video with detailed additional notes.

[... 1,094 words]

6:26 pm / 6th November 2025 / plugins, python, youtube, ai, datasette, generative-ai, llms, ai-assisted-programming, uv, coding-agents, codex

Code research projects with async coding agents like Claude Code and Codex

I’ve been experimenting with a pattern for LLM usage recently that’s working out really well: asynchronous code research tasks. Pick a research question, spin up an asynchronous coding agent and let it go and run some experiments and report back when it’s done.

[... 2,017 words]

3:53 pm / 6th November 2025 / ai, webassembly, generative-ai, llms, ai-assisted-programming, slop, ai-agents, coding-agents, claude-code, jules, codex

A new SQL-powered permissions system in Datasette 1.0a20

Datasette 1.0a20 is out with the biggest breaking API change on the road to 1.0, improving how Datasette’s permissions system works by migrating permission logic to SQL running in SQLite. This release involved 163 commits, with 10,660 additions and 1,825 deletions, most of which was written with the help of Claude Code.

[... 2,750 words]

9:34 pm / 4th November 2025 / plugins, projects, python, sql, sqlite, datasette, annotated-release-notes, uv, coding-agents, claude-code, codex

New prompt injection papers: Agents Rule of Two and The Attacker Moves Second

Two interesting new papers regarding LLM security and prompt injection came to my attention this weekend.

[... 1,433 words]

11:09 pm / 2nd November 2025 / definitions, security, openai, prompt-injection, anthropic, nicholas-carlini, paper-review, lethal-trifecta

«« first « previous page 3 / 111 next » last »»

Simon Willison’s Weblog