Archive for Sunday, 2nd March 2025

Sunday, 2nd March 2025

Hallucinations in code are the least dangerous form of LLM mistakes

A surprisingly common complaint I see from developers who have tried using LLMs for code is that they encountered a hallucination—usually the LLM inventing a method or even a full software library that doesn’t exist—and it crashed their confidence in LLMs as a tool for writing code. How could anyone productively use these things if they invent methods that don’t exist?

[... 1,052 words]

6:25 am / ai, openai, generative-ai, llms, ai-assisted-programming, anthropic, claude, boring-technology, code-interpreter, ai-agents, hallucinations, coding-agents

18f.org. New site by members of 18F, the team within the US government that were doing some of the most effective work at improving government efficiency.

For over 11 years, 18F has been proudly serving you to make government technology work better. We are non-partisan civil servants. 18F has worked on hundreds of projects, all designed to make government technology not just efficient but effective, and to save money for American taxpayers.

However, all employees at 18F – a group that the Trump Administration GSA Technology Transformation Services Director called "the gold standard" of civic tech – were terminated today at midnight ET.

18F was doing exactly the type of work that DOGE claims to want – yet we were eliminated.

The entire team is now on "administrative leave" and locked out of their computers.

But these are not the kind of civil servants to abandon their mission without a fight:

We’re not done yet.

We’re still absorbing what has happened. We’re wrestling with what it will mean for ourselves and our families, as well as the impact on our partners and the American people.

But we came to the government to fix things. And we’re not done with this work yet.

More to come.

You can follow @team18f.bsky.social on Bluesky.

# 9:24 am / government, political-hacking, politics, bluesky

Regarding the recent blog post, I think a simpler explanation is that hallucinating a non-existent library is a such an inhuman error it throws people. A human making such an error would be almost unforgivably careless.

— Kellan Elliott-McCrea

# 2:16 pm / kellan-elliott-mccrea, ai, generative-ai, llms, ai-assisted-programming

Notes from my Accessibility and Gen AI podcast appearance

I was a guest on the most recent episode of the Accessibility + Gen AI Podcast, hosted by Eamon McErlean and Joe Devon. We had a really fun, wide-ranging conversation about a host of different topics. I’ve extracted a few choice quotes from the transcript.

[... 947 words]

2:51 pm / accessibility, alt-text, podcasts, ai, generative-ai, llms, podcast-appearances

After publishing this piece, I was contacted by Anthropic who told me that Sonnet 3.7 would not be considered a 10^26 FLOP model and cost a few tens of millions of dollars to train, though future models will be much bigger.

— Ethan Mollick

# 5:56 pm / ai, generative-ai, llms, ethan-mollick, anthropic, claude

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Simon Willison’s Weblog

Sunday, 2nd March 2025

Hallucinations in code are the least dangerous form of LLM mistakes

Notes from my Accessibility and Gen AI podcast appearance