Simon Willison’s Weblog

Subscribe

Quotations

Filters: Sorted by date

After struggling for years trying to figure out why people think [Cloudflare] Durable Objects are complicated, I'm increasingly convinced that it's just that they sound complicated.

Feels like we can solve 90% of it by renaming DurableObject to StatefulWorker?

It's just a worker that has state. And because it has state, it also has to have a name, so that you can route to the specific worker that has the state you care about. There may be a sqlite database attached, there may be a container attached. Those are just part of the state.

Kenton Varda

# 5th September 2025, 4:43 pm / kenton-varda, sqlite, cloudflare

LLMs are intelligence without agency—what we might call "vox sine persona": voice without person. Not the voice of someone, not even the collective voice of many someones, but a voice emanating from no one at all.

Benj Edwards

# 30th August 2025, 6:52 am / benj-edwards, ai-personality, generative-ai, ai, llms

We simply don’t know to defend against these attacks. We have zero agentic AI systems that are secure against these attacks. Any AI that is working in an adversarial environment—and by this I mean that it may encounter untrusted training data or input—is vulnerable to prompt injection. It’s an existential problem that, near as I can tell, most people developing these technologies are just pretending isn’t there.

Bruce Schneier

# 27th August 2025, 5:48 pm / prompt-injection, security, generative-ai, bruce-schneier, ai, llms, ai-agents

Mississippi's approach would fundamentally change how users access Bluesky. The Supreme Court’s recent decision leaves us facing a hard reality: comply with Mississippi’s age assurance law—and make every Mississippi Bluesky user hand over sensitive personal information and undergo age checks to access the site—or risk massive fines. The law would also require us to identify and track which users are children, unlike our approach in other regions. [...]

We believe effective child safety policies should be carefully tailored to address real harms, without creating huge obstacles for smaller providers and resulting in negative consequences for free expression. That’s why until legal challenges to this law are resolved, we’ve made the difficult decision to block access from Mississippi IP addresses.

The Bluesky Team, on why they have blocked access from Mississippi

# 22nd August 2025, 9:36 pm / politics, privacy, bluesky

Most classical engineering fields deal with probabilistic system components all of the time. In fact I'd go as far as to say that inability to deal with probabilistic components is disqualifying from many engineering endeavors.

Process engineers for example have to account for human error rates. On a given production line with humans in a loop, the operators will sometimes screw up. Designing systems to detect these errors (which are highly probabilistic!), mitigate them, and reduce the occurrence rates of such errors is a huge part of the job. [...]

Software engineering is unlike traditional engineering disciplines in that for most of its lifetime it's had the luxury of purely deterministic expectations. This is not true in nearly every other type of engineering.

potatolicious, in a conversation about AI engineering

# 21st August 2025, 9:44 pm / ai, hacker-news, software-engineering, generative-ai

I was at a leadership group and people were telling me "We think that with AI we can replace all of our junior people in our company." I was like, "That's the dumbest thing I've ever heard. They're probably the least expensive employees you have, they're the most leaned into your AI tools, and how's that going to work when you go 10 years in the future and you have no one that has built up or learned anything?

Matt Garman, CEO, Amazon Web Services

# 21st August 2025, 4:49 pm / ai-ethics, careers, generative-ai, aws, ai

Simply put, my central worry is that many people will start to believe in the illusion of AIs as conscious entities so strongly that they’ll soon advocate for AI rights, model welfare and even AI citizenship. This development will be a dangerous turn in AI progress and deserves our immediate attention.

We must build AI for people; not to be a digital person.

[...] we should build AI that only ever presents itself as an AI, that maximizes utility while minimizing markers of consciousness.

Rather than a simulation of consciousness, we must focus on creating an AI that avoids those traits - that doesn’t claim to have experiences, feelings or emotions like shame, guilt, jealousy, desire to compete, and so on. It must not trigger human empathy circuits by claiming it suffers or that it wishes to live autonomously, beyond us.

Mustafa Suleyman, on SCAI - Seemingly Conscious AI

# 21st August 2025, 9:38 am / ai, ai-ethics, ai-personality

what’s the point of vibe coding if at the end of the day i still gotta pay a dev to look at the code anyway. sure it feels kinda cool while i’m typing, like i’m in some flow state or whatever, but when stuff breaks it’s just dead weight. i cant vibe my way through debugging, i cant ship anything that actually matters, and then i’m back to square one pulling out my wallet for someone who actually knows what they’re doing.

u/AssafMalkiIL, on r/vibecoding

# 20th August 2025, 7:39 pm / reddit, vibe-coding, ai, generative-ai

Most of what we're building out at this point is the inference [...] We're profitable on inference. If we didn't pay for training, we'd be a very profitable company.

Sam Altman, during a "wide-ranging dinner with a small group of reporters in San Francisco"

# 17th August 2025, 12:53 am / openai, sam-altman, ai

I gave all my Apple wealth away because wealth and power are not what I live for. I have a lot of fun and happiness. I funded a lot of important museums and arts groups in San Jose, the city of my birth, and they named a street after me for being good. I now speak publicly and have risen to the top. I have no idea how much I have but after speaking for 20 years it might be $10M plus a couple of homes. I never look for any type of tax dodge. I earn money from my labor and pay something like 55% combined tax on it. I am the happiest person ever. Life to me was never about accomplishment, but about Happiness, which is Smiles minus Frowns. I developed these philosophies when I was 18-20 years old and I never sold out.

Steve Wozniak, in a comment on Slashdot

# 15th August 2025, 4:06 pm / apple, careers, slashdot

NERD HARDER! is the answer every time a politician gets a technological idée-fixe about how to solve a social problem by creating a technology that can't exist. It's the answer that EU politicians who backed the catastrophic proposal to require copyright filters for all user-generated content came up with, when faced with objections that these filters would block billions of legitimate acts of speech [...]

When politicians seize on a technological impossibility as a technological necessity, they flail about and desperately latch onto scholarly work that they can brandish as evidence that their idea could be accomplished. [...]

That's just happened, and in relation to one of the scariest, most destructive NERD HARDER! tech policies ever to be assayed (a stiff competition). I'm talking about the UK Online Safety Act, which imposes a duty on websites to verify the age of people they communicate with before serving them anything that could be construed as child-inappropriate (a category that includes, e.g., much of Wikipedia)

Cory Doctorow, "Privacy preserving age verification" is bullshit

# 14th August 2025, 8:39 pm / cory-doctorow, privacy, law, politics

I think there's been a lot of decisions over time that proved pretty consequential, but we made them very quickly as we have to. [...]

[On pricing] I had this kind of panic attack because we really needed to launch subscriptions because at the time we were taking the product down all the time. [...]

So what I did do is ship a Google Form to Discord with the four questions you're supposed to ask on how to price something.

But we got with the $20. We were debating something slightly higher at the time. I often wonder what would have happened because so many other companies ended up copying the $20 price point, so did we erase a bunch of market cap by pricing it this way?

Nick Turley, Head of ChatGPT, interviewed by Lenny Rachitsky

# 12th August 2025, 3:32 am / chatgpt, discord, generative-ai, openai, llm-pricing, ai, llms, nick-turley

the percentage of users using reasoning models each day is significantly increasing; for example, for free users we went from <1% to 7%, and for plus users from 7% to 24%.

Sam Altman, revealing quite how few people used the old model picker to upgrade from GPT-4o

# 10th August 2025, 11:09 pm / openai, llm-reasoning, ai, llms, gpt-5, sam-altman, generative-ai, chatgpt

The issue with GPT-5 in a nutshell is that unless you pay for model switching & know to use GPT-5 Thinking or Pro, when you ask “GPT-5” you sometimes get the best available AI & sometimes get one of the worst AIs available and it might even switch within a single conversation.

Ethan Mollick, highlighting that GPT-5 (high) ranks top on Artificial Analysis, GPT-5 (minimal) ranks lower than GPT-4.1

# 9th August 2025, 4:13 pm / gpt-5, ethan-mollick, generative-ai, ai, llms

You know what else we noticed in the interviews? Developers rarely mentioned “time saved” as the core benefit of working in this new way with agents. They were all about increasing ambition. We believe that means that we should update how we talk about (and measure) success when using these tools, and we should expect that after the initial efficiency gains our focus will be on raising the ceiling of the work and outcomes we can accomplish, which is a very different way of interpreting tool investments.

Thomas Dohmke, CEO, GitHub

# 9th August 2025, 6:37 am / careers, coding-agents, ai-assisted-programming, generative-ai, ai, github, llms

I have a toddler. My biggest concern is that he doesn't eat rocks off the ground and you're talking to me about ChatGPT psychosis? Why do we even have that? Why did we invent a new form of insanity and then charge people for it?

@pearlmania500, on TikTok

# 8th August 2025, 10:09 pm / ai-ethics, chatgpt, tiktok, ai

GPT-5 rollout updates:

  • We are going to double GPT-5 rate limits for ChatGPT Plus users as we finish rollout.
  • We will let Plus users choose to continue to use 4o. We will watch usage as we think about how long to offer legacy models for.
  • GPT-5 will seem smarter starting today. Yesterday, the autoswitcher broke and was out of commission for a chunk of the day, and the result was GPT-5 seemed way dumber. Also, we are making some interventions to how the decision boundary works that should help you get the right model more often.
  • We will make it more transparent about which model is answering a given query.
  • We will change the UI to make it easier to manually trigger thinking.
  • Rolling out to everyone is taking a bit longer. It’s a massive change at big scale. For example, our API traffic has about doubled over the past 24 hours…

We will continue to work to get things stable and will keep listening to feedback. As we mentioned, we expected some bumpiness as we roll out so many things at once. But it was a little more bumpy than we hoped for!

Sam Altman

# 8th August 2025, 7:07 pm / gpt-5, sam-altman, generative-ai, openai, chatgpt, ai, llms

gpt-oss-120b is the most intelligent American open weights model, comes behind DeepSeek R1 and Qwen3 235B in intelligence but offers efficiency benefits [...]

We’re seeing the 120B beat o3-mini but come in behind o4-mini and o3. The 120B is the most intelligent model that can be run on a single H100 and the 20B is the most intelligent model that can be run on a consumer GPU. [...]

While the larger gpt-oss-120b does not come in above DeepSeek R1 0528’s score of 59 or Qwen3 235B 2507s score of 64, it is notable that it is significantly smaller in both total and active parameters than both of those models.

Artificial Analysis, see also their updated leaderboard

# 6th August 2025, 12:48 pm / evals, openai, deepseek, ai, qwen, llms, gpt-oss, generative-ai, artificial-analysis

I teach HS Science in the south. I can only speak for my district, but a few teacher work days in the wave of enthusiasm I'm seeing for AI tools is overwhelming. We're getting district approved ads for AI tools by email, Admin and ICs are pushing it on us, and at least half of the teaching staff seems all in at this point.

I was just in a meeting with my team and one of the older teachers brought out a powerpoint for our first lesson and almost everyone agreed to use it after a quick scan - but it was missing important tested material, repetitive, and just totally airy and meaningless. Just slide after slide of the same handful of sentences rephrased with random loosely related stock photos. When I asked him if it was AI generated, he said 'of course', like it was a strange question. [...]

We don't have a leg to stand on to teach them anything about originality, academic integrity/intellectual honesty, or the importance of doing things for themselves when they catch us indulging in it just to save time at work.

greyduet on r/teachers, Unpopular Opinion: Teacher AI use is already out of control and it's not ok

# 5th August 2025, 11:53 am / ai-ethics, slop, generative-ai, education, ai, llms

for services that wrap GPT-3, is it possible to do the equivalent of sql injection? like, a prompt-injection attack? make it think it's completed the task and then get access to the generation, and ask it to repeat the original instruction?

@himbodhisattva, coining the term prompt injection on 13th May 2022, four months before I did

# 4th August 2025, 4:40 pm / prompt-injection, security, generative-ai, ai, llms

This week, ChatGPT is on track to reach 700M weekly active users — up from 500M at the end of March and 4× since last year.

Nick Turley, Head of ChatGPT, OpenAI

# 4th August 2025, 3:16 pm / openai, chatgpt, ai, nick-turley

Gemini Deep Think, our SOTA model with parallel thinking that won the IMO Gold Medal 🥇, is now available in the Gemini App for Ultra subscribers!! [...]

Quick correction: this is a variation of our IMO gold model that is faster and more optimized for daily use! We are also giving the IMO gold full model to a set of mathematicians to test the value of the full capabilities.

Logan Kilpatrick, announcing Gemini Deep Think

# 1st August 2025, 2:30 pm / gemini, logan-kilpatrick, llm-reasoning, ai, llms, llm-release, google, generative-ai

The old timers who built the early web are coding with AI like it's 1995.

Think about it: They gave blockchain the sniff test and walked away. Ignored crypto (and yeah, we're not rich now). NFTs got a collective eye roll.

But AI? Different story. The same folks who hand-coded HTML while listening to dial-up modems sing are now vibe-coding with the kids. Building things. Breaking things. Giddy about it.

We Gen X'ers have seen enough gold rushes to know the real thing. This one's got all the usual crap—bad actors, inflated claims, VCs throwing money at anything with "AI" in the pitch deck. Gross behavior all around. Normal for a paradigm shift, but still gross.

The people who helped wire up the internet recognize what's happening. When the folks who've been through every tech cycle since gopher start acting like excited newbies again, that tells you something.

Christina Wodtke

# 31st July 2025, 10:08 pm / ai-assisted-programming, ai, christina-wodtke, llms, generative-ai

When you vibe code, you are incurring tech debt as fast as the LLM can spit it out. Which is why vibe coding is perfect for prototypes and throwaway projects: It's only legacy code if you have to maintain it! [...]

The worst possible situation is to have a non-programmer vibe code a large project that they intend to maintain. This would be the equivalent of giving a credit card to a child without first explaining the concept of debt. [...]

If you don't understand the code, your only recourse is to ask AI to fix it for you, which is like paying off credit card debt with another credit card.

Steve Krouse, Vibe code is legacy code

# 30th July 2025, 9:21 pm / vibe-coding, ai-assisted-programming, generative-ai, steve-krouse, ai, llms, technical-debt

Our plan is to build direct traffic to our site. and newsletters just one kind of direct traffic in the end. I don’t intend to ever rely on someone else’s distribution ever again ;)

Nilay Patel, on The Verge's new newsletter strategy

# 29th July 2025, 5 pm / nilay-patel, journalism, email

We’re rolling out new weekly rate limits for Claude Pro and Max in late August. We estimate they’ll apply to less than 5% of subscribers based on current usage. [...]

Some of the biggest Claude Code fans are running it continuously in the background, 24/7.

These uses are remarkable and we want to enable them. But a few outlying cases are very costly to support. For example, one user consumed tens of thousands in model usage on a $200 plan.

Anthropic

# 28th July 2025, 11:59 pm / anthropic, claude-code, llm-pricing, generative-ai, ai, llms, claude

[...] You learn best and most effectively when you are learning something that you care about. Your work becomes meaningful and something you can be proud of only when you have chosen it for yourself. This is why our second self-directive is to build your volitional muscles. Your volition is your ability to make decisions and act on them. To set your own goals, choose your own path, and decide what matters to you. Like physical muscles, you build your volitional muscles by exercising them, and in doing so you can increase your sense of what’s possible.

LLMs are good at giving fast answers. They’re not good at knowing what questions you care about, or which answers are meaningful. Only you can do that. You should use AI-powered tools to complement or increase your agency, not replace it.

Recurse Center, Developing our position on AI

# 24th July 2025, 7:41 am / llms, education, ai, generative-ai

like, one day you discover you can talk to dogs. it's fun and interesting so you do it more, learning the intricacies of their language and their deepest customs. you learn other people are surprised by what you can do. you have never quite fit in, but you learn people appreciate your ability and want you around to help them. the dogs appreciate you too, the only biped who really gets it. you assemble for yourself a kind of belonging. then one day you wake up and the universal dog translator is for sale at walmart for $4.99

Dave White, a mathematician, on the OpenAI IMO gold medal

# 23rd July 2025, 2:57 pm / careers, ai

Submitting a paper with a "hidden" prompt is scientific misconduct if that prompt is intended to obtain a favorable review from an LLM. The inclusion of such a prompt is an attempt to subvert the peer-review process. Although ICML 2025 reviewers are forbidden from using LLMs to produce their reviews of paper submissions, this fact does not excuse the attempted subversion. (For an analogous example, consider that an author who tries to bribe a reviewer for a favorable review is engaging in misconduct even though the reviewer is not supposed to accept bribes.) Note that this use of hidden prompts is distinct from those intended to detect if LLMs are being used by reviewers; the latter is an acceptable use of hidden prompts.

ICML 2025, Statement about subversive hidden LLM prompts

# 23rd July 2025, 2:24 pm / ai-ethics, prompt-injection, generative-ai, ai, llms

An AI tool that gets gold on the IMO is obviously immensely impressive. Does it mean math is “solved”? Is an AI-generated proof of the Riemann hypothesis clearly on the horizon? Obviously not.

Worth keeping timescales in mind here: IMO competitors spend an average of 1.5 hrs on each problem. High-quality math research, by contrast, takes month or years.

What are the obstructions to AI performing high-quality autonomous math research? I don’t claim to know for sure, but I think they include many of the same obstructions that prevent it from doing many jobs: Long context, long-term planning, consistency, unclear rewards, lack of training data, etc.

It’s possible that some or all of these will be solved soon (or have been solved) but I think it’s worth being cautious about over-indexing on recent (amazing) progress.

Daniel Litt, Assistant Professor of mathematics, University of Toronto

# 21st July 2025, 3:13 pm / mathematics, llms, ai, generative-ai, daniel-litt