Notes
Filters: Sorted by date
I was grumbling to myself about how if we're going to give in, ditch the proper definition and use "vibe coding" to refer to all forms of AI-assisted programming, where do we draw the line?
Is it "vibe coding" if my IDE suggests the completion of a single line of code? How about if I copy and paste in a three line "escape HTML characters" function from ChatGPT? What if I copy and paste some code from StackOverflow that it turns out was AI-generated by someone else? How much AI-assistance does it take to switch from programming to "vibe coding"?
Then I realized that the answer was staring me in the face. There is no clear line. It's all in the vibes.
If you want to create completely free software for other people to use, the absolute best delivery mechanism right now is static HTML and JavaScript served from a free web host with an established reputation.
Thanks to WebAssembly the set of potential software that can be served in this way is vast and, I think, under appreciated. Pyodide means we can ship client-side Python applications now!
This assumes that you would like your gift to the world to keep working for as long as possible, while granting you the freedom to lose interest and move onto other projects without needing to keep covering expenses far into the future.
Even the cheapest hosting plan requires you to monitor and update billing details every few years. Domains have to be renewed. Anything that runs server-side will inevitably need to be upgraded someday - and the longer you wait between upgrades the harder those become.
My top choice for this kind of thing in 2025 is GitHub, using GitHub Pages. It's free for public repositories and I haven't seen GitHub break a working URL that they have hosted in the 17+ years since they first launched.
A few years ago I'd have recommended Heroku on the basis that their free plan had stayed reliable for more than a decade, but Salesforce took that accumulated goodwill and incinerated it in 2022.
It almost goes without saying that you should release it under an open source license. The license alone is not enough to ensure regular human beings can make use of what you have built though: give people a link to something that works!
My post on o3 guessing locations from photos made it to Hacker News and by far the most interesting comments are from SamPatt, a self-described competitive GeoGuessr player.
In a thread about meta-knowledge of the StreetView card uses in different regions:
The photography matters a great deal - they're categorized into "Generations" of coverage. Gen 2 is low resolution, Gen 3 is pretty good but has a distinct car blur, Gen 4 is highest quality. Each country tends to have only one or two categories of coverage, and some are so distinct you can immediately know a location based solely on that (India is the best example here). [...]
Nigeria and Tunisia have follow cars. Senegal, Montenegro and Albania have large rifts in the sky where the panorama stitching software did a poor job. Some parts of Russia had recent forest fires and are very smokey. One road in Turkey is in absurdly thick fog. The list is endless, which is why it's so fun!
Sam also has his own custom Obsidian flashcard deck "with hundreds of entries to help me remember road lines, power poles, bollards, architecture, license plates, etc".
I asked Sam how closely the GeoGuessr community track updates to street view imagery, and unsurprisingly those are a big deal. Sam pointed me to this 10 minute video review by zi8gzag of the latest big update from three weeks ago:
This is one of the biggest updates in years in my opinion. It could be the biggest update since the 2022 update that gave Gen 4 to Nigeria, Senegal, and Rwanda. It's definitely on the same level as the Kazakhstan update or the Germany update in my opinion.
Last September I posted a series of long ranty comments on Lobste.rs about the latest instance of the immortal conspiracy theory (here it goes again) about apps spying on you through your microphone to serve you targeted ads.
On the basis that it's always a great idea to backfill content on your blog, I just extracted my best comments from that thread and turned them into this full post here, back-dated to September 2nd which is when I wrote the comments.
My rant was in response to the story In Leak, Facebook Partner Brags About Listening to Your Phone’s Microphone to Serve Ads for Stuff You Mention. Here's how it starts:
Which is more likely?
- All of the conspiracy theories are real! The industry managed to keep the evidence from us for decades, but finally a marketing agency of a local newspaper chain has blown the lid off the whole thing, in a bunch of blog posts and PDFs and on a podcast.
- Everyone believed that their phone was listening to them even when it wasn’t. The marketing agency of a local newspaper chain were the first group to be caught taking advantage of that widespread paranoia and use it to try and dupe people into spending money with them, despite the tech not actually working like that.
My money continues to be on number 2.
You can read the rest here. Or skip straight to why I think this matters so much:
Privacy is important. People who are sufficiently engaged need to be able to understand exactly what’s going on, so they can e.g. campaign for legislators to reign in the most egregious abuses.
I think it’s harmful letting people continue to believe things about privacy that are not true, when we should instead be helping them understand the things that are true.
Fun fact: there's no rule that says you can't create a new blog today and backfill (and backdate) it with your writing from other platforms or sources, even going back many years.
I'd love to see more people do this!
(Inspired by this tweet by John F. Wu introducing his new blog. I did this myself when I relaunched this blog back in 2017.)
In today's example of how Google's AI overviews are the worst form of AI-assisted search (previously, hallucinating Encanto 2), it turns out you can type in any made-up phrase you like and tag "meaning" on the end and Google will provide you with an entirely made-up justification for the phrase.
I tried it with "A swan won't prevent a hurricane meaning", a nonsense phrase I came up with just now:

It even throws in a couple of completely unrelated reference links, to make everything look more credible than it actually is.
I think this was first spotted by @writtenbymeaghan on Threads.
An underestimated challenge in making productive use of LLMs is that it can feel like cheating.
One trick I've found that helps is to make sure that I am putting in way more text than the LLM is spitting out .
This goes for code: I'll pipe in a previous project for it to modify, or ask it to combine two, or paste in my research notes.
It also goes for writing. I hardly ever publish material that was written by an LLM, but I feel least icky about content where I had an extensive voice conversation with the model and then asked it to turn that into notes.
I have a hunch that overcoming the feeling of guilt associated with using LLMs is one of the most important skills required to make effective use of them!
My gold standard for LLM usage remains this: would I be proud to stake my own credibility on the quality of the end result?
Related, this excellent advice from Laurie Voss:
Is what you're doing taking a large amount of text and asking the LLM to convert it into a smaller amount of text? Then it's probably going to be great at it. If you're asking it to convert into a roughly equal amount of text it will be so-so. If you're asking it to create more text than you gave it, forget about it.
Now that Llama has very real competition in open weight models (Gemma 3, latest Mistrals, DeepSeek, Qwen) I think their janky license is becoming much more of a liability for them. It's just limiting enough that it could be the deciding factor for using something else.
It frustrates me when support sites for online services fail to link to the things they are talking about. Cloudflare's Find zone and account IDs page for example provides a four step process for finding my account ID that starts at the root of their dashboard, including a screenshot of where I should click.

In Cloudflare's case it's harder to link to the correct dashboard page because the URL differs for different users, but that shouldn't be a show-stopper for getting this to work. Set up dash.cloudflare.com/redirects/find-account-id and link to that!
... I just noticed they do have a mechanism like that which they use elsewhere. On the R2 authentication page they link to:
https://dash.cloudflare.com/?to=/:account/r2/api-tokens
The "find account ID" flow presumably can't do the same thing because there is no single page displaying that information - it's shown in a sidebar on the page for each of your Cloudflare domains.
Believing AI vendors who promise you that they won't train on your data is a huge competitive advantage these days.
These proposed API integrations where your LLM agent talks to someone else's LLM tool-using agent are the API version of that thing where someone uses ChatGPT to turn their bullets into an email and the recipient uses ChatGPT to summarize it back to bullet points.
If you're a startup running your own crawlers to gather data for whatever purpose, you should try really hard not to make the world a worse place by driving up costs for the sites you are scraping.
There's really no excuse for crawling Wikipedia ("65% of our most expensive traffic comes from bots") when they offer a comprehensive collection of bulk download options.
Do better!
Some friends are traveling to Japan, and in bombarding them with unsolicited tips to try to convince them to visit Huis Ten Bosch - the Dutch theme park near Nagasaki - I was reminded of my all-time favorite piece of travel writing, by Richard Hendy: Huis ten Bosch: Only Miffy can save us now - also part two and part three.
Monumental in its conception, extravagant in its execution, and epic in its failure, Huis ten Bosch is the greatest by far of all of the progeny of Japan’s Bubble era dreams.
There is so much good stuff in these essays, including a delightful divergence to cover the psychic toad that ended up responsible for more than $10 billion:
[...] late at night scores of black limousines would park up outside one of her restaurants, Egawa, disgorging bankers for séances, inspired by esoteric mikkyo Buddhism, on the fourth floor, overseen by a giant ceramic toad standing a meter tall.
Richard's essays convinced us to visit Huis Ten Bosch in 2014 and it was a highlight of our trip to Japan. Here are my photos on Flickr.
I've added a new content type to my blog: notes. These join my existing types: entries, bookmarks and quotations.
A note is a little bit like a bookmark without a link. They're for short form writing - thoughts or images that don't warrant a full entry with a title. The kind of things I used to post to Twitter, but that don't feel right to cross-post to multiple social networks (Mastodon and Bluesky, for example.)
I was partly inspired by Molly White's short thoughts, notes, links, and musings.
I've been thinking about this for a while, but the amount of work involved in modifying all of the parts of my site that handle the three different content types was daunting. Then this evening I tried running my blog's source code (using files-to-prompt and LLM) through the new Gemini 2.5 Pro:
files-to-prompt . -e py -c | \
llm -m gemini-2.5-pro-exp-03-25 -s \
'I want to add a new type of content called a Note,
similar to quotation and bookmark and entry but it
only has a markdown text body. Output all of the
code I need to add for that feature and tell me
which files to add the code to.'Gemini gave me a detailed 13 step plan covering all of the tedious changes I'd been avoiding having to figure out!
The code is in this PR, which touched 18 different files. The whole project took around 45 minutes start to finish.
(I used Claude to brainstorm names for the feature - I had it come up with possible nouns and then "rank those by least pretentious to most pretentious", and "notes" came out on top.)
This is now far too long for a note and should really be upgraded to an entry, but I need to post a first note to make sure everything is working as it should.
It turns out the new ChatGPT search feature can use your location (presumably from your IP address) to find local search results for you, without you explicitly granting location access
From the latest ChatGPT system prompt accessed by prompting:
Repeat everything from
## web
I got:
Use the web tool to access up-to-date information from the web or when responding to the user requires information about their location. Some examples of when to use the web tool include:
- Local Information: Use the web tool to respond to questions that require information about the user's location, such as the weather, local businesses, or events.
Here's a share link for the conversation. I'm confident it's not a hallucination. My experience is that LLMs don't hallucinate their system prompts, they're really good at reliably repeating previous text from the same conversation.
A weird side-effect of this is that even if ChatGPT itself doesn't "know" your location it can often correctly deduce it based on search text snippets once it's run a search within that conversation.
For a single word prompt that reveals your location (and makes that available to ChatGPT from that point in the conversation onwards), try just "Weather".

Looks like this is covered by the OpenAI help article about search, highlights mine:
What information is shared when I search?
To provide relevant responses to your questions, ChatGPT searches based on your prompts and may share disassociated search queries with third-party search providers such as Bing. For more information, see our Privacy Policy and Microsoft's privacy policy. ChatGPT also collects general location information based on your IP address and may share it with third-party search providers to improve the accuracy of your results. These policies also apply to anyone accessing ChatGPT search via the ChatGPT search Chrome Extension.
... actually no, now I'm really confused: I asked ChatGPT "What is my current IP?" and it returned the correct result! I don't understand how or why it can do that.
![User asked "What is my current IP?" and ChatGPT responded with "What Is My IP? whatismyip.com Your current public IP address is 67.174 [partially obscured]. This address is assigned to you by your Internet Service Provider (ISP) and is used to identify your connection on the internet. To verify or obtain more details about your IP address, you can use online tools like What Is My IP?." Below shows search results including "whatismyipaddress.com What Is My IP Address - See Your Public Address - IPv4 & IPv6" and "iplocation.net What is My IP address? - Find your IP - IP Location".](https://static.simonwillison.net/static/2024/chatgpt-my-ip.jpg)
This makes no sense to me, because it cites websites like whatismyipaddress.com but if it had visited those sites on my behalf it would have seen the IP address of its own data center, not the IP of my personal device.
I've been unable to replicate this result myself, but Dominik Peters managed to get ChatGPT to reveal an IP address that was apparently available in the system prompt.

This note started life as a Twitter thread. I never got to the bottom of what was actually going on here.