<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: http</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/http.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-02-27T17:50:54+00:00</updated><author><name>Simon Willison</name></author><entry><title>Unicode Explorer using binary search over fetch() HTTP range requests</title><link href="https://simonwillison.net/2026/Feb/27/unicode-explorer/#atom-tag" rel="alternate"/><published>2026-02-27T17:50:54+00:00</published><updated>2026-02-27T17:50:54+00:00</updated><id>https://simonwillison.net/2026/Feb/27/unicode-explorer/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tools.simonwillison.net/unicode-binary-search"&gt;Unicode Explorer using binary search over fetch() HTTP range requests&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here's a little prototype I built this morning from my phone as an experiment in HTTP range requests, and a general example of using LLMs to satisfy curiosity.&lt;/p&gt;
&lt;p&gt;I've been collecting &lt;a href="https://simonwillison.net/tags/http-range-requests/"&gt;HTTP range tricks&lt;/a&gt; for a while now, and I decided it would be fun to build something with them myself that used binary search against a large file to do something useful.&lt;/p&gt;
&lt;p&gt;So I &lt;a href="https://claude.ai/share/47860666-cb20-44b5-8cdb-d0ebe363384f"&gt;brainstormed with Claude&lt;/a&gt;. The challenge was coming up with a use case for binary search where the data could be naturally sorted in a way that would benefit from binary search.&lt;/p&gt;
&lt;p&gt;One of Claude's suggestions was looking up information about unicode codepoints, which means searching through many MBs of metadata.&lt;/p&gt;
&lt;p&gt;I had Claude write me a spec to feed to Claude Code - &lt;a href="https://github.com/simonw/research/pull/90#issue-4001466642"&gt;visible here&lt;/a&gt; - then kicked off an &lt;a href="https://simonwillison.net/2025/Nov/6/async-code-research/"&gt;asynchronous research project&lt;/a&gt; with Claude Code for web against my &lt;a href="https://github.com/simonw/research"&gt;simonw/research&lt;/a&gt; repo to turn that into working code.&lt;/p&gt;
&lt;p&gt;Here's the &lt;a href="https://github.com/simonw/research/tree/main/unicode-explorer-binary-search#readme"&gt;resulting report and code&lt;/a&gt;. One interesting thing I learned is that Range request tricks aren't compatible with HTTP compression because they mess with the byte offset calculations. I added &lt;code&gt;'Accept-Encoding': 'identity'&lt;/code&gt; to the &lt;code&gt;fetch()&lt;/code&gt; calls but this isn't actually necessary because Cloudflare and other CDNs automatically skip compression if a &lt;code&gt;content-range&lt;/code&gt; header is present.&lt;/p&gt;
&lt;p&gt;I deployed the result &lt;a href="https://tools.simonwillison.net/unicode-binary-search"&gt;to my tools.simonwillison.net site&lt;/a&gt;, after first tweaking it to query the data via range requests against a CORS-enabled 76.6MB file in an S3 bucket fronted by Cloudflare.&lt;/p&gt;
&lt;p&gt;The demo is fun to play with - type in a single character like &lt;code&gt;ø&lt;/code&gt; or a hexadecimal codepoint indicator like &lt;code&gt;1F99C&lt;/code&gt; and it will binary search its way through the large file and show you the steps it takes along the way:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Animated demo of a web tool called Unicode Explore. I enter the ampersand character and hit Search. A box below shows a sequence of HTTP binary search requests made, finding in 17 steps with 3,864 bytes transferred and telling me that ampersand is U+0026 in Punctuation other, Basic Latin" src="https://static.simonwillison.net/static/2026/unicode-explore.gif" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/algorithms"&gt;algorithms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/research"&gt;research&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/unicode"&gt;unicode&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http-range-requests"&gt;http-range-requests&lt;/a&gt;&lt;/p&gt;



</summary><category term="algorithms"/><category term="http"/><category term="research"/><category term="tools"/><category term="unicode"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="vibe-coding"/><category term="http-range-requests"/></entry><entry><title>Introducing gisthost.github.io</title><link href="https://simonwillison.net/2026/Jan/1/gisthost/#atom-tag" rel="alternate"/><published>2026-01-01T22:12:20+00:00</published><updated>2026-01-01T22:12:20+00:00</updated><id>https://simonwillison.net/2026/Jan/1/gisthost/#atom-tag</id><summary type="html">
    &lt;p&gt;I am a huge fan of &lt;a href="https://gistpreview.github.io/"&gt;gistpreview.github.io&lt;/a&gt;, the site by Leon Huang that lets you append &lt;code&gt;?GIST_id&lt;/code&gt; to see a browser-rendered version of an HTML page that you have saved to a Gist. The last commit was ten years ago and I needed a couple of small changes so I've forked it and deployed an updated version at &lt;a href="https://gisthost.github.io/"&gt;gisthost.github.io&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="some-background-on-gistpreview"&gt;Some background on gistpreview&lt;/h4&gt;
&lt;p&gt;The genius thing about &lt;code&gt;gistpreview.github.io&lt;/code&gt; is that it's a core piece of GitHub infrastructure, hosted and cost-covered entirely by GitHub, that wasn't built with any involvement from GitHub at all.&lt;/p&gt;
&lt;p&gt;To understand how it works we need to first talk about Gists.&lt;/p&gt;
&lt;p&gt;Any file hosted in a &lt;a href="https://gist.github.com/"&gt;GitHub Gist&lt;/a&gt; can be accessed via a direct URL that looks like this:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;https://gist.githubusercontent.com/simonw/d168778e8e62f65886000f3f314d63e3/raw/79e58f90821aeb8b538116066311e7ca30c870c9/index.html&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;That URL is served with a few key HTTP headers:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These ensure that every file is treated by browsers as plain text, so HTML file will not be rendered even by older browsers that attempt to guess the content type based on the content.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Via: 1.1 varnish
Cache-Control: max-age=300
X-Served-By: cache-sjc1000085-SJC
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These confirm that the file is sever via GitHub's caching CDN, which means I don't feel guilty about linking to them for potentially high traffic scenarios.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Access-Control-Allow-Origin: *
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is my favorite HTTP header! It means I can hit these files with a &lt;code&gt;fetch()&lt;/code&gt; call from any domain on the internet, which is fantastic for building &lt;a href="https://simonwillison.net/2025/Dec/10/html-tools/"&gt;HTML tools&lt;/a&gt; that do useful things with content hosted in a Gist.&lt;/p&gt;
&lt;p&gt;The one big catch is that Content-Type header. It means you can't use a Gist to serve HTML files that people can view.&lt;/p&gt;
&lt;p&gt;That's where &lt;code&gt;gistpreview&lt;/code&gt; comes in. The &lt;code&gt;gistpreview.github.io&lt;/code&gt; site belongs to the dedicated &lt;a href="https://github.com/gistpreview"&gt;gistpreview&lt;/a&gt; GitHub organization, and is served out of the &lt;a href="https://github.com/gistpreview/gistpreview.github.io"&gt;github.com/gistpreview/gistpreview.github.io&lt;/a&gt; repository by GitHub Pages.&lt;/p&gt;
&lt;p&gt;It's not much code. The key functionality is this snippet of JavaScript from &lt;a href="https://github.com/gistpreview/gistpreview.github.io/blob/master/main.js"&gt;main.js&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'https://api.github.com/gists/'&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;gistId&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;res&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;res&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;res&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;status&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-c1"&gt;200&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
    &lt;span class="pl-smi"&gt;console&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;log&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;res&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt; &lt;span class="pl-c"&gt;// debug&lt;/span&gt;
    &lt;span class="pl-k"&gt;throw&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;Error&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'Gist &amp;lt;strong&amp;gt;'&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;gistId&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s"&gt;'&amp;lt;/strong&amp;gt;, '&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;message&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;replace&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-pds"&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;span class="pl-cce"&gt;\(&lt;/span&gt;.&lt;span class="pl-c1"&gt;*&lt;/span&gt;&lt;span class="pl-cce"&gt;\)&lt;/span&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;''&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;info&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;fileName&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-s"&gt;''&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;var&lt;/span&gt; &lt;span class="pl-s1"&gt;file&lt;/span&gt; &lt;span class="pl-k"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;info&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;files&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c"&gt;// index.html or the first file&lt;/span&gt;
      &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;fileName&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-s"&gt;''&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt; &lt;span class="pl-s1"&gt;file&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-s"&gt;'index.html'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-s1"&gt;fileName&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;file&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;info&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;files&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;hasOwnProperty&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;fileName&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-c1"&gt;false&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;throw&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;Error&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'File &amp;lt;strong&amp;gt;'&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;fileName&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s"&gt;'&amp;lt;/strong&amp;gt; is not exist'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-k"&gt;var&lt;/span&gt; &lt;span class="pl-s1"&gt;content&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;info&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;files&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s1"&gt;fileName&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;write&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This chain of promises fetches the Gist content from the GitHub API, finds the section of that JSON corresponding to the requested file name and then outputs it to the page like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;write&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is smart. Injecting the content using &lt;code&gt;document.body.innerHTML = content&lt;/code&gt; would fail to execute inline scripts. Using &lt;code&gt;document.write()&lt;/code&gt; causes the browser to treat the HTML as if it was directly part of the parent page.&lt;/p&gt;
&lt;p&gt;That's pretty much the whole trick! Read the Gist ID from the query string, fetch the content via the JSON API and &lt;code&gt;document.write()&lt;/code&gt; it into the page.&lt;/p&gt;
&lt;p&gt;Here's a demo:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://gistpreview.github.io/?d168778e8e62f65886000f3f314d63e3"&gt;https://gistpreview.github.io/?d168778e8e62f65886000f3f314d63e3&lt;/a&gt;&lt;/p&gt;
&lt;h4 id="fixes-for-gisthost-github-io"&gt;Fixes for gisthost.github.io&lt;/h4&gt;
&lt;p&gt;I forked &lt;code&gt;gistpreview&lt;/code&gt; to add two new features:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A workaround for Substack mangling the URLs&lt;/li&gt;
&lt;li&gt;The ability to serve larger files that get truncated in the JSON API&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I also removed some dependencies (jQuery and Bootstrap and an old &lt;code&gt;fetch()&lt;/code&gt; polyfill) and inlined the JavaScript into &lt;a href="https://github.com/gisthost/gisthost.github.io/blob/main/index.html"&gt;a single index.html file&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The Substack issue was small but frustrating. If you email out a link to a &lt;code&gt;gistpreview&lt;/code&gt; page via Substack it modifies the URL to look like this:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://gistpreview.github.io/?f40971b693024fbe984a68b73cc283d2=&amp;amp;utm_source=substack&amp;amp;utm_medium=email"&gt;https://gistpreview.github.io/?f40971b693024fbe984a68b73cc283d2=&amp;amp;utm_source=substack&amp;amp;utm_medium=email&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This breaks &lt;code&gt;gistpreview&lt;/code&gt; because it treats &lt;code&gt;f40971b693024fbe984a68b73cc283d2=&amp;amp;utm_source...&lt;/code&gt; as the Gist ID.&lt;/p&gt;
&lt;p&gt;The fix is to read everything up to that equals sign. I &lt;a href="https://github.com/gistpreview/gistpreview.github.io/pull/7"&gt;submitted a PR&lt;/a&gt; for that back in November.&lt;/p&gt;
&lt;p&gt;The second issue around truncated files was &lt;a href="https://github.com/simonw/claude-code-transcripts/issues/26#issuecomment-3699668871"&gt;reported against my claude-code-transcripts project&lt;/a&gt; a few days ago.&lt;/p&gt;
&lt;p&gt;That project provides a CLI tool for exporting HTML rendered versions of Claude Code sessions. It includes a &lt;code&gt;--gist&lt;/code&gt; option which uses the &lt;code&gt;gh&lt;/code&gt; CLI tool to publish the resulting HTML to a Gist and returns a gistpreview URL that the user can share.&lt;/p&gt;
&lt;p&gt;These exports can get pretty big, and some of the resulting HTML was past the size limit of what comes back from the Gist API.&lt;/p&gt;
&lt;p&gt;As of &lt;a href="https://github.com/simonw/claude-code-transcripts/releases/tag/0.5"&gt;claude-code-transcripts 0.5&lt;/a&gt; the &lt;code&gt;--gist&lt;/code&gt; option now publishes to &lt;a href="https://gisthost.github.io/"&gt;gisthost.github.io&lt;/a&gt; instead, fixing both bugs.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://gisthost.github.io/?02ced545666128ce4206103df6185536"&gt;the Claude Code transcript&lt;/a&gt; that refactored Gist Host to remove those dependencies, which I published to Gist Host using the following command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx claude-code-transcripts web --gist
&lt;/code&gt;&lt;/pre&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="github"/><category term="http"/><category term="javascript"/><category term="projects"/><category term="ai-assisted-programming"/><category term="cors"/></entry><entry><title>YouTube embeds fail with a 153 error</title><link href="https://simonwillison.net/2025/Dec/1/youtube-embed-153-error/#atom-tag" rel="alternate"/><published>2025-12-01T05:26:23+00:00</published><updated>2025-12-01T05:26:23+00:00</updated><id>https://simonwillison.net/2025/Dec/1/youtube-embed-153-error/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/simonwillisonblog/issues/561"&gt;YouTube embeds fail with a 153 error&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I just fixed this bug on my blog. I was getting an annoying "Error 153: Video player configuration error" on some of the YouTube video embeds (like &lt;a href="https://simonwillison.net/2024/Jun/21/search-based-rag/"&gt;this one&lt;/a&gt;) on this site. After some digging it turns out the culprit was this HTTP header, which Django's SecurityMiddleware was &lt;a href="https://docs.djangoproject.com/en/5.2/ref/middleware/#module-django.middleware.security"&gt;sending by default&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Referrer-Policy: same-origin
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;YouTube's &lt;a href="https://developers.google.com/youtube/terms/required-minimum-functionality#embedded-player-api-client-identity"&gt;embedded player terms documentation&lt;/a&gt; explains why this broke:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;API Clients that use the YouTube embedded player (including the YouTube IFrame Player API) must provide identification through the &lt;code&gt;HTTP Referer&lt;/code&gt; request header. In some environments, the browser will automatically set &lt;code&gt;HTTP Referer&lt;/code&gt;, and API Clients need only ensure they are not setting the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Referrer-Policy"&gt;&lt;code&gt;Referrer-Policy&lt;/code&gt;&lt;/a&gt; in a way that suppresses the &lt;code&gt;Referer&lt;/code&gt; value. YouTube recommends using &lt;code&gt;strict-origin-when-cross-origin&lt;/code&gt; Referrer-Policy, which is already the default in many browsers.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The fix, which I &lt;a href="https://github.com/simonw/simonwillisonblog/pull/562"&gt;outsourced to GitHub Copilot agent&lt;/a&gt; since I was on my phone, was to add this to my &lt;code&gt;settings.py&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;SECURE_REFERRER_POLICY = "strict-origin-when-cross-origin"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This &lt;a href="https://developer.chrome.com/blog/referrer-policy-new-chrome-default"&gt;explainer on the Chrome blog&lt;/a&gt; describes what the header means:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;strict-origin-when-cross-origin&lt;/code&gt; offers more privacy. With this policy, only the origin is sent in the Referer header of cross-origin requests.&lt;/p&gt;
&lt;p&gt;This prevents leaks of private data that may be accessible from other parts of the full URL such as the path and query string.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Effectively it means that any time you follow a link from my site to somewhere else they'll see this in the incoming HTTP headers even if you followed the link from a page other than my homepage:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Referer: https://simonwillison.net/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The previous header, &lt;code&gt;same-origin&lt;/code&gt;, is &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Referrer-Policy"&gt;explained by MDN here&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Send the &lt;a href="https://developer.mozilla.org/en-US/docs/Glossary/Origin"&gt;origin&lt;/a&gt;, path, and query string for &lt;a href="https://developer.mozilla.org/en-US/docs/Glossary/Same-origin_policy"&gt;same-origin&lt;/a&gt; requests. Don't send the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Referer"&gt;&lt;code&gt;Referer&lt;/code&gt;&lt;/a&gt; header for cross-origin requests.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This meant that previously traffic from my site wasn't sending any HTTP referer at all!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/privacy"&gt;privacy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/youtube"&gt;youtube&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="http"/><category term="privacy"/><category term="youtube"/></entry><entry><title>httpjail</title><link href="https://simonwillison.net/2025/Sep/19/httpjail/#atom-tag" rel="alternate"/><published>2025-09-19T21:57:29+00:00</published><updated>2025-09-19T21:57:29+00:00</updated><id>https://simonwillison.net/2025/Sep/19/httpjail/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/coder/httpjail"&gt;httpjail&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here's a promising new (experimental) project in the sandboxing space from Ammar Bandukwala at &lt;a href="https://coder.com/"&gt;Coder&lt;/a&gt;. &lt;code&gt;httpjail&lt;/code&gt; provides a Rust CLI tool for running an individual process against a custom configured HTTP proxy.&lt;/p&gt;
&lt;p&gt;The initial goal is to help run coding agents like Claude Code and Codex CLI with extra rules governing how they interact with outside services. From Ammar's blog post that introduces the new tool, &lt;a href="https://ammar.io/blog/httpjail"&gt;Fine-grained HTTP filtering for Claude Code&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;httpjail&lt;/code&gt; implements an HTTP(S) interceptor alongside process-level network isolation. Under default configuration, all DNS (udp:53) is permitted and all other non-HTTP(S) traffic is blocked.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;httpjail&lt;/code&gt; rules are either JavaScript expressions or custom programs. This approach makes them far more flexible than traditional rule-oriented firewalls and avoids the learning curve of a DSL.&lt;/p&gt;
&lt;p&gt;Block all HTTP requests other than the LLM API traffic itself:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ httpjail --js "r.host === 'api.anthropic.com'" -- claude "build something great"
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;I tried it out using OpenAI's Codex CLI instead and found this recipe worked:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew upgrade rust
cargo install httpjail # Drops it in `~/.cargo/bin`
httpjail --js "r.host === 'chatgpt.com'" -- codex
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Within that Codex instance the model ran fine but any attempts to access other URLs (e.g. telling it "&lt;code&gt;Use curl to fetch simonwillison.net&lt;/code&gt;)" failed at the proxy layer.&lt;/p&gt;
&lt;p&gt;This is still at a really early stage but there's a lot I like about this project. Being able to use JavaScript to filter requests via the &lt;code&gt;--js&lt;/code&gt; option is neat (it's using V8 under the hood), and there's also a &lt;code&gt;--sh shellscript&lt;/code&gt; option which instead runs a shell program passing environment variables that can be used to determine if the request should be allowed.&lt;/p&gt;
&lt;p&gt;At a basic level it works by running a proxy server and setting &lt;code&gt;HTTP_PROXY&lt;/code&gt; and &lt;code&gt;HTTPS_PROXY&lt;/code&gt; environment variables so well-behaving software knows how to route requests.&lt;/p&gt;
&lt;p&gt;It can also add a bunch of other layers. On Linux it sets up &lt;a href="https://en.wikipedia.org/wiki/Nftables"&gt;nftables&lt;/a&gt; rules to explicitly deny additional network access. There's also a &lt;code&gt;--docker-run&lt;/code&gt; option which can launch a Docker container with the specified image but first locks that container down to only have network access to the &lt;code&gt;httpjail&lt;/code&gt; proxy server.&lt;/p&gt;
&lt;p&gt;It can intercept, filter and log HTTPS requests too by generating its own certificate and making that available to the underlying process.&lt;/p&gt;
&lt;p&gt;I'm always interested in new approaches to sandboxing, and fine-grained network access is a particularly tricky problem to solve. This looks like a very promising step in that direction - I'm looking forward to seeing how this project continues to evolve.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://ammar.io/blog/httpjail"&gt;Fine-grained HTTP filtering for Claude Code&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/v8"&gt;v8&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex-cli"&gt;codex-cli&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="javascript"/><category term="proxies"/><category term="sandboxing"/><category term="security"/><category term="v8"/><category term="rust"/><category term="claude-code"/><category term="codex-cli"/></entry><entry><title>tidwall/pogocache</title><link href="https://simonwillison.net/2025/Jul/21/pogocache/#atom-tag" rel="alternate"/><published>2025-07-21T23:58:53+00:00</published><updated>2025-07-21T23:58:53+00:00</updated><id>https://simonwillison.net/2025/Jul/21/pogocache/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/tidwall/pogocache"&gt;tidwall/pogocache&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New project from Josh Baker, author of the excellent &lt;code&gt;tg&lt;/code&gt; C geospatial libarry (&lt;a href="https://simonwillison.net/2023/Sep/23/tg-polygon-indexing/"&gt;covered previously&lt;/a&gt;) and various other &lt;a href="https://github.com/tidwall"&gt;interesting projects&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Pogocache is fast caching software built from scratch with a focus on low latency and cpu efficency.&lt;/p&gt;
&lt;p&gt;Faster: Pogocache is faster than Memcache, Valkey, Redis, Dragonfly, and Garnet. It has the lowest latency per request, providing the quickest response times. It's optimized to scale from one to many cores, giving you the best single-threaded and multithreaded performance.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Faster than Memcache and Redis is a big claim! The README includes a &lt;a href="https://github.com/tidwall/pogocache/blob/main/README.md#design-details"&gt;design details&lt;/a&gt; section that explains how the system achieves that performance, using a sharded hashmap inspired by Josh's &lt;a href="https://github.com/tidwall/shardmap"&gt;shardmap&lt;/a&gt; project and clever application of threads.&lt;/p&gt;
&lt;p&gt;Performance aside, the most interesting thing about Pogocache is the server interface it provides: it emulates the APIs for Redis and Memcached, provides a simple HTTP API &lt;em&gt;and&lt;/em&gt; lets you talk to it over the PostgreSQL wire protocol as well!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;psql -h localhost -p 9401
=&amp;gt; SET first Tom;
=&amp;gt; SET last Anderson;
=&amp;gt; SET age 37;

$ curl http://localhost:9401/last
Anderson
&lt;/code&gt;&lt;/pre&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=44638076"&gt;Show HN&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;&lt;/p&gt;



</summary><category term="c"/><category term="caching"/><category term="http"/><category term="memcached"/><category term="postgresql"/><category term="redis"/></entry><entry><title>Some Go web dev notes</title><link href="https://simonwillison.net/2024/Sep/27/some-go-web-dev-notes/#atom-tag" rel="alternate"/><published>2024-09-27T23:43:31+00:00</published><updated>2024-09-27T23:43:31+00:00</updated><id>https://simonwillison.net/2024/Sep/27/some-go-web-dev-notes/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://jvns.ca/blog/2024/09/27/some-go-web-dev-notes/"&gt;Some Go web dev notes&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Julia Evans on writing small, self-contained web applications in Go:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In general everything about it feels like it makes projects easy to work on for 5 days, abandon for 2 years, and then get back into writing code without a lot of problems.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Go 1.22 &lt;a href="https://go.dev/blog/routing-enhancements"&gt;introduced HTTP routing&lt;/a&gt; in February of this year, making it even more practical to build a web application using just the Go standard library.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/go"&gt;go&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/web-development"&gt;web-development&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/julia-evans"&gt;julia-evans&lt;/a&gt;&lt;/p&gt;



</summary><category term="go"/><category term="http"/><category term="web-development"/><category term="julia-evans"/></entry><entry><title>How streaming LLM APIs work</title><link href="https://simonwillison.net/2024/Sep/22/how-streaming-llm-apis-work/#atom-tag" rel="alternate"/><published>2024-09-22T03:48:12+00:00</published><updated>2024-09-22T03:48:12+00:00</updated><id>https://simonwillison.net/2024/Sep/22/how-streaming-llm-apis-work/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis"&gt;How streaming LLM APIs work&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New TIL. I used &lt;code&gt;curl&lt;/code&gt; to explore the streaming APIs provided by OpenAI, Anthropic and Google Gemini and wrote up detailed notes on what I learned.&lt;/p&gt;
&lt;p&gt;Also includes example code for &lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis#user-content-bonus-accessing-these-streams-using-httpx"&gt;receiving streaming events in Python with HTTPX&lt;/a&gt; and &lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis#user-content-bonus--2-processing-streaming-events-in-javascript-with-fetch"&gt;receiving streaming events in client-side JavaScript using fetch()&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="http"/><category term="json"/><category term="llms"/></entry><entry><title>SQL Injection Isn't Dead: Smuggling Queries at the Protocol Level</title><link href="https://simonwillison.net/2024/Aug/12/smuggling-queries-at-the-protocol-level/#atom-tag" rel="alternate"/><published>2024-08-12T15:36:47+00:00</published><updated>2024-08-12T15:36:47+00:00</updated><id>https://simonwillison.net/2024/Aug/12/smuggling-queries-at-the-protocol-level/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://media.defcon.org/DEF%20CON%2032/DEF%20CON%2032%20presentations/DEF%20CON%2032%20-%20Paul%20Gerste%20-%20SQL%20Injection%20Isn%27t%20Dead%20Smuggling%20Queries%20at%20the%20Protocol%20Level.pdf"&gt;SQL Injection Isn&amp;#x27;t Dead: Smuggling Queries at the Protocol Level&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
PDF slides from a presentation by &lt;a href="https://twitter.com/pspaul95"&gt;Paul Gerste&lt;/a&gt; at DEF CON 32. It turns out some databases have vulnerabilities in their binary protocols that can be exploited by carefully crafted SQL queries.&lt;/p&gt;
&lt;p&gt;Paul demonstrates an attack against PostgreSQL (which works in some but not all of the PostgreSQL client libraries) which uses a message size overflow, by embedding a string longer than 4GB (2**32 bytes) which overflows the maximum length of a string in the underlying protocol and writes data to the subsequent value. He then shows a similar attack against MongoDB.&lt;/p&gt;
&lt;p&gt;The current way to protect against these attacks is to ensure a size limit on incoming requests. This can be more difficult than you may expect - Paul points out that alternative paths such as WebSockets might bypass limits that are in place for regular HTTP requests, plus some servers may apply limits before decompression, allowing an attacker to send a compressed payload that is larger than the configured limit.&lt;/p&gt;
&lt;p&gt;&lt;img alt="How Web Apps Handle Large Payloads. Potential bypasses: - Unprotected endpoints - Compression - WebSockets (highlighted) - Alternate body types - Incrementation.  Next to WebSockets:  - Compression support - Large message size - Many filters don't apply" src="https://static.simonwillison.net/static/2024/sql-injection-websockets.jpg" /&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/mxgp7v/sql_injection_isn_t_dead_smuggling"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mongodb"&gt;mongodb&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql-injection"&gt;sql-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/websockets"&gt;websockets&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="mongodb"/><category term="postgresql"/><category term="security"/><category term="sql-injection"/><category term="websockets"/></entry><entry><title>Cloudflare does not consider vary values in caching decisions</title><link href="https://simonwillison.net/2023/Nov/20/cloudflare-does-not-consider-vary-values-in-caching-decisions/#atom-tag" rel="alternate"/><published>2023-11-20T05:08:52+00:00</published><updated>2023-11-20T05:08:52+00:00</updated><id>https://simonwillison.net/2023/Nov/20/cloudflare-does-not-consider-vary-values-in-caching-decisions/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://developers.cloudflare.com/cache/concepts/cache-control/#other"&gt;Cloudflare does not consider vary values in caching decisions&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here’s the spot in Cloudflare’s documentation where they hide a crucially important detail:&lt;/p&gt;

&lt;p&gt;“Cloudflare does not consider vary values in caching decisions. Nevertheless, vary values are respected when Vary for images is configured and when the vary header is vary: accept-encoding.”&lt;/p&gt;

&lt;p&gt;This means you can’t deploy an application that uses content negotiation via the Accept header behind the Cloudflare CDN—for example serving JSON or HTML for the same URL depending on the incoming Accept header. If you do, Cloudflare may serve cached JSON to an HTML client or vice-versa.&lt;/p&gt;

&lt;p&gt;There’s an exception for image files, which Cloudflare added support for in September 2021 (for Pro accounts only) in order to support formats such as WebP which may not have full support across all browsers.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cloudflare"&gt;cloudflare&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="http"/><category term="cloudflare"/></entry><entry><title>See this page fetch itself, byte by byte, over TLS</title><link href="https://simonwillison.net/2023/May/10/see-this-page-fetch-itself-byte-by-byte-over-tls/#atom-tag" rel="alternate"/><published>2023-05-10T13:58:36+00:00</published><updated>2023-05-10T13:58:36+00:00</updated><id>https://simonwillison.net/2023/May/10/see-this-page-fetch-itself-byte-by-byte-over-tls/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://subtls.pages.dev/"&gt;See this page fetch itself, byte by byte, over TLS&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
George MacKerron built a TLS 1.3 library in TypeScript and used it to construct this amazing educational demo, which performs a full HTTPS request for its own source code over a WebSocket and displays an annotated byte-by-byte representation of the entire exchange. This is the most useful illustration of how HTTPS actually works that I’ve ever seen.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/b0rk/status/1656287855612682240"&gt;Julia Evans&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/encryption"&gt;encryption&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/https"&gt;https&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tls"&gt;tls&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/websockets"&gt;websockets&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/explorables"&gt;explorables&lt;/a&gt;&lt;/p&gt;



</summary><category term="encryption"/><category term="http"/><category term="https"/><category term="tls"/><category term="websockets"/><category term="explorables"/></entry><entry><title>urllib3 v2.0.0 is now generally available</title><link href="https://simonwillison.net/2023/Apr/26/urllib3/#atom-tag" rel="alternate"/><published>2023-04-26T22:00:16+00:00</published><updated>2023-04-26T22:00:16+00:00</updated><id>https://simonwillison.net/2023/Apr/26/urllib3/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://sethmlarson.dev/urllib3-2.0.0"&gt;urllib3 v2.0.0 is now generally available&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
urllib3 is 12 years old now, and is a common low-level dependency for packages like requests and httpx. The biggest new feature in v2 is a higher-level API: resp = urllib3.request(“GET”, “https://example.com”)—a very welcome addition to the library.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="python"/></entry><entry><title>RFC 7807: Problem Details for HTTP APIs</title><link href="https://simonwillison.net/2022/Nov/1/rfc-7807/#atom-tag" rel="alternate"/><published>2022-11-01T03:15:05+00:00</published><updated>2022-11-01T03:15:05+00:00</updated><id>https://simonwillison.net/2022/Nov/1/rfc-7807/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://datatracker.ietf.org/doc/draft-ietf-httpapi-rfc7807bis/"&gt;RFC 7807: Problem Details for HTTP APIs&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This RFC has been brewing for quite a while, and is currently in last call (ends 2022-11-03). I’m designing the JSON error messages for Datasette at the moment so this could not be more relevant for me.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://blog.frankel.ch/structured-errors-http-apis/"&gt;Nicolas Fränkel&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/errors"&gt;errors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mark-nottingham"&gt;mark-nottingham&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rfc"&gt;rfc&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/standards"&gt;standards&lt;/a&gt;&lt;/p&gt;



</summary><category term="errors"/><category term="http"/><category term="json"/><category term="mark-nottingham"/><category term="rfc"/><category term="standards"/></entry><entry><title>Introducing sqlite-http: A SQLite extension for making HTTP requests</title><link href="https://simonwillison.net/2022/Aug/10/sqlite-http/#atom-tag" rel="alternate"/><published>2022-08-10T22:22:42+00:00</published><updated>2022-08-10T22:22:42+00:00</updated><id>https://simonwillison.net/2022/Aug/10/sqlite-http/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/introducing-sqlite-http"&gt;Introducing sqlite-http: A SQLite extension for making HTTP requests&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Characteristically thoughtful SQLite extension from Alex, following his sqlite-html extension from a few days ago. sqlite-http lets you make HTTP requests from SQLite—both as a SQL function that returns a string, and as a table-valued SQL function that lets you independently access the body, headers and even the timing data for the request.&lt;/p&gt;

&lt;p&gt;This write-up is excellent: it provides interactive demos but also shows how additional SQLite extensions such as the new-to-me “define” extension can be combined with sqlite-http to create custom functions for parsing and processing HTML.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/agarcia_me/status/1557437368818249728"&gt;@agarcia_me&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="sqlite"/><category term="alex-garcia"/></entry><entry><title>curlconverter.com</title><link href="https://simonwillison.net/2022/Mar/10/curlconvertercom/#atom-tag" rel="alternate"/><published>2022-03-10T20:12:44+00:00</published><updated>2022-03-10T20:12:44+00:00</updated><id>https://simonwillison.net/2022/Mar/10/curlconvertercom/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://curlconverter.com/"&gt;curlconverter.com&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is pretty magic: paste in a “curl” command (including the ones you get from browser devtools using copy-as-curl) and this will convert that into code for making the same HTTP request... using Python, JavaScript, PHP, R, Go, Rust, Elixir, Java, MATLAB, Ansible URI, Strest, Dart or JSON.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://jvns.ca/blog/2022/03/10/how-to-use-undocumented-web-apis/"&gt;Julia Evans&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/curl"&gt;curl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;&lt;/p&gt;



</summary><category term="curl"/><category term="http"/></entry><entry><title>Hurl</title><link href="https://simonwillison.net/2021/Nov/22/hurl/#atom-tag" rel="alternate"/><published>2021-11-22T03:32:33+00:00</published><updated>2021-11-22T03:32:33+00:00</updated><id>https://simonwillison.net/2021/Nov/22/hurl/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/Orange-OpenSource/hurl"&gt;Hurl&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Hurl is “a command line tool that runs HTTP requests defined in a simple plain text format”—written in Rust on top of curl, it lets you run HTTP requests and then execute assertions against the response, defined using JSONPath or XPath for HTML. It can even assert that responses were returned within a specified duration.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/humphd/status/1462594205629493254"&gt;@humphd&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/curl"&gt;curl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;&lt;/p&gt;



</summary><category term="curl"/><category term="http"/><category term="rust"/></entry><entry><title>New HTTP standards for caching on the modern web</title><link href="https://simonwillison.net/2021/Oct/21/new-http-standards-for-caching-on-the-modern-web/#atom-tag" rel="alternate"/><published>2021-10-21T22:40:50+00:00</published><updated>2021-10-21T22:40:50+00:00</updated><id>https://simonwillison.net/2021/Oct/21/new-http-standards-for-caching-on-the-modern-web/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://httptoolkit.tech/blog/status-targeted-caching-headers/`"&gt;New HTTP standards for caching on the modern web&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Cache-Status is a new HTTP header (RFC from August 2021) designed to provide better debugging information about which caches were involved in serving a request—“Cache-Status: Nginx; hit, Cloudflare; fwd=stale; fwd-status=304; collapsed; ttl=300” for example indicates that Nginx served a cache hit, then Cloudflare had a stale cached version so it revalidated from Nginx, got a 304 not modified, collapsed multiple requests (dogpile prevention) and plans to serve the new cached value for the next five minutes. Also described is $Target-Cache-Control: which allows different CDNs to respond to different headers and is already supported by Cloudflare and Akamai (Cloudflare-CDN-Cache-Control: and Akamai-Cache-Control:).

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=28930941"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dogpile"&gt;dogpile&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cloudflare"&gt;cloudflare&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="dogpile"/><category term="http"/><category term="cloudflare"/></entry><entry><title>Weeknotes: Archiving coronavirus.data.gov.uk, custom pages and directory configuration in Datasette, photos-to-sqlite</title><link href="https://simonwillison.net/2020/Apr/29/weeknotes/#atom-tag" rel="alternate"/><published>2020-04-29T19:41:11+00:00</published><updated>2020-04-29T19:41:11+00:00</updated><id>https://simonwillison.net/2020/Apr/29/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;I mainly made progress on three projects this week: Datasette, photos-to-sqlite and a cleaner way of archiving data to a git repository.&lt;/p&gt;

&lt;h3&gt;Archiving coronavirus.data.gov.uk&lt;/h3&gt;

&lt;p&gt;The UK goverment have a new portal website sharing detailed Coronavirus data for regions around the country, at &lt;a href="https://coronavirus.data.gov.uk/"&gt;coronavirus.data.gov.uk&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As with everything else built in 2020, it's a big single-page JavaScript app. Matthew Somerville &lt;a href="http://dracos.co.uk/wrote/coronavirus-dashboard/"&gt;investigated&lt;/a&gt; what it would take to build a much lighter (and faster loading) site displaying the same information by moving much of the rendering to the server.&lt;/p&gt;

&lt;p&gt;One of the best things about the SPA craze is that it strongly encourages structured data to be published as JSON files. Matthew's article inspired me to take a look, and sure enough the government figures are available in an extremely comprehensive (and 3.3MB in size) JSON file, available from &lt;a href="https://c19downloads.azureedge.net/downloads/data/data_latest.json"&gt;https://c19downloads.azureedge.net/downloads/data/data_latest.json&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Any time I see a file like this my first questions are how often does it change - and what kind of changes are being made to it?&lt;/p&gt;

&lt;p&gt;I've written about scraping to a git repository (see my new &lt;a href="https://simonwillison.net/tags/gitscraping/"&gt;gitscraping&lt;/a&gt; tag) a bunch in the past:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2017/Sep/10/scraping-irma/"&gt;Scraping hurricane Irma&lt;/a&gt; - September 2017&lt;/li&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2017/Oct/10/fires-in-the-north-bay/"&gt;Changelogs to help understand the fires in the North Bay&lt;/a&gt; - October 2017&lt;/li&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2019/Mar/13/tree-history/"&gt;Generating a commit log for San Francisco’s official list of trees&lt;/a&gt; - March 2019&lt;/li&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2019/Oct/10/pge-outages/"&gt;Tracking PG&amp;amp;E outages by scraping to a git repo&lt;/a&gt; - October 2019&lt;/li&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2020/Jan/21/github-actions-cloud-run/"&gt;Deploying a data API using GitHub Actions and Cloud Run&lt;/a&gt; - January 2020&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;Now that I've figured out a really clean way to &lt;a href="https://github.com/simonw/til/blob/master/github-actions/commit-if-file-changed.md"&gt;Commit a file if it changed&lt;/a&gt; in a GitHub Action knocking out new versions of this pattern is really quick.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/simonw/coronavirus-data-gov-archive"&gt;simonw/coronavirus-data-gov-archive&lt;/a&gt; is my new repo that does exactly that: it periodically fetches the latest versions of the JSON data files powering that site and commits them if they have changed. The aim is to build a &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/commits/master/data_latest.json"&gt;commit history&lt;/a&gt; of changes made to the underlying data.&lt;/p&gt;

&lt;p&gt;The first implementation was extremely simple - here's the &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/blob/c83d69e95ec6400bf77d7b0d474e868baa78841e/.github/workflows/scheduled.yml"&gt;entire action&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;name: Fetch latest data

on:
push:
repository_dispatch:
schedule:
    - cron:  '25 * * * *'

jobs:
scheduled:
    runs-on: ubuntu-latest
    steps:
    - name: Check out this repo
    uses: actions/checkout@v2
    - name: Fetch latest data
    run: |-
        curl https://c19downloads.azureedge.net/downloads/data/data_latest.json | jq . &amp;gt; data_latest.json
        curl https://c19pub.azureedge.net/utlas.geojson | gunzip | jq . &amp;gt; utlas.geojson
        curl https://c19pub.azureedge.net/countries.geojson | gunzip | jq . &amp;gt; countries.geojson
        curl https://c19pub.azureedge.net/regions.geojson | gunzip | jq . &amp;gt; regions.geojson
    - name: Commit and push if it changed
    run: |-
        git config user.name "Automated"
        git config user.email "actions@users.noreply.github.com"
        git add -A
        timestamp=$(date -u)
        git commit -m "Latest data: ${timestamp}" || exit 0
        git push&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It uses a combination of &lt;code&gt;curl&lt;/code&gt; and &lt;code&gt;jq&lt;/code&gt; (both available &lt;a href="https://github.com/actions/virtual-environments/blob/master/images/linux/Ubuntu1804-README.md"&gt;in the default worker environment&lt;/a&gt;) to pull down the data and pretty-print it (better for readable diffs), then commits the result.&lt;/p&gt;

&lt;p&gt;Matthew Somerville &lt;a href="https://twitter.com/dracos/status/1255221799085846532"&gt;pointed out&lt;/a&gt; that inefficient polling sets a bad precedent. Here I'm hitting &lt;code&gt;azureedge.net&lt;/code&gt;, the Azure CDN, so that didn't particularly worry me - but since I want this pattern to be used widely it's good to provide a best-practice example.&lt;/p&gt;

&lt;p&gt;Figuring out the best way to make &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests"&gt;conditional get requests&lt;/a&gt; in a GitHub Action lead me down &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/issues/1"&gt;something of a rabbit hole&lt;/a&gt;. I wanted to use &lt;a href="https://daniel.haxx.se/blog/2019/12/06/curl-speaks-etag/"&gt;curl's new ETag support&lt;/a&gt; but I ran into &lt;a href="https://github.com/curl/curl/issues/5309"&gt;a curl bug&lt;/a&gt;, so I ended up rolling a simple Python CLI tool called &lt;a href="https://github.com/simonw/conditional-get"&gt;conditional-get&lt;/a&gt; to solve my problem. In the time it took me to release that tool (just a few hours) a &lt;a href="https://github.com/curl/curl/issues/5309#issuecomment-621265179"&gt;new curl release&lt;/a&gt; came out with a fix for that bug!&lt;/p&gt;

&lt;p&gt;Here's &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/blob/a95d7661b236a9ee9a26a441dd948eb00308f919/.github/workflows/scheduled.yml"&gt;the workflow&lt;/a&gt; using my &lt;code&gt;conditional-get&lt;/code&gt; tool. See &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/issues/1"&gt;the issue thread&lt;/a&gt; for all of the other potential solutions, including a really neat &lt;a href="https://github.com/hubgit/curl-etag"&gt;Action shell-script solution&lt;/a&gt; by Alf Eaton.&lt;/p&gt;

&lt;p&gt;To my absolute delight, the project has already been forked once by Daniel Langer to &lt;a href="https://github.com/dlanger/coronavirus-hc-infobase-archive"&gt;capture Canadian Covid-19 cases&lt;/a&gt;!&lt;/p&gt;

&lt;h3 id="new-datasette-features"&gt;New Datasette features&lt;/h3&gt;

&lt;p&gt;I pushed two new features to &lt;a href="https://github.com/simonw/datasette"&gt;Datasette&lt;/a&gt; master, ready for release in 0.41.&lt;/p&gt;

&lt;h4&gt;Configuration directory mode&lt;/h4&gt;

&lt;p&gt;This is an idea I had while building &lt;a href="https://github.com/simonw/datasette-publish-now"&gt;datasette-publish-now&lt;/a&gt;. Datasette instances can be run with custom metadata, custom plugins and custom templates. I'm increasingly finding myself working on projects that run using something like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ datasette data1.db data2.db data3.db \
    --metadata=metadata.json
    --template-dir=templates \
    --plugins-dir=plugins&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Directory configuration mode introduces the idea that Datasette can configure itself based on a directory layout. The above example can instead by handled by creating the following layout:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;my-project/data1.db
my-project/data2.db
my-project/data3.db
my-project/metadatata.json
my-project/templates/index.html
my-project/plugins/custom_plugin.py&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then run Datasette directly targetting that directory:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ datasette my-project/&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;See &lt;a href="https://github.com/simonw/datasette/issues/731"&gt;issue #731&lt;/a&gt; for more details. Directory configuration mode &lt;a href="https://datasette.readthedocs.io/en/latest/config.html#configuration-directory-mode"&gt;is documented here&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Define custom pages using templates/pages&lt;/h4&gt;

&lt;p&gt;In &lt;a href="https://simonwillison.net/2019/Nov/25/niche-museums/"&gt;niche-museums.com, powered by Datasette&lt;/a&gt; I described how I built the &lt;a href="https://www.niche-museums.com/"&gt;www.niche-museums.com&lt;/a&gt; website as a heavily customized Datasette instance.&lt;/p&gt;

&lt;p&gt;That site has &lt;a href="https://www.niche-museums.com/about"&gt;/about&lt;/a&gt; and &lt;a href="https://www.niche-museums.com/map"&gt;/map&lt;/a&gt; pages which are served by custom templates - but I had to do some gnarly hacks with empty &lt;code&gt;about.db&lt;/code&gt; and &lt;code&gt;map.db&lt;/code&gt; files to get them to work.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/simonw/datasette/issues/648"&gt;Issue #648&lt;/a&gt; introduces a new mechanism for creating this kind of page: create a &lt;code&gt;templates/pages/map.html&lt;/code&gt; template file and custom 404 handling code will ensure that any hits to &lt;code&gt;/map&lt;/code&gt; serve the rendered contents of that template.&lt;/p&gt;

&lt;p&gt;This could work really well with the &lt;a href="https://github.com/simonw/datasette-template-sql"&gt;datasette-template-sql&lt;/a&gt; plugin, which allows templates to execute abritrary SQL queries (ala PHP or ColdFusion).&lt;/p&gt;

&lt;p&gt;Here's the new &lt;a href="https://datasette.readthedocs.io/en/latest/custom_templates.html#custom-pages"&gt;documentation on custom pages&lt;/a&gt;, including details of how to use the new &lt;code&gt;custom_status()&lt;/code&gt;, &lt;code&gt;custom_header()&lt;/code&gt; and &lt;code&gt;custom_redirect()&lt;/code&gt; template functions to go beyond just returning HTML.&lt;/p&gt;

&lt;h3&gt;photos-to-sqlite&lt;/h3&gt;

&lt;p&gt;My &lt;a href="https://dogsheep.github.io/"&gt;Dogsheep&lt;/a&gt; personal analytics project brings my &lt;a href="https://github.com/dogsheep/twitter-to-sqlite"&gt;tweets&lt;/a&gt;, &lt;a href="https://github.com/dogsheep/github-to-sqlite"&gt;GitHub activity&lt;/a&gt;, &lt;a href="https://github.com/dogsheep/swarm-to-sqlite"&gt;Swarm checkins&lt;/a&gt; and more together in one place. But the big missing feature is my photos.&lt;/p&gt;

&lt;p&gt;As-of yesterday, I have 39,000 photos from Apple Photos uploaded to an S3 bucket using my new &lt;a href="https://github.com/dogsheep/photos-to-sqlite/"&gt;photos-to-sqlite&lt;/a&gt; tool. I can run the following SQL query and get back ten random photos!&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;select
  json_object(
    'img_src',
    'https://photos.simonwillison.net/i/' || 
    sha256 || '.' || ext || '?w=400'
  ),
  filepath,
  ext
from
  photos
where
  ext in ('jpeg', 'jpg', 'heic')
order by
  random()
limit
  10&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;photos.simonwillison.net&lt;/code&gt; is running a modified version of my &lt;a href="https://github.com/simonw/heic-to-jpeg"&gt;heic-to-jpeg&lt;/a&gt; image converting and resizing proxy, which I'll release at some point soon.&lt;/p&gt;

&lt;p&gt;There's still plenty of work to do - I still need to import EXIF data (including locations) into SQLite, and I plan to use &lt;a href="https://github.com/RhetTbull/osxphotos"&gt;osxphotos&lt;/a&gt; to export additional metadata from my Apple Photos library. But this week it went from a pure research project to something I can actually start using, which is exciting.&lt;/p&gt;

&lt;h3&gt;TIL this week&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/macos/fixing-compinit-insecure-directories.md"&gt;Fixing "compinit: insecure directories" error&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/tailscale/lock-down-sshd.md"&gt;Restricting SSH connections to devices within a Tailscale network&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/python/generate-nested-json-summary.md"&gt;Generated a summary of nested JSON data&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/pytest/session-scoped-tmp.md"&gt;Session-scoped temporary directories in pytest&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/pytest/mock-httpx.md"&gt;How to mock httpx using pytest-mock&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;Generated using &lt;a href="https://til.simonwillison.net/til?sql=select+json_object(%27pre%27%2C+group_concat(%27*+[%27+||+title+||+%27](%27+||+url+||+%27)%27%2C+%27%0D%0A%27))+from+til+where+%22created_utc%22+%3E%3D+%3Ap0+order+by+updated_utc+desc+limit+101&amp;amp;p0=2020-04-23"&gt;this query&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/matthew-somerville"&gt;matthew-somerville&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/photos"&gt;photos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="git"/><category term="http"/><category term="matthew-somerville"/><category term="photos"/><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="covid19"/><category term="git-scraping"/></entry><entry><title>Async Support - HTTPX</title><link href="https://simonwillison.net/2020/Jan/10/httpx/#atom-tag" rel="alternate"/><published>2020-01-10T04:49:59+00:00</published><updated>2020-01-10T04:49:59+00:00</updated><id>https://simonwillison.net/2020/Jan/10/httpx/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.python-httpx.org/async/"&gt;Async Support - HTTPX&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
HTTPX is the new async-friendly HTTP library for Python spearheaded by Tom Christie. It works in both async and non-async mode with an API very similar to requests. The async support is particularly interesting - it's a really clean API, and now that Jupyter supports top-level await you can run &lt;code&gt;(await httpx.AsyncClient().get(url)).text&lt;/code&gt; directly in a cell and get back the response. Most excitingly the library lets you pass an ASGI app directly to the client and then perform requests against it - ideal for unit tests.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/_tomchristie/status/1215240517962870784"&gt;@_tomchristie&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/async"&gt;async&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/asgi"&gt;asgi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-christie"&gt;tom-christie&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/httpx"&gt;httpx&lt;/a&gt;&lt;/p&gt;



</summary><category term="async"/><category term="http"/><category term="python"/><category term="asgi"/><category term="tom-christie"/><category term="httpx"/></entry><entry><title>Usage of ARIA attributes via HTTP Archive</title><link href="https://simonwillison.net/2018/Jul/12/usage-aria-attributes-http-archive/#atom-tag" rel="alternate"/><published>2018-07-12T03:16:26+00:00</published><updated>2018-07-12T03:16:26+00:00</updated><id>https://simonwillison.net/2018/Jul/12/usage-aria-attributes-http-archive/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://discuss.httparchive.org/t/usage-of-aria-attributes/778"&gt;Usage of ARIA attributes via HTTP Archive&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A neat example of a Google BigQuery query you can run against the HTTP Archive public dataset (a crawl of the “top” websites run periodically by the Internet Archive, which captures the full details of every resource fetched) to see which ARIA attributes are used the most often. Linking to this because I used it successfully today as the basis for my own custom query—I love that it’s possible to analyze a huge representative sample of the modern web in this way.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/aria"&gt;aria&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/internet-archive"&gt;internet-archive&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/big-data"&gt;big-data&lt;/a&gt;&lt;/p&gt;



</summary><category term="aria"/><category term="http"/><category term="internet-archive"/><category term="big-data"/></entry><entry><title>How Balanced does Database Migrations with Zero-Downtime</title><link href="https://simonwillison.net/2017/Nov/7/how-balanced-does-database-migrations-with-zero-downtime/#atom-tag" rel="alternate"/><published>2017-11-07T11:36:25+00:00</published><updated>2017-11-07T11:36:25+00:00</updated><id>https://simonwillison.net/2017/Nov/7/how-balanced-does-database-migrations-with-zero-downtime/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://blog.balancedpayments.com/payments-infrastructure-suspending-traffic-zero-downtime-migrations/"&gt;How Balanced does Database Migrations with Zero-Downtime&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I’m fascinated by the idea of “pausing” traffic during a blocking site maintenance activity (like a database migration) and then un-pausing when the operation is complete—so end clients just see some of their requests taking a few seconds longer than expected. I first saw this trick described by Braintree. Balanced wrote about a neat way of doing this just using HAproxy, which lets you live reconfigure the maxconns to your backend down to zero (causing traffic to be queued up) and then bring the setting back up again a few seconds later to un-pause those requests.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/haproxy"&gt;haproxy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/highavailability"&gt;highavailability&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/migrations"&gt;migrations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/zero-downtime"&gt;zero-downtime&lt;/a&gt;&lt;/p&gt;



</summary><category term="haproxy"/><category term="highavailability"/><category term="http"/><category term="migrations"/><category term="scaling"/><category term="zero-downtime"/></entry><entry><title>Whether 404 custom error page necessary for a website?</title><link href="https://simonwillison.net/2014/Jan/3/whether-404-custom-error/#atom-tag" rel="alternate"/><published>2014-01-03T13:14:00+00:00</published><updated>2014-01-03T13:14:00+00:00</updated><id>https://simonwillison.net/2014/Jan/3/whether-404-custom-error/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/Whether-404-custom-error-page-necessary-for-a-website/answer/Simon-Willison"&gt;Whether 404 custom error page necessary for a website?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;They aren't required, but if you don't have a custom 404 page you're missing out on a very easy way of improving the user experience of your site, and protecting against expired or incorrect links from elsewhere on the web.&lt;/p&gt;

&lt;p&gt;Even just a search box and a link to your homepage is enough to ensure visitors who arrive on a 404 can still visit the rest of your site, and hopefully find what they were looking for when they clicked on the link.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/seo"&gt;seo&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="http"/><category term="seo"/><category term="quora"/></entry><entry><title>What will HTTP be superseded by?</title><link href="https://simonwillison.net/2012/Dec/26/what-will-http-be/#atom-tag" rel="alternate"/><published>2012-12-26T12:28:00+00:00</published><updated>2012-12-26T12:28:00+00:00</updated><id>https://simonwillison.net/2012/Dec/26/what-will-http-be/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/What-will-HTTP-be-superseded-by/answer/Simon-Willison"&gt;What will HTTP be superseded by?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;HTTP 1.x will likely never be completely replaced, but there is ongoing work at the moment to define HTTP 2.0. The first draft of this was released in November and is based on Google's SPDY protocol, which is already widely deployed in Google Chrome and Google's web properties (other browsers have experimented with support for SPDY as well): &lt;span&gt;&lt;a href="http://en.m.wikipedia.org/wiki/HTTP_2.0"&gt;http://en.m.wikipedia.org/wiki/H...&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;One thing that looks pretty likely is that any replacement will only work over SSL - not just to improve privacy and security on the web, but also because this is the most reliable way to avoid breaking all of the legacy proxy servers already deployed around the net.&lt;/p&gt;

&lt;p&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/internet"&gt;internet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/web-development"&gt;web-development&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="http"/><category term="internet"/><category term="web-development"/><category term="quora"/></entry><entry><title>How can I download a web server's directory and all subdirectories with one command?</title><link href="https://simonwillison.net/2012/Jan/15/how-can-i-download/#atom-tag" rel="alternate"/><published>2012-01-15T18:55:00+00:00</published><updated>2012-01-15T18:55:00+00:00</updated><id>https://simonwillison.net/2012/Jan/15/how-can-i-download/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/How-can-I-download-a-web-servers-directory-and-all-subdirectories-with-one-command/answer/Simon-Willison"&gt;How can I download a web server&amp;#39;s directory and all subdirectories with one command?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Use wget (you can install it with apt-get install wget)&lt;/p&gt;

&lt;p&gt;$ wget --recursive &lt;span&gt;&lt;a href="http://example.com"&gt;http://example.com&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;That will create a directory called &lt;span&gt;&lt;a href="http://example.com"&gt;example.com&lt;/a&gt;&lt;/span&gt; and put the mirrored downloaded files in the right sub-directories inside it.&lt;/p&gt;

&lt;p&gt;If you just want to download a subdirectory, do this:&lt;/p&gt;

&lt;p&gt;$ wget --recursive &lt;span&gt;&lt;a href="http://example.com/subdirectory"&gt;http://example.com/subdirectory&lt;/a&gt;&lt;/span&gt; --no-parent&lt;/p&gt;

&lt;p&gt;The --no-parent option ensures wget won't follow links up to parent directories of the one you want to download.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/linux"&gt;linux&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ubuntu"&gt;ubuntu&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="http"/><category term="linux"/><category term="ubuntu"/><category term="quora"/></entry><entry><title>What are the best practices in Node.js to communicate with an existing Java backend?</title><link href="https://simonwillison.net/2011/Dec/8/what-are-the-best/#atom-tag" rel="alternate"/><published>2011-12-08T12:53:00+00:00</published><updated>2011-12-08T12:53:00+00:00</updated><id>https://simonwillison.net/2011/Dec/8/what-are-the-best/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/What-are-the-best-practices-in-Node-js-to-communicate-with-an-existing-Java-backend/answer/Simon-Willison"&gt;What are the best practices in Node.js to communicate with an existing Java backend?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Node speaks HTTP extremely well, and using HTTP means you can do things like put an HTTP load balancer or cache (such as varnish) between Node and your Java application server at a later date.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nodejs"&gt;nodejs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="http"/><category term="nodejs"/><category term="quora"/></entry><entry><title>Quoting Dan Manges</title><link href="https://simonwillison.net/2011/Jun/30/braintree/#atom-tag" rel="alternate"/><published>2011-06-30T21:27:00+00:00</published><updated>2011-06-30T21:27:00+00:00</updated><id>https://simonwillison.net/2011/Jun/30/braintree/#atom-tag</id><summary type="html">
    &lt;blockquote cite="http://www.braintreepayments.com/inside-braintree/how-we-built-the-software-that-processes-billions-in-payments"&gt;&lt;p&gt;We can deploy new versions of our software, make database schema changes, or even rotate our primary database server, all without failing to respond to a single request. We can accomplish this because we gave ourselves the ability suspend our traffic, which gives us a window of a few seconds to make some changes before letting the requests through. To make this happen, we built a custom HTTP server and application dispatching infrastructure around Python’s Tornado and Redis.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="http://www.braintreepayments.com/inside-braintree/how-we-built-the-software-that-processes-billions-in-payments"&gt;Dan Manges&lt;/a&gt;, Braintree&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/deployment"&gt;deployment&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tornado"&gt;tornado&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="deployment"/><category term="http"/><category term="redis"/><category term="tornado"/><category term="recovered"/></entry><entry><title>On HTTP Load Testing</title><link href="https://simonwillison.net/2011/May/18/loadtesting/#atom-tag" rel="alternate"/><published>2011-05-18T10:17:00+00:00</published><updated>2011-05-18T10:17:00+00:00</updated><id>https://simonwillison.net/2011/May/18/loadtesting/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.mnot.net/blog/2011/05/18/http_benchmark_rules"&gt;On HTTP Load Testing&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Mark Nottingham explains that running good HTTP benchmarks means understanding available network bandwidth, using dedicated physical hardware, testing at progressively higher loads and a whole lot more.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mark-nottingham"&gt;mark-nottingham&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/load-testing"&gt;load-testing&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="mark-nottingham"/><category term="recovered"/><category term="load-testing"/></entry><entry><title>The Inside Story of How Facebook Responded to Tunisian Hacks</title><link href="https://simonwillison.net/2011/Jan/24/tunisia/#atom-tag" rel="alternate"/><published>2011-01-24T18:06:00+00:00</published><updated>2011-01-24T18:06:00+00:00</updated><id>https://simonwillison.net/2011/Jan/24/tunisia/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.theatlantic.com/technology/archive/2011/01/the-inside-story-of-how-facebook-responded-to-tunisian-hacks/70044/"&gt;The Inside Story of How Facebook Responded to Tunisian Hacks&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“By January 5, it was clear that an entire country’s worth of passwords were in the process of being stolen right in the midst of the greatest political upheaval in two decades.”—which is why you shouldn’t serve your login form over HTTP even though it POSTs over HTTPS.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://radar.oreilly.com/2011/01/four-short-links-24-january-20.html?utm_source=feedburner&amp;amp;utm_medium=feed&amp;amp;utm_campaign=Feed%3A oreilly%2Fradar%2Fatom %28O%27Reilly Radar%29"&gt;O&amp;#x27;Reilly Radar&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/https"&gt;https&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tunisia"&gt;tunisia&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="http"/><category term="https"/><category term="security"/><category term="recovered"/><category term="tunisia"/></entry><entry><title>gzip support for Amazon Web Services CloudFront</title><link href="https://simonwillison.net/2010/Nov/12/gzip/#atom-tag" rel="alternate"/><published>2010-11-12T05:33:00+00:00</published><updated>2010-11-12T05:33:00+00:00</updated><id>https://simonwillison.net/2010/Nov/12/gzip/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.nomitor.com/blog/2010/11/10/gzip-support-for-amazon-web-services-cloudfront/"&gt;gzip support for Amazon Web Services CloudFront&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This would have saved me a bunch of work a few weeks ago. CloudFront can now be pointed at your own web server rather than S3, and you can ask it to forward on the Accept-Encoding header and cache multiple content versions based on the result.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cloudfront"&gt;cloudfront&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gzip"&gt;gzip&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="cloudfront"/><category term="gzip"/><category term="http"/><category term="recovered"/></entry><entry><title>LWPx::ParanoidAgent</title><link href="https://simonwillison.net/2010/Aug/31/paranoidagent/#atom-tag" rel="alternate"/><published>2010-08-31T02:30:00+00:00</published><updated>2010-08-31T02:30:00+00:00</updated><id>https://simonwillison.net/2010/Aug/31/paranoidagent/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://search.cpan.org/dist/LWPx-ParanoidAgent/lib/LWPx/ParanoidAgent.pm"&gt;LWPx::ParanoidAgent&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Every programming language needs an equivalent of this library—a robust, secure way to make HTTP requests against URLs from untrusted sources without risk of tarpits, internal network access, socket starvation, weird server errors, or other nastiness.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/perl"&gt;perl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="perl"/><category term="recovered"/></entry><entry><title>nodejitsu's node-http-proxy</title><link href="https://simonwillison.net/2010/Jul/28/nodejitsus/#atom-tag" rel="alternate"/><published>2010-07-28T23:34:00+00:00</published><updated>2010-07-28T23:34:00+00:00</updated><id>https://simonwillison.net/2010/Jul/28/nodejitsus/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://github.com/nodejitsu/node-http-proxy"&gt;nodejitsu&amp;#x27;s node-http-proxy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Exactly what I’ve been waiting for—a robust  HTTP proxy library for Node that makes it trivial to proxy requests to a backend with custom proxy behaviour added in JavaScript. The example app adds an artificial delay to every request to simulate a slow connection, but other exciting potential use cases could include rate limiting, API key restriction, logging, load balancing, lint testing and more besides.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://thechangelog.com/post/872114581/node-http-proxy-reverse-proxy-for-node-js"&gt;The Changelog&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nodejs"&gt;nodejs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="javascript"/><category term="nodejs"/><category term="proxies"/><category term="recovered"/></entry></feed>