<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: github-issues</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/github-issues.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-05-26T07:23:17+00:00</updated><author><name>Simon Willison</name></author><entry><title>GitHub Issues search now supports nested queries and boolean operators: Here’s how we (re)built it</title><link href="https://simonwillison.net/2025/May/26/github-issues-search/#atom-tag" rel="alternate"/><published>2025-05-26T07:23:17+00:00</published><updated>2025-05-26T07:23:17+00:00</updated><id>https://simonwillison.net/2025/May/26/github-issues-search/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.blog/developer-skills/application-development/github-issues-search-now-supports-nested-queries-and-boolean-operators-heres-how-we-rebuilt-it/"&gt;GitHub Issues search now supports nested queries and boolean operators: Here’s how we (re)built it&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
GitHub Issues got a significant search upgrade &lt;a href="https://simonwillison.net/2025/Jan/16/evolving-github-issues/"&gt;back in January&lt;/a&gt;. Deborah Digges provides some behind the scene details about how it works and how they rolled it out.&lt;/p&gt;
&lt;p&gt;The signature new feature is complex boolean logic: you can now search for things like &lt;code&gt;is:issue state:open author:rileybroughten (type:Bug OR type:Epic)&lt;/code&gt;, up to five levels of nesting deep.&lt;/p&gt;
&lt;p&gt;Queries are parsed into an AST using the Ruby &lt;a href="https://github.com/kschiess/parslet"&gt;parslet&lt;/a&gt; PEG grammar library. The AST is then compiled into a nested Elasticsearch &lt;code&gt;bool&lt;/code&gt; JSON query.&lt;/p&gt;
&lt;p&gt;GitHub Issues search deals with around 2,000 queries a second so robust testing is extremely important! The team rolled it out invisibly to 1% of live traffic, running the new implementation via a queue and competing the number of results returned to try and spot any degradations compared to the old production code.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/elasticsearch"&gt;elasticsearch&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ops"&gt;ops&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/parsing"&gt;parsing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ruby"&gt;ruby&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;



</summary><category term="elasticsearch"/><category term="github"/><category term="ops"/><category term="parsing"/><category term="ruby"/><category term="scaling"/><category term="search"/><category term="github-issues"/></entry><entry><title>GitHub issues for notes</title><link href="https://simonwillison.net/2025/May/26/notes/#atom-tag" rel="alternate"/><published>2025-05-26T07:11:13+00:00</published><updated>2025-05-26T07:11:13+00:00</updated><id>https://simonwillison.net/2025/May/26/notes/#atom-tag</id><summary type="html">
    &lt;p&gt;GitHub issues is &lt;em&gt;almost&lt;/em&gt; the best notebook in the world.&lt;/p&gt;
&lt;p&gt;Free and unlimited, for both public and private notes.&lt;/p&gt;
&lt;p&gt;Comprehensive Markdown support, including syntax highlighting for almost any language. Plus you can drag and drop images or videos directly onto a note.&lt;/p&gt;
&lt;p&gt;It has fantastic inter-linking abilities. You can paste in URLs to other issues (in any other repository on GitHub) in a markdown list &lt;a href="https://github.com/simonw/llm/issues/1059#issuecomment-2907515045"&gt;like this&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;- https://github.com/simonw/llm/issues/1078
- https://github.com/simonw/llm/issues/1080
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Your issue will pull in the title of the other issue, plus that other issue will get back a link to yours - taking issue visibility rules into account.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of an issue showing user &amp;quot;simonw&amp;quot; posted yesterday containing text &amp;quot;Current logs could do with a bit of a redesign around tools already, see:&amp;quot; followed by two bullet points with green checkmark icons: &amp;quot;Tools in LLM logs output should only show definition first time #1078&amp;quot; and &amp;quot;New design for logs involving tool calls (and maybe tool classes) #1080&amp;quot;" src="https://static.simonwillison.net/static/2025/issue-notes.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;It has excellent search, both within a repo, across all of your repos or even across the whole of GitHub if you've completely forgotten where you put something.&lt;/p&gt;
&lt;p&gt;It has a comprehensive API, both for exporting notes and creating and editing new ones. Add GitHub Actions, triggered by issue events, and you can automate it to do almost anything.&lt;/p&gt;
&lt;p&gt;The one missing feature? Synchronized offline support. I still mostly default to Apple Notes on my phone purely because it works with or without the internet and syncs up with my laptop later on.&lt;/p&gt;
&lt;p&gt;A few extra notes inspired by the &lt;a href="https://news.ycombinator.com/item?id=44094980"&gt;discussion of this post&lt;/a&gt; on Hacker News:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I'm not worried about privacy here. A lot of companies pay GitHub a lot of money to keep the source code and related assets safe. I do not think GitHub are going to sacrifice that trust to "train a model" or whatever.&lt;/li&gt;
&lt;li&gt;There is always the risk of bug that might expose my notes, across any note platform. That's why I keep things like passwords out of my notes!&lt;/li&gt;
&lt;li&gt;Not paying and not self-hosting is a very important feature. I don't want to risk losing my notes to a configuration or billing error!&lt;/li&gt;
&lt;li&gt;The thing where notes can include checklists using &lt;code&gt;- [ ] item&lt;/code&gt; syntax is really useful. You can even do &lt;code&gt;- [ ] #ref&lt;/code&gt; to reference another issue and the checkbox will be automatically checked when that other issue is closed.&lt;/li&gt;
&lt;li&gt;I've experimented with a bunch of ways of backing up my notes locally, such as &lt;a href="https://github.com/dogsheep/github-to-sqlite"&gt;github-to-sqlite&lt;/a&gt;. I'm not running any of them on cron on a separate machine at the moment, but I really should!&lt;/li&gt;
&lt;li&gt;I'll go back to pen and paper as soon as my paper notes can be instantly automatically backed up to at least two different continents.&lt;/li&gt;
&lt;li&gt;GitHub issues also &lt;em&gt;scales&lt;/em&gt;! &lt;a href="https://github.com/microsoft/vscode/issues?q=is%3Aissue"&gt;microsoft/vscode&lt;/a&gt; has 195,376 issues. &lt;a href="https://github.com/flutter/flutter/issues?q=is%3Aissue"&gt;flutter/flutter&lt;/a&gt; has 106,572. I'm not going to run out of space.&lt;/li&gt;
&lt;li&gt;Having my notes in a format that's easy to pipe into an LLM is really fun. Here's &lt;a href="https://github.com/simonw/sqlite-chronicle/issues/7#issuecomment-2833108867"&gt;a recent example&lt;/a&gt; where I summarized a 50+ comment, 1.5 year long issue thread into a new comment using &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I was curious how many issues and comments I've created on GitHub. With &lt;a href="https://claude.ai/share/ffdd2662-d234-447b-bb84-4012898e0032"&gt;Claude's help&lt;/a&gt; I figured out you can get that using a GraphQL query:&lt;/p&gt;
&lt;pre&gt;{
  &lt;span class="pl-v"&gt;viewer&lt;/span&gt; {
    &lt;span class="pl-v"&gt;issueComments&lt;/span&gt; {
      &lt;span class="pl-v"&gt;totalCount&lt;/span&gt;
    }
    &lt;span class="pl-v"&gt;issues&lt;/span&gt; {
      &lt;span class="pl-v"&gt;totalCount&lt;/span&gt;
    }
  }
}&lt;/pre&gt;

&lt;p&gt;Running that with &lt;a href="https://docs.github.com/en/graphql/overview/explorer"&gt;the GitHub GraphQL Explorer&lt;/a&gt; tool gave me this:&lt;/p&gt;
&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"data"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"viewer"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"issueComments"&lt;/span&gt;: {
        &lt;span class="pl-ent"&gt;"totalCount"&lt;/span&gt;: &lt;span class="pl-c1"&gt;39087&lt;/span&gt;
      },
      &lt;span class="pl-ent"&gt;"issues"&lt;/span&gt;: {
        &lt;span class="pl-ent"&gt;"totalCount"&lt;/span&gt;: &lt;span class="pl-c1"&gt;9413&lt;/span&gt;
      }
    }
  }
}&lt;/pre&gt;

&lt;p&gt;That's 48,500 combined issues and comments!&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/graphql"&gt;graphql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="graphql"/><category term="ai-assisted-programming"/><category term="llm"/><category term="github-issues"/></entry><entry><title>cityofaustin/atd-data-tech issues</title><link href="https://simonwillison.net/2025/May/20/data-tech-issues/#atom-tag" rel="alternate"/><published>2025-05-20T18:18:39+00:00</published><updated>2025-05-20T18:18:39+00:00</updated><id>https://simonwillison.net/2025/May/20/data-tech-issues/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/cityofaustin/atd-data-tech/issues"&gt;cityofaustin/atd-data-tech issues&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I stumbled across this today while looking for interesting frequently updated data sources from local governments. It turns out the City of Austin's &lt;a href="https://austinmobility.io/"&gt;Transportation Data &amp;amp; Technology Services&lt;/a&gt; department run everything out of a public GitHub issues instance, which currently has 20,225 closed and 2,002 open issues. They also publish an &lt;a href="https://data.austintexas.gov/Transportation-and-Mobility/Transportation-Public-Works-Data-Tech-Services-Iss/rzwg-fyv8/about_data"&gt;exported copy&lt;/a&gt; of the issues data through the &lt;a href="https://data.austintexas.gov/"&gt;data.austintexas.gov&lt;/a&gt; open data portal.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-data"&gt;open-data&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="open-data"/><category term="github-issues"/></entry><entry><title>Jules</title><link href="https://simonwillison.net/2025/May/19/jules/#atom-tag" rel="alternate"/><published>2025-05-19T21:40:11+00:00</published><updated>2025-05-19T21:40:11+00:00</updated><id>https://simonwillison.net/2025/May/19/jules/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://jules.google.com/"&gt;Jules&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
It seems like &lt;em&gt;everyone&lt;/em&gt; is rolling out AI coding assistants that attach to your GitHub account and submit PRs for you right now. We had &lt;a href="https://simonwillison.net/2025/May/16/openai-codex/"&gt;OpenAI Codex&lt;/a&gt; last week, today Microsoft announced &lt;a href="https://github.blog/changelog/2025-05-19-github-copilot-coding-agent-in-public-preview/"&gt;GitHub Copilot coding agent&lt;/a&gt; (confusingly not the same thing as &lt;a href="https://githubnext.com/projects/copilot-workspace"&gt;Copilot Workspace&lt;/a&gt;) and I found out just now that Google's Jules, &lt;a href="https://developers.googleblog.com/en/the-next-chapter-of-the-gemini-era-for-developers/"&gt;announced in December&lt;/a&gt;, is now in a beta preview.&lt;/p&gt;
&lt;p&gt;I'm flying home from PyCon but I managed to try out Jules from my phone. I took &lt;a href="https://github.com/datasette/datasette-chronicle/issues/3"&gt;this GitHub issue thread&lt;/a&gt;, converted it to copy-pasteable Markdown with &lt;a href="https://tools.simonwillison.net/github-issue-to-markdown"&gt;this tool&lt;/a&gt; and pasted it into Jules, with no further instructions.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/datasette/datasette-chronicle/pull/6"&gt;the resulting PR&lt;/a&gt; created from its branch. I haven't fully reviewed it yet and the tests aren't passing, so it's hard to evaluate from my phone how well it did. In a cursory first glance it looks like it's covered most of the requirements from the issue thread.&lt;/p&gt;
&lt;p&gt;My habit of &lt;a href="https://simonwillison.net/2022/Nov/26/productivity/#issue-thread"&gt;creating long issue threads&lt;/a&gt; where I talk to myself about the features I'm planning is proving to be a good fit for outsourcing implementation work to this new generation of coding assistants.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/google"&gt;google&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/async-coding-agents"&gt;async-coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jules"&gt;jules&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="google"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="gemini"/><category term="github-issues"/><category term="async-coding-agents"/><category term="jules"/></entry><entry><title>Evolving GitHub Issues (public preview)</title><link href="https://simonwillison.net/2025/Jan/16/evolving-github-issues/#atom-tag" rel="alternate"/><published>2025-01-16T17:41:32+00:00</published><updated>2025-01-16T17:41:32+00:00</updated><id>https://simonwillison.net/2025/Jan/16/evolving-github-issues/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.blog/changelog/2025-01-13-evolving-github-issues-public-preview/"&gt;Evolving GitHub Issues (public preview)&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
GitHub just shipped the largest set of changes to GitHub Issues I can remember in a few years. As an Issues power-user this is directly relevant to me.&lt;/p&gt;
&lt;p&gt;The big new features are sub-issues, issue types and boolean operators in search.&lt;/p&gt;
&lt;p&gt;Sub-issues look to be a more robust formalization of the existing feature where you could create a &lt;code&gt;- [ ] #123&lt;/code&gt; Markdown list of issues in the issue description to relate issue together and track a 3/5 progress bar. There are now explicit buttons for creating a sub-issue and managing the parent relationship of such, and clicking a sub-issue opens it in a side panel on top of the parent.&lt;/p&gt;
&lt;p&gt;Issue types took me a moment to track down: it turns out they are an organization level feature, so they won't show up on repos that belong to a specific user.&lt;/p&gt;
&lt;p&gt;Organizations can define issue types that will be available across all of their repos. I created a "Research" one to classify research tasks, joining the default task, bug and feature types.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of Datasette organization settings page showing issue types configuration. Left sidebar shows navigation options including General, Policies, Access, etc. Main content shows 4 issue types: Task (&amp;quot;A specific piece of work&amp;quot;), Bug (&amp;quot;An unexpected problem or behavior&amp;quot;), Feature (&amp;quot;A request, idea, or new functionality&amp;quot;), and Research (&amp;quot;Research is needed&amp;quot;). A &amp;quot;Create new type&amp;quot; button appears in top right." src="https://static.simonwillison.net/static/2025/issue-types.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Unlike labels an issue can have just one issue type. You can then search for all issues of a specific type across an entire organization using &lt;code&gt;org:datasette type:"Research"&lt;/code&gt; in GitHub search.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/filtering-and-searching-issues-and-pull-requests#using-parentheses-for-more-complicated-filters"&gt;new boolean logic&lt;/a&gt; in GitHub search looks like it could be really useful - it includes AND, OR and parenthesis for grouping.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;(type:"Bug" AND assignee:octocat) OR (type:"Enhancement" AND assignee:hubot)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I'm not sure if these are available via the GitHub APIs yet.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="github-issues"/></entry><entry><title>AI assisted learning: Learning Rust with ChatGPT, Copilot and Advent of Code</title><link href="https://simonwillison.net/2022/Dec/5/rust-chatgpt-copilot/#atom-tag" rel="alternate"/><published>2022-12-05T21:11:08+00:00</published><updated>2022-12-05T21:11:08+00:00</updated><id>https://simonwillison.net/2022/Dec/5/rust-chatgpt-copilot/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm using this year's &lt;a href="https://adventofcode.com/2022"&gt;Advent of Code&lt;/a&gt; to learn Rust - with the assistance of &lt;a href="https://github.com/features/copilot"&gt;GitHub Copilot&lt;/a&gt; and OpenAI's new &lt;a href="https://openai.com/blog/chatgpt/"&gt;ChatGPT&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I think one of the most exciting applications of large language models is to support self-guided learning. Used the right way, a language model such as GPT-3 can act as a sort of super-smart-and-super-dumb teaching assistant: you can ask it questions and follow-up questions, and if you get your questions right it can genuinely help you build a good mental model of the topic at hand.&lt;/p&gt;
&lt;p&gt;And it could also hallucinate and teach you things that are entirely divorced from reality, but in a very convincing way!&lt;/p&gt;
&lt;p&gt;I've started thinking of them as an excellent teacher for some topics who is also a conspiracy theorist around others: you can have a great conversation with them, but you need to take everything they say with a very generous grain of salt.&lt;/p&gt;
&lt;p&gt;I've been tinkering with this idea for a few months now, mostly &lt;a href="https://simonwillison.net/2022/Jun/5/play-with-gpt3/"&gt;via the GPT-3 Playground&lt;/a&gt;. ChatGPT provides a much better interface for this, and I'm ready to try it out for a larger, more structured project.&lt;/p&gt;
&lt;h4&gt;Learning Rust&lt;/h4&gt;
&lt;p&gt;I've been looking for an excuse to explore &lt;a href="https://www.rust-lang.org/"&gt;Rust&lt;/a&gt; for a few years now. As primarily a Python programmer the single biggest tool missing from my toolbox is something more low-level - I want to be able to confidently switch to more of a systems language for performance-critical tasks, while still being able to use that optimized code in my Python projects.&lt;/p&gt;
&lt;p&gt;Rust feels like the best available option for this. It has a really great Python integration support, is already used extensively in the Python ecosystem (e.g. by the &lt;a href="https://cryptography.io/en/latest/"&gt;cryptography package&lt;/a&gt;) and lots of people who I respect have used it without loudly expressing their distaste for it later on!&lt;/p&gt;
&lt;p&gt;The problem was finding the right project. I learn by building things, but none of the projects I could imagine building in Rust (a high performance web proxy for example) would be useful to me if I built terrible versions of them while still learning the basics.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://adventofcode.com/"&gt;Advent of Code&lt;/a&gt; turns out to be perfect for this.&lt;/p&gt;
&lt;p&gt;Each day you get a new exercise, designed to be solved in a short amount of time (at least so far). Exercises are automatically graded using an input file that is unique to you, so you can't cheat by copying other people's answers (though you can cheat by copying and running their code).&lt;/p&gt;
&lt;p&gt;The exercise design is &lt;em&gt;so good&lt;/em&gt;! Eric Wastl has been running it for seven years now and I couldn't be more impressed with how it works or the quality of the exercises so far (I just finished day 5).&lt;/p&gt;
&lt;p&gt;It's absolutely perfect for my goal of learning a new programming language.&lt;/p&gt;
&lt;h4&gt;AI assisted learning tools&lt;/h4&gt;
&lt;p&gt;I've seen a bunch of people this year attempt to solve Advent of Code by feeding the questions to an AI model. That's a fun exercise, but what I'm doing here is a little bit different.&lt;/p&gt;
&lt;p&gt;My goal here is to get comfortable enough with basic Rust that I can attempt a larger project without feeling like I'm wasting my time writing unusably poor code.&lt;/p&gt;
&lt;p&gt;I also want to see if AI assisted learning actually works as well as I think it might.&lt;/p&gt;
&lt;p&gt;I'm using two tools to help me here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/features/copilot"&gt;GitHub Copilot&lt;/a&gt; runs in my VS Code editor. I've used it for the past few months mainly as a typing assistant (and for &lt;a href="https://til.simonwillison.net/gpt3/writing-test-with-copilot"&gt;writing things like repetitive tests&lt;/a&gt;). For this project I'm going to lean a lot more heavily on it - I'm taking advantage of comment-driven prompting, where you can add a code comment and Copilot will suggest code that matches the comment.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://openai.com/blog/chatgpt/"&gt;ChatGPT&lt;/a&gt;. I'm using this as a professor/teaching-assistant/study partner. I ask it questions about how to do things with Rust, it replies with answers (and usually a code sample too). I've also been using it to help understand error messages, which it turns out to be &lt;em&gt;incredibly&lt;/em&gt; effective at.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;And copious notes&lt;/h4&gt;
&lt;p&gt;I'm doing all of my work on this in the open, in my &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust"&gt;simonw/advent-of-code-2022-in-rust&lt;/a&gt; repository on GitHub. Each day gets an issue, and I'm making notes on the help I get from the AI tools in detailed issue comments.&lt;/p&gt;
&lt;p&gt;Here are my issue threads so far:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/1"&gt;Day 1: Calorie Counting&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/3"&gt;Day 2: Rock Paper Scissors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/4"&gt;Day 3: Rucksack Reorganization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/5"&gt;Day 4: Camp Cleanup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/6"&gt;Day 5: Supply Stacks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/8"&gt;Day 6: Tuning Trouble&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I recommend checking out each issue in full if you want to follow how this has all been going.&lt;/p&gt;
&lt;h4&gt;Some examples from ChatGPT&lt;/h4&gt;
&lt;p&gt;Here are a few highlights from my interactions with ChatGPT so far:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Using &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/1#issuecomment-1334161700"&gt;"add comments explaining every single line"&lt;/a&gt; to cause it to output a replacement code example with extremely verbose comments.&lt;/li&gt;
&lt;li&gt;Pasting in &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/3#issuecomment-1336192850"&gt;both the code and the resulting Rust compiler error&lt;/a&gt; - ChatGPT clearly explained the error to me and spat out a version of the code that incorporated a fix!&lt;/li&gt;
&lt;li&gt;Another example of &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/4#issuecomment-1336194817"&gt;a back-and-forth conversation&lt;/a&gt; - I started with "Given two strings in Rust, identify the character that appears in both strings", then said "Can you do this without a HashSet?", then finally asked for it "without any modules" at all. Then I said "Turn that into a function I can call" and it wrote that code for me too.&lt;/li&gt;
&lt;li&gt;I got an error from a VS Code extension I was using: "rust-analyzer failed to discover workspace" - so &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/4#issuecomment-1336250499"&gt;I pasted that into ChatGPT&lt;/a&gt; and it gave me step-by-step instructions for fixing that.&lt;/li&gt;
&lt;li&gt;I pasted in some gnarly nested code generated by Copilot, and ChatGPT &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/4#issuecomment-1336255628"&gt;suggested an improvement that removed the nesting&lt;/a&gt; without me even asking it how that code could be improved. Though amusingly it used deprecated Rust syntax for it, which took me quite a while to figure out.&lt;/li&gt;
&lt;li&gt;Amos has &lt;a href="https://fasterthanli.me/series/advent-of-code-2022/part-3"&gt;a series of posts&lt;/a&gt; about this year's Advent of Code where he uses them to illustrate advanced features of Rust. I occasionally paste his code into ChatGPT and ask it to explain each concept - in &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/4#issuecomment-1336261936"&gt;this particular case&lt;/a&gt;, I used Copilot to generate a list of questions and then pasted them into ChatGPT to get back the answers.&lt;/li&gt;
&lt;li&gt;Given two examples of how to do something, I asked it &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/5#issuecomment-1336737517"&gt;Which of these two options is the more idiomatic Rust code?&lt;/a&gt; and it answered along with justification as to why!&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Handling CLI arguments in Rust&lt;/h4&gt;
&lt;p&gt;I'll include my most recent highlight in full:&lt;/p&gt;
&lt;p&gt;I wanted to add command-line arguments to my Rust program, so &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/6#issuecomment-1337986027"&gt;I asked ChatGPT&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;How do I write a Rust program that accepts command line options - a required position argument and an optional --multistack flag?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It replied:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;To write a Rust program that accepts command line options, you can use the &lt;code&gt;clap&lt;/code&gt; crate. First, add &lt;code&gt;clap&lt;/code&gt; to your &lt;code&gt;Cargo.toml&lt;/code&gt; file like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-toml"&gt;&lt;pre&gt;[&lt;span class="pl-en"&gt;dependencies&lt;/span&gt;]
&lt;span class="pl-smi"&gt;clap&lt;/span&gt; = &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;2.33.0&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next, import the crate and use its macros to define the command line arguments that your program accepts. For example:&lt;/p&gt;
&lt;div class="highlight highlight-source-rust"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;use&lt;/span&gt; clap&lt;span class="pl-kos"&gt;::&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-v"&gt;App&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-v"&gt;Arg&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-k"&gt;fn&lt;/span&gt; &lt;span class="pl-en"&gt;main&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; matches = &lt;span class="pl-smi"&gt;App&lt;/span&gt;&lt;span class="pl-kos"&gt;::&lt;/span&gt;&lt;span class="pl-en"&gt;new&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"My Program"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
        &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;arg&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
            &lt;span class="pl-smi"&gt;Arg&lt;/span&gt;&lt;span class="pl-kos"&gt;::&lt;/span&gt;&lt;span class="pl-en"&gt;with_name&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"position"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
                &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;help&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"The position argument"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
                &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;required&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-c1"&gt;true&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
                &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;index&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-c1"&gt;1&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-kos"&gt;)&lt;/span&gt;
        &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;arg&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
            &lt;span class="pl-smi"&gt;Arg&lt;/span&gt;&lt;span class="pl-kos"&gt;::&lt;/span&gt;&lt;span class="pl-en"&gt;with_name&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"multistack"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
                &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;help&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"A flag that enables the use of multiple stacks"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
                &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;long&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"multistack"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
                &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;short&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"m"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-kos"&gt;)&lt;/span&gt;
        &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;get_matches&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

    &lt;span class="pl-c"&gt;// Get the value of the position argument&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; position = matches&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;value_of&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"position"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;unwrap&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

    &lt;span class="pl-c"&gt;// Check if the multistack flag was passed&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; multistack = matches&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;is_present&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"multistack"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this example, the &lt;code&gt;position&lt;/code&gt; argument is required and must be specified as the first argument to the program. The &lt;code&gt;multistack&lt;/code&gt; flag is optional and can be specified using the &lt;code&gt;--multistack&lt;/code&gt; or &lt;code&gt;-m&lt;/code&gt; flag.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's the thing though: I'm still working on Rust basics, so I didn't particularly want to jump straight to using an external dependency for this. So I asked a follow-up question:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Is there a way to do that without using any external crates at all?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And it replied:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Yes, you can parse command line arguments without using any external crates in Rust. To do this, you can use the &lt;code&gt;std::env::args&lt;/code&gt; function, which returns an iterator over the arguments that were passed to the program.&lt;/p&gt;
&lt;p&gt;For example, you could parse the required &lt;code&gt;position&lt;/code&gt; argument and the optional &lt;code&gt;multistack&lt;/code&gt; flag like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-rust"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;use&lt;/span&gt; std&lt;span class="pl-kos"&gt;::&lt;/span&gt;env&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-k"&gt;fn&lt;/span&gt; &lt;span class="pl-en"&gt;main&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c"&gt;// Get the arguments as an iterator&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; args&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-smi"&gt;Vec&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-smi"&gt;String&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt; = env&lt;span class="pl-kos"&gt;::&lt;/span&gt;&lt;span class="pl-en"&gt;args&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;collect&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

    &lt;span class="pl-c"&gt;// Ensure that at least one argument was passed&lt;/span&gt;
    &lt;span class="pl-k"&gt;if&lt;/span&gt; args&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;len&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &amp;lt; &lt;span class="pl-c1"&gt;2&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-en"&gt;println&lt;/span&gt;&lt;span class="pl-en"&gt;!&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"error: missing required argument 'position'"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
        &lt;span class="pl-k"&gt;return&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;

    &lt;span class="pl-c"&gt;// Get the value of the position argument&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; position = &lt;span class="pl-c1"&gt;&amp;amp;&lt;/span&gt;args&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;1&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

    &lt;span class="pl-c"&gt;// Check if the multistack flag was passed&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; multistack = args&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;iter&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;any&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;|arg| arg == &lt;span class="pl-s"&gt;"--multistack"&lt;/span&gt; || arg == &lt;span class="pl-s"&gt;"-m"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this example, the &lt;code&gt;position&lt;/code&gt; argument is required and must be specified as the first argument to the program. The &lt;code&gt;multistack&lt;/code&gt; flag is optional and can be specified using the &lt;code&gt;--multistack&lt;/code&gt; or &lt;code&gt;-m&lt;/code&gt; flag.&lt;/p&gt;
&lt;p&gt;You can then use the values of these arguments in your program as shown. Note that this approach does not provide as much functionality or convenience as using a dedicated crate like &lt;code&gt;clap&lt;/code&gt;, but it can be useful in some situations.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I really like this example.&lt;/p&gt;
&lt;p&gt;I deliberately crafted my original question to include details of exactly what I wanted to do - not just add CLI options, but add a specific positional argument and a specific flag.&lt;/p&gt;
&lt;p&gt;This generated the exact code I needed, plus a very high quality prose explanation of it.&lt;/p&gt;
&lt;p&gt;Then I used a follow-up question to get a completely different approach that better matched my personal taste. Unprompted, it even included a note about the disadvantages of that approach at the end!&lt;/p&gt;
&lt;h4&gt;Copilot as well&lt;/h4&gt;
&lt;p&gt;I haven't been keeping as detailed notes on my Copilot interactions, since those take the form of autocompletions in my editor which are harder to write down.&lt;/p&gt;
&lt;p&gt;I did capture a few illustrative screenshots though. Some examples:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/copilot-1.jpg" alt="I wrote a code comment saying And an extra score based on what I picked X = 1, Y = 2, Z = 3. Copilot suggested: score += me.chars().next().unwrap() as i32 - 'X' as i32 + 1;" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I didn't like that suggestion at all - way too convoluted. So I changed my comment prompt and got something much better:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/copilot-2.jpg" alt="A added use match for that too to my comment, and Copilot swapped out the implementation for one that uses match and is much easier to read. score += match me { 'X' = /&gt; 1, 'Y' =&gt; 2, 'Z' =&gt; 3, _ =&gt; 0, };" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;This comment-driven approach to prompting Copilot has proven to be amazingly effective. I'm learning Rust without having to spend any time looking things up - I'm using Copilot to show me examples, then if I don't understand them I paste them into ChatGPT and ask for a detailed explanation.&lt;/p&gt;
&lt;h4&gt;Where it goes wrong&lt;/h4&gt;
&lt;p&gt;An interesting part of this exercise is spotting where things go wrong.&lt;/p&gt;
&lt;p&gt;Rust is &lt;em&gt;not&lt;/em&gt; an easy language to learn. There are concepts like the borrow checker that I've not even started touching on yet, and I'm still getting the hang of basic concepts like Options and Results.&lt;/p&gt;
&lt;p&gt;Mostly Copilot and ChatGPT have been able to act as confident guides - but every now and then I've run up against the sharp edges of their fake confidence combined and the fact that they're actually just language models with no genuine understanding of what they are doing.&lt;/p&gt;
&lt;p&gt;I had one instance where I lost about an hour to an increasingly frustrating back-and-forth over &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/4#issuecomment-1336249276"&gt;an integer overflow error&lt;/a&gt; - I ended up having to actually think hard about the problem after failing to debug it with ChatGPT!&lt;/p&gt;
&lt;p&gt;I wanted to figure out if the first character of a line was a &lt;code&gt;"1"&lt;/code&gt;. ChatGPT lead me down &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/6#issuecomment-1336783594"&gt;an infuriatingly complicated warren&lt;/a&gt; of options - at one point I asked it "Why is this so hard!?" - until I finally independently stumbled across &lt;code&gt;if line.starts_with("1")&lt;/code&gt; which was exactly what I needed. Turns out I should have asked "how do I check if a strings starts with another string" - using the word "character" had thrown it completely off.&lt;/p&gt;
&lt;p&gt;I also had an incident where I installed a package using &lt;code&gt;cargo add itertools&lt;/code&gt; and decided I wanted to remove it. &lt;a href="https://github.com/simonw/advent-of-code-2022-in-rust/issues/5#issuecomment-1336759988"&gt;I asked ChatGPT about it&lt;/a&gt; and it confidently gave me instructions on using &lt;code&gt;cargo remove itertools&lt;/code&gt;... which turns out to be a command that does not exist! It hallucinated that, then hallucinated some more options until I gave up and figured it out by myself.&lt;/p&gt;
&lt;h4&gt;So is it working?&lt;/h4&gt;
&lt;p&gt;So far I think this is working really well.&lt;/p&gt;
&lt;p&gt;I feel like I'm beginning to get a good mental model of how Rust works, and a lot of the basic syntax is beginning to embed itself into my muscle memory.&lt;/p&gt;
&lt;p&gt;The real test is going to be if I can first make it to day 25 (with no prior Advent of Code experience I don't know how much the increasing difficulty level will interfere with my learning) and then if I can actually write a useful Rust program after that without any assistance from these AI models.&lt;/p&gt;
&lt;p&gt;And honestly, the other big benefit here is that this is simply &lt;em&gt;a lot of fun&lt;/em&gt;. I'm finding interacting with AIs in this way - as an actual exercise, not just to try them out - is deeply satisfying and intellectually stimulating.&lt;/p&gt;
&lt;h4&gt;And is this ethical?&lt;/h4&gt;
&lt;p&gt;The ethical issues around generative AI - both large language models like GPT-3 and image generation models &lt;a href="https://simonwillison.net/2022/Sep/5/laion-aesthetics-weeknotes/"&gt;such as Stable Diffusion&lt;/a&gt;, continue to be the most complex I've encountered in my career to date.&lt;/p&gt;
&lt;p&gt;I'm confident that one thing that &lt;em&gt;is&lt;/em&gt; ethical is learning as much as possible about these tools, and helping other people to understand them too.&lt;/p&gt;
&lt;p&gt;Using them for personal learning exercises like this feels to me like one of the best ways to do that.&lt;/p&gt;
&lt;p&gt;I like that this is a space where I can write code that's not going to be included in products, or used to make money. I don't feel bad about bootstrapping my Rust education off a model that was trained on a vast corpus of data collected without the permission of the people who created it.&lt;/p&gt;
&lt;p&gt;(Advent of Code does have a competitive leaderboard to see who can solve the exercises fastest. I have no interest at all in competing on that front, and I'm avoiding trying to leap on the exercises as soon as they are released.)&lt;/p&gt;
&lt;p&gt;My current ethical position around these models is best summarized as acknowledging that the technology exists now, and it can't be put back in its bottle.&lt;/p&gt;
&lt;p&gt;Our job is to figure out ways to maximize its benefit to society while minimising the harm it causes.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/education"&gt;education&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-3"&gt;gpt-3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-copilot"&gt;github-copilot&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="education"/><category term="github"/><category term="projects"/><category term="ai"/><category term="rust"/><category term="gpt-3"/><category term="openai"/><category term="generative-ai"/><category term="chatgpt"/><category term="github-copilot"/><category term="llms"/><category term="ai-assisted-programming"/><category term="github-issues"/></entry><entry><title>Coping strategies for the serial project hoarder</title><link href="https://simonwillison.net/2022/Nov/26/productivity/#atom-tag" rel="alternate"/><published>2022-11-26T15:47:02+00:00</published><updated>2022-11-26T15:47:02+00:00</updated><id>https://simonwillison.net/2022/Nov/26/productivity/#atom-tag</id><summary type="html">
    &lt;p&gt;I gave a talk at DjangoCon US 2022 in San Diego last month about productivity on personal projects, titled "Massively increase your productivity on personal projects with comprehensive documentation and automated tests".&lt;/p&gt;
&lt;p&gt;The alternative title for the talk was &lt;em&gt;Coping strategies for the serial project hoarder&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;I'm maintaining a &lt;em&gt;lot&lt;/em&gt; of different projects at the moment. Somewhat unintuitively, the way I'm handling this is by scaling down techniques that I've seen working for large engineering teams spread out across multiple continents.&lt;/p&gt;
&lt;p&gt;The key trick is to ensure that every project has comprehensive documentation and automated tests. This scales my productivity horizontally, by freeing me up from needing to remember all of the details of all of the different projects I'm working on at the same time.&lt;/p&gt;
&lt;p&gt;You can watch the talk &lt;a href="https://www.youtube.com/watch?v=GLkRK2rJGB0"&gt;on YouTube&lt;/a&gt; (25 minutes). Alternatively, I've included a detailed annotated version of the slides and notes below.&lt;/p&gt;
&lt;div class="resp-container"&gt;
  &lt;iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube-nocookie.com/embed/GLkRK2rJGB0" width="560"&gt; &lt;/iframe&gt;
&lt;/div&gt;
&lt;!-- cutoff --&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.001.jpeg" alt="Title slide: Massively increase your productivity on personal projects with comprehensive documentation and automated tests - Simon Willison, DjangoCon US 2022" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;This was the title I originally submitted to the conference. But I realized a better title was probably...&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.003.jpeg" alt="Same title slide, but the title has been replaced" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Coping strategies for the serial project hoarder&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.004.jpeg" alt="A static frame from a video: a monkey sits on some steps stuffing itself with several pastries. In the longer video the monkey is handed more and more pastries and can't resist trying to hold and eat all of them at once, no matter how many it receives." style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;&lt;a href="https://twitter.com/devisridhar/status/1576170527882121217"&gt;This video&lt;/a&gt; is a neat representation of my approach to personal projects: I always have a few on the go, but I can never resist the temptation to add even more.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.005.jpeg" alt="A screenshot of my profile on PyPI - my join date is Oct 26, 2017 and I have 185 pojects listed." style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;&lt;a href="https://pypi.org/user/simonw/"&gt;My PyPI profile&lt;/a&gt; (which is only five years old) lists 185 Python packages that I've released. Technically I'm actively maintaining all of them, in that if someone reports a bug I'll push out a fix. Many of them receive new releases at least once a year.&lt;/p&gt;
&lt;p&gt;Aside: I took this screenshot using &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; with a little bit of extra JavaScript to hide a notification bar at the top of the page:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;shot-scraper &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://pypi.org/user/simonw/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
--javascript &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    document.body.style.paddingTop = 0;&lt;/span&gt;
&lt;span class="pl-s"&gt;    document.querySelector(&lt;/span&gt;
&lt;span class="pl-s"&gt;        '#sticky-notifications'&lt;/span&gt;
&lt;span class="pl-s"&gt;    ).style.display = 'none';&lt;/span&gt;
&lt;span class="pl-s"&gt;  &lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; --height 1000&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.006.jpeg" alt="A map of the world with the Eventbrite logo overlaid on it. There are pins on San Francisco, Nashville, Mendoza and Madrid." style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;How can one individual maintain 185 projects?&lt;/p&gt;
&lt;p&gt;Surprisingly, I'm using techniques that I've scaled down from working at a company with hundreds of engineers.&lt;/p&gt;
&lt;p&gt;I spent seven years at Eventbrite, during which time the engineering team grew to span three different continents. We had major engineering centers in San Francisco, Nashville, Mendoza in Argentina and Madrid in Spain.&lt;/p&gt;
&lt;p&gt;Consider timezones: engineers in Madrid and engineers in San Francisco had almost no overlap in their working hours. Good asynchronous communication was essential.&lt;/p&gt;
&lt;p&gt;Over time, I noticed that the teams that were most effective at this scale were the teams that had a strong culture of documentation and automated testing.&lt;/p&gt;
&lt;p&gt;As I started to work on my own array of smaller personal projects, I found that the same discipline that worked for large teams somehow sped me up, when intuitively I would have expected it to slow me down.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.007.jpeg" alt="The perfect commit: Implementation + tests + documentation and a link to an issue thread" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;I wrote an extended description of this in &lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/"&gt;The Perfect Commit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I've started structuring the majority of my work in terms of what I think of as "the perfect commit" - a commit that combines implementation, tests, documentation and a link to an issue thread.&lt;/p&gt;
&lt;p&gt;As software engineers, it's important to note that our job generally isn't to write new software: it's to make changes to existing software.&lt;/p&gt;
&lt;p&gt;As such, the commit is our unit of work. It's worth us paying attention to how we can make our commits as useful as possible.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.008.jpeg" alt="Screenshot of a commit on GitHub: the title is Async support for prepare_jinja2_environment, closes #1809" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/datasette/commit/ddc999ad1296e8c69cffede3e367dda059b8adad"&gt;a recent example&lt;/a&gt; from one of my projects, Datasette.&lt;/p&gt;
&lt;p&gt;It's a single commit which bundles together the implementation, some related documentation improvements and the tests that show it works. And it links back to an issue thread from the commit message.&lt;/p&gt;
&lt;p&gt;Let's talk about each component in turn.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.009.jpeg" alt="Implementation: it should just do one thing (thing here is deliberately vague)" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;There's not much to be said about the implementation: your commit should change something!&lt;/p&gt;
&lt;p&gt;It should only change one thing, but what that actually means varies on a case by case basis.&lt;/p&gt;
&lt;p&gt;It should be a single change that can be documented, tested and explained independently of other changes.&lt;/p&gt;
&lt;p&gt;(Being able to cleanly revert it is a useful property too.)&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.010.jpeg" alt="Tests: prove that the implementation works. Pass if the new implementation is correct, fail otherwise." style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;The goals of the tests that accompany a commit are to prove that the new implementation works.&lt;/p&gt;
&lt;p&gt;If you apply the implementation the new tests should pass. If you revert it the tests should fail.&lt;/p&gt;
&lt;p&gt;I often use &lt;code&gt;git stash&lt;/code&gt; to try this out.&lt;/p&gt;
&lt;p&gt;If you tell people they need to write tests for &lt;em&gt;every single change&lt;/em&gt; they'll often push back that this is too much of a burden, and will harm their productivity.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.011.jpeg" alt="Every project should start with a test. assert 1 + 1 == 2 is fine! Adding tests to an existing test suite is SO MUCH less work than starting a new test suite from scratch." style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;But I find that the incremental cost of adding a test to an existing test suite keeps getting lower over time.&lt;/p&gt;
&lt;p&gt;The hard bit of testing is getting a testing framework setup in the first place - with a test runner, and fixtures, and objects under test and suchlike.&lt;/p&gt;
&lt;p&gt;Once that's in place, adding new tests becomes really easy.&lt;/p&gt;
&lt;p&gt;So my personal rule is that every new project starts with a test. It doesn't really matter what that test does - what matters is that you can run &lt;code&gt;pytest&lt;/code&gt; to run the tests, and you have an obvious place to start building more of them.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.012.jpeg" alt="Cookiecutter repo templates: simonw/python-lib, simonw/click-app, simonw/datasette-plugin" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;I maintain three &lt;a href="https://cookiecutter.readthedocs.io/"&gt;cookiecutter&lt;/a&gt; templates to help with this, for the three kinds of projects I most frequently create:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/python-lib"&gt;simonw/python-lib&lt;/a&gt; for Python libraries&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/click-app"&gt;simonw/click-app&lt;/a&gt; for command line tools&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/datasette-plugin"&gt;simonw/datasette-plugin&lt;/a&gt; for Datasette plugins&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each of these templates creates a project with a &lt;code&gt;setup.py&lt;/code&gt; file, a README, a test suite and GitHub Actions workflows to run those tests and ship tagged releases to PyPI.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.013.jpeg" alt="Screenshot of the GitHub page to create a new repsoitory from python-lib-template-repository, which asks for a repository name, a description string and if the new repo should be public or private." style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;I have a trick for running &lt;code&gt;cookiecutter&lt;/code&gt; as part of creating a brand new repository on GitHub. I described that in &lt;a href="https://simonwillison.net/2021/Aug/28/dynamic-github-repository-templates/"&gt;Dynamic content for GitHub repository templates using cookiecutter and GitHub Actions&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.014.jpeg" alt="Documentation: Same repository as the code! Document changes that impact external developers. Update the docs in the same commit as the change. Catch missing documentation updates in PR / code review" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;This is a hill that I will die on: your documentation must live in the same repository as your code!&lt;/p&gt;
&lt;p&gt;You often see projects keep their documentation somewhere else, like in a wiki.&lt;/p&gt;
&lt;p&gt;Inevitably it goes out of date. And my experience is that if your documentation is out of date people will lose trust in it, which means they'll stop reading it and stop contributing to it.&lt;/p&gt;
&lt;p&gt;The gold standard of documentation has to be that it's reliably up to date with the code.&lt;/p&gt;
&lt;p&gt;The only way you can do that is if the documentation and code are in the same repository.&lt;/p&gt;
&lt;p&gt;This gives you versioned snapshots of the documentation that exactly match the code at that time.&lt;/p&gt;
&lt;p&gt;More importantly, it means you can enforce it through code review. You can say in a PR "this is great, but don't forget to update this paragraph on this page of the documentation to reflect the change you're making".&lt;/p&gt;
&lt;p&gt;If you do this you can finally get documentation that people learn to trust over time.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.015.jpeg" alt="Bonus trick: documentation unit tests" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Another trick I like to use is something I call documentation unit tests.&lt;/p&gt;
&lt;p&gt;The idea here is to use unit tests to enforce that concepts introspected from your code are at least mentioned in your documentation.&lt;/p&gt;
&lt;p&gt;I wrote more about that in &lt;a href="https://simonwillison.net/2018/Jul/28/documentation-unit-tests/"&gt;Documentation unit tests&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.016.jpeg" alt="Screenshot showing pytest running 26 passing tests, each with a name like test_plugin_hook_are_documented[filters_from_request]" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Here's an example. Datasette has &lt;a href="https://github.com/simonw/datasette/blob/0.63.1/tests/test_docs.py#L41-L53"&gt;a test&lt;/a&gt; that scans through each of the Datasette plugin hooks and checks that there is a heading for each one in the documentation.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.017.jpeg" alt="Screenshot of the code linked to above" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;The test itself is pretty simple: it uses &lt;code&gt;pytest&lt;/code&gt; parametrization to look through every introspected plugin hook name, and for each one checks that it has a matching heading in the documentation.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide" id="issue-thread"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.018.jpeg" alt="Everything links to an issue thread" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;The final component of my perfect commit is this: every commit must link to an issue thread.&lt;/p&gt;
&lt;p&gt;I'll usually have these open in advance but  sometimes I'll open an issue thread just so I can close it with a commit a few seconds later!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.019.jpeg" alt="A screenshot of the issue titled prepare_jinja_enviroment() hook should take datasette argument - it has 11 comments" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/datasette/issues/1809"&gt;the issue&lt;/a&gt; for the commit I showed earlier. It has 11 comments, and every single one of those comments is by me.&lt;/p&gt;
&lt;p&gt;I have literally thousands of issues on GitHub that look like this: issue threads that are effectively me talking to myself about the changes that I'm making.&lt;/p&gt;
&lt;p&gt;It turns out this a fantastic form of additional documentation.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.020.jpeg" alt="What goes in an issue?" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;What goes in an issue?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Background: the reasons for the change. In six months time you'll want to know why you did this.&lt;/li&gt;
&lt;li&gt;State of play before-hand: embed existing code, link to existing docs. I like to start my issues with "I'm going to change this code right here" - that way if I come back the next day I don't have to repeat that little piece of research.&lt;/li&gt;
&lt;li&gt;Links to things! Documentation, inspiration, clues found on StackOverflow. The idea is to capture all of the loose information floating around that topic.&lt;/li&gt;
&lt;li&gt;Code snippets illustrating potential designs and false-starts.&lt;/li&gt;
&lt;li&gt;Decisions. What did you consider? What did you decide? As programmers we make decisions constantly, all day, about everything. That work doesn't have to be invisible. Writing them down also avoids having to re-litigate them several months later when you've forgotten your original  reasoning.&lt;/li&gt;
&lt;li&gt;Screenshots - of everything! Animated screenshots even better. I even take screenshots of things like the AWS console to remind me what I did there.&lt;/li&gt;
&lt;li&gt;When you close it: a link to the updated documentation and demo&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.021.jpeg" alt="Temporal documentation. It's timestamped and contextual. You don't need to commit to keeping it up-to-date in the future (but you can add more comments if you like)" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;The reason I love issues is that they're a form of documentation that I think of as &lt;em&gt;temporal documentation&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Regular documentation comes with a big commitment: you have to keep it up to date in the future.&lt;/p&gt;
&lt;p&gt;Issue comments skip that commitment entirely. They're displayed with a timestamp, in the context of the work you were doing at the time.&lt;/p&gt;
&lt;p&gt;No-one will be upset or confused if you fail to keep them updated to match future changes.&lt;/p&gt;
&lt;p&gt;So it's a commitment-free form of documentation, which I for one find incredibly liberating.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.022.jpeg" alt="Issue driven development" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;I think of this approach as &lt;em&gt;issue driven development&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Everything you are doing is issue-first, and from that you drive the rest of the development process.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.023.jpeg" alt="Don't remember anything: you can go back to a project in six months and pick up right where you left off" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;This is how it relates back to maintaining 185 projects at the same time.&lt;/p&gt;
&lt;p&gt;With issue driven development you &lt;em&gt;don't have to remember anything&lt;/em&gt; about any of these projects at all.&lt;/p&gt;
&lt;p&gt;I've had issues where I did a bunch of design work in issue comments, then dropped it, then came back 12 months later and implemented that design - without having to rethink it.&lt;/p&gt;
&lt;p&gt;I've had projects where I forgot that the project existed entirely! But I've found it again, and there's been an open issue, and I've been able to pick up work again.&lt;/p&gt;
&lt;p&gt;It's a way of working where you treat it like every project is going to be maintained by someone else, and it's the classic cliche here that the somebody else is you in the future.&lt;/p&gt;
&lt;p&gt;It horizontally scales you and lets you tackle way more interesting problems.&lt;/p&gt;
&lt;p&gt;Programmers always complain when you interrupt them - there's this idea of "flow state" and that interrupting a programmer for a moment costs them half an hour in getting back up to speed.&lt;/p&gt;
&lt;p&gt;This fixes that! It's much easier to get back to what you are doing if you have an issue thread that records where you've got to.&lt;/p&gt;
&lt;p&gt;Issue driven development is my key productivity hack for taking on much more ambitious projects in much larger quantities.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.024.jpeg" alt="Laboratory notebooks - and a picture of a page from one by Leonardo da Vinci" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Another way to think about this is to compare it to laboratory notebooks.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://en.wikipedia.org/wiki/Studies_of_the_Fetus_in_the_Womb"&gt;a page&lt;/a&gt; from one by Leonardo da Vinci.&lt;/p&gt;
&lt;p&gt;Great scientists and great engineers have always kept detailed notes.&lt;/p&gt;
&lt;p&gt;We can use GitHub issues as a really quick and easy way to do the same thing!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.025.jpeg" alt="Issue: Figure out how to deploy Datasette to AWS lambda using function URLs and Mangum" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Another thing I like to use these for is deep research tasks.&lt;/p&gt;
&lt;p&gt;Here's an example, from when I was trying to figure out how to run my Python web application in an AWS Lambda function:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/public-notes/issues/6"&gt;Figure out how to deploy Datasette to AWS Lambda using function URLs and Mangum&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This took me 65 comments over the course of a few days... but by the end of that thread I'd figured out how to do it!&lt;/p&gt;
&lt;p&gt;Here's the follow-up, with another 77 comments, in which I &lt;a href="https://github.com/simonw/public-notes/issues/1"&gt;figure out how to serve an AWS Lambda function with a Function URL from a custom subdomain&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I will never have to figure this out ever again! That's a huge win.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.026.jpeg" alt="simonw/public-notes/issues" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/public-notes"&gt;https://github.com/simonw/public-notes&lt;/a&gt; is a public repository where I keep some of these issue threads, transferred from my private notes repos &lt;a href="https://til.simonwillison.net/github/transfer-issue-private-to-public"&gt;using this trick&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.027.jpeg" alt="Tell people what you did! (It's so easy to skip this step)" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;The last thing I want to encourage you to do is this: if you do project, tell people what it is you did!&lt;/p&gt;
&lt;p&gt;This counts for both personal and work projects. It's so easy to skip this step.&lt;/p&gt;
&lt;p&gt;Once you've shipped a feature or built a project, it's so tempting to skip the step of spending half an hour or more writing about the work you have done.&lt;/p&gt;
&lt;p&gt;But you are missing out on &lt;em&gt;so much&lt;/em&gt; of the value of your work if you don't give other people a chance to understand what you did.&lt;/p&gt;
&lt;p&gt;I wrote more about this here: &lt;a href="https://simonwillison.net/2022/Nov/6/what-to-blog-about/"&gt;What to blog about&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.028.jpeg" alt="Release notes (with dates)" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;For projects with releases, release notes are a really good way to do this.&lt;/p&gt;
&lt;p&gt;I like using GitHub releases for this - they're quick and easy to write, and I have automation setup for my projects such that creating release notes in GitHub triggers a build and release to PyPI.&lt;/p&gt;
&lt;p&gt;I've done over 1,000 releases in this way. Having them automated is crucial, and having automation makes it really easy to ship releases more often.&lt;/p&gt;
&lt;p&gt;Please make sure your release notes have dates on them. I need to know when your change went out, because if it's only a week old it's unlikely people will have upgraded to it yet, whereas a change from five years ago is probably safe to depend on.&lt;/p&gt;
&lt;p&gt;I wrote more about &lt;a href="https://simonwillison.net/2022/Jan/31/release-notes/"&gt;writing better release notes&lt;/a&gt; here.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.029.jpeg" alt="Expand your definition of done to include writing about what you did" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;This is a mental trick which works really well for me. "No project of mine is finished until I've told people about it in some way" is a really useful habit to form.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.030.jpeg" alt="Twitter threads (embed images + links + videos)" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Twitter threads are (or were) a great low-effort way to write about a project. Build a quick thread with some links and images, and maybe even a video.&lt;/p&gt;
&lt;p&gt;Get a little unit about your project out into the world, and then you can stop thinking about it.&lt;/p&gt;
&lt;p&gt;(I'm trying to do this &lt;a href="https://simonwillison.net/2022/Nov/5/mastodon/"&gt;on Mastodon now&lt;/a&gt; instead.)&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.031.jpeg" alt="Get a blog" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;Even better: get a blog! Having your own corner of the internet to write about the work that you are doing is a small investment that will pay off many times over.&lt;/p&gt;
&lt;p&gt;("Nobody blogs anymore" I said in the talk... Phil Gyford disagrees with that meme so much that he launched &lt;a href="https://ooh.directory/blog/2022/welcome/"&gt;a new blog directory&lt;/a&gt; to show how wrong it is.)&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.032.jpeg" alt="GUILT is the enemy of projects" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;The enemy of projects, especially personal projects, is &lt;em&gt;guilt&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The more projects you have, the more guilty you feel about working on any one of them - because you're not working on the others, and those projects haven't yet achieved their goals.&lt;/p&gt;
&lt;p&gt;You have to overcome guilt if you're going to work on 185 projects at once!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide" id="avoid-user-accounts"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.033.jpeg" alt="Avoid side projects with user accounts. If i has user accounts it's not a side-project, it's an unpaid job." style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;This is the most important tip: avoid side projects with user accounts.&lt;/p&gt;
&lt;p&gt;If you build something that people can sign into, that's not a side-project, it's an unpaid job. It's a very big responsibility, avoid at all costs!&lt;/p&gt;
&lt;p&gt;Almost all of my projects right now are open source things that people can run on their own machines, because that's about as far away from user accounts as I can get.&lt;/p&gt;
&lt;p&gt;I still have a responsibility for shipping security updates and things like that, but at least I'm not holding onto other people's data for them.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.034.jpeg" alt="If your project is tested and documented, you have nothing to feel guilty about. That's what I tell myself anyway!" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;I feel like if your project is tested and documented, &lt;em&gt;you have nothing to feel guilty about&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;You have put a thing out into the world, and it has tests to show that it works, and it has documentation that explains what it is.&lt;/p&gt;
&lt;p&gt;This means I can step back and say that it's OK for me to work on other things. That thing there is a unit that makes sense to people.&lt;/p&gt;
&lt;p&gt;That's what I tell myself anyway! It's OK to have 185 projects provided they all have documentation and they all have tests.&lt;/p&gt;
&lt;p&gt;Do that and the guilt just disappears. You can live guilt free!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="slide"&gt;
&lt;img loading="lazy" src="https://static.simonwillison.net/static/2022/djangocon-productivity/productivity.035.jpeg" alt="Thank you - simonwillison.net - twitter.com/simonw / github.com/simonw" style="max-width: 100%;" width="450" height="253" /&gt;&lt;div&gt;
&lt;p&gt;You can follow me on Mastodon at &lt;a href="https://fedi.simonwillison.net/@simon"&gt;@simon@simonwillison.net&lt;/a&gt; or on GitHub at &lt;a href="https://github.com/simonw"&gt;github.com/simonw&lt;/a&gt;. Or subscribe to my blog at &lt;a href="https://simonwillison.net/"&gt;simonwillison.net&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;From the Q&amp;amp;A:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You've tweeted about using GitHub Projects. Could you talk about that?
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.github.com/en/issues/planning-and-tracking-with-projects/learning-about-projects/about-projects"&gt;GitHub Projects V2&lt;/a&gt; is the perfect TODO list for me, because it lets me bring together issues from different repositories. I use a project called "Everything" on a daily basis (it's my browser default window) - I add issues to it that I plan to work on, including personal TODO list items as well as issues from my various public and private repositories. It's kind of like a cross between Trello and Airtable and I absolutely love it.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;How did you move notes from the private to the public repo?
&lt;ul&gt;
&lt;li&gt;GitHub doesn't let you do this. But there's a trick I use involving a &lt;code&gt;temp&lt;/code&gt; repo which I switch between public and private to help transfer notes. More in &lt;a href="https://til.simonwillison.net/github/transfer-issue-private-to-public"&gt;this TIL&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Question about the perfect commit: do you commit your failing tests?
&lt;ul&gt;
&lt;li&gt;I don't: I try to keep the commits that land on my &lt;code&gt;main&lt;/code&gt; branch always passing. I'll sometimes write the failing test before the implementation and  then commit them together. For larger projects I'll work in a branch and then squash-merge the final result into a perfect commit to main later on.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;&lt;/div&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/djangocon"&gt;djangocon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/documentation"&gt;documentation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/productivity"&gt;productivity&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/my-talks"&gt;my-talks&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-talks"&gt;annotated-talks&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="djangocon"/><category term="documentation"/><category term="productivity"/><category term="my-talks"/><category term="testing"/><category term="annotated-talks"/><category term="github-issues"/></entry><entry><title>The Perfect Commit</title><link href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#atom-tag" rel="alternate"/><published>2022-10-29T20:41:01+00:00</published><updated>2022-10-29T20:41:01+00:00</updated><id>https://simonwillison.net/2022/Oct/29/the-perfect-commit/#atom-tag</id><summary type="html">
    &lt;p&gt;For the last few years I've been trying to center my work around creating what I consider to be the &lt;em&gt;Perfect Commit&lt;/em&gt;. This is a single commit that contains all of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;implementation&lt;/strong&gt;: a single, focused change&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tests&lt;/strong&gt; that demonstrate the implementation works&lt;/li&gt;
&lt;li&gt;Updated &lt;strong&gt;documentation&lt;/strong&gt; reflecting the change&lt;/li&gt;
&lt;li&gt;A link to an &lt;strong&gt;issue thread&lt;/strong&gt; providing further context&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Our job as software engineers generally isn't to write new software from scratch: we spend the majority of our time adding features and fixing bugs in existing software.&lt;/p&gt;
&lt;p&gt;The commit is our principle unit of work. It deserves to be treated thoughtfully and with care.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update 26th November 2022&lt;/strong&gt;: My 25 minute talk &lt;a href="https://simonwillison.net/2022/Nov/26/productivity/"&gt;Massively increase your productivity on personal projects with comprehensive documentation and automated tests&lt;/a&gt; describes this approach to software development in detail.&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#implementation"&gt;Implementation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#tests"&gt;Tests&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#documentation"&gt;Documentation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#link-to-an-issue"&gt;A link to an issue&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#issue-over-commit-message"&gt;An issue is more valuable than a commit message&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#not-all-perfect"&gt;Not every commit needs to be "perfect"&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#scrappy-branches"&gt;Write scrappy commits in a branch&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#examples"&gt;Some examples&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="implementation"&gt;Implementation&lt;/h4&gt;
&lt;p&gt;Each commit should change a single thing.&lt;/p&gt;
&lt;p&gt;The definition of "thing" here is left deliberately vague!&lt;/p&gt;
&lt;p&gt;The goal is have something that can be easily reviewed, and that can be clearly understood in the future when revisited using tools like &lt;code&gt;git blame&lt;/code&gt; or &lt;a href="https://til.simonwillison.net/git/git-bisect"&gt;git bisect&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I like to keep my commit history linear, as I find that makes it much easier to comprehend later. This further reinforces the value of each commit being a single, focused change.&lt;/p&gt;
&lt;p&gt;Atomic commits are also much easier to cleanly revert if something goes wrong - or to cherry-pick into other branches.&lt;/p&gt;
&lt;p&gt;For things like web applications that can be deployed to production, a commit should be a unit that can be deployed. Aiming to keep the main branch in a deployable state is a good rule of thumb for deciding if a commit is a sensible atomic change or not.&lt;/p&gt;
&lt;h4 id="tests"&gt;Tests&lt;/h4&gt;
&lt;p&gt;The ultimate goal of tests is to &lt;em&gt;increase&lt;/em&gt; your productivity. If your testing practices are slowing you down, you should consider ways to improve them.&lt;/p&gt;
&lt;p&gt;In the longer term, this productivity improvement comes from gaining the freedom to make changes and stay confident that your change hasn't broken something else.&lt;/p&gt;
&lt;p&gt;But tests can help increase productivity in the immediate short term as well.&lt;/p&gt;
&lt;p&gt;How do you know when the change you have made is finished and ready to commit? It's ready when the new tests pass.&lt;/p&gt;
&lt;p&gt;I find this reduces the time I spend second-guessing myself and questioning whether I've done enough and thought through all of the edge cases.&lt;/p&gt;
&lt;p&gt;Without tests, there's a very strong possibility that your change will have broken some other, potentially unrelated feature. Your commit could be held up by hours of tedious manual testing. Or you could &lt;abbr title="You Only Live Once"&gt;YOLO&lt;/abbr&gt; it and learn that you broke something important later!&lt;/p&gt;
&lt;p&gt;Writing tests becomes far less time consuming if you already have good testing practices in place.&lt;/p&gt;
&lt;p&gt;Adding a new test to a project with a lot of existing tests is easy: you can often find an existing test that has 90% of the pattern you need already worked out for you.&lt;/p&gt;
&lt;p&gt;If your project has no tests at all, adding a test for your change will be a lot more work.&lt;/p&gt;
&lt;p&gt;This is why I start every single one of my projects with a passing test. It doesn't matter what this test is - &lt;code&gt;assert 1 + 1 == 2&lt;/code&gt; is fine! The key thing is to get a testing framework in place, such that you can run a command (for me that's usually &lt;code&gt;pytest&lt;/code&gt;) to execute the test suite - and you have an obvious place to add new tests in the future.&lt;/p&gt;
&lt;p&gt;I use &lt;a href="https://simonwillison.net/2021/Aug/28/dynamic-github-repository-templates/"&gt;these cookiecutter templates&lt;/a&gt; for almost all of my new projects. They configure a testing framework with a single passing test and GitHub Actions workflows to exercise it all from the very start.&lt;/p&gt;
&lt;p&gt;I'm not a huge advocate of test-first development, where tests are written before the code itself. What I care about is tests-included development, where the final commit bundles the tests and the implementation together. I wrote more about my approach to testing in &lt;a href="https://simonwillison.net/2020/Feb/11/cheating-at-unit-tests-pytest-black/"&gt;How to cheat at unit tests with pytest and Black&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="documentation"&gt;Documentation&lt;/h4&gt;
&lt;p&gt;If your project defines APIs that are meant to be used outside of your project, they need to be documented. In my work these projects are usually one of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Python APIs (modules, functions and classes) that provide code designed to be imported into other projects.&lt;/li&gt;
&lt;li&gt;Web APIs - usually JSON over HTTP these days - that provide functionality to be consumed by other applications.&lt;/li&gt;
&lt;li&gt;Command line interface tools, such as those implemented using &lt;a href="https://click.palletsprojects.com/"&gt;Click&lt;/a&gt; or &lt;a href="https://typer.tiangolo.com/"&gt;Typer&lt;/a&gt; or &lt;a href="https://docs.python.org/3/library/argparse.html"&gt;argparse&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is critical that this documentation &lt;strong&gt;must live in the same repository as the code itself&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;This is important for a number of reasons.&lt;/p&gt;
&lt;p&gt;Documentation is only valuable &lt;strong&gt;if people trust it&lt;/strong&gt;. People will only trust it if they know that it is kept up to date.&lt;/p&gt;
&lt;p&gt;If your docs live in a separate wiki somewhere it's easy for them to get out of date - but more importantly it's hard for anyone to quickly confirm if the documentation is being updated in sync with the code or not.&lt;/p&gt;
&lt;p&gt;Documentation should be &lt;strong&gt;versioned&lt;/strong&gt;. People need to be able to find the docs for the specific version of your software that they are using. Keeping it in the same repository as the code gives you synchronized versioning for free.&lt;/p&gt;
&lt;p&gt;Documentation changes should be &lt;strong&gt;reviewed&lt;/strong&gt; in the same way as your code. If they live in the same repository you can catch changes that need to be reflected in the documentation as part of your code review process.&lt;/p&gt;
&lt;p&gt;And ideally, documentation should be &lt;strong&gt;tested&lt;/strong&gt;. I wrote about my approach to doing this using &lt;a href="https://simonwillison.net/2018/Jul/28/documentation-unit-tests/"&gt;Documentation unit tests&lt;/a&gt;. Executing example code in the documentation using a testing framework is a great idea too.&lt;/p&gt;
&lt;p&gt;As with tests, writing documentation from scratch is much more work than incrementally modifying existing documentation.&lt;/p&gt;
&lt;p&gt;Many of my commits include documentation that is just a sentence or two. This doesn't take very long to write, but it adds up to something very comprehensive over time.&lt;/p&gt;
&lt;p&gt;How about end-user facing documentation? I'm still figuring that out myself. I created my &lt;a href="https://simonwillison.net/2022/Mar/10/shot-scraper/"&gt;shot-scraper tool&lt;/a&gt; to help automate the process of keeping screenshots up-to-date, but I've not yet found personal habits and styles for end-user documentation that I'm confident in.&lt;/p&gt;
&lt;h4 id="link-to-an-issue"&gt;A link to an issue&lt;/h4&gt;
&lt;p&gt;Every perfect commit should include a link to an issue thread that accompanies that change.&lt;/p&gt;
&lt;p&gt;Sometimes I'll even open an issue seconds before writing the commit message, just to give myself something I can link to from the commit itself!&lt;/p&gt;
&lt;p&gt;The reason I like issue threads is that they provide effectively unlimited space for commentary and background for the change that is being made.&lt;/p&gt;
&lt;p&gt;Most of my issue threads are me talking to myself - sometimes with dozens of issue comments, all written by me.&lt;/p&gt;
&lt;p&gt;Things that can go in an issue thread include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Background&lt;/strong&gt;: the reason for the change. I try to include this in the opening comment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State of play&lt;/strong&gt; before the change. I'll often link to the current version of the code and documentation. This is great for if I return to an open issue a few days later, as it saves me from having to repeat that initial research.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Links to things&lt;/strong&gt;. So many links! Inspiration for the change, relevant documentation, conversations on Slack or Discord, clues found on StackOverflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code snippets&lt;/strong&gt; illustrating potential designs and false-starts. Use &lt;code&gt;```python ... ```&lt;/code&gt; blocks to get syntax highlighting in your issue comments.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decisions&lt;/strong&gt;. What did you consider? What did you decide? As programmers we make hundreds of tiny decisions a day. Write them down! Then you'll never find yourself relitigating them in the future having forgotten your original reasoning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Screenshots&lt;/strong&gt;. What it looked like before, what it looked like after. Animated screenshots are even better! I use &lt;a href="https://www.cockos.com/licecap/"&gt;LICEcap&lt;/a&gt; to generate quick GIF screen captures or QuickTime to capture videos - both of which can be dropped straight into a GitHub issue comment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prototypes&lt;/strong&gt;. I'll often paste a few lines of code copied from a Python console session. Sometimes I'll even paste in a block of HTML and CSS, or add a screenshot of a UI prototype.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After I've closed my issues I like to add one last comment that links to the updated documentation and ideally a live demo of the new feature.&lt;/p&gt;
&lt;h4 id="issue-over-commit-message"&gt;An issue is more valuable than a commit message&lt;/h4&gt;
&lt;p&gt;I went through a several year phase of writing essays in my commit messages, trying to capture as much of the background context and thinking as possible.&lt;/p&gt;
&lt;p&gt;My commit messages grew a lot shorter when I started bundling the updated documentation in the commit - since often much of the material I'd previously included in the commit message was now in that documentation instead.&lt;/p&gt;
&lt;p&gt;As I extended my practice of writing issue threads, I found that they were a better place for most of this context than the commit messages themselves. They supported embedded media, were more discoverable and I could continue to extend them even after the commit had landed.&lt;/p&gt;
&lt;p&gt;Today many of my commit messages are a single line summary and a link to an issue!&lt;/p&gt;
&lt;p&gt;The biggest benefit of lengthy commit messages is that they are guaranteed to survive for as long as the repository itself. If you're going to use issue threads in the way I describe here it is critical that you consider their long term archival value.&lt;/p&gt;
&lt;p&gt;I expect this to be controversial! I'm advocating for abandoning one of the core ideas of Git here - that each repository should incorporate a full, decentralized record of its history that is copied in its entirety when someone clones a repo.&lt;/p&gt;
&lt;p&gt;I understand that philosophy. All I'll say here is that my own experience has been that dropping that requirement has resulted in a net increase in my overall productivity. Other people may reach a different conclusion.&lt;/p&gt;
&lt;p&gt;If this offends you too much, you're welcome to construct an &lt;em&gt;even more perfect commit&lt;/em&gt; that incorporates background information and additional context in an extended commit message as well.&lt;/p&gt;
&lt;p&gt;One of the reasons I like GitHub Issues is that it includes a comprehensive API, which can be used to extract all of that data. I use my &lt;a href="https://github.com/dogsheep/github-to-sqlite"&gt;github-to-sqlite tool&lt;/a&gt; to maintain an ongoing archive of my issues and issue comments as a SQLite database file.&lt;/p&gt;
&lt;h4 id="not-all-perfect"&gt;Not every commit needs to be "perfect"&lt;/h4&gt;
&lt;p&gt;I find that the vast majority of my work fits into this pattern, but there are exceptions.&lt;/p&gt;
&lt;p&gt;Typo fix for some documentation or a comment? Just ship it, it's fine.&lt;/p&gt;
&lt;p&gt;Bug fix that doesn't deserve documentation? Still bundle the implementation and the test plus a link to an issue, but no need to update the docs - especially if they already describe the expected bug-free behaviour.&lt;/p&gt;
&lt;p&gt;Generally though, I find that aiming for implementation, tests, documentation and an issue link covers almost all of my work. It's a really good default model.&lt;/p&gt;
&lt;h4 id="scrappy-branches"&gt;Write scrappy commits in a branch&lt;/h4&gt;
&lt;p&gt;If I'm writing more exploratory or experimental code it often doesn't make sense to work in this strict way. For those instances I'll usually work in a branch, where I can ship "WIP" commit messages and failing tests with abandon. I'll then squash-merge them into a single perfect commit (sometimes via a self-closed GitHub pull request) to keep my main branch as tidy as possible.&lt;/p&gt;
&lt;h4 id="examples"&gt;Some examples&lt;/h4&gt;
&lt;p&gt;Here are some examples of my commits that follow this pattern:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/datasette/commit/9676b2deb07cff20247ba91dad3e84a4ab0b00d1"&gt;Upgrade Docker images to Python 3.11&lt;/a&gt; for &lt;a href="https://github.com/simonw/datasette/issues/1853"&gt;datasette #1853&lt;/a&gt; - a pretty tiny change, but still includes tests, docs and an issue link.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/sqlite-utils/commit/ab8d4aad0c42f905640981f6f24bc1e37205ae62"&gt;sqlite-utils schema now takes optional tables&lt;/a&gt; for &lt;a href="https://github.com/simonw/sqlite-utils/issues/299"&gt;sqlite-utils #299&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/shot-scraper/commit/5048e21a1ca5accedfeca6ac25a16a38dc240b81"&gt;shot-scraper html command&lt;/a&gt; for &lt;a href="https://github.com/simonw/shot-scraper/issues/96"&gt;shot-scraper #96&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/s3-credentials/commit/c7bb7268c4a124349bb511f7ec3ee3f28f9581ad"&gt;s3-credentials put-objects command&lt;/a&gt; for &lt;a href="https://github.com/simonw/s3-credentials/issues/68"&gt;s3-credentials #68&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/datasette-gunicorn/commit/0d561d7a94f76079b1eb7779b3e944c163d2539e"&gt;Initial implementation&lt;/a&gt; for &lt;a href="https://github.com/simonw/datasette-gunicorn/issues/1"&gt;datasette-gunicorn #1&lt;/a&gt; - this was the first commit to this repository, but I still bundled the tests, docs, implementation and a link to an issue.&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/code-review"&gt;code-review&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/definitions"&gt;definitions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/documentation"&gt;documentation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/software-engineering"&gt;software-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="code-review"/><category term="definitions"/><category term="documentation"/><category term="git"/><category term="github"/><category term="software-engineering"/><category term="testing"/><category term="github-issues"/></entry><entry><title>A tool to run caption extraction against online videos using Whisper and GitHub Issues/Actions</title><link href="https://simonwillison.net/2022/Sep/30/action-transcription/#atom-tag" rel="alternate"/><published>2022-09-30T00:56:28+00:00</published><updated>2022-09-30T00:56:28+00:00</updated><id>https://simonwillison.net/2022/Sep/30/action-transcription/#atom-tag</id><summary type="html">
    &lt;p&gt;I released a new project this weekend, built during the Bellingcat Hackathon (I came second!) It's called &lt;a href="https://github.com/simonw/action-transcription"&gt;Action Transcription&lt;/a&gt; and it's a tool for caturing captions and transcripts from online videos.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://www.youtube.com/watch?v=AneNxjSGn1I"&gt;my video&lt;/a&gt; introducing the new tool:&lt;/p&gt;
&lt;iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube-nocookie.com/embed/AneNxjSGn1I" style="max-width: 100%" width="560"&gt; &lt;/iframe&gt;
&lt;h4&gt;Bellingcat&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://www.bellingcat.com/about/"&gt;Bellingcat&lt;/a&gt; describe themselves as an "independent international collective of researchers, investigators and citizen journalists using open source and social media investigation to probe a variety of subjects".&lt;/p&gt;
&lt;p&gt;They specialize in open source intelligence - which, confusingly, does NOT mean "open source software" - this is a &lt;a href="https://en.wikipedia.org/wiki/Open-source_intelligence"&gt;much older usage of the term&lt;/a&gt; that describes the use of publicly available information to gather intelligence.&lt;/p&gt;
&lt;p&gt;They have broken a LOT of impressive stories over their eight year lifespan. Wikipedia &lt;a href="https://en.wikipedia.org/wiki/Bellingcat"&gt;has a good list&lt;/a&gt; - highlights include identifying the suspects behind the &lt;a href="https://en.wikipedia.org/wiki/Bellingcat#Skripal_poisoning"&gt;Skripal poisoning case&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The theme of the hackathon was "General Digital Investigation Tools". The goal was to build prototypes of tools that could be used by their community of investigators - most of whom are volunteers working from home with little-to-no budget, and often with limited technical skills (they can use tools very effectively but they might not be comfortable writing code or using the command-line).&lt;/p&gt;
&lt;p&gt;Inspired by the recent release of &lt;a href="https://github.com/openai/whisper"&gt;OpenAI's Whisper&lt;/a&gt;, I decided to build a tool that would make it easier to extract captions and transcripts from videos on social media sites.&lt;/p&gt;
&lt;h4&gt;Why GitHub Actions and GitHub Issues?&lt;/h4&gt;
&lt;p&gt;My goals for the project were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Help people achieve something useful&lt;/li&gt;
&lt;li&gt;Make it as inexpensive to run as possible - ideally free&lt;/li&gt;
&lt;li&gt;Make it easy for people to install and run their own copies&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I decided to build the entire thing using GitHub Actions and GitHub Issues.&lt;/p&gt;
&lt;p&gt;GitHub Actions is a powerful service for running CI jobs and other automation, but its best feature for this particular project is that it's free.&lt;/p&gt;
&lt;p&gt;I'm fine with spending money myself, but if I'm building tools for other people having a way for them to run the tool without paying for anything is a huge win.&lt;/p&gt;
&lt;p&gt;My tool needed a UI. To keep things as simple as possible, i didn't want to host anything outside of GitHub itself. So I turned to GitHub Issues to provide the interface layer.&lt;/p&gt;
&lt;p&gt;It's easy to create Actions scripts that trigger when a new issue is created. And those scripts can then interact with that issue - attaching comments, or even closing it as completed.&lt;/p&gt;
&lt;p&gt;I decided that my flow would be:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The user opens an issue and pastes in a link to an online video.&lt;/li&gt;
&lt;li&gt;GitHub Actions is triggered by that issue, extracts the URL and fetches the video using &lt;a href="https://youtube-dl.org/"&gt;youtube-dl&lt;/a&gt; (which, despite the name, can actually download videos from &lt;a href="http://ytdl-org.github.io/youtube-dl/supportedsites.html"&gt;over 1,200 sites&lt;/a&gt; including many of the social media services popular in Russia).&lt;/li&gt;
&lt;li&gt;The script extracts just the audio from the video.&lt;/li&gt;
&lt;li&gt;The audio is then passed through OpenAI's Whisper, which can create a high quality transcript in the original language AND create a shockingly good English translation.&lt;/li&gt;
&lt;li&gt;The caption is then both written back to the GitHub repository and attached to the original issue as a comment.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;GitHub Actions doesn't (yet) provide GPUs, and Whisper works a whole lot faster with GPU access. So I decided to run Whisper using &lt;a href="https://replicate.com/cjwbw/whisper"&gt;this hosted copy of the model on Replicate&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Extracting YouTube's captions directly&lt;/h4&gt;
&lt;p&gt;I had a check-in meeting with Tristan from Bellingcat just to make sure my hack wasn't a duplicate effort, and to get feedback on the plan.&lt;/p&gt;
&lt;p&gt;Tristan liked the plan, but pointed out that extracting captions directly from YouTube would be a useful additional feature.&lt;/p&gt;
&lt;p&gt;In addition to supporting manual captions, it turns out YouTube already creates machine-generated captions in over 100 languages! The quality of these isn't nearly as good as OpenAI Whisper, but they're still useful. And they're free (running Whisper currently costs me money).&lt;/p&gt;
&lt;p&gt;So I adapted the plan, to provide the user with two options. The default option would extract captions directly from the video provider - which would definitely work for YouTube and might work for other sites too.&lt;/p&gt;
&lt;p&gt;The second option would use Whisper to create a transcript and a translation, taking longer but providing results even for sites that didn't offer their own captions.&lt;/p&gt;
&lt;p&gt;I decided to use issue tags to trigger these two workflows: tag with "captions" to extract captions directly, tag with "whisper" to use Whisper.&lt;/p&gt;
&lt;h4&gt;The implementation&lt;/h4&gt;
&lt;p&gt;The implementation ended up being &lt;a href="https://github.com/simonw/action-transcription/blob/7d900b209c6c465df35a27bb812d03754677cb78/.github/workflows/issue_created.yml"&gt;218 lines&lt;/a&gt; of JavaScript-embedded-in-YAML in a GitHub Actions &lt;code&gt;issue_created.yml&lt;/code&gt; workflow.&lt;/p&gt;
&lt;p&gt;I used &lt;a href="https://github.com/actions/github-script"&gt;actions/github-script&lt;/a&gt; for it - a convenient reusable Action that provides a pre-configured set of JavaScript objects for interacting with the GitHub API.&lt;/p&gt;
&lt;p&gt;The code isn't hugely elegant: I'm not hugely familiar with the Node.js ecosystem so I ended up hacking around with Copilot quite a bit to figure out the patterns that would work.&lt;/p&gt;
&lt;p&gt;It turns out captions can come back in a variety of different formats. The two most common appeared to be TTML - which uses XML, and WebVTT, a text-based format.&lt;/p&gt;
&lt;p&gt;I decided to archive the original caption files in the GitHub repository itself, but I wanted to extract just the text and post that as the issue comment.&lt;/p&gt;
&lt;p&gt;So I ended up building two tiny new tools: &lt;a href="https://github.com/simonw/webvtt-to-json"&gt;webvtt-to-json&lt;/a&gt; and &lt;a href="https://github.com/simonw/ttml-to-json"&gt;ttml-to-json&lt;/a&gt; - which converted the different formats into a standard JSON format of my own invention, normalizing the captions so I could then extract the text and include it in a comment.&lt;/p&gt;
&lt;p&gt;Hackathons tend to encourage some pretty scrappy solutions!&lt;/p&gt;
&lt;h4&gt;The results&lt;/h4&gt;
&lt;p&gt;These two issues demonstrate the final result of the tool:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/action-transcription-demo/issues/3"&gt;Example issue with a VK video transcribed to English using Whisper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/action-transcription-demo/issues/4"&gt;Example issue that extracted YouTube auto-generated English captions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That first one in particular shows quite how good the Whisper model is at handling Russian text, and translating it to English.&lt;/p&gt;
&lt;h4&gt;Adding issue templates&lt;/h4&gt;
&lt;p&gt;I added one last enhancement to the project after recording the demo video for the judges embedded above.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository"&gt;Issue templates&lt;/a&gt; are a new GitHub feature that let you define a form that users must fill out when they create a new issue.&lt;/p&gt;
&lt;p&gt;Frustratingly, these only work with public repositories. I had built my hack in a private repo at first, so I was only able to explore using issue templates once I had made it public.&lt;/p&gt;
&lt;p&gt;I created &lt;a href="https://github.com/simonw/action-transcription/tree/7d900b209c6c465df35a27bb812d03754677cb78/.github/ISSUE_TEMPLATE"&gt;two issue templates&lt;/a&gt; - one for caption tasks and one for whisper tasks.&lt;/p&gt;
&lt;p&gt;Now when a user goes to open a new issue they get to chose one of the two templates and fill in the URL as part of a form! Here's a GIF demo showing that flow in action:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/action-transcription-demo.gif" alt="Animated demo. Click Issues, then New Issue, then select Get Started on the Capture captions menu option. Paste in a URL and click Submit new issue." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;Template repositories&lt;/h4&gt;
&lt;p&gt;One last trick. I want users to be able to run this system themselves, on their own GitHub account.&lt;/p&gt;
&lt;p&gt;I made &lt;a href="https://github.com/simonw/action-transcription"&gt;simonw/action-transcription&lt;/a&gt; a &lt;a href="https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-template-repository"&gt;template repository&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This means that any user can click a green button to get their own copy of the repository - and when they do, they'll get their own fully configured copy of the GitHub Actions workflows too.&lt;/p&gt;
&lt;p&gt;If they want to use Whisper they'll need to get an API key from &lt;a href="https://replicate.com/"&gt;Replicate.com&lt;/a&gt; and add it to their repository's secrets - but regular caption extraction will work fine without that.&lt;/p&gt;
&lt;p&gt;I've used this technique before - I wrote about it here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2022/Mar/14/shot-scraper-template/"&gt;Instantly create a GitHub repository to take screenshots of a web page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2021/Aug/28/dynamic-github-repository-templates/"&gt;Dynamic content for GitHub repository templates using cookiecutter and GitHub Actions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;GitHub Actions as a platform&lt;/h4&gt;
&lt;p&gt;I'm pleased with how this project turned out. But I'm mainly excited about the underlying pattern. I think building tools using GitHub Actions that people can clone to their own accounts is a really promising way of developing sophisticated automated software that people can then run independently, entirely through the GitHub web interface.&lt;/p&gt;
&lt;p&gt;I'm excited to see more tools adopt a similar pattern.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hackathons"&gt;hackathons&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bellingcat"&gt;bellingcat&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/whisper"&gt;whisper&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replicate"&gt;replicate&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/speech-to-text"&gt;speech-to-text&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="hackathons"/><category term="bellingcat"/><category term="github-actions"/><category term="openai"/><category term="whisper"/><category term="replicate"/><category term="github-issues"/><category term="speech-to-text"/></entry><entry><title>upptime</title><link href="https://simonwillison.net/2022/May/26/upptime/#atom-tag" rel="alternate"/><published>2022-05-26T03:53:35+00:00</published><updated>2022-05-26T03:53:35+00:00</updated><id>https://simonwillison.net/2022/May/26/upptime/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/upptime/upptime"&gt;upptime&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“Open-source uptime monitor and status page, powered entirely by GitHub Actions, Issues, and Pages.” This is a very creative (ab)use of GitHub Actions: it runs a scheduled action to check the availability of sites that you specify, records the results in a YAML file (with the commit history tracking them over time) and can automatically open a GitHub issue for you if it detects a new incident.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/ray_voelker/status/1528820787205099520"&gt;Ray Voelker&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;



</summary><category term="github-actions"/><category term="github-issues"/></entry><entry><title>Automatically opening issues when tracked file content changes</title><link href="https://simonwillison.net/2022/Apr/28/issue-on-changes/#atom-tag" rel="alternate"/><published>2022-04-28T17:18:14+00:00</published><updated>2022-04-28T17:18:14+00:00</updated><id>https://simonwillison.net/2022/Apr/28/issue-on-changes/#atom-tag</id><summary type="html">
    &lt;p&gt;I figured out a GitHub Actions pattern to keep track of a file published somewhere on the internet and automatically open a new repository issue any time the contents of that file changes.&lt;/p&gt;
&lt;h4&gt;Extracting GZipMiddleware from Starlette&lt;/h4&gt;
&lt;p&gt;Here's why I needed to solve this problem.&lt;/p&gt;
&lt;p&gt;I want to add gzip support to my &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; open source project. Datasette builds on the Python &lt;a href="https://asgi.readthedocs.io/"&gt;ASGI&lt;/a&gt; standard, and &lt;a href="https://www.starlette.io/"&gt;Starlette&lt;/a&gt; provides an extremely well tested, robust &lt;a href="https://www.starlette.io/middleware/#gzipmiddleware"&gt;GZipMiddleware class&lt;/a&gt; that adds gzip support to any ASGI application. As with everything else in Starlette, it's &lt;em&gt;really&lt;/em&gt; good code.&lt;/p&gt;
&lt;p&gt;The problem is, I don't want to add the whole of Starlette as a dependency. I'm trying to keep Datasette's core as small as possible, so I'm very careful about new dependencies. Starlette itself is actually very light (and only has a tiny number of dependencies of its own) but I still don't want the whole thing just for that one class.&lt;/p&gt;
&lt;p&gt;So I decided to extract the &lt;code&gt;GZipMiddleware&lt;/code&gt; class into a separate Python package, under the same BSD license as Starlette itself.&lt;/p&gt;
&lt;p&gt;The result is my new &lt;a href="https://pypi.org/project/asgi-gzip/"&gt;asgi-gzip&lt;/a&gt; package, now available on PyPI.&lt;/p&gt;
&lt;h4&gt;What if Starlette fixes a bug?&lt;/h4&gt;
&lt;p&gt;The problem with extracting code like this is that Starlette is a very effectively maintained package. What if they make improvements or fix bugs in the &lt;code&gt;GZipMiddleware&lt;/code&gt; class? How can I make sure to apply those same fixes to my extracted copy?&lt;/p&gt;
&lt;p&gt;As I thought about this challenge, I realized I had most of the solution already.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;Git scraping&lt;/a&gt;&lt;/strong&gt; is the name I've given to the trick of running a periodic scraper that writes to a git repository in order to track changes to data over time.&lt;/p&gt;
&lt;p&gt;It may seem redundant to do this against a file that already &lt;a href="https://github.com/encode/starlette/commits/master/starlette/middleware/gzip.py"&gt;lives in version control&lt;/a&gt; elsewhere - but in addition to tracking changes, Git scraping can offfer a cheap and easy way to add automation that triggers when a change is detected.&lt;/p&gt;
&lt;p&gt;I need an actionable alert any time the Starlette code changes so I can review the change and apply a fix to my own library, if necessary.&lt;/p&gt;
&lt;p&gt;Since I already run all of my projects out of GitHub issues, automatically opening an issue against the &lt;a href="https://github.com/simonw/asgi-gzip"&gt;asgi-gzip repository&lt;/a&gt; would be ideal.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://github.com/simonw/asgi-gzip/blob/0.1/.github/workflows/track.yml"&gt;track.yml workflow&lt;/a&gt; does exactly that: it implements the Git scraping pattern against the &lt;a href="https://github.com/encode/starlette/blob/master/starlette/middleware/gzip.py"&gt;gzip.py module&lt;/a&gt; in Starlette, and files an issue any time it detects changes to that file.&lt;/p&gt;
&lt;p&gt;Starlette haven't made any changes to that file since I started tracking it, so I created &lt;a href="https://github.com/simonw/issue-when-changed"&gt;a test repo&lt;/a&gt; to try this out.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/issue-when-changed/issues/3"&gt;one of the example issues&lt;/a&gt;. I decided to include the visual diff in the issue description and have a link to it from the underlying commit as well.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/issue-when-changed.jpg" alt="Screenshot of an open issue page. The issues is titled &amp;quot;gzip.py was updated&amp;quot; and contains a visual diff showing the change to a file. A commit that references the issue is listed too." style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;How it works&lt;/h4&gt;
&lt;p&gt;The implementation is contained entirely in this &lt;a href="https://github.com/simonw/asgi-gzip/blob/0.1/.github/workflows/track.yml"&gt;track.yml workflow&lt;/a&gt;. I designed this to be contained as a single file to make it easy to copy and paste it to adapt it for other projects.&lt;/p&gt;
&lt;p&gt;It uses &lt;a href="https://github.com/actions/github-script"&gt;actions/github-script&lt;/a&gt;, which makes it easy to do things like file new issues using JavaScript.&lt;/p&gt;
&lt;p&gt;Here's a heavily annotated copy:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;&lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Track the Starlette version of this&lt;/span&gt;

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Run on repo pushes, and if a user clicks the "run this action" button,&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; and on a schedule at 5:21am UTC every day&lt;/span&gt;
&lt;span class="pl-ent"&gt;on&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;push&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;workflow_dispatch&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;schedule&lt;/span&gt;:
  - &lt;span class="pl-ent"&gt;cron&lt;/span&gt;:  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;21 5 * * *&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Without this block I got this error when the action ran:&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; HttpError: Resource not accessible by integration&lt;/span&gt;
&lt;span class="pl-ent"&gt;permissions&lt;/span&gt;:
  &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Allow the action to create issues&lt;/span&gt;
  &lt;span class="pl-ent"&gt;issues&lt;/span&gt;: &lt;span class="pl-s"&gt;write&lt;/span&gt;
  &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Allow the action to commit back to the repository&lt;/span&gt;
  &lt;span class="pl-ent"&gt;contents&lt;/span&gt;: &lt;span class="pl-s"&gt;write&lt;/span&gt;

&lt;span class="pl-ent"&gt;jobs&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;check&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;runs-on&lt;/span&gt;: &lt;span class="pl-s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="pl-ent"&gt;steps&lt;/span&gt;:
    - &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/checkout@v2&lt;/span&gt;
    - &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/github-script@v6&lt;/span&gt;
      &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Using env: here to demonstrate how an action like this can&lt;/span&gt;
      &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; be adjusted to take dynamic inputs&lt;/span&gt;
      &lt;span class="pl-ent"&gt;env&lt;/span&gt;:
        &lt;span class="pl-ent"&gt;URL&lt;/span&gt;: &lt;span class="pl-s"&gt;https://raw.githubusercontent.com/encode/starlette/master/starlette/middleware/gzip.py&lt;/span&gt;
        &lt;span class="pl-ent"&gt;FILE_NAME&lt;/span&gt;: &lt;span class="pl-s"&gt;tracking/gzip.py&lt;/span&gt;
      &lt;span class="pl-ent"&gt;with&lt;/span&gt;:
        &lt;span class="pl-ent"&gt;script&lt;/span&gt;: &lt;span class="pl-s"&gt;|&lt;/span&gt;
&lt;span class="pl-s"&gt;          const { URL, FILE_NAME } = process.env;&lt;/span&gt;
&lt;span class="pl-s"&gt;          // promisify pattern for getting an await version of child_process.exec&lt;/span&gt;
&lt;span class="pl-s"&gt;          const util = require("util");&lt;/span&gt;
&lt;span class="pl-s"&gt;          // Used exec_ here because 'exec' variable name is already used:&lt;/span&gt;
&lt;span class="pl-s"&gt;          const exec_ = util.promisify(require("child_process").exec);&lt;/span&gt;
&lt;span class="pl-s"&gt;          // Use curl to download the file&lt;/span&gt;
&lt;span class="pl-s"&gt;          await exec_(`curl -o ${FILE_NAME} ${URL}`);&lt;/span&gt;
&lt;span class="pl-s"&gt;          // Use 'git diff' to detect if the file has changed since last time&lt;/span&gt;
&lt;span class="pl-s"&gt;          const { stdout } = await exec_(`git diff ${FILE_NAME}`);&lt;/span&gt;
&lt;span class="pl-s"&gt;          if (stdout) {&lt;/span&gt;
&lt;span class="pl-s"&gt;            // There was a diff to that file&lt;/span&gt;
&lt;span class="pl-s"&gt;            const title = `${FILE_NAME} was updated`;&lt;/span&gt;
&lt;span class="pl-s"&gt;            const body =&lt;/span&gt;
&lt;span class="pl-s"&gt;              `${URL} changed:` +&lt;/span&gt;
&lt;span class="pl-s"&gt;              "\n\n```diff\n" +&lt;/span&gt;
&lt;span class="pl-s"&gt;              stdout +&lt;/span&gt;
&lt;span class="pl-s"&gt;              "\n```\n\n" +&lt;/span&gt;
&lt;span class="pl-s"&gt;              "Close this issue once those changes have been integrated here";&lt;/span&gt;
&lt;span class="pl-s"&gt;            const issue = await github.rest.issues.create({&lt;/span&gt;
&lt;span class="pl-s"&gt;              owner: context.repo.owner,&lt;/span&gt;
&lt;span class="pl-s"&gt;              repo: context.repo.repo,&lt;/span&gt;
&lt;span class="pl-s"&gt;              title: title,&lt;/span&gt;
&lt;span class="pl-s"&gt;              body: body,&lt;/span&gt;
&lt;span class="pl-s"&gt;            });&lt;/span&gt;
&lt;span class="pl-s"&gt;            const issueNumber = issue.data.number;&lt;/span&gt;
&lt;span class="pl-s"&gt;            // Now commit and reference that issue number, so the commit shows up&lt;/span&gt;
&lt;span class="pl-s"&gt;            // listed at the bottom of the issue page&lt;/span&gt;
&lt;span class="pl-s"&gt;            const commitMessage = `${FILE_NAME} updated, refs #${issueNumber}`;&lt;/span&gt;
&lt;span class="pl-s"&gt;            // https://til.simonwillison.net/github-actions/commit-if-file-changed&lt;/span&gt;
&lt;span class="pl-s"&gt;            await exec_(`git config user.name "Automated"`);&lt;/span&gt;
&lt;span class="pl-s"&gt;            await exec_(`git config user.email "actions@users.noreply.github.com"`);&lt;/span&gt;
&lt;span class="pl-s"&gt;            await exec_(`git add -A`);&lt;/span&gt;
&lt;span class="pl-s"&gt;            await exec_(`git commit -m "${commitMessage}" || exit 0`);&lt;/span&gt;
&lt;span class="pl-s"&gt;            await exec_(`git pull --rebase`);&lt;/span&gt;
&lt;span class="pl-s"&gt;            await exec_(`git push`);&lt;/span&gt;
&lt;span class="pl-s"&gt;          }&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In the &lt;a href="https://github.com/simonw/asgi-gzip"&gt;asgi-gzip&lt;/a&gt; repository I keep the fetched &lt;code&gt;gzip.py&lt;/code&gt; file in a &lt;code&gt;tracking/&lt;/code&gt; directory. This directory isn't included in the Python package that gets uploaded to PyPI - it's there only so that my code can track changes to it over time.&lt;/p&gt;
&lt;h4&gt;More interesting applications&lt;/h4&gt;
&lt;p&gt;I built this to solve my "tell me when Starlette update their &lt;code&gt;gzip.py&lt;/code&gt; file" problem, but clearly this pattern has much more interesting uses.&lt;/p&gt;
&lt;p&gt;You could point this at any web page to get a new GitHub issue opened when that page content changes. Subscribe to notifications for that repository and you get a robust , shared mechanism for alerts - plus an issue system where you can post additional comments and close the issue once someone has reviewed the change.&lt;/p&gt;
&lt;p&gt;There's a lot of potential here for solving all kinds of interesting problems. And it doesn't cost anything either: GitHub Actions (somehow) remains completely free for public repositories!&lt;/p&gt;
&lt;h4&gt;Update: October 13th 2022&lt;/h4&gt;
&lt;p&gt;Almost six months after writing about this... it triggered for the first time!&lt;/p&gt;
&lt;p&gt;Here's the issue that the script opened: &lt;a href="https://github.com/simonw/asgi-gzip/issues/4"&gt;#4: tracking/gzip.py was updated&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I applied the improvement (Marcelo Trylesinski and Kai Klingenberg updated Starlette's code to avoid gzipping if the response already had a Content-Encoding header) and released &lt;a href="https://github.com/simonw/asgi-gzip/releases/tag/0.2"&gt;version 0.2&lt;/a&gt; of the package.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gzip"&gt;gzip&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/asgi"&gt;asgi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="github"/><category term="gzip"/><category term="projects"/><category term="python"/><category term="datasette"/><category term="asgi"/><category term="github-actions"/><category term="git-scraping"/><category term="github-issues"/></entry><entry><title>How I build a feature</title><link href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#atom-tag" rel="alternate"/><published>2022-01-12T18:10:17+00:00</published><updated>2022-01-12T18:10:17+00:00</updated><id>https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm maintaining &lt;a href="https://github.com/simonw/simonw/blob/main/releases.md"&gt;a lot of different projects&lt;/a&gt; at the moment. I thought it would be useful to describe the process I use for adding a new feature to one of them, using the new &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html#cli-create-database"&gt;sqlite-utils create-database&lt;/a&gt; command as an example.&lt;/p&gt;
&lt;p&gt;I like each feature to be represented by what I consider to be the &lt;strong&gt;perfect commit&lt;/strong&gt; - one that bundles together the implementation, the tests, the documentation and a link to an external issue thread.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update 29th October 2022:&lt;/strong&gt; I wrote &lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/"&gt;more about the perfect commit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;sqlite-utils create-database&lt;/code&gt; command is very simple: it creates a new, empty SQLite database file. You use it like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;% sqlite-utils create-database empty.db
&lt;/code&gt;&lt;/pre&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#everything-starts-with-an-issue"&gt;Everything starts with an issue&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#development-environment"&gt;Development environment&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#automated-tests"&gt;Automated tests&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#implementing-the-feature"&gt;Implementing the feature&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#code-formatting-with-black"&gt;Code formatting with Black&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#linting"&gt;Linting&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#documentation"&gt;Documentation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#committing-the-change"&gt;Committing the change&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#branches-and-pull-requests"&gt;Branches and pull requests&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#release-notes-and-a-release"&gt;Release notes, and a release&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#a-live-demo"&gt;A live demo&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#tell-the-world-about-it"&gt;Tell the world about it&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/#more-examples-of-this-pattern"&gt;More examples of this pattern&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="everything-starts-with-an-issue"&gt;Everything starts with an issue&lt;/h4&gt;
&lt;p&gt;Every piece of work I do has an associated issue. This acts as ongoing work-in-progress notes and lets me record decisions, reference any research, drop in code snippets and sometimes even add screenshots and video - stuff that is really helpful but doesn't necessarily fit in code comments or commit messages.&lt;/p&gt;
&lt;p&gt;Even if it's a tiny improvement that's only a few lines of code, I'll still open an issue for it - sometimes just a few minutes before closing it again as complete.&lt;/p&gt;
&lt;p&gt;Any commits that I create that relate to an issue reference the issue number in their commit message. GitHub does a great job of automatically linking these together, bidirectionally so I can navigate from the commit to the issue or from the issue to the commit.&lt;/p&gt;
&lt;p&gt;Having an issue also gives me something I can link to from my release notes.&lt;/p&gt;
&lt;p&gt;In the case of the &lt;code&gt;create-database&lt;/code&gt; command, I opened &lt;a href="https://github.com/simonw/sqlite-utils/issues/348"&gt;this issue&lt;/a&gt; in November when I had the idea for the feature.&lt;/p&gt;
&lt;p&gt;I didn't do the work until over a month later - but because I had designed the feature in the issue comments I could get started on the implementation really quickly.&lt;/p&gt;
&lt;h4 id="development-environment"&gt;Development environment&lt;/h4&gt;
&lt;p&gt;Being able to quickly spin up a development environment for a project is crucial. All of my projects have a section in the README or the documentation describing how to do this - here's &lt;a href="https://sqlite-utils.datasette.io/en/stable/contributing.html"&gt;that section for sqlite-utils&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;On my own laptop each project gets a directory, and I use &lt;code&gt;pipenv shell&lt;/code&gt; in that directory to activate a directory-specific virtual environment, then &lt;code&gt;pip install -e '.[test]'&lt;/code&gt; to install the dependencies and test dependencies.&lt;/p&gt;
&lt;h4 id="automated-tests"&gt;Automated tests&lt;/h4&gt;
&lt;p&gt;All of my features are accompanied by automated tests. This gives me the confidence to boldly make changes to the software in the future without fear of breaking any existing features.&lt;/p&gt;
&lt;p&gt;This means that writing tests needs to be as quick and easy as possible - the less friction here the better.&lt;/p&gt;
&lt;p&gt;The best way to make writing tests easy is to have a great testing framework in place from the very beginning of the project. My cookiecutter templates (&lt;a href="https://github.com/simonw/python-lib"&gt;python-lib&lt;/a&gt;, &lt;a href="https://github.com/simonw/datasette-plugin"&gt;datasette-plugin&lt;/a&gt; and &lt;a href="https://github.com/simonw/click-app"&gt;click-app&lt;/a&gt;) all configure &lt;a href="https://docs.pytest.org/"&gt;pytest&lt;/a&gt; and add a &lt;code&gt;tests/&lt;/code&gt; folder with a single passing test, to give me something to start adding tests to.&lt;/p&gt;
&lt;p&gt;I can't say enough good things about pytest. Before I adopted it, writing tests was a chore. Now it's an activity I genuinely look forward to!&lt;/p&gt;
&lt;p&gt;I'm not a religious adherent to writing the tests first - see &lt;a href="https://simonwillison.net/2020/Feb/11/cheating-at-unit-tests-pytest-black/"&gt;How to cheat at unit tests with pytest and Black&lt;/a&gt; for more thoughts on that - but I'll write the test first if it's pragmatic to do so.&lt;/p&gt;
&lt;p&gt;In the case of &lt;code&gt;create-database&lt;/code&gt;, writing the test first felt like the right thing to do. Here's the test I started with:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;test_create_database&lt;/span&gt;(&lt;span class="pl-s1"&gt;tmpdir&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;db_path&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;tmpdir&lt;/span&gt; &lt;span class="pl-c1"&gt;/&lt;/span&gt; &lt;span class="pl-s"&gt;"test.db"&lt;/span&gt;
    &lt;span class="pl-k"&gt;assert&lt;/span&gt; &lt;span class="pl-c1"&gt;not&lt;/span&gt; &lt;span class="pl-s1"&gt;db_path&lt;/span&gt;.&lt;span class="pl-en"&gt;exists&lt;/span&gt;()
    &lt;span class="pl-s1"&gt;result&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;CliRunner&lt;/span&gt;().&lt;span class="pl-en"&gt;invoke&lt;/span&gt;(
        &lt;span class="pl-s1"&gt;cli&lt;/span&gt;.&lt;span class="pl-s1"&gt;cli&lt;/span&gt;, [&lt;span class="pl-s"&gt;"create-database"&lt;/span&gt;, &lt;span class="pl-en"&gt;str&lt;/span&gt;(&lt;span class="pl-s1"&gt;db_path&lt;/span&gt;)]
    )
    &lt;span class="pl-k"&gt;assert&lt;/span&gt; &lt;span class="pl-s1"&gt;result&lt;/span&gt;.&lt;span class="pl-s1"&gt;exit_code&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-c1"&gt;0&lt;/span&gt;
    &lt;span class="pl-k"&gt;assert&lt;/span&gt; &lt;span class="pl-s1"&gt;db_path&lt;/span&gt;.&lt;span class="pl-en"&gt;exists&lt;/span&gt;()&lt;/pre&gt;
&lt;p&gt;This test uses the &lt;a href="https://docs.pytest.org/en/6.2.x/tmpdir.html#the-tmpdir-fixture"&gt;tmpdir pytest fixture&lt;/a&gt; to provide a temporary directory that will be automatically cleaned up by pytest after the test run finishes.&lt;/p&gt;
&lt;p&gt;It checks that the &lt;code&gt;test.db&lt;/code&gt; file doesn't exist yet, then uses the Click framework's &lt;a href="https://click.palletsprojects.com/en/8.0.x/testing/"&gt;CliRunner utility&lt;/a&gt; to execute the create-database command. Then it checks that the command didn't throw an error and that the file has been created.&lt;/p&gt;
&lt;p&gt;The I run the test, and watch it fail - because I haven't built the feature yet!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;% pytest -k test_create_database

============ test session starts ============
platform darwin -- Python 3.8.2, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /Users/simon/Dropbox/Development/sqlite-utils
plugins: cov-2.12.1, hypothesis-6.14.5
collected 808 items / 807 deselected / 1 selected                           

tests/test_cli.py F                                                   [100%]

================= FAILURES ==================
___________ test_create_database ____________

tmpdir = local('/private/var/folders/wr/hn3206rs1yzgq3r49bz8nvnh0000gn/T/pytest-of-simon/pytest-659/test_create_database0')

    def test_create_database(tmpdir):
        db_path = tmpdir / "test.db"
        assert not db_path.exists()
        result = CliRunner().invoke(
            cli.cli, ["create-database", str(db_path)]
        )
&amp;gt;       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = &amp;lt;Result SystemExit(1)&amp;gt;.exit_code

tests/test_cli.py:2097: AssertionError
========== short test summary info ==========
FAILED tests/test_cli.py::test_create_database - assert 1 == 0
===== 1 failed, 807 deselected in 0.99s ====
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;-k&lt;/code&gt; option lets me run any test that match the search string, rather than running the full test suite. I use this all the time.&lt;/p&gt;
&lt;p&gt;Other pytest features I often use:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;pytest -x&lt;/code&gt;: runs the entire test suite but quits at the first test that fails&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pytest --lf&lt;/code&gt;: re-runs any tests that failed during the last test run&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pytest --pdb -x&lt;/code&gt;: open the Python debugger at the first failed test (omit the &lt;code&gt;-x&lt;/code&gt; to open it at every failed test). This is the main way I interact with the Python debugger. I often use this to help write the tests, since I can add &lt;code&gt;assert False&lt;/code&gt; and get a shell inside the test to interact with various objects and figure out how to best run assertions against them.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="implementing-the-feature"&gt;Implementing the feature&lt;/h4&gt;
&lt;p&gt;Test in place, it's time to implement the command. I added this code to my existing &lt;a href="https://github.com/simonw/sqlite-utils/blob/3.20/sqlite_utils/cli.py"&gt;cli.py module&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;cli&lt;/span&gt;.&lt;span class="pl-en"&gt;command&lt;/span&gt;(&lt;span class="pl-s1"&gt;name&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"create-database"&lt;/span&gt;)&lt;/span&gt;
&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;click&lt;/span&gt;.&lt;span class="pl-en"&gt;argument&lt;/span&gt;(&lt;/span&gt;
&lt;span class="pl-en"&gt;    &lt;span class="pl-s"&gt;"path"&lt;/span&gt;,&lt;/span&gt;
&lt;span class="pl-en"&gt;    &lt;span class="pl-s1"&gt;type&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;click&lt;/span&gt;.&lt;span class="pl-v"&gt;Path&lt;/span&gt;(&lt;span class="pl-s1"&gt;file_okay&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;, &lt;span class="pl-s1"&gt;dir_okay&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;False&lt;/span&gt;, &lt;span class="pl-s1"&gt;allow_dash&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;False&lt;/span&gt;),&lt;/span&gt;
&lt;span class="pl-en"&gt;    &lt;span class="pl-s1"&gt;required&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;,&lt;/span&gt;
&lt;span class="pl-en"&gt;)&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;create_database&lt;/span&gt;(&lt;span class="pl-s1"&gt;path&lt;/span&gt;):
    &lt;span class="pl-s"&gt;"Create a new empty database file."&lt;/span&gt;
    &lt;span class="pl-s1"&gt;db&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;sqlite_utils&lt;/span&gt;.&lt;span class="pl-v"&gt;Database&lt;/span&gt;(&lt;span class="pl-s1"&gt;path&lt;/span&gt;)
    &lt;span class="pl-s1"&gt;db&lt;/span&gt;.&lt;span class="pl-en"&gt;vacuum&lt;/span&gt;()&lt;/pre&gt;
&lt;p&gt;(I happen to know that the quickest way to create an empty SQLite database file is to run &lt;code&gt;VACUUM&lt;/code&gt; against it.)&lt;/p&gt;
&lt;p&gt;The test now passes!&lt;/p&gt;
&lt;p&gt;I iterated on this implementation a little bit more, to add the &lt;code&gt;--enable-wal&lt;/code&gt; option I had designed &lt;a href="https://github.com/simonw/sqlite-utils/issues/348#issuecomment-983120066"&gt;in the issue comments&lt;/a&gt; - and updated the test to match. You can see the final implementation in this commit: &lt;a href="https://github.com/simonw/sqlite-utils/commit/1d64cd2e5b402ff957f9be2d9bb490d313c73989"&gt;1d64cd2e5b402ff957f9be2d9bb490d313c73989&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If I add a new test and it passes the first time, I’m always suspicious of it. I’ll deliberately break the test (change a 1 to a 2 for example) and run it again to make sure it fails, then change it back again.&lt;/p&gt;
&lt;h4 id="code-formatting-with-black"&gt;Code formatting with Black&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://github.com/psf/black"&gt;Black&lt;/a&gt; has increased my productivity as a Python developer by a material amount. I used to spend a whole bunch of brain cycles agonizing over how to indent my code, where to break up long function calls and suchlike. Thanks to Black I never think about this at all - I instinctively run &lt;code&gt;black .&lt;/code&gt; in the root of my project and accept whatever style decisions it applies for me.&lt;/p&gt;
&lt;h4 id="linting"&gt;Linting&lt;/h4&gt;
&lt;p&gt;I have a few linters set up to run on every commit. I can run these locally too - how to do that is &lt;a href="https://sqlite-utils.datasette.io/en/stable/contributing.html#linting-and-formatting"&gt;documented here&lt;/a&gt; - but I'm often a bit lazy and leave them to &lt;a href="https://github.com/simonw/sqlite-utils/blob/main/.github/workflows/test.yml"&gt;run in CI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In this case one of my linters failed! I accidentally called the new command function &lt;code&gt;create_table()&lt;/code&gt; when it should have been called &lt;code&gt;create_database()&lt;/code&gt;. The code worked fine due to how the &lt;code&gt;cli.command(name=...)&lt;/code&gt; decorator works but &lt;code&gt;mypy&lt;/code&gt; &lt;a href="https://github.com/simonw/sqlite-utils/runs/4754944593?check_suite_focus=true"&gt;complained about&lt;/a&gt; the redefined function name. I fixed that in &lt;a href="https://github.com/simonw/sqlite-utils/commit/2f8879235afc6a06a8ae25ded1b2fe289ad8c3a6#diff-76294b3d4afeb27e74e738daa01c26dd4dc9ccb6f4477451483a2ece1095902e"&gt;a separate commit&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="documentation"&gt;Documentation&lt;/h4&gt;
&lt;p&gt;My policy these days is that if a feature isn't documented it doesn't exist. Updating existing documentation isn't much work at all if the documentation already exists, and over time these incremental improvements add up to something really comprehensive.&lt;/p&gt;
&lt;p&gt;For smaller projects I use a single &lt;code&gt;README.md&lt;/code&gt; which gets displayed on both GitHub and PyPI (and the Datasette website too, for example on &lt;a href="https://datasette.io/tools/git-history"&gt;datasette.io/tools/git-history&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;My larger projects, such as &lt;a href="https://docs.datasette.io/"&gt;Datasette&lt;/a&gt; and &lt;a href="https://sqlite-utils.datasette.io/"&gt;sqlite-utils&lt;/a&gt;, use &lt;a href="https://readthedocs.org/"&gt;Read the Docs&lt;/a&gt; and &lt;a href="https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html"&gt;reStructuredText&lt;/a&gt; with &lt;a href="https://www.sphinx-doc.org/"&gt;Sphinx&lt;/a&gt; instead.&lt;/p&gt;
&lt;p&gt;I like reStructuredText mainly because it has really good support for internal reference links - something that is missing from Markdown, though it can be enabled using &lt;a href="https://myst-parser.readthedocs.io"&gt;MyST&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;sqlite-utils&lt;/code&gt; uses Sphinx. I have the &lt;a href="https://github.com/executablebooks/sphinx-autobuild"&gt;sphinx-autobuild&lt;/a&gt; extension configured, which means I can run a live reloading server with the documentation like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd docs
make livehtml
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Any time I'm working on the documentation I have that server running, so I can hit "save" in VS Code and see a preview in my browser a few seconds later.&lt;/p&gt;
&lt;p&gt;For Markdown documentation I use the VS Code preview pane directly.&lt;/p&gt;
&lt;p&gt;The moment the documentation is live online, I like to add a link to it in a comment on the issue thread.&lt;/p&gt;
&lt;h4 id="committing-the-change"&gt;Committing the change&lt;/h4&gt;
&lt;p&gt;I run &lt;code&gt;git diff&lt;/code&gt; a LOT while hacking on code, to make sure I haven’t accidentally changed something unrelated. This also helps spot things like rogue &lt;code&gt;print()&lt;/code&gt; debug statements I may have added.&lt;/p&gt;
&lt;p&gt;Before my final commit, I sometimes even run &lt;code&gt;git diff | grep print&lt;/code&gt; to check for those.&lt;/p&gt;
&lt;p&gt;My goal with the commit is to bundle the test, documentation and implementation. If those are the only files I've changed I do this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git commit -a -m "sqlite-utils create-database command, closes #348"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If this completes the work on the issue I use "&lt;code&gt;closes #N&lt;/code&gt;", which causes GitHub to close the issue for me. If it's not yet ready to close I use "&lt;code&gt;refs #N&lt;/code&gt;" instead.&lt;/p&gt;
&lt;p&gt;Sometimes there will be unrelated changes in my working directory. If so, I use &lt;code&gt;git add &amp;lt;files&amp;gt;&lt;/code&gt; and then commit just with &lt;code&gt;git commit -m message&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id="branches-and-pull-requests"&gt;Branches and pull requests&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;create-database&lt;/code&gt; is a good example of a feature that can be implemented in a single commit, with no need to work in a branch.&lt;/p&gt;
&lt;p&gt;For larger features, I'll work in a feature branch:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git checkout -b my-feature
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I'll make a commit (often just labelled "WIP prototype, refs #N") and then push that to GitHub and open a pull request for it:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git push -u origin my-feature 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I ensure the new pull request links back to the issue in its description, then switch my ongoing commentary to comments on the pull request itself.&lt;/p&gt;
&lt;p&gt;I'll sometimes add a task checklist to the opening comment on the pull request, since tasks there get reflected in the GitHub UI anywhere that links to the PR. Then I'll check those off as I complete them.&lt;/p&gt;
&lt;p&gt;An example of a PR I used like this is &lt;a href="https://github.com/simonw/sqlite-utils/pull/361"&gt;#361: --lines and --text and --convert and --import&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I don't like merge commits - I much prefer to keep my &lt;code&gt;main&lt;/code&gt; branch history as linear as possible. I usually merge my PRs through the GitHub web interface using the squash feature, which results in a single, clean commit to main with the combined tests, documentation and implementation. Occasionally I will see value in keeping the individual commits, in which case I will rebase merge them.&lt;/p&gt;
&lt;p&gt;Another goal here is to keep the &lt;code&gt;main&lt;/code&gt; branch releasable at all times. Incomplete work should stay in a branch. This makes turning around and releasing quick bug fixes a lot less stressful!&lt;/p&gt;
&lt;h4 id="release-notes-and-a-release"&gt;Release notes, and a release&lt;/h4&gt;
&lt;p&gt;A feature isn't truly finished until it's been released to &lt;a href="https://pypi.org/"&gt;PyPI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;All of my projects are configured the same way: they use GitHub releases to trigger a GitHub Actions workflow which publishes the new release to PyPI. The &lt;code&gt;sqlite-utils&lt;/code&gt; workflow for that &lt;a href="https://github.com/simonw/sqlite-utils/blob/main/.github/workflows/publish.yml"&gt;is here in publish.yml&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://cookiecutter.readthedocs.io/"&gt;cookiecutter&lt;/a&gt; templates for new projects set up this workflow for me. I just need to create a PyPI token for the project and assign it as a repository secret. See the &lt;a href="https://github.com/simonw/python-lib"&gt;python-lib cookiecutter README&lt;/a&gt; for details.&lt;/p&gt;
&lt;p&gt;To push out a new release, I need to increment the version number in &lt;a href="https://github.com/simonw/sqlite-utils/blob/main/setup.py"&gt;setup.py&lt;/a&gt; and write the release notes.&lt;/p&gt;
&lt;p&gt;I use &lt;a href="https://semver.org/"&gt;semantic versioning&lt;/a&gt; - a new feature is a minor version bump, a breaking change is a major version bump (I try very hard to avoid these) and a bug fix or documentation-only update is a patch increment.&lt;/p&gt;
&lt;p&gt;Since &lt;code&gt;create-database&lt;/code&gt; was a new feature, it went out in &lt;a href="https://github.com/simonw/sqlite-utils/releases/3.21"&gt;release 3.21&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;My projects that use Sphinx for documentation have &lt;a href="https://github.com/simonw/sqlite-utils/blob/main/docs/changelog.rst"&gt;changelog.rst&lt;/a&gt; files in their repositories. I add the release notes there, linking to the relevant issues and cross-referencing the new documentation. Then I ship a commit that bundles the release notes with the bumped version number, with a commit message that looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git commit -m "Release 3.21

Refs #348, #364, #366, #368, #371, #372, #374, #375, #376, #379"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/sqlite-utils/commit/7c637b11805adc3d3970076a7ba6afe8e34b371e"&gt;the commit for release 3.21&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Referencing the issue numbers in the release automatically adds a note to their issue threads indicating the release that they went out in.&lt;/p&gt;
&lt;p&gt;I generate that list of issue numbers by pasting the release notes into an Observable notebook I built for the purpose: &lt;a href="https://observablehq.com/@simonw/extract-issue-numbers-from-pasted-text"&gt;Extract issue numbers from pasted text&lt;/a&gt;. Observable is really great for building this kind of tiny interactive utility.&lt;/p&gt;
&lt;p&gt;For projects that just have a README I write the release notes in Markdown and paste them directly into the GitHub "new release" form.&lt;/p&gt;
&lt;p&gt;I like to duplicate the release notes to GiHub releases for my Sphinx changelog projects too. This is mainly so the &lt;a href="https://datasette.io/"&gt;datasette.io&lt;/a&gt; website will display the release notes on its homepage, which is populated &lt;a href="https://simonwillison.net/2020/Dec/13/datasette-io/"&gt;at build time&lt;/a&gt; using the GitHub GraphQL API.&lt;/p&gt;
&lt;p&gt;To convert my reStructuredText to Markdown I copy and paste the rendered HTML into this brilliant &lt;a href="https://euangoddard.github.io/clipboard2markdown/"&gt;Paste to Markdown&lt;/a&gt; tool by &lt;a href="https://github.com/euangoddard"&gt;Euan Goddard&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="a-live-demo"&gt;A live demo&lt;/h4&gt;
&lt;p&gt;When possible, I like to have a live demo that I can link to.&lt;/p&gt;
&lt;p&gt;This is easiest for features in Datasette core. Datesette’s main branch gets &lt;a href="https://github.com/simonw/datasette/blob/0.60a1/.github/workflows/deploy-latest.yml#L51-L73"&gt;deployed automatically&lt;/a&gt; to &lt;a href="https://latest.datasette.io/"&gt;latest.datasette.io&lt;/a&gt; so I can often link to a demo there.&lt;/p&gt;
&lt;p&gt;For Datasette plugins, I’ll deploy a fresh instance with the plugin (e.g. &lt;a href="https://datasette-graphql-demo.datasette.io/"&gt;this one for datasette-graphql&lt;/a&gt;) or (more commonly) add it to my big &lt;a href="https://latest-with-plugins.datasette.io/"&gt;latest-with-plugins.datasette.io&lt;/a&gt; instance - which tries to demonstrate what happens to Datasette if you install dozens of plugins at once (so far it works OK).&lt;/p&gt;
&lt;p&gt;Here’s a demo of the &lt;a href="https://datasette.io/plugins/datasette-copyable"&gt;datasette-copyable plugin&lt;/a&gt; running there:  &lt;a href="https://latest-with-plugins.datasette.io/github/commits.copyable"&gt;https://latest-with-plugins.datasette.io/github/commits.copyable&lt;/a&gt;&lt;/p&gt;
&lt;h4 id="tell-the-world-about-it"&gt;Tell the world about it&lt;/h4&gt;
&lt;p&gt;The last step is to tell the world (beyond the people who meticulously read the release notes) about the new feature.&lt;/p&gt;
&lt;p&gt;Depending on the size of the feature, I might do this with a tweet &lt;a href="https://twitter.com/simonw/status/1455266746701471746"&gt;like this one&lt;/a&gt; - usually with a screenshot and a link to the documentation. I often extend this into a short Twitter thread, which gives me a chance to link to related concepts and demos or add more screenshots.&lt;/p&gt;
&lt;p&gt;For larger or more interesting feature I'll blog about them. I may save this for my weekly &lt;a href="https://simonwillison.net/tags/weeknotes/"&gt;weeknotes&lt;/a&gt;, but sometimes for particularly exciting features I'll write up a dedicated blog entry. Some examples include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2020/Sep/23/sqlite-advanced-alter-table/"&gt;Executing advanced ALTER TABLE operations in SQLite&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2020/Jul/30/fun-binary-data-and-sqlite/"&gt;Fun with binary data and SQLite&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2020/Sep/23/sqlite-utils-extract/"&gt;Refactoring databases with sqlite-utils extract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2021/Jun/19/sqlite-utils-memory/"&gt;Joining CSV and JSON data with an in-memory SQLite database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2021/Aug/6/sqlite-utils-convert/"&gt;Apply conversion functions to data in SQLite columns with the sqlite-utils CLI tool&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I may even assemble a full set of &lt;a href="https://simonwillison.net/tags/annotatedreleasenotes/"&gt;annotated release notes&lt;/a&gt; on my blog, where I quote each item from the release in turn and provide some fleshed out examples plus background information on why I built it.&lt;/p&gt;
&lt;p&gt;If it’s a new Datasette (or Datasette-adjacent) feature, I’ll try to remember to write about it in the next edition of the &lt;a href="https://datasette.substack.com/"&gt;Datasette Newsletter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Finally, if I learned a new trick while building a feature I might extract that into &lt;a href="https://til.simonwillison.net/"&gt;a TIL&lt;/a&gt;. If I do that I'll link to the new TIL from the issue thread.&lt;/p&gt;
&lt;h4 id="more-examples-of-this-pattern"&gt;More examples of this pattern&lt;/h4&gt;
&lt;p&gt;Here are a bunch of examples of commits that implement this pattern, combining the tests, implementation and documentation into a single unit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;sqlite-utils: &lt;a href="https://github.com/simonw/sqlite-utils/commit/324ebc31308752004fe5f7e4941fc83706c5539c"&gt;adding —limit and —offset to sqlite-utils rows&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;sqlite-utils: &lt;a href="https://github.com/simonw/sqlite-utils/commit/d83b2568131f2b1cc01228419bb08c96d843d65d"&gt;--where and -p options for sqlite-utils convert&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;s3-credentials: &lt;a href="https://github.com/simonw/s3-credentials/commit/905258379817e8b458528e4ccc5e6cc2c8cf4352"&gt;s3-credentials policy command&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;datasette: &lt;a href="https://github.com/simonw/datasette/commit/5cadc244895fc47e0534c6e90df976d34293921e"&gt;db.execute_write_script() and db.execute_write_many()&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;datasette: &lt;a href="https://github.com/simonw/datasette/commit/992496f2611a72bd51e94bfd0b17c1d84e732487"&gt;?_nosuggest=1 parameter for table views&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;datasette-graphql: &lt;a href="https://github.com/simonw/datasette-graphql/commit/2d8c042e93e3429c5b187121d26f8817997073dd"&gt;GraphQL execution limits: time_limit_ms and num_queries_limit&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/software-engineering"&gt;software-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pytest"&gt;pytest&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/black"&gt;black&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/read-the-docs"&gt;read-the-docs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="git"/><category term="github"/><category term="software-engineering"/><category term="testing"/><category term="pytest"/><category term="black"/><category term="read-the-docs"/><category term="github-issues"/></entry><entry><title>Goodbye Zeit Now v1, hello datasette-publish-now - and talking to myself in GitHub issues</title><link href="https://simonwillison.net/2020/Apr/8/weeknotes-zeit-now-v2/#atom-tag" rel="alternate"/><published>2020-04-08T03:32:24+00:00</published><updated>2020-04-08T03:32:24+00:00</updated><id>https://simonwillison.net/2020/Apr/8/weeknotes-zeit-now-v2/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I’ve been mostly dealing with the finally announced shutdown of Zeit Now v1. And having long-winded conversations with myself in GitHub issues.&lt;/p&gt;

&lt;h3&gt;How Zeit Now inspired Datasette&lt;/h3&gt;

&lt;p&gt;I first started experiencing with Zeit’s serverless &lt;a href="https://zeit.co/home"&gt;Now&lt;/a&gt; hosting platform back &lt;a href="https://simonwillison.net/2017/Oct/14/async-python-sanic-now/"&gt;in October 2017&lt;/a&gt;, when I used it to deploy &lt;a href="https://json-head.now.sh/"&gt;json-head.now.sh&lt;/a&gt; - an updated version of an API tool I originally built for Google App Engine &lt;a href="https://simonwillison.net/2008/Jul/29/jsonhead/"&gt;in July 2008&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I liked Zeit Now, a lot. Instant, inexpensive deploys of any stateless project that could be defined using a Dockerfile? Just type &lt;code&gt;now&lt;/code&gt; to deploy the project in your current directory? Every deployment gets its own permanent URL? Amazing!&lt;/p&gt;

&lt;p&gt;There was just one catch: Since Now deployments are ephemeral applications running on them need to be stateless. If you want a database, you need to involve another (potentially costly) service. It's a limitation shared by other scalable hosting solutions - Heroku, App Engine and so on. How much interesting stuff can you build without a database?&lt;/p&gt;

&lt;p&gt;I was musing about this in the shower one day (that &lt;a href="https://lifehacker.com/science-explains-why-our-best-ideas-come-in-the-shower-5987858"&gt;old cliche&lt;/a&gt; really happened for me) when I had a thought: sure, you can't write to a database... but if your data is read-only, why not bundle the database alongside the application code as part of the Docker image?&lt;/p&gt;

&lt;p&gt;Ever since I &lt;a href="https://simonwillison.net/2009/Mar/10/openplatform/"&gt;helped launch the Datablog&lt;/a&gt; at the Guardian back in 2009 I had been interested in finding better ways to publish data journalism datasets than CSV files or a Google spreadsheets - so building something that could package and bundle read-only data was of extreme interest to me.&lt;/p&gt;

&lt;p&gt;In November 2017 I released &lt;a href="https://simonwillison.net/2017/Nov/13/datasette/"&gt;the first version&lt;/a&gt; of Datasette. The original idea was very much inspired by Zeit Now.&lt;/p&gt;

&lt;p&gt;I gave &lt;a href="https://www.youtube.com/watch?v=_uwrqB--eM4"&gt;a talk about Datasette&lt;/a&gt; at the Zeit Day conference in San Francisco in April 2018. Suffice to say I was a huge fan!&lt;/p&gt;

&lt;h3&gt;Goodbye, Zeit Now v1&lt;/h3&gt;

&lt;p&gt;In November 2018, Zeit &lt;a href="https://simonwillison.net/2018/Nov/19/smaller-python-docker-images/"&gt;announced Now v2&lt;/a&gt;. And it was, &lt;em&gt;different&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;v2 is an entirely different architecture from v1. Where v1 built on Docker containers, v2 is built on top of serverless functions - AWS Lambda in particular.&lt;/p&gt;

&lt;p&gt;I can see why Zeit did this. Lambda functions can launch from cold &lt;em&gt;way faster&lt;/em&gt; - v1's Docker infrastructure had tough cold-start times. They are much cheaper to run as well - crucial for Zeit given their &lt;a href="https://zeit.co/pricing"&gt;extremely generous pricing plans&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But it was bad news for my projects. Lambdas are tightly size constrained, which is tough when you're bundling potentially large SQLite database files with your deployments.&lt;/p&gt;

&lt;p&gt;More importantly, in 2018 Amazon were deliberately excluding the Python &lt;code&gt;sqlite3&lt;/code&gt; standard library module from the Python Lambda environment! I guess they hadn't considered people who might want to work with read-only database files.&lt;/p&gt;

&lt;p&gt;So Datasette on Now v2 just wasn't going to work. Zeit kept v1 supported for the time being, but the writing was clearly on the wall.&lt;/p&gt;

&lt;p&gt;In April 2019 &lt;a href="https://cloud.google.com/blog/products/serverless/announcing-cloud-run-the-newest-member-of-our-serverless-compute-stack"&gt;Google announced Cloud Run&lt;/a&gt;, a serverless, scale-to-zero hosting environment based around Docker containers. In many ways it's Google's version of Zeit Now v1 - it has many of the characteristics I loved about v1, albeit with a clunkier developer experience and much more friction in assigning nice URLs to projects. Romain Primet &lt;a href="https://github.com/simonw/datasette/pull/434"&gt;contributed Cloud Run support to Datasette&lt;/a&gt; and it has since become my preferred hosting target for my new projects (see &lt;a href="https://simonwillison.net/2020/Jan/21/github-actions-cloud-run/"&gt;Deploying a data API using GitHub Actions and Cloud Run&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Last week, Zeit &lt;a href="https://twitter.com/simonw/status/1246300304917680128"&gt;finally announced&lt;/a&gt; the sunset date for v1. From 1st of May new deploys won't be allowed, and on the 7th of August they'll be turning off the old v1 infrastructure and deleting all existing Now v1 deployments.&lt;/p&gt;

&lt;p&gt;I engaged in &lt;a href="https://twitter.com/simonw/status/1246300304917680128"&gt;an extensive Twitter conversation&lt;/a&gt; about this, where I praised Zeit's handling of the shutdown while bemoaning the loss of the v1 product I had loved so much.&lt;/p&gt;

&lt;h3 id="migrating-my-projects"&gt;Migrating my projects&lt;/h3&gt;

&lt;p&gt;My newer projects have been on Cloud Run for quite some time, but I still have a bunch of old projects that I care about and want to keep running past the v1 shutdown.&lt;/p&gt;

&lt;p&gt;The first project I ported was &lt;a href="https://latest.datasette.io/"&gt;latest.datasette.io&lt;/a&gt;, a live demo of Datasette which updates with the latest code any time I push to the Datasette master branch on GitHub.&lt;/p&gt;

&lt;p&gt;Any time I do some kind of ops task like this I've gotten into the habit of meticulously documenting every single step in comments on a GitHub issue. Here's &lt;a href="https://github.com/simonw/datasette/issues/705"&gt;the issue&lt;/a&gt; for porting latest.datasette.io to Cloud Run (and switching from Circle CI to GitHub Actions at the same time).&lt;/p&gt;

&lt;p&gt;My next project was &lt;a href="https://global-power-plants.datasettes.com/global-power-plants/global-power-plants"&gt;global-power-plants-datasette&lt;/a&gt;, a small project which takes a database of global power plants &lt;a href="https://www.wri.org/publication/global-power-plant-database"&gt;published by the World Resources Institute&lt;/a&gt; and publishes it using Datasette. It checks for new updates to &lt;a href="https://github.com/wri/global-power-plant-database"&gt;their repo&lt;/a&gt; once a day. I originally built it as a demo for &lt;a href="https://github.com/simonw/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt;, since it's fun seeing 33,000 power plants on a single map. Here's &lt;a href="https://github.com/simonw/global-power-plants-datasette/issues/1"&gt;that issue&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Having warmed up with these two, my next target was the most significant: porting my &lt;a href="https://www.niche-museums.com/"&gt;Niche Museums&lt;/a&gt; website.&lt;/p&gt;

&lt;p&gt;Niche Museums is the most heavily customized Datasette instance I've run anywhere - it incorporates custom templates, CSS and plugins.&lt;/p&gt;

&lt;p&gt;Here's &lt;a href="https://github.com/simonw/museums/issues/20"&gt;the tracking issue&lt;/a&gt; for porting it to Cloud Run. I ran into a few hurdles with DNS and TLS certificates, and I had to do &lt;a href="https://github.com/simonw/museums/issues/21"&gt;some additional work&lt;/a&gt; to ensure &lt;code&gt;niche-museums.com&lt;/code&gt; redirects to &lt;code&gt;www.niche-musums.com&lt;/code&gt;, but it's now fully migrated.&lt;/p&gt;

&lt;h3 id="hello-zeit-now-v2"&gt;Hello, Zeit Now v2&lt;/h3&gt;

&lt;p&gt;In &lt;a href="https://twitter.com/simonw/status/1246302021608591360"&gt;complaining about&lt;/a&gt; the lack of that essential &lt;code&gt;sqlite3&lt;/code&gt; module I figured it would be responsible to double-check and make sure that was still true.&lt;/p&gt;

&lt;p&gt;It was not! Today Now's Python environment &lt;a href="https://twitter.com/simonw/status/1246600935289184256"&gt;includes sqlite3&lt;/a&gt; after all.&lt;/p&gt;

&lt;p&gt;Datasette's &lt;a href="https://datasette.readthedocs.io/en/0.39/plugins.html#publish-subcommand-publish"&gt;publish_subcommand() plugin hook&lt;/a&gt; lets plugins add new publishing targets to the &lt;code&gt;datasette publish&lt;/code&gt; command (I used it to build &lt;a href="https://github.com/simonw/datasette-publish-fly"&gt;datasette-publish-fly&lt;/a&gt; last month). How hard would it be to build a plugin for Zeit Now v2?&lt;/p&gt;

&lt;p&gt;I fired up a new &lt;a href="https://github.com/simonw/datasette/issues/717"&gt;lengthy talking-to-myself GitHub issue&lt;/a&gt; and started prototyping.&lt;/p&gt;

&lt;p&gt;Now v2 may not support Docker, but it does support the &lt;a href="https://asgi.readthedocs.io/en/latest/"&gt;ASGI Python standard&lt;/a&gt; (the asynchronous alternative to WSGI, shepherded by Andrew Godwin).&lt;/p&gt;

&lt;p&gt;Zeit are keen proponents of the &lt;a href="https://jamstack.org/"&gt;Jamstack&lt;/a&gt; approach, where websites are built using static pre-rendered HTML and JavaScript that calls out to APIs for dynamic data. v2 deployments are expected to consist of static HTML with "serverless functions" - standalone server-side scripts that live in an &lt;code&gt;api/&lt;/code&gt; directory by convention and are compiled into separate lambdas.&lt;/p&gt;

&lt;p&gt;Datasette works just fine without JavaScript, which means it needs to handle all of the URL routes for a site. Essentually I need to build a single function that runs the whole of Datasette, then route all incoming traffic to it.&lt;/p&gt;

&lt;p&gt;It took me a while to figure it out, but it turns out the Now v2 recipe for that is a &lt;code&gt;now.json&lt;/code&gt; file that looks like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{
    "version": 2,
    "builds": [
        {
            "src": "index.py",
            "use": "@now/python"
        }
    ],
    "routes": [
        {
            "src": "(.*)",
            "dest": "index.py"
        }
    ]
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Thanks Aaron Boodman for &lt;a href="https://twitter.com/aboodman/status/1246605658067066882"&gt;the tip&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Given the above configuration, Zeit will install any Python dependencies in a &lt;code&gt;requirements.txt&lt;/code&gt; file, then treat an &lt;code&gt;app&lt;/code&gt; variable in the &lt;code&gt;index.py&lt;/code&gt; file as an ASGI application it should route all incoming traffic to. Exactly what I need to deploy Datasette!&lt;/p&gt;

&lt;p&gt;This was everything I needed to build the new plugin. &lt;a href="https://github.com/simonw/datasette-publish-now"&gt;datasette-publish-now&lt;/a&gt; is the result.&lt;/p&gt;

&lt;p&gt;Here's &lt;a href="https://datasette-public.now.sh/_src"&gt;the generated source code&lt;/a&gt; for a project deployed using the plugin, showing how the underlyinng ASGI application is configured.&lt;/p&gt;

&lt;p&gt;It's currently an alpha - not every feature is supported (see &lt;a href="https://github.com/simonw/datasette-publish-now/milestone/1"&gt;this milestone&lt;/a&gt;) and it relies on a minor deprecated feature (which I've &lt;a href="https://github.com/zeit/now/discussions/4021"&gt;implored Zeit to reconsider&lt;/a&gt;) but it's already full-featured enough that I can start using it to upgrade some of my smaller existing Now projects.&lt;/p&gt;

&lt;p&gt;The first I upgraded is one of my favourites: &lt;a href="https://polar-bears.now.sh/"&gt;polar-bears.now.sh&lt;/a&gt;, which visualizes tracking data from polar bear ear tags (using &lt;a href="https://github.com/simonw/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt;) that was &lt;a href="https://alaska.usgs.gov/products/data.php?dataid=130"&gt;published by the USGS Alaska Science Center, Polar Bear Research Program&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here's the command I used to deploy the site:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ pip install datasette-publish-now
$ datasette publish now2 polar-bears.db \
    --title "Polar Bear Ear Tags, 2009-2011" \
    --source "USGS Alaska Science Center, Polar Bear Research Program" \
    --source_url "https://alaska.usgs.gov/products/data.php?dataid=130" \
    --install datasette-cluster-map \
    --project=polar-bears&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I exported a full list of my Now v1 projects from their handy &lt;a href="https://zeit.co/dashboard/active-v1-instances"&gt;active v1 instances&lt;/a&gt; page.&lt;/p&gt;

&lt;h3&gt;The rest of my projects&lt;/h3&gt;

&lt;p&gt;I scraped the page using the following JavaScript, constructed with the help of the &lt;a href="https://simonwillison.net/2020/Apr/7/new-developer-features-firefox-75/"&gt;instant evaluation&lt;/a&gt; console feature in Firefox 75:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;console.log(
  JSON.stringify(
    Array.from(
      Array.from(
        document.getElementsByTagName("table")[1].
          getElementsByTagName("tr")
      ).slice(1).map(
        (tr) =&amp;gt;
          Array.from(
            tr.getElementsByTagName("td")
        ).map((td) =&amp;gt; td.innerText)
      )
    )
  )
);&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then I loaded them into Datasette for analysis.&lt;/p&gt;

&lt;p&gt;After filtering out the &lt;code&gt;datasette-latest-commithash.now.sh&lt;/code&gt; projects I had deployed for every push to GitHub it turns out I have 34 distinct projects running there.&lt;/p&gt;

&lt;p&gt;I won't port all of them, but given &lt;code&gt;datasette-publish-now&lt;/code&gt; I should be able to port the ones that I care about without too much trouble.&lt;/p&gt;

&lt;h3 id="git-bisect"&gt;Debugging Datasette with git bisect run&lt;/h3&gt;

&lt;p&gt;I fixed two bugs in Datasette this week using &lt;code&gt;git bisect run&lt;/code&gt; - a tool I've been meaning to figure out for years, which lets you run an automated binary search against a commit log to find the source of a bug.&lt;/p&gt;

&lt;p&gt;Since I was figuring out a new tool, I fired up another GitHub issue self-conversation: in &lt;a href="https://github.com/simonw/datasette/issues/716"&gt;issue #716&lt;/a&gt; I document my process of both learning to use &lt;code&gt;git bisect run&lt;/code&gt; and using it to find a solution to that particular bug.&lt;/p&gt;

&lt;p&gt;It worked great, so I used the same trick on &lt;a href="https://github.com/simonw/datasette/issues/689"&gt;issue 689&lt;/a&gt; as well.&lt;/p&gt;

&lt;p&gt;Watching &lt;code&gt;git bisect run&lt;/code&gt; churn through 32 revisions in a few seconds and pinpoint the exact moment a bug was introduced is pretty delightful:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ git bisect start master 0.34
Bisecting: 32 revisions left to test after this (roughly 5 steps)
[dc80e779a2e708b2685fc641df99e6aae9ad6f97] Handle scope path if it is a string
$ git bisect run python check_templates_considered.py
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 15 revisions left to test after this (roughly 4 steps)
[7c6a9c35299f251f9abfb03fd8e85143e4361709] Better tests for prepare_connection() plugin hook, refs #678
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[0091dfe3e5a3db94af8881038d3f1b8312bb857d] More reliable tie-break ordering for facet results
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[ce12244037b60ba0202c814871218c1dab38d729] Release notes for 0.35
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 1 revision left to test after this (roughly 1 step)
[70b915fb4bc214f9d064179f87671f8a378aa127] Datasette.render_template() method, closes #577
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[286ed286b68793532c2a38436a08343b45cfbc91] geojson-to-sqlite
running python check_templates_considered.py
70b915fb4bc214f9d064179f87671f8a378aa127 is the first bad commit
commit 70b915fb4bc214f9d064179f87671f8a378aa127
Author: Simon Willison
Date:   Tue Feb 4 12:26:17 2020 -0800

    Datasette.render_template() method, closes #577

    Pull request #664.

:040000 040000 def9e31252e056845609de36c66d4320dd0c47f8 da19b7f8c26d50a4c05e5a7f05220b968429725c M	datasette
bisect run success&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Supporting metadata.yaml&lt;/h3&gt;

&lt;p&gt;The other Datasette project I completed this week is a relatively small feature with hopefully a big impact: you can &lt;a href="https://github.com/simonw/datasette/issues/713"&gt;now use YAML for Datasette's metadata configuration&lt;/a&gt; as an alternative to JSON.&lt;/p&gt;

&lt;p&gt;I'm not crazy about YAML: I still don't feel like I've mastered it, and I've been &lt;a href="https://simonwillison.net/tags/yaml/"&gt;tracking it for 18 years&lt;/a&gt;! But it has one big advantage over JSON for configuration files: robust support for multi-line strings.&lt;/p&gt;

&lt;p&gt;Datasette's &lt;a href="https://datasette.readthedocs.io/en/latest/metadata.html"&gt;metadata file&lt;/a&gt; can include lengthy SQL statements and strings of HTML, both of which benefit from multi-line strings.&lt;/p&gt;

&lt;p&gt;I first used YAML for metadata for my &lt;a href="https://simonwillison.net/2018/Aug/6/russian-facebook-ads/"&gt;Analyzing US Election Russian Facebook Ads&lt;/a&gt; project. The &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette/blob/336ba87ef8071e664441ad0a95e3b8d0a33f682a/russian-ads-metadata.yaml"&gt;metadata file for that&lt;/a&gt; demonstrates both embedded HTML and embedded SQL - and an accompanying &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette/blob/336ba87ef8071e664441ad0a95e3b8d0a33f682a/build_metadata.py"&gt;build_metadata.py&lt;/a&gt; script converted it to JSON at build time. I've since used the same trick for a number of other projects.&lt;/p&gt;

&lt;p&gt;The next release of Datasette (hopefully within a week) will ship the new feature, at which point those conversion scripts won't be necessary.&lt;/p&gt;

&lt;p&gt;This should work particularly well with the forthcoming &lt;a href="https://github.com/simonw/datasette/issues/698"&gt;ability for a canned query to write to a database&lt;/a&gt;. Getting that wrapped up and shipped will be my focus for the next few days.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/zeit-now"&gt;zeit-now&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="git"/><category term="github"/><category term="projects"/><category term="yaml"/><category term="zeit-now"/><category term="datasette"/><category term="weeknotes"/><category term="github-issues"/></entry><entry><title>github-trending-repos</title><link href="https://simonwillison.net/2018/Feb/23/github-trending-repos/#atom-tag" rel="alternate"/><published>2018-02-23T17:36:41+00:00</published><updated>2018-02-23T17:36:41+00:00</updated><id>https://simonwillison.net/2018/Feb/23/github-trending-repos/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/vitalets/github-trending-repos"&gt;github-trending-repos&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is a really clever hack: Vitaliy Potapov built a system for subscribing to a weekly digest of trending GitHub repos in your favourite languages entirely on top of the existing GitHub issues notification system. Find the issue for your particular language and hit “subscribe” and you’ll get an email (or push notification depending on how you get your issue notifications) once a week with the latest trends. The implementation is a 220 line Node.js script which runs on a daily and weekly schedule using Circle CI, so Vitaliy doesn’t even have to host or pay for any of the underlying infrastructure. It’s brilliant.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=16446250"&gt;Show HN&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nodejs"&gt;nodejs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="nodejs"/><category term="github-issues"/></entry></feed>