Simon Willison’s Weblog

42 items tagged “documentation”

2022

Software engineering practices

Gergely Orosz started a Twitter conversation asking about recommended “software engineering practices” for development teams.

[... 1557 words]

How I’m a Productive Programmer With a Memory of a Fruit Fly (via) Hynek Schlawack describes the value he gets from searchable offline developer documentation, and advocates for the Documentation Sets format which bundles docs, metadata and a SQLite search index. Hynek’s doc2dash command can convert documentation generated by tools like Sphinx into a docset that’s compatible with several offline documentation browser applications. # 19th September 2022, 4:19 pm

Your documentation is complete when someone can use your module without ever having to look at its code. This is very important. This makes it possible for you to separate your module’s documented interface from its internal implementation (guts). This is good because it means that you are free to change the module’s internals as long as the interface remains the same. Remember: the documentation, not the code, defines what a module does.

Ken Williams # 4th August 2022, 3:50 pm

Cleaning data with sqlite-utils and Datasette (via) I wrote a new tutorial for the Datasette website, showing how to use sqlite-utils to import a CSV file, clean up the resulting schema, fix date formats and extract some of the columns into a separate table. It’s accompanied by a ten minute video originally recorded for the HYTRADBOI conference. # 31st July 2022, 7:57 pm

Weeknotes: Building Datasette Cloud on Fly Machines, Furo for documentation

Hosting provider Fly released Fly Machines this week. I got an early preview and I’ve been working with it for a few days—it’s a fascinating new piece of technology. I’m using it to get my hosting service for Datasette ready for wider release.

[... 1005 words]

GOV.UK Guidance: Documenting APIs (via) Characteristically excellent guide from GOV.UK on writing great API documentation. “Task-based guidance helps users complete the most common integration tasks, based on the user needs from your research.” # 21st May 2022, 11:31 pm

jq language description (via) I love jq but I’ve always found it difficult to remember how to use it, and the manual hasn’t helped me as much as I would hope. It turns out the jq wiki on GitHub offers an alternative, more detailed description of the language which fits the way my brain works a lot better. # 26th April 2022, 7:04 pm

Deno by example (via) Interesting approach to documentation: a big list of annotated examples illustrating the Deno way of solving a bunch of common problems. # 17th March 2022, 1:02 am

Instantly create a GitHub repository to take screenshots of a web page

I just released shot-scraper-template, a GitHub repository template that helps you start taking automated screenshots of a web page by filling out a form.

[... 1177 words]

Weeknotes: Distracted by Playwright

My goal for this week was to unblock progress on Datasette by finally finishing the dash encoding implementation I described last week. I was getting close, and then I got very distracted by Playwright.

[... 892 words]

shot-scraper: automated screenshots for documentation, built on Playwright

shot-scaper is a new tool that I’ve built to help automate the process of keeping screenshots up-to-date in my documentation. It also doubles as a scraping tool—hence the name—which I picked as a complement to my git scraping and help scraping techniques.

[... 1781 words]

Weeknotes: Datasette Tutorials

I published two new tutorials for Datasette this week, both focused at end-users of the web application.

[... 479 words]

2021

Making world-class docs takes effort (via) Curl maintainer Daniel Stenberg writes about his principles for good documentation. I agree with all of these: he emphasizes keeping docs in the repo, avoiding the temptation to exclusively generate them from code, featuring examples and ensuring every API you provide has documentation. Daniel describes an approach similar to the documentation unit tests I’ve been using for my own projects: he has scripts which scan the curl documentation to ensure not only that everything is documented but that each documentation area contains the same sections in the same order. # 6th September 2021, 6:58 pm

The Diátaxis documentation framework. Daniele Procida’s model of four types of technical documentation—tutorials, how-to guides, technical reference and explanation—now has a name: Diátaxis. # 21st August 2021, 10:59 pm

Datasette on Codespaces, sqlite-utils API reference documentation and other weeknotes

This week I broke my streak of not sending out the Datasette newsletter, figured out how to use Sphinx for Python class documentation, worked out how to run Datasette on GitHub Codespaces, implemented Datasette column metadata and got tantalizingly close to a solution for an elusive Datasette feature.

[... 2164 words]

Adding Sphinx autodoc to a project, and configuring Read The Docs to build it. My TIL notes from figuring out how to use sphinx-autodoc for the sqlite-utils reference documentation today. # 11th August 2021, 1:21 am

sqlite-utils API reference (via) I released sqlite-utils 3.15.1 today with just one change, but it’s a big one: I’ve added docstrings and type annotations to nearly every method in the library, and I’ve started using sphinx-autodoc to generate an API reference page in the documentation directly from those docstrings. I’ve deliberately avoided building this kind of documentation in the past because I so often see projects where the class reference is the ONLY documentation, which I find makes it really hard to figure out how to actually use it. sqlite-utils already has extensive narrative prose documentation so in this case I think it’s a useful enhancement—especially since the docstrings and type hints can help improve the usability of the library in IDEs and Jupyter notebooks. # 11th August 2021, 1:03 am

2020

Design Docs at Google. Useful description of the format used for software design docs at Google—informal documents of between 3 and 20 pages that outline the proposed design of a new project, discuss trade-offs that were considered and solicit feedback before the code starts to be written. # 7th August 2020, 4:31 pm

The unofficial Google Cloud Run FAQ. This is really useful: a no-fluff, content rich explanation of Google Cloud Run hosted as a GitHub repo that actively accepts pull requests from the community. It’s maintained by Ahmet Alp Balkan, a Cloud Run engineer who states “Googlers: If you find this repo useful, you should recognize the work internally, as I actively fight for alternative forms of content like this”. One of the hardest parts of working with AWS and GCP is digging through the marketing materials to figure out what the product actually does, so the more alternative forms of documentation like this the better. # 22nd July 2020, 5:20 pm

How to find what you want in the Django documentation (via) Useful guide by Matthew Segal to navigating the Django documentation, and tips for reading documentation in general. The Django docs have a great reputation so it’s easy to forget how intimidating they can be for newcomers: Matthew emphasizes that docs are rarely meant to be read in full: the trick is learning how to quickly search them for the things you need to understand right now. # 3rd July 2020, 3:04 pm

2019

Documentation needs to include and be structured around its four different functions: tutorials, how-to guides, explanation and technical reference. Each of them requires a distinct mode of writing. People working with software need these four different kinds of documentation at different times, in different circumstances—so software usually needs them all.

Daniele Procida # 3rd August 2019, 8:29 am

2018

The subset of reStructuredText worth committing to memory

reStructuredText is the standard for documentation in the Python world.

[... 1186 words]

Honeycomb changelog (via) Too few hosted services have detailed user-facing changelogs. This one from Honeycomb (a metrics, tracing and observavility platform) is a particularly great example. I especially like the use of animated screenshots, something I’ve been evangelizing pretty heavily recently for internal communication at work. # 25th August 2018, 3:12 am

Documentation unit tests

Or: Test-driven documentation.

[... 1521 words]

SpatiaLite — Datasette documentation. Datasette’s documentation now includes extensive coverage of the SpatiaLite extension for SQLite: how to install it, how to import latitude/longitude points, shapefiles and GeoJSON data into SpatiaLite tables, and how to run SQL queries against it that take advantage of spatial indexes. I’m learning SpatiaLite at the moment and filling out the documentation with each new trick I learn as I go—as Mark Pilgrim once taught me, the best way to learn a new technology is to write about it. # 30th May 2018, 4:34 am

2017

TLDR pages. This is an absurdly good idea: a community maintained set of alternative man pages for common commands with a focus on usage examples, plus a “tldr netstat” command to see them. The man pages themselves are maintained on GitHub. # 24th November 2017, 5:38 am

gitchangelog. Handy Python utility that can generate a reStructured Text changelog from your git commit log. I used this to help get the Datasette release notes started. # 16th November 2017, 4:52 pm

Datasette 0.12. I just released v0.12 of Datasette. The most exciting new feature is the ability to display a UI for editing named parameters—so you can construct an arbitrarily complex SQL query, include some named parameters and then link directly to it in Datasette to provide a simple interface for changing those parameters. An example involving Australian dogs is included in the release notes. # 16th November 2017, 3:55 pm

Doc of docs

Here’s a low-tech, high-impact trick I recently learned at work that’s amazingly useful: create a doc-of-docs.

[... 215 words]

2009

Writing good documentation (part 1). Jacob explains some of the philosophy behind Django’s documentation. Topical guides are particularly interesting—many projects skip them (leaving books to fill the gap) but they fill an essential gap between tutorials and low-level reference documentation. # 11th November 2009, 7:13 am