Simon Willison’s Weblog

Subscribe

May 2020

May 2, 2020

github-to-sqlite 2.2 highlights thread. I released github-to-sqlite 2.2 today with a new “stargazers” command for importing users who have starred one or more specific repositories. This Twitter thread lists highlights of recent releases and links to a live Datasette demo that shows what the tool can do.

# 10:16 pm / projects, dogsheep, datasette, github

May 4, 2020

How to install and upgrade Datasette using pipx (via) I’ve been using pipx to run Datasette for a while now—it’s a neat Python packaging tool which installs a Python CLI command with all of its dependencies in its own isolated virtual environment. Today, thanks to Twitter, I figured out how to install and upgrade plugins in the same environment—so I added a section to the Datasette installation documentation about it.

# 7:23 pm / datasette, pip, python

How to get Rich with Python (a terminal rendering library). Will McGugan introduces Rich, his new Python library for rendering content on the terminal. This is a very cool piece of software—out of the box it supports coloured text, emoji, tables, rendering Markdown, syntax highlighting code, rendering Python tracebacks, progress bars and more. “pip install rich” and then “python -m rich” to render a “test card” demo demonstrating the features of the library.

# 11:27 pm / cli, python, will-mcgugan

May 5, 2020

A hands-on introduction to static code analysis. Useful tutorial on using the Python standard library tokenize and ast modules to find specific patterns in Python source code, using the visitor pattern.

# 12:15 am / compilers, python, staticanalysis

May 7, 2020

Weeknotes: Datasette 0.41, photos breakthroughs

Visit Weeknotes: Datasette 0.41, photos breakthroughs

Shorter weeknotes this week, because my main project for the week warrants a detailed write-up on its own (coming soon... update 21st May here it is).

[... 867 words]

html-to-svg (via) ‪This is absolutely ingenious: 50 lines of JavaScript which uses Puppeteer to get headless Chrome to grab a PDF screenshot of a page, then shells out to Inkscape to convert the PDF to SVG. Wraps the whole thing up in a Docker container and ships it to Cloud Run as a web service you can call by passing it a URL.

# 6:01 am / cloudrun, chrome, svg, puppeteer

May 8, 2020

Datasette table diagram, now with a DOT graph (via) Thomas Ballinger shared a huge improvement to my Observable notebook for rendering a diagram of a collection of Datasette tables. He showed how to use the DOT language to render a full schema digram with arrows joining together the different tables. I’ve applied his changes to my notebook.

# 3:23 am / observable, datasette, visualization

May 9, 2020

pyp: Easily run Python at the shell (via) Fascinating little CLI utility which uses some deeply clever AST introspection to enable little Python one-liners that act as replacements for all manner of pipe-oriented unix utilities. Took me a while to understand how it works from the README, but then I looked at the code and the entire thing is only 380 lines long. There’s also a useful --explain option which outputs the Python source code that it would execute for a given command.

# 9:05 pm / shell, python

May 11, 2020

And for what? Again - there is a swath of use cases which would be hard without React and which aren’t complicated enough to push beyond React’s limits. But there are also a lot of problems for which I can’t see any concrete benefit to using React. Those are things like blogs, shopping-cart-websites, mostly-CRUD-and-forms-websites. For these things, all of the fancy optimizations are optimizations to get you closer to the performance you would’ve gotten if you just hadn’t used so much technology.

Tom MacWright

# 12:03 am / react, tom-macwright

Data Journalism Academy (via) MaryJo Webster is the data editor for the Star Tribune in Minneapolis, and a 2019 Pulitzer nominee. She’s has a huge amount of experience teaching data journalism and has just released her accumulated teaching materials in the form of the Data Journalism Academy.

# 4:45 am / data-journalism

Why we at $FAMOUS_COMPANY Switched to $HYPED_TECHNOLOGY (via) Beautiful piece of writing by Saagar Jha. “Ultimately, however, our decision to switch was driven by our difficulty in hiring new talent for $UNREMARKABLE_LANGUAGE, despite it being taught in dozens of universities across the United States. Our blog posts on $PRACTICAL_OPEN_SOURCE_FRAMEWORK seemed to get fewer upvotes when posted on Reddit as well, cementing our conviction that our technology stack was now legacy code.”

# 7:11 pm / migration

May 13, 2020

Deno 1.0. Deno is a new take on server-side JavaScript from a team lead by Ryan Dahl, who originally created Node.js. It’s built using Rust and crammed with fascinating ideas—like the ability to import code directly from a URL.

# 11:38 pm / nodejs, ryan-dahl, javascript, rust, deno

May 14, 2020

Weeknotes: Working on my screenplay

I’m taking an Introduction to Screenwriting course with Adam Tobin at Stanford, and my partial screenplay is due this week. I’m pulling together some scenes that tell the story of the Russian 1917 February Revolution and the fall of the Tsar through the lens of the craftsmen working on the Tsar’s last Fabergé egg. So I’ve not been spending much time on anything else.

[... 226 words]

Web apps are typically continuously delivered, not rolled back, and you don't have to support multiple versions of the software running in the wild.

This is not the class of software that I had in mind when I wrote the blog post 10 years ago. If your team is doing continuous delivery of software, I would suggest to adopt a much simpler workflow (like GitHub flow) instead of trying to shoehorn git-flow into your team.

Vincent Driessen

# 1:49 pm / git, continuous-deployment, continuous-integration

May 18, 2020

Doordash and Pizza Arbitrage (via) In which a Pizza restaurant owner notices that Doordash, uninvited, have started offering their $24 pizzas for $16 and starts ordering their own pizzas and keeping the difference.

# 2:32 pm / money

May 20, 2020

Company culture is the shared way everyone acts when you aren’t around to see it

Adam Kalsey

# 3:30 am / management

May 21, 2020

Using SQL to find my best photo of a pelican according to Apple Photos

Visit Using SQL to find my best photo of a pelican according to Apple Photos

According to the Apple Photos internal SQLite database, this is the most aesthetically pleasing photograph I have ever taken of a pelican:

[... 1,937 words]

May 22, 2020

Food consumption really only grows at the rate of population growth, so if you want to grow faster than that, you have to take market share from someone else. Ideally, you take it from someone weaker, who has less information. In this industry, the delivery platforms have found unsuspecting victims in restaurants and drivers.

Collin Wallace

# 3:05 am / restaurants

Using SQL to Look Through All of Your iMessage Text Messages (via) Dan Kelch shows how to access the iMessage SQLite database at ~/Library/Messages/chat.db—it’s protected under macOS Catalina so you have to enable Full Disk Access in the privacy settings first. I usually use the macOS terminal app but I installed iTerm for this because I’d rather enable full disk access to a separate terminal program than let anything I’m running in my regular terminal take advantage of it. It worked! Now I can run “datasette ~/Library/Messages/chat.db” to browse my messages.

# 4:45 pm / datasette, apple, sql, sqlite

May 26, 2020

Waiting in asyncio. Handy cheatsheet explaining the differences between asyncio.gather(), asyncio.wait_for(), asyncio.as_completed() and asyncio.wait() by Hynek Schlawack.

# 3:28 pm / async, python, hynek-schlawack

Serving photos locally with datasette-media. datasette-media is a new Datasette plugin which can serve static files from disk in response to a configured SQL query that maps incoming URL parameters to a path to a file. I built it so I could run dogsheep-photos locally on my laptop and serve up thumbnails of images that match particular queries. I’ve added documentation to the dogsheep-photos README explaining how to use datasette-media, datasette-json-html and datasette-template-sql to create custom interfaces onto Apple Photos data on your machine.

# 3:53 pm / projects, dogsheep, applephotos, datasette, plugins

AWS services explained in one line each (via) Impressive effort to summarize all 163(!) AWS services—this helped clarify a whole bunch that I haven’t figured yet. Only a few defeated the author, with a single question mark for the description. I enjoyed Amazon Braket: “Some quantum thing. It’s in preview so I have no idea what it is.”

# 4:41 pm / aws

May 27, 2020

Why we use homework to recruit engineers. Ad Hoc run a remote-first team, and use detailed homework assignments as part of their interview process in place of in-person technical interview. The homework assignments are really interesting to browse through—“Containerize” for example involves building a Docker container to run a Python app with nginx a and a modern cipher suite. I’m nervous about the extra burden this places on candidates, but Ad Hoc address that: “We recognize that we’re asking folks to invest time into our process, but we feel like our homework compares favorably to extensive on-site interviews or other evaluation techniques, especially for candidates who have responsibilities outside of their work life.”

# 6:04 pm / recruiting

May 28, 2020

Any time you can think of something that is possible this year and wasn’t possible last year, you should pay attention. You may have the seed of a great startup idea. This is especially true if next year will be too late.

Sam Altman

# 9:36 pm / ideas, startups

Advice on specifying more granular permissions with Google Cloud IAM (via) My single biggest frustration working with both Google Cloud and AWS is permissions: more specifically, figuring out what the smallest set of permissions are that I need to assign in order to achieve different goals. Katie McLaughlin’s new series aims to address exactly that problem. I learned a ton from this that I’ve previously missed, and there’s plenty of actionable advice on tooling that can be used to help figure this stuff out.

# 10:44 pm / permissions, cloudrun

May 29, 2020

Weeknotes: Datasette 0.43

My main achievement this week was shipping Datasette 0.43, with a collection of smaller improvements and one big one: a redesign of the register_output_renderer plugin hook.

[... 475 words]

Deno is a Browser for Code (via) One of the most interesting ideas in Deno is that code imports are loaded directly from URLs—which can themselves depend on other URL-based packages. On first encounter it feels wrong—obviously insecure. Deno contributor Kitson Kelly provides a deeper exploration of the idea, and explains how the combination of caching and lock files makes it no less secure than code installed from npm or PyPI.

# 2:36 am / packaging, deno

Practical Python Programming (via) David Beazley has been developing and presenting this three day Python course (aimed at people with some prior programming experience) for over thirteen years, and he’s just released the course materials under a Creative Commons license for the first time.

# 1:15 pm / david-beazley, python

2020 » May

MTWTFSS
    123
45678910
11121314151617
18192021222324
25262728293031