Simon Willison’s Weblog

779 items tagged “python”

2022

Datasette Lite: a server-side Python web application running in a browser

Datasette Lite is a new way to run Datasette: entirely in a browser, taking advantage of the incredible Pyodide project which provides Python compiled to WebAssembly plus a whole suite of useful extras.

[... 4760 words]

sqlite-utils 3.26.1 (via) I released sqlite-utils 3.36.1 with one tiny but exciting feature: I fixed its one dependency that wasn’t published as a pure Python wheel, which means it can now be used with Pyodide—Python compiled to WebAssembly running in your browser! # 2nd May 2022, 6:43 pm

PyScript demos (via) PyScript was announced at PyCon this morning. It’s a new open source project that provides Web Components built on top of Pyodide, allowing you to use Python directly within your HTML pages in a way that is executed using a WebAssembly copy of Python running in your browser. These demos really help illustrate what it can do—it’s a fascinating new piece of the Python web ecosystem. # 30th April 2022, 9:50 pm

Testing Datasette parallel SQL queries in the nogil/python fork. As part of my ongoing research into whether Datasette can be sped up by running SQL queries in parallel I’ve been growing increasingly suspicious that the GIL is holding me back. I know the sqlite3 module releases the GIL and was hoping that would give me parallel queries, but it looks like there’s still a ton of work going on in Python GIL land creating Python objects representing the results of the query. Sam Gross has been working on a nogil fork of Python and I decided to give it a go. It’s published as a Docker image and it turns out trying it out really did just take a few commands... and it produced the desired results, my parallel code started beating my serial code where previously the two had produced effectively the same performance numbers. I’m pretty stunned by this. I had no idea how far along the nogil fork was. It’s amazing to see it in action. # 29th April 2022, 5:45 am

Automatically opening issues when tracked file content changes

I figured out a GitHub Actions pattern to keep track of a file published somewhere on the internet and automatically open a new repository issue any time the contents of that file changes.

[... 1151 words]

Weeknotes: Parallel SQL queries for Datasette, plus some middleware tricks

A promising new performance optimization for Datasette, plus new datasette-gzip and datasette-total-page-time plugins.

[... 1534 words]

Useful tricks with pip install URL and GitHub

The pip install command can accept a URL to a zip file or tarball. GitHub provides URLs that can create a zip file of any branch, tag or commit in any repository. Combining these is a really useful trick for maintaining Python packages.

[... 929 words]

Glue code to quickly copy data from one Postgres table to another (via) The Python script that Retool used to migrate 4TB of data between two PostgreSQL databases. I find the structure of this script really interesting—it uses Python to spin up a queue full of ID ranges to be transferred and then starts some threads, but then each thread shells out to a command that runs “psql COPY (SELECT ...) TO STDOUT” and pipes the result to “psql COPY xxx FROM STDIN”. Clearly this works really well (“saturate the database’s hardware capacity” according to a comment on HN), and neatly sidesteps any issues with Python’s GIL. # 19th April 2022, 4:57 pm

typesplainer (via) A Python module that produces human-readable English descriptions of Python type definitions—also available as a web interface. # 15th March 2022, 6:18 am

[history] When I tried this in 1996 (via) “I removed the GIL back in 1996 from Python 1.4...” is the start of a fascinating (supportive) comment by Greg Stein on the promising nogil Python fork that Sam Gross has been putting together. Greg provides some historical context that I’d never heard before, relating to an embedded Python for Microsoft IIS. # 21st February 2022, 10:43 pm

Mypyc (via) Spotted this in the Black release notes: “Black is now compiled with mypyc for an overall 2x speed-up”. Mypyc is a tool that compiles Python modules (written in a subset of Python) to C extensions—similar to Cython but using just Python syntax, taking advantage of type annotations to perform type checking and type inference. It’s part of the mypy type checking project, which has been using it since 2019 to gain a 4x performance improvement over regular Python. # 30th January 2022, 1:31 am

Black 22.1.0 (via) Black, the uncompromising code formatter for Python, has had its first stable non-beta release after almost four years of releases. I adopted Black a few years ago for all of my projects and I wouldn’t release Python code without it now—the productivity boost I get from not spending even a second thinking about code formatting and indentation is huge. I know Django has been holding off on adopting it until a stable release was announced, so hopefully that will happen soon. # 30th January 2022, 1:23 am

Weeknotes: python_requires, documentation SEO

Fixed Datasette on Python 3.6 for the last time. Worked on documentation infrastructure improvements. Spent some time with Fly Volumes.

[... 1497 words]

2021

Annotated explanation of David Beazley’s dataklasses (via) David Beazley released a self-described “deliciously evil spin on dataclasses” that uses some deep Python trickery to implement a dataclass style decorator which creates classes that import 15-20 times faster than the original. I put together a heavily annotated version of his code while trying to figure out how all of the different Python tricks in it work. # 20th December 2021, 5:05 am

TypeScript for Pythonistas (via) Really useful explanation of how TypeScript differs from Python with mypy. I hadn’t realized TypeScript leans so far into structural typing, to the point that two types with different names but the same “shape” are identified as being the same type as each other. # 17th December 2021, 7:43 pm

wheel.yml for Pyjion using cibuildwheel (via) cibuildwheel, maintained by the Python Packaging Authority, builds and tests Python wheels across multiple platforms. I hadn’t realized quite how minimal a configuration using their GitHub Actions action was until I looked at how Pyjion was using it. # 10th December 2021, 3:05 am

Introducing stack graphs (via) GitHub launched “precise code navigation” for Python today—the first language to get support for this feature. Click on any Python symbol in GitHub’s code browsing views and a box will show you exactly where that symbol was defined—all based on static analysis by a custom parser written in Rust as opposed to executing any Python code directly. The underlying computer science uses a technique called stack graphs, based on scope graphs research from Eelco Visser’s research group at TU Delft. # 9th December 2021, 11:07 pm

An oral history of Bank Python (via) Fascinating description of a very custom Python environment inside a large investment bank—where all of the code lives inside the Python environment itself, everything can be imported into the same process and a directed acyclic graph engine implements Excel-style reactive dependencies. Plenty of extra flavour from people who’ve worked with this (and related) Python systems in the Hacker News comments. # 5th November 2021, 5:18 am

How to build, test and publish an open source Python library

At PyGotham this year I presented a ten minute workshop on how to package up a new open source Python library and publish it to the Python Package Index. Here is the video and accompanying notes, which should make sense even without watching the talk.

[... 2017 words]

s3-credentials: a tool for creating credentials for S3 buckets

I’ve built a command-line tool called s3-credentials to solve a problem that’s been frustrating me for ages: how to quickly and easily create AWS credentials (an access key and secret key) that have permission to read or write from just a single S3 bucket.

[... 1618 words]

Why you shouldn’t invoke setup.py directly (via) Paul Ganssle explains why you shouldn’t use “python setup.py command” any more. I’ve mostly switched to pip and pytest and twine but I was still using “python setup.py sdist”—apparently the new replacement recipe for that is “python -m build”. # 19th October 2021, 5:22 pm

Where does all the effort go? Looking at Python core developer activity (via) Łukasz Langa used Datasette to explore 28,780 pull requests made to the CPython GitHub repository, using some custom Python scripts (and sqlite-utils) to load in the data. # 18th October 2021, 8:21 pm

Tests aren’t enough: Case study after adding type hints to urllib3. Very thorough write-up by Seth Michael Larson describing what it took for the urllib3 Python library to fully embrace mypy and optional typing and what they learned along the way. # 18th October 2021, 7:03 pm

Web Browser Engineering (via) In progress free online book by Pavel Panchekha and Chris Harrelson that demonstrates how a web browser works by writing one from scratch using Python, tkinter and the DukPy wrapper around the Duktape JavaScript interpreter. # 17th October 2021, 3:53 pm

Finding and reporting an asyncio bug in Python 3.10

I found a bug in Python 3.10 today! Some notes on how I found it and my process for handling it once I figured out what was going on.

[... 1789 words]

The GIL and its effects on Python multithreading (via) Victor Skvortsov presents the most in-depth explanation of the Python Global Interpreter Lock I’ve seen anywhere. I learned a ton from reading this. # 29th September 2021, 5:23 pm

django-upgrade (via) Adam Johnson’s new CLI tool for upgrading Django projects by automatically applying changes to counter deprecations made in different versions of the framework. Uses the Python standard library tokenize module which gives it really quick performance in parsing and rewriting Python code. Exciting to see this kind of codemod approach becoming more common in Python world—JavaScript developers use this kind of thing a lot. # 26th September 2021, 5:42 am

SQLModel. A new project by FastAPI creator Sebastián Ramírez: SQLModel builds on top of both SQLAlchemy and Sebastián’s Pydantic validation library to provide a new ORM that’s designed around Python 3’s optional typing. The real brilliance here is that a SQLModel subclass is simultaneously a valid SQLAlchemy ORM model AND a valid Pydantic validation model, saving on duplicate code by allowing the same class to be used both for form/API validation and for interacting with the database. # 24th August 2021, 11:16 pm

sqlite-utils API reference (via) I released sqlite-utils 3.15.1 today with just one change, but it’s a big one: I’ve added docstrings and type annotations to nearly every method in the library, and I’ve started using sphinx-autodoc to generate an API reference page in the documentation directly from those docstrings. I’ve deliberately avoided building this kind of documentation in the past because I so often see projects where the class reference is the ONLY documentation, which I find makes it really hard to figure out how to actually use it. sqlite-utils already has extensive narrative prose documentation so in this case I think it’s a useful enhancement—especially since the docstrings and type hints can help improve the usability of the library in IDEs and Jupyter notebooks. # 11th August 2021, 1:03 am

How the Python import system works (via) Remarkably detailed and thorough dissection of how exactly import, modules and packages work in Python—eventually digging right down into the C code. Part of Victor Skvortsov’s excellent “Python behind the scenes” series. # 24th July 2021, 8:12 pm