Simon Willison’s Weblog

Subscribe

921 items tagged “python”

2023

Data analysis with SQLite and Python. I turned my 2hr45m workshop from PyCon into the latest official tutorial on the Datasette website. It includes an extensive handout which should be useful independently of the video itself. # 2nd July 2023, 4:48 pm

Status of Python Versions (via) Very clear and useful page showing the exact status of different Python versions. 3.7 reaches end of life today (no more security updates), while 3.11 will continue to be supported until October 2027. # 27th June 2023, 2:01 pm

Building Search DSLs with Django (via) Neat tutorial by Dan Lamanna: how to build a GitHub-style search feature—supporting modifiers like “is:open author:danlamanna”—using PyParsing and the Django ORM. # 19th June 2023, 8:30 am

Symbex: search Python code for functions and classes, then pipe them into a LLM

I just released a new Python CLI tool called Symbex. It’s a search tool, loosely inspired by ripgrep, which lets you search Python code for functions and classes by name or wildcard, then see just the source code of those matching entities.

[... 1183 words]

sqlean.py: Python’s sqlite3 with extensions. Anton Zhiyanov built a new Python package which bundles a fresh, compiled copy of SQLite with his SQLean family of C extensions built right in. Installing it gets you the latest SQLite—3.42.0—with nearly 200 additional functions, including things like define() and eval(), fileio_read() and fileio_write(), percentile_95() and uuid4() and many more. “import sqlean as sqlite3” works as a drop-in replacement for the module from the standard library. # 17th June 2023, 10:42 pm

simpleaichat (via) Max Woolf released his own Python package for building against the GPT-3.5 and GPT-4 APIs (and potentially other LLMs in the future).

It’s a very clean piece of API design with some useful additional features: there’s an AsyncAIChat subclass that works with Python asyncio, and the library includes a mechanism for registering custom functions that can then be called by the LLM as tools.

One trick I haven’t seen before: it uses a combination of max_tokens: 1 and a ChatGPT logit_bias to ensure that answers to one of its default prompts are restricted to just numerals between 0 and 9. This is described in the PROMPTS.md file. # 8th June 2023, 9:06 pm

pytest-icdiff (via) This is neat: “pip install pytest-icdiff” provides an instant usability upgrade to the output of failed tests in pytest, especially if the assertions involve comparing larger strings or nested JSON objects. # 3rd June 2023, 4:59 pm

The Python Language Summit 2023: Making the Global Interpreter Lock Optional. Extremely informative update covering Sam Gross’s python-nogil proposal from this year’s language summit at PyCon.

Sam has been working hard on his fork for the past year, and now has it rebased for Python 3.12. If his PEP is accepted it could end up as an optional compile-time build in time for Python 3.13.

“The plan for nogil remains that it would be enabled via a compile-time flag, named --disable-gil. Third-party C extensions would need to provide separate wheels for GIL-disabled Python.” # 31st May 2023, 12:04 am

Trogon (via) The latest project from the Textualize/Rich crew, Trogon provides a Python decorator—@tui—which, when applied to a Click CLI application, adds a new interactive TUI mode which introspects the available subcommands and their options and creates a full Text User Interface—with keyboard and mouse support—for assembling invocations of those various commands.

I just shipped sqlite-utils 3.32 with support for this—it uses an optional dependency, so you’ll need to run “sqlite-utils install trogon” and then “sqlite-utils tui” to try it out. # 21st May 2023, 9:39 pm

Writing Python like it’s Rust (via) Fascinating article by Jakub Beránek describing in detail patterns for using type annotations in Python inspired by working in Rust. I learned new tricks about both languages from reading this. # 21st May 2023, 12:18 am

Real Multithreading is Coming to Python—Learn How You Can Use It Now (via) Martin Heinz provides a detailed tutorial on trying out the new Per-Interpreter GIL feature that’s landing in Python 3.12, which allows Python code to run concurrently in multiple threads by spawning separate sub-interpreters, each with their own dedicated GIL.

It’s not an easy feature to play with yet! First you need to compile Python yourself, and then use APIs that are generally only available to C code (but should hopefully become available to Python code itself in Python 3.13).

Martin’s workaround for this is ingenious: it turns out the Python test.support package provides utility functions to help write tests against interpreters, and Martin shows how to abuse this module to launch, run and cleanup interpreters using regular Python code.

He also demonstrates test.support.interpreters.create_channel(), which can be used to create channels with receiver and sender ends, somewhat similar to Go. # 15th May 2023, 7:42 pm

Implement DNS in a weekend (via) Fantastically clear and useful guide to implementing DNS lookups, from scratch, using Python’s struct, socket and dataclass modules—Julia Evans plans to follow this up with one for TLS which I am very much looking forward to. # 12th May 2023, 6:14 pm

Mojo may be the biggest programming advance in decades (via) Jeremy Howard makes a very convincing argument for why the new programming language Mojo is a big deal.

Mojo is a superset of Python designed by a team lead by Chris Lattner, who previously created LLVM, Clang and and Swift.

Existing Python code should work unmodified, but it also adds features that enable performant low-level programming—like “fn” for creating typed, compiled functions and “struct” for memory-optimized alternatives to classes.

It’s worth watching Jeremy’s video where he uses these features to get more than a 2000x speed up implementing matrix multiplication, while still keeping the code readable and easy to follow.

Mojo isn’t available yet outside of a playground preview environment, but it does look like an intriguing new project. # 4th May 2023, 4:41 am

urllib3 v2.0.0 is now generally available. urllib3 is 12 years old now, and is a common low-level dependency for packages like requests and httpx. The biggest new feature in v2 is a higher-level API: resp = urllib3.request(“GET”, “https://example.com”)—a very welcome addition to the library. # 26th April 2023, 10 pm

Rye. Armin Ronacher’s take on a Python packaging tool. There are a lot of interesting ideas in this one—it’s written in Rust, configured using pyproject.toml and has some very strong opinions, including completely hiding pip from view and insisting you use “rye add package” instead. Notably, it doesn’t use the system Python at all: instead, it downloads a pre-compiled standalone Python from Gregory Szorc’s python-build-standalone project—the same approach I used for the Datasette Desktop Electron app.

Armin warns that this is just an exploration, with no guarantees of future maintenance—and even has an issue open titled “Should Rye exist?” # 24th April 2023, 4:02 am

Introducing PyPI Organizations. Launched at PyCon US today: Organizations allow packages on the Python Package Index to be owned by a group, not an individual user account. “We’re making organizations available to community projects for free, forever, and to corporate projects for a small fee.”—this is the first revenue generating PyPI feature. # 23rd April 2023, 8:29 pm

codespaces-jupyter (via) This is really neat. Click “Use this template” -> “Open in a codespace” and you get a full in-browser VS Code interface where you can open existing notebook files (or create new ones) and start playing with them straight away. # 14th April 2023, 10:38 pm

Running Python micro-benchmarks using the ChatGPT Code Interpreter alpha

Today I wanted to understand the performance difference between two Python implementations of a mechanism to detect changes to a SQLite database schema. I rendered the difference between the two as this chart:

[... 2939 words]

My strong hunch is that the GIL does not need removing, if a) subinterpreters have their own GILs and b) an efficient way is provided to pass (some) data between subinterpreters lock free and c) we find good patterns to make working with subinterpreters work.

Armin Ronacher # 11th April 2023, 4:47 pm

Several libraries let you declare objects with type-hinted members and automatically derive validation rules and serialization/deserialization from the type hints – Pydantic is the most popular, but alternatives like msgspec are out there too. There’s also a whole new generation of web frameworks like FastAPI and Starlite which use type hints at runtime to do not just input validation and serialization/deserialization but also things like dependency injection.

Personally, I’ve seen more significant gains in productivity from those runtime usages of Python’s type hints than from any static ahead-of-time type checking, which mostly is only useful to me as documentation.

James Bennett # 7th April 2023, 2:19 am

The different uses of Python type hints (via) Luke Plant describes five different categories for how Python optional types are being used today: IDE assistants, type checking, runtime behavior changes via introspection (e.g. Pydantic), code documentation, compiler instructions (ala mypyc)—and a bonus sixth, dependency injection. # 7th April 2023, 2:17 am

textual-mandelbrot (via) I love this: run “pipx install textual-mandelbrot” and then “mandelexp” to get an interactive Mandelbrot fractal exploration interface right there in your terminal, built on top of Textual. The code for this is only 250 lines of Python and delightfully easy to follow. # 1st April 2023, 7:23 pm

djngo.com: Portable Django (via) “A 20mb executable zip file with Python 3.6 and Django 2.2. Works on Windows, Linux, MacOSX with x86_64 and aarch64 (yes, Apple M1 and Raspberry Pi).” The latest wizardry from the ecosystem surrounding the Cosmopolitan project, which provides a should-be-impossible mechanism for running the same executable on a bunch of different platforms. This utility by Ariel Núñez bundles Python and Django and SQLite, such that a Django application can become a portable executable ready to run on multiple platforms. It’s currently limited to Python 3.6 and Django 2.2 since those are the versions that run under Cosmopolitan, but I expect we’ll see more recent versions of those dependencies in the future. # 24th February 2023, 12:52 am

PocketPy. PocketPy is “a lightweight(~5000 LOC) Python interpreter for game engines”. It’s implemented as a single C++ header which provides an impressive subset of the Python language: functions, dictionaries, lists, strings and basic classes too. There’s also a browser demo that loads a 766.66 KB pypocket.wasm file (240.72 KB compressed) and uses it to power a basic terminal interface. I tried and failed to get that pypocket.wasm file working from wasmer/wasmtime/wasm3—it should make a really neat lightweight language to run in a WebAssembly sandbox. # 8th February 2023, 5:13 am

Making SQLite extensions pip install-able (via) Alex Garcia figured out how to bundle a compiled SQLite extension in a Python wheel (building different wheels for different platforms) and publish them to PyPI. This is a huge leap forward in terms of the usability of SQLite extensions, which have previously been pretty difficult to actually install and run. Alex also created Datasette plugins that depend on his packages, so you can now “datasette install datasette-sqlite-regex” (or datasette-sqlite-ulid, datasette-sqlite-fastrand, datasette-sqlite-jsonschema) to gain access to his custom SQLite extensions in your Datasette instance. It even works with “datasette publish --install” to deploy to Vercel, Fly.io and Cloud Run. # 6th February 2023, 7:44 pm

Python’s “Disappointing” Superpowers. Luke Plant provides a fascinating detailed list of Python libraries that use dynamic meta-programming tricks in interesting ways—including SQLAlchemy, Django, Werkzeug, pytest and more. # 1st February 2023, 10:41 pm

pyfakefs usage (via) New to me pytest fixture library that provides a really easy way to mock Python’s filesystem functions—open(), os.path.listdir() and so on—so a test can run against a fake set of files. This looks incredibly useful. # 1st February 2023, 10:37 pm

Python Sandbox in Web Assembly (via) Jim Kring responded to my questions on Mastodon about running Python in a WASM sandbox by building this repo, which demonstrates using wasmer-python to run a build of Python 3.6 compiled to WebAssembly, complete with protected access to a sandbox directory. # 25th January 2023, 9:10 pm

Mapping Python to LLVM (via) Codon is a fascinating new entry in the “compile Python code to something else” world—this time targeting LLVM. Ariya Shajii describes in great detail how it pulls this off, including tricks such as transforming Python generators to LLVM coroutines. Codon doesn’t promise that all Python code will work—it’s best thought of as a Python-like language which can be used to create compiled modules which can then be imported back into regular Python projects. # 10th January 2023, 2:08 am

nanoGPT. “The simplest, fastest repository for training/finetuning medium-sized GPTs”—by Andrej Karpathy, in about 600 lines of Python. # 2nd January 2023, 11:27 pm