Simon Willison on python

1,161 posts tagged “python”

The Python programming language.

2024

Talking about Django’s history and future on Django Chat (via) Django co-creator Jacob Kaplan-Moss sat down with the Django Chat podcast team to talk about Django’s history, his recent return to the Django Software Foundation board and what he hopes to achieve there.

Here’s his post about it, where he used Whisper and Claude to extract some of his own highlights from the conversation.

# 21st March 2024, 12:42 am / jacob-kaplan-moss, python, django, podcasts, dsf

Every dunder method in Python. Trey Hunner: "Python includes 103 'normal' dunder methods, 12 library-specific dunder methods, and at least 52 other dunder attributes of various types."

This cheat sheet doubles as a tour of many of the more obscure corners of the Python language and standard library.

I did not know that Python has over 100 dunder methods now! Quite a few of these were new to me, like __class_getitem__ which can be used to implement type annotations such as list[int].

# 20th March 2024, 3:45 am / python, trey-hunner

DiskCache (via) Grant Jenks built DiskCache as an alternative caching backend for Django (also usable without Django), using a SQLite database on disk. The performance numbers are impressive—it even beats memcached in microbenchmarks, due to avoiding the need to access the network.

The source code (particularly in core.py) is a great case-study in SQLite performance optimization, after five years of iteration on making it all run as fast as possible.

# 19th March 2024, 3:43 pm / sqlite, django, python, performance

pywebview 5 (via) pywebview is a library for building desktop (and now Android) applications using Python, based on the idea of displaying windows that use the system default browser to display an interface to the user—styled such that the fact they run on HTML, CSS and JavaScript is mostly hidden from the end-user.

It’s a bit like a much simpler version of Electron. Unlike Electron it doesn’t bundle a full browser engine (Electron bundles Chromium), which reduces the size of the dependency a lot but does mean that cross-browser differences (quite rare these days) do come back into play.

I tried out their getting started example and it’s very pleasant to use—import webview, create a window and then start the application loop running to display it.

You can register JavaScript functions that call back to Python, and you can execute JavaScript in a window from your Python code.

# 13th March 2024, 2:15 pm / electron, python, javascript

gh-116167: Allow disabling the GIL with PYTHON_GIL=0 or -X gil=0. Merged into python:main 14 hours ago. Looks like the first phase of Sam Gross’s phenomenal effort to provide a GIL free Python (here via an explicit opt-in) will ship in Python 3.13.

# 12th March 2024, 5:40 am / gil, python

llm-claude-3. I built a new plugin for LLM—my command-line tool and Python library for interacting with Large Language Models—which adds support for the new Claude 3 models from Anthropic.

# 4th March 2024, 6:46 pm / llm, anthropic, claude, ai, llms, python, generative-ai, projects

The Zen of Python, Unix, and LLMs. Here’s the YouTube recording of my 1.5 hour conversation with Hugo Bowne-Anderson yesterday.

I fed a Whisper transcript to Google Gemini Pro 1.5 and asked it for the themes from our conversation, and it said we talked about “Python’s success and versatility, the rise and potential of LLMs, data sharing and ethics in the age of LLMs, Unix philosophy and its influence on software development and the future of programming and human-computer interaction”.

# 29th February 2024, 9:04 pm / speaking, ai, python, llms, gemini, whisper, my-talks

aiolimiter. I found myself wanting an asyncio rate limiter for Python today—so I could send POSTs to an API endpoint no more than once every 10 seconds. This library worked out really well—it has a very neat design and lets you set up rate limits for things like “no more than 50 items every 10 seconds”, implemented using the leaky bucket algorithm.

# 20th February 2024, 1:15 am / async, python, rate-limiting

datasette-studio. I've been thinking for a while that it might be interesting to have a version of Datasette that comes bundled with a set of useful plugins, aimed at expanding Datasette's default functionality to cover things like importing data and editing schemas.

This morning I built the very first experimental preview of what that could look like. Install it using pipx:

pipx install datasette-studio

I recommend pipx because it will ensure datasette-studio gets its own isolated environment, independent of any other Datasette installations you might have.

Now running datasette-studio instead of datasette will get you the version with the bundled plugins.

The implementation of this is fun - it's a single pyproject.toml file defining the dependencies and setting up the datasette-studio CLI hook, which is enough to provide the full set of functionality.

Is this a good idea? I don't know yet, but it's certainly an interesting initial experiment.

# 18th February 2024, 8:38 pm / datasette, python, pypi, plugins, projects, cli

wddbfs – Mount a sqlite database as a filesystem. Ingenious hack from Adam Obeng. Install this Python tool and run it against a SQLite database:

wddbfs --anonymous --db-path path/to/content.db

Then tell the macOS Finder to connect to Go -> Connect to Server -> http://127.0.0.1:8080/ (connect as guest) - connecting via WebDAV.

/Volumes/127.0.0.1/content.db will now be a folder full of CSV, TSV, JSON and JSONL files - one of each format for every table.

This means you can open data from SQLite directly in any application that supports that format, and you can even run CLI commands such as grep, ripgrep or jq directly against the data!

Adam used WebDAV because "Despite how clunky it is, this seems to be the best way to implement a filesystem given that getting FUSE support is not straightforward". What a neat trick.

# 18th February 2024, 3:31 am / sqlite, webdav, python, jq, cli

uv: Python packaging in Rust (via) "uv is an extremely fast Python package installer and resolver, written in Rust, and designed as a drop-in replacement for pip and pip-tools workflows."

From Charlie Marsh and Astral, the team behind Ruff, who describe it as a milestone in their pursuit of a "Cargo for Python".

Also in this announcement: Astral are taking over stewardship of Armin Ronacher's Rye packaging tool, another Rust project.

uv is reported to be 8-10x faster than regular pip, increasing to 80-115x faster with a warm global module cache thanks to copy-on-write and hard links on supported filesystems - which saves on disk space too.

It also has a --resolution=lowest option for installing the lowest available version of dependencies - extremely useful for testing, I've been wanting this for my own projects for a while.

Also included: uv venv - a fast tool for creating new virtual environments with no dependency on Python itself.

# 15th February 2024, 7:57 pm / rust, python, armin-ronacher, rye, pip, ruff, uv, astral, charlie-marsh

Python Development on macOS Notes: pyenv and pyenv-virtualenvwrapper (via) Jeff Triplett shares the recipe he uses for working with pyenv (initially installed via Homebrew) on macOS.

I really need to start habitually using this. The benefit of pyenv over Homebrew’s default Python is that pyenv managed Python versions are forever—your projects won’t suddenly stop working in the future when Homebrew changes its default Python version.

# 11th February 2024, 4:41 am / jeff-triplett, macos, python

Rye: Added support for marking virtualenvs ignored for cloud sync (via) A neat feature in the new Rye 0.22.0 release. It works by using an xattr Rust crate to set the attributes “com.dropbox.ignored” and “com.apple.fileprovider.ignore#P” on the folder.

# 10th February 2024, 6:50 am / rye, rust, python, dropbox

Rye lets you get from no Python on a computer to a fully functioning Python project in under a minute with linting, formatting and everything in place.

[...] Because it was demonstrably designed to avoid interference with any pre-existing Python configurations, Rye allows for a smooth and gradual integration and the emotional barrier of picking it up even for people who use other tools was shown to be low.

— Armin Ronacher

# 4th February 2024, 3:12 pm / armin-ronacher, python, rye

unstructured. Relatively new but impressively capable Python library (Apache 2 licensed) for extracting information from unstructured documents, such as PDFs, images, Word documents and many other formats.

I got some good initial results against a PDF by running “pip install ’unstructured[pdf]’” and then using the “unstructured.partition.pdf.partition_pdf(filename)” function.

There are a lot of moving parts under the hood: pytesseract, OpenCV, various PDF libraries, even an ONNX model—but it installed cleanly for me on macOS and worked out of the box.

# 2nd February 2024, 2:47 am / ocr, python, pdf

snoop. Neat Python debugging utility by Alex Hall: snoop lets you “import snoop” and then add “@snoop” as a decorator to any function, which causes that function’s source code to be output directly to the console with details of any variable state changes that occur while it’s running.

I didn’t know you could make a Python module callable like that—turns out it’s running “sys.modules[’snoop’] = snoop” in the __init__.py module!

# 31st January 2024, 6:19 pm / debugging, python

Beej’s Guide to Networking Concepts (via) Beej's Guide to Network Programming is a legendary tutorial on network programming in C, continually authored and updated by Brian "Beej" Hall since 1995.

This is NOT that. Beej's Guide to Networking Concepts is brand new - started in March 2023 - and illustrates a whole bunch of networking concepts using Python instead of C.

From the forward:

Is it Beej’s Guide to Network Programming in Python? Well, kinda, actually. The C book is more about how C’s (well, Unix’s) network API works. And this book is more about the concepts underlying it, using Python as a vehicle.

# 30th January 2024, 10:08 pm / networking, c, python

urllib3 2.2.0. Highlighted feature: “urllib3 now works in the browser”—the core urllib3 library now includes code that can integrate with Pyodide, using the browser’s fetch() or XMLHttpRequest APIs to make HTTP requests (to CORS-enabled endpoints).

# 30th January 2024, 4:31 pm / webassembly, pyodide, python, cors

Getting Started With CUDA for Python Programmers (via) if, like me, you’ve avoided CUDA programming (writing efficient code that runs on NVIGIA GPUs) in the past, Jeremy Howard has a new 1hr17m video tutorial that demystifies the basics. The code is all run using PyTorch in notebooks running on Google Colab, and it starts with a very clear demonstration of how to convert a RGB image to black and white.

# 29th January 2024, 9:23 pm / jeremy-howard, pytorch, ai, python, gpus

Find a level of abstraction that works for what you need to do. When you have trouble there, look beneath that abstraction. You won’t be seeing how things really work, you’ll be seeing a lower-level abstraction that could be helpful. Sometimes what you need will be an abstraction one level up. Is your Python loop too slow? Perhaps you need a C loop. Or perhaps you need numpy array operations.

You (probably) don’t need to learn C.

— Ned Batchelder

# 24th January 2024, 6:25 pm / programming, python, ned-batchelder, abstractions, c

Python packaging must be getting better—a datapoint (via) Luke Plant reports on a recent project he developed on Linux using a requirements.txt file and some complex binary dependencies—Qt5 and VTK—and when he tried to run it on Windows... it worked! No modifications required.

I think Python’s packaging system has never been more effective... provided you know how to use it. The learning curve is still too high, which I think accounts for the bulk of complaints about it today.

# 22nd January 2024, 6:06 pm / luke-plant, windows, packaging, python

Publish Python packages to PyPI with a python-lib cookiecutter template and GitHub Actions

I use cookiecutter to start almost all of my Python projects. It helps me quickly generate a skeleton of a project with my preferred directory structure and configured tools.

[... 686 words]

9:59 pm / 16th January 2024 / cookiecutter, github-actions, python, projects, pypi, github

Marimo (via) This is a really interesting new twist on Python notebooks.

The most powerful feature is that these notebooks are reactive: if you change the value or code in a cell (or change the value in an input widget) every other cell that depends on that value will update automatically. It’s the same pattern implemented by Observable JavaScript notebooks, but now it works for Python.

There are a bunch of other nice touches too. The notebook file format is a regular Python file, and those files can be run as “applications” in addition to being edited in the notebook interface. The interface is very nicely built, especially for such a young project—they even have GitHub Copilot integration for their CodeMirror cell editors.

# 12th January 2024, 9:17 pm / jupyter, open-source, python, observable, github-copilot, marimo

The Random Transformer (via) “Understand how transformers work by demystifying all the math behind them”—Omar Sanseviero from Hugging Face meticulously implements the transformer architecture behind LLMs from scratch using Python and numpy. There’s a lot to take in here but it’s all very clearly explained.

# 10th January 2024, 5:09 am / generative-ai, numpy, python, transformers, ai, llms

Python 3.13 gets a JIT. “In late December 2023 (Christmas Day to be precise), CPython core developer Brandt Bucher submitted a little pull-request to the Python 3.13 branch adding a JIT compiler.”

Anthony Shaw does a deep dive into this new experimental JIT, explaining how it differs from other JITs. It’s an implementation of a copy-and-patch JIT, an idea that only emerged in 2021. This makes it architecturally much simpler than a traditional JIT, allowing it to compile faster and take advantage of existing LLVM tools on different architectures.

So far it’s providing a 2-9% performance improvement, but the real impact will be from the many future optimizations it enables.

# 9th January 2024, 9:25 pm / llvm, jit, python, anthony-shaw

Fastest Way to Read Excel in Python (via) Haki Benita produced a meticulously researched and written exploration of the options for reading a large Excel spreadsheet into Python. He explored Pandas, Tablib, Openpyxl, shelling out to LibreOffice, DuckDB and python-calamine (a Python wrapper of a Rust library). Calamine was the winner, taking 3.58s to read 500,00 rows—compared to Pandas in last place at 32.98s.

# 3rd January 2024, 8:04 pm / rust, duckdb, python, pandas, excel, haki-benita

2023

iSH: The Linux shell for iOS (via) Installing this iOS app gives you a full Linux shell environment running on your phone, using a “usermode x86 emulator”. You can even install packages: “apk add python3” gave me a working Python 3.9 interpreter, installed from the apk.ish.app repository.

I didn’t think this kind of thing was allowed by the App Store, but that’s not been the case for a few years now: Section 4.5.2 of the App Store guidelines clarifies that “Educational apps designed to teach, develop, or allow students to test executable code may, in limited circumstances, download code provided that such code is not used for other purposes.”

# 31st December 2023, 4:20 am / appstore, ios, emulator, linux, python

ast-grep (via) There are a lot of interesting things about this year-old project.

sg (an alias for ast-grep) is a CLI tool for running AST-based searches against code, built in Rust on top of the Tree-sitter parsing library. You can run commands like this:

sg -p ’await await_me_maybe($ARG)’ datasette --lang python

To search the datasette directory for code that matches the search pattern, in a syntax-aware way.

It works across 19 different languages, and can handle search-and-replace too, so it can work as a powerful syntax-aware refactoring tool.

My favourite detail is how it’s packaged. You can install the CLI utility using Homebrew, Cargo, npm or pip/pipx—each of which will give you a CLI tool you can start running. On top of that it provides API bindings for Rust, JavaScript and Python!

# 10th December 2023, 7:56 pm / treesitter, rust, tools, javascript, python, search, cli

It’s Time For A Change: datetime.utcnow() Is Now Deprecated (via) Miguel Grinberg explains the deprecation of datetime.utcnow() and utcfromtimestamp() in Python 3.12, since they return naive datetime objects which cause all sorts of follow-on problems.

The replacement idiom is datetime.datetime.now(datetime.timezone.utc)

# 18th November 2023, 7:24 pm / timezones, python

Fleet Context. This project took the source code and documentation for 1221 popular Python libraries and ran them through the OpenAI text-embedding-ada-002 embedding model, then made those pre-calculated embedding vectors available as Parquet files for download from S3 or via a custom Python CLI tool.

I haven’t seen many projects release pre-calculated embeddings like this, it’s an interesting initiative.

# 15th November 2023, 10:20 pm / embeddings, ai, python, llms

«« first « previous page 6 / 39 next » last »»

Simon Willison’s Weblog

1,161 posts tagged “python”

2024

Publish Python packages to PyPI with a python-lib cookiecutter template and GitHub Actions

2023