Simon Willison’s Weblog


Items tagged python in 2024

Filters: Year: 2024 × python × Sorted by date

urllib3 2.2.0. Highlighted feature: “urllib3 now works in the browser”—the core urllib3 library now includes code that can integrate with Pyodide, using the browser’s fetch() or XMLHttpRequest APIs to make HTTP requests (to CORS-enabled endpoints). # 30th January 2024, 4:31 pm

Getting Started With CUDA for Python Programmers (via) if, like me, you’ve avoided CUDA programming (writing efficient code that runs on NVIGIA GPUs) in the past, Jeremy Howard has a new 1hr17m video tutorial that demystifies the basics. The code is all run using PyTorch in notebooks running on Google Colab, and it starts with a very clear demonstration of how to convert a RGB image to black and white. # 29th January 2024, 9:23 pm

Find a level of abstraction that works for what you need to do. When you have trouble there, look beneath that abstraction. You won’t be seeing how things really work, you’ll be seeing a lower-level abstraction that could be helpful. Sometimes what you need will be an abstraction one level up. Is your Python loop too slow? Perhaps you need a C loop. Or perhaps you need numpy array operations.

You (probably) don’t need to learn C.

Ned Batchelder # 24th January 2024, 6:25 pm

Python packaging must be getting better—a datapoint (via) Luke Plant reports on a recent project he developed on Linux using a requirements.txt file and some complex binary dependencies—Qt5 and VTK—and when he tried to run it on Windows... it worked! No modifications required.

I think Python’s packaging system has never been more effective... provided you know how to use it. The learning curve is still too high, which I think accounts for the bulk of complaints about it today. # 22nd January 2024, 6:06 pm

Publish Python packages to PyPI with a python-lib cookiecutter template and GitHub Actions

I use cookiecutter to start almost all of my Python projects. It helps me quickly generate a skeleton of a project with my preferred directory structure and configured tools.

[... 686 words]

Marimo (via) This is a really interesting new twist on Python notebooks.

The most powerful feature is that these notebooks are reactive: if you change the value or code in a cell (or change the value in an input widget) every other cell that depends on that value will update automatically. It’s the same pattern implemented by Observable JavaScript notebooks, but now it works for Python.

There are a bunch of other nice touches too. The notebook file format is a regular Python file, and those files can be run as “applications” in addition to being edited in the notebook interface. The interface is very nicely built, especially for such a young project—they even have GitHub Copilot integration for their CodeMirror cell editors. # 12th January 2024, 9:17 pm

The Random Transformer (via) “Understand how transformers work by demystifying all the math behind them”—Omar Sanseviero from Hugging Face meticulously implements the transformer architecture behind LLMs from scratch using Python and numpy. There’s a lot to take in here but it’s all very clearly explained. # 10th January 2024, 5:09 am

Python 3.13 gets a JIT. “In late December 2023 (Christmas Day to be precise), CPython core developer Brandt Bucher submitted a little pull-request to the Python 3.13 branch adding a JIT compiler.”

Anthony Shaw does a deep dive into this new experimental JIT, explaining how it differs from other JITs. It’s an implementation of a copy-and-patch JIT, an idea that only emerged in 2021. This makes it architecturally much simpler than a traditional JIT, allowing it to compile faster and take advantage of existing LLVM tools on different architectures.

So far it’s providing a 2-9% performance improvement, but the real impact will be from the many future optimizations it enables. # 9th January 2024, 9:25 pm

Fastest Way to Read Excel in Python (via) Haki Benita produced a meticulously researched and written exploration of the options for reading a large Excel spreadsheet into Python. He explored Pandas, Tablib, Openpyxl, shelling out to LibreOffice, DuckDB and python-calamine (a Python wrapper of a Rust library). Calamine was the winner, taking 3.58s to read 500,00 rows—compared to Pandas in last place at 32.98s. # 3rd January 2024, 8:04 pm