Simon Willison’s Weblog

Subscribe

Items tagged python in Oct

Filters: Month: Oct × python × Sorted by date


My User Experience Porting Off setup.py (via) PyOxidizer maintainer Gregory Szorc provides a detailed account of his experience trying to figure out how to switch from setup.py to pyproject.toml for his zstandard Python package.

This kind of detailed usability feedback is incredibly valuable for project maintainers, especially when the user encountered this many different frustrations along the way. It’s like the written version of a detailed usability testing session. # 31st October 2023, 7:57 pm

chDB (via) This is a really interesting development: chDB offers “an embedded SQL OLAP Engine” as a Python package, which you can install using “pip install chdb”. What you’re actually getting is a wrapper around ClickHouse—it’s almost like ClickHouse has been repackaged into an embedded database similar to SQLite. # 24th October 2023, 11:04 pm

I’m banned for life from advertising on Meta. Because I teach Python. (via) If accurate, this describes a nightmare scenario of automated decision making.

Reuven recently found he had a permanent ban from advertising on Facebook. They won’t tell him exactly why, and have marked this as a final decision that can never be reviewed.

His best theory (impossible for him to confirm) is that it’s because he tried advertising a course on Python and Pandas a few years ago which was blocked because a dumb algorithm thought he was trading exotic animals!

The worst part? An appeal is no longer possible because relevant data is only retained for 180 days and so all of the related evidence has now been deleted.

Various comments on Hacker News from people familiar with these systems confirm that this story likely holds up. # 19th October 2023, 2:56 pm

Bottleneck T5 Text Autoencoder (via) Colab notebook by Linus Lee demonstrating his Contra Bottleneck T5 embedding model, which can take up to 512 tokens of text, convert that into a 1024 floating point number embedding vector... and then then reconstruct the original text (or a close imitation) from the embedding again.

This allows for some fascinating tricks, where you can do things like generate embeddings for two completely different sentences and then reconstruct a new sentence that combines the weights from both. # 10th October 2023, 2:12 am

New sqlite3 CLI tool in Python 3.12. The newly released Python 3.12 includes a SQLite shell, which you can open using “python -m sqlite3”—handy for when you’re using a machine that has Python installed but no sqlite3 binary.

I installed Python 3.12 for macOS using the official installer from Python.org and now “/usr/local/bin/python3 -m sqlite3” gives me a SQLite 3.41.1 shell—a pleasantly recent version from March 2023 (the latest SQLite is 3.43.1, released in September). # 3rd October 2023, 6:57 pm

[On Python 3.12 subinterpreters] there’s massive advantages for mixed C(++) and Python: I can now have multiple sub interpreters running concurrently and accessing the same shared state in a thread-safe C++ library.

Previously this required rewriting the whole C++ library to support either pickling (multiplying the total memory consumption by the number of cores), or support allocating everything in shared memory (which means normal C++ types like `std::string` are unusable, need to switch e.g. to boost::interprocess).

Now is sufficient to pickle a pointer to a C++ object as an integer, and it’ll still be a valid pointer in the other subinterpreter.

ynik # 2nd October 2023, 6:13 pm

Chris Amico’s Python setup for 2022 (via) Homebrew to install pyenv, then pyenv to install specific Python versions. pipx and pipenv for package management. I need to habitually start using pyenv for everything. # 18th October 2022, 2:04 pm

How to create a Python package in 2022 (via) Fantastic tutorial on modern Python packaging by Rodrigo Girão Serrão. I’ve been meaning to figure out Poetry for a while now and this gave me exactly the information I needed to start figuring it out. Great coverage of GitHub Actions, Tox and pre-commit as well. # 15th October 2022, 10:10 pm

py2rs. Extremely useful document providing resources for learning Rust followed by an extensive collection of common Python tasks (building a list, opening a file, spawning a thread, running a simple web server) and their Rust equivalents. # 7th October 2022, 5:44 pm

Why you shouldn’t invoke setup.py directly (via) Paul Ganssle explains why you shouldn’t use “python setup.py command” any more. I’ve mostly switched to pip and pytest and twine but I was still using “python setup.py sdist”—apparently the new replacement recipe for that is “python -m build”. # 19th October 2021, 5:22 pm

Where does all the effort go? Looking at Python core developer activity (via) Łukasz Langa used Datasette to explore 28,780 pull requests made to the CPython GitHub repository, using some custom Python scripts (and sqlite-utils) to load in the data. # 18th October 2021, 8:21 pm

Tests aren’t enough: Case study after adding type hints to urllib3. Very thorough write-up by Seth Michael Larson describing what it took for the urllib3 Python library to fully embrace mypy and optional typing and what they learned along the way. # 18th October 2021, 7:03 pm

Web Browser Engineering (via) In progress free online book by Pavel Panchekha and Chris Harrelson that demonstrates how a web browser works by writing one from scratch using Python, tkinter and the DukPy wrapper around the Duktape JavaScript interpreter. # 17th October 2021, 3:53 pm

Finding and reporting an asyncio bug in Python 3.10

I found a bug in Python 3.10 today! Some notes on how I found it and my process for handling it once I figured out what was going on.

[... 1789 words]

Streamlit: Turn Python Scripts into Beautiful ML Tools (via) A really interesting new tool / application development framework. Streamlit is designed to help machine learning engineers build usable web frontends for their work. It does this by providing a simple, productive Python environment which lets you declaratively build up a sort-of Notebook style interface for your code. It includes the ability to insert a DataFrame, geospatial map rendering, chart or image into the application with a single Python function call. It’s hard to describe how it works, but the tutorial and demo worked really well for me: “pip install streamlit” and then “streamlit hello” to get a full-featured demo in a browser, then you can run through the tutorial to start building a real interactive application in a few dozen lines of code. # 6th October 2019, 3:52 am

The ASGI specification provides an opportunity for Python to hit a productivity/performance sweet-spot for a wide range of use-cases, from writing high-volume proxy servers through to bringing large-scale web applications to market at speed.

Tom Christie # 8th October 2018, 2:43 pm

Exploring United States Policing Data Using Python. Outstanding introduction to data analysis with Jupyter and Pandas. # 29th October 2017, 4:58 pm

Fast GeoSpatial Analysis in Python. Some clever advanced performance tricks with Cython and Dask, but it also introduced me to GeoPandas. # 29th October 2017, 4:47 pm

profiling. “An interactive continuous Python profiler”. This is really neat—simply run “profiling myscript.py” to get an interactive, navigable console-based profile inspector at the end of your script... or run “profiling live-profile mywebverver.py” to see a live, updating profile of a long-running process. Has options for statistical profiling as well, which has a much lower overhead in exchange for a less accurate view of what is going on. # 28th October 2017, 1:06 am

Contributors to python/cpython, Aug 5, 1990—Oct 26, 2017. I love how the graphs on this page summarize the history of the last 27 years of Python development, showing exactly when each core contributor was most active. # 26th October 2017, 6:58 pm

hupper (via) Handy Python module for adding “live reload” development support to just about anything. I’m using it with Sanic—I run “hupper -m app” and it starts up my code in app.py and automatically reloads it any time any of the corresponding files changes on disk. # 23rd October 2017, 12:34 am

Parse shell one-liners with pyparsing. Neat introduction to the pyparsing library, both for parsing tokens into labeled sections and constructing an AST from them. # 22nd October 2017, 1:35 pm

Getting the Most out of Sqlite3 with Python. A couple of neat tricks I didn’t know: you can skip cursors entirely by calling .execute and .executemany directly on the connection object, and you can use the connection object as a context manager to execute transactions using a “with” block. # 22nd October 2017, 12:35 pm

Porting my blog to Python 3

This blog is now running on Python 3! Admittedly this is nearly nine years after the first release of Python 3.0, but it’s the first Python 3 project I’ve deployed myself so I’m pretty excited about it.

[... 883 words]

Deploying an asynchronous Python microservice with Sanic and Zeit Now

Back in 2008 Natalie Downe and I deployed what today we would call a microservice: json-head, a tiny Google App Engine app that allowed you to make an HTTP head request against a URL and get back the HTTP headers as JSON. One of our initial use-scase for this was Natalie’s addSizes.js, an unobtrusive jQuery script that could annotate links to PDFs and other large files with their corresponding file size pulled from the Content-Length header. Another potential use-case is detecting broken links, since the API can be used to spot 404 status codes (as in this example).

[... 1361 words]

Sanic. “Sanic is a Flask-like Python 3.5+ web server that’s written to go fast [...] On top of being Flask-like, Sanic supports async request handlers. This means you can use the new shiny async/await syntax from Python 3.5, making your code non-blocking and speedy”. # 7th October 2017, 6:39 pm

uvloop: Blazing fast Python networking. “uvloop makes asyncio fast. In fact, it is at least 2x faster than nodejs, gevent, as well as any other Python asynchronous framework. The performance of uvloop-based asyncio is close to that of Go programs.” # 7th October 2017, 5:53 pm

Building a statistical profiler in python. Generating flame graphs of production Python code is surprisingly straight-forward. Brian Pitts built a statistical flame graph profiler into our tikibar debugging tool at Eventbrite and it’s proved extremely useful. # 5th October 2017, 3:44 pm

For a Django application, deployed on Heroku, what are my options for storing user-uploaded media files?

S3 is really a no-brainer for this, it’s extremely inexpensive, very easy to integrate with and unbelievably reliable. It’s so cheap that it will be practically free for testing purposes (expect to spend pennies a month on it).

[... 88 words]

What is the scope, as a career, for a Python developer?

Don’t be an “X developer”. You’re selling yourself short if you define yourself by the technology you most frequently use.

[... 169 words]