Simon Willison’s Weblog

Subscribe

Items tagged python in Oct, 2023

Filters: Year: 2023 × Month: Oct × python × Sorted by date


My User Experience Porting Off setup.py (via) PyOxidizer maintainer Gregory Szorc provides a detailed account of his experience trying to figure out how to switch from setup.py to pyproject.toml for his zstandard Python package.

This kind of detailed usability feedback is incredibly valuable for project maintainers, especially when the user encountered this many different frustrations along the way. It’s like the written version of a detailed usability testing session. # 31st October 2023, 7:57 pm

chDB (via) This is a really interesting development: chDB offers “an embedded SQL OLAP Engine” as a Python package, which you can install using “pip install chdb”. What you’re actually getting is a wrapper around ClickHouse—it’s almost like ClickHouse has been repackaged into an embedded database similar to SQLite. # 24th October 2023, 11:04 pm

I’m banned for life from advertising on Meta. Because I teach Python. (via) If accurate, this describes a nightmare scenario of automated decision making.

Reuven recently found he had a permanent ban from advertising on Facebook. They won’t tell him exactly why, and have marked this as a final decision that can never be reviewed.

His best theory (impossible for him to confirm) is that it’s because he tried advertising a course on Python and Pandas a few years ago which was blocked because a dumb algorithm thought he was trading exotic animals!

The worst part? An appeal is no longer possible because relevant data is only retained for 180 days and so all of the related evidence has now been deleted.

Various comments on Hacker News from people familiar with these systems confirm that this story likely holds up. # 19th October 2023, 2:56 pm

Bottleneck T5 Text Autoencoder (via) Colab notebook by Linus Lee demonstrating his Contra Bottleneck T5 embedding model, which can take up to 512 tokens of text, convert that into a 1024 floating point number embedding vector... and then then reconstruct the original text (or a close imitation) from the embedding again.

This allows for some fascinating tricks, where you can do things like generate embeddings for two completely different sentences and then reconstruct a new sentence that combines the weights from both. # 10th October 2023, 2:12 am

New sqlite3 CLI tool in Python 3.12. The newly released Python 3.12 includes a SQLite shell, which you can open using “python -m sqlite3”—handy for when you’re using a machine that has Python installed but no sqlite3 binary.

I installed Python 3.12 for macOS using the official installer from Python.org and now “/usr/local/bin/python3 -m sqlite3” gives me a SQLite 3.41.1 shell—a pleasantly recent version from March 2023 (the latest SQLite is 3.43.1, released in September). # 3rd October 2023, 6:57 pm

[On Python 3.12 subinterpreters] there’s massive advantages for mixed C(++) and Python: I can now have multiple sub interpreters running concurrently and accessing the same shared state in a thread-safe C++ library.

Previously this required rewriting the whole C++ library to support either pickling (multiplying the total memory consumption by the number of cores), or support allocating everything in shared memory (which means normal C++ types like `std::string` are unusable, need to switch e.g. to boost::interprocess).

Now is sufficient to pickle a pointer to a C++ object as an integer, and it’ll still be a valid pointer in the other subinterpreter.

ynik # 2nd October 2023, 6:13 pm