Simon Willison’s Weblog

Subscribe

Items tagged rust in 2024

Filters: Year: 2024 × rust × Sorted by date


Zed Decoded: Rope & SumTree (via) Text editors like Zed need in-memory data structures that are optimized for handling large strings where text can be inserted or deleted at any point without needing to copy the whole string.

Ropes are a classic, widely used data structure for this.

Zed have their own implementation of ropes in Rust, but it's backed by something even more interesting: a SumTree, described here as a thread-safe, snapshot-friendly, copy-on-write B+ tree where each leaf node contains multiple items and a Summary for each Item, and internal tree nodes contain a Summary of the items in its subtree.

These summaries allow for some very fast traversal tree operations, such as turning an offset in the file into a line and row coordinate and vice-versa. The summary itself can be anything, so each application of SumTree in Zed collects different summary information.

Uses in Zed include tracking highlight regions, code folding state, git blame information, project file trees and more - over 20 different classes and counting.

Zed co-founder Nathan Sobo calls SumTree "the soul of Zed".

Also notable: this detailed article is accompanied by an hour long video with a four-way conversation between Zed maintainers providing a tour of these data structures in the Zed codebase. # 28th April 2024, 3:25 pm

Ruff v0.4.0: a hand-written recursive descent parser for Python. The latest release of Ruff—a Python linter and formatter, written in Rust—includes a complete rewrite of the core parser. Previously Ruff used a parser borrowed from RustPython, generated using the LALRPOP parser generator. Victor Hugo Gomes contributed a new parser written from scratch, which provided a 2x speedup and also added error recovery, allowing parsing of invalid Python—super-useful for a linter.

I tried Ruff 0.4.0 just now against Datasette—a reasonably large Python project—and it ran in less than 1/10th of a second. This thing is Fast. # 19th April 2024, 5 am

How we built JSR (via) Really interesting deep dive by Luca Casonato into the engineering behind the new JSR alternative JavaScript package registry launched recently by Deno.

The backend uses PostgreSQL and a Rust API server hosted on Google Cloud Run.

The frontend uses Fresh, Deno’s own server-side JavaScript framework which leans heavily in the concept of “islands”—a progressive enhancement technique where pages are rendered on the server and small islands of interactivity are added once the page has loaded. # 12th April 2024, 3:49 pm

uv: Python packaging in Rust (via) “uv is an extremely fast Python package installer and resolver, written in Rust, and designed as a drop-in replacement for pip and pip-tools workflows.”

From Charlie Marsh and Astral, the team behind Ruff, who describe it as a milestone in their pursuit of a “Cargo for Python”.

Also in this announcement: Astral are taking over stewardship of Armin Ronacher’s Rye packaging tool, another Rust project.

uv is reported to be 8-10x faster than regular pip, increasing to 80-115x faster with a warm global module cache thanks to copy-on-write and hard links on supported filesystems—which saves on disk space too.

It also has a --resolution=lowest option for installing the lowest available version of dependencies—extremely useful for testing, I’ve been wanting this for my own projects for a while.

Also included: “uv venv”—a fast tool for creating new virtual environments with no dependency on Python itself. # 15th February 2024, 7:57 pm

Rye: Added support for marking virtualenvs ignored for cloud sync (via) A neat feature in the new Rye 0.22.0 release. It works by using an xattr Rust crate to set the attributes “com.dropbox.ignored” and “com.apple.fileprovider.ignore#P” on the folder. # 10th February 2024, 6:50 am

NYT Flash-based visualizations work again. The New York Times are using the open source Ruffle Flash emulator—built using Rust, compiled to WebAssembly—to get their old archived data visualization interactives working again. # 21st January 2024, 5:58 am

Fastest Way to Read Excel in Python (via) Haki Benita produced a meticulously researched and written exploration of the options for reading a large Excel spreadsheet into Python. He explored Pandas, Tablib, Openpyxl, shelling out to LibreOffice, DuckDB and python-calamine (a Python wrapper of a Rust library). Calamine was the winner, taking 3.58s to read 500,00 rows—compared to Pandas in last place at 32.98s. # 3rd January 2024, 8:04 pm