May 2021
49 posts: 5 entries, 16 links, 2 quotes, 26 beats
May 2, 2021
Query Engines: Push vs. Pull (via) Justin Jaffray (who has worked on Materialize) explains the difference between push and pull query execution engines using some really clear examples built around JavaScript generators.
One year of TILs
Just over a year ago I started tracking TILs, inspired by Josh Branchaud’s collection. I’ve since published 148 TILs across 43 different topics. It’s a great format!
[... 224 words]Hosting SQLite databases on Github Pages (via) I've seen the trick of running SQLite compiled to WASM in the browser before, but here it comes with an incredibly clever bonus trick: it uses SQLite's page structure to fetch subsets of the database file via HTTP range requests, which means you can run indexed SQL queries against a 600MB database file while only fetching a few MBs of data over the wire. Absolutely brilliant. Tucked away at the end of the post is another neat trick: making the browser DOM available to SQLite as a virtual table, so you can query and update the DOM of the current page using SQL!
May 3, 2021
Adding GeoDjango to an existing Django project
Work on VIAL for Vaccinate The States continues.
[... 1,503 words]May 4, 2021
Practical SQL for Data Analysis
(via)
This is a really great SQL tutorial: it starts with the basics, but quickly moves on to a whole array of advanced PostgreSQL techniques - CTEs, window functions, efficient sampling, rollups, pivot tables and even linear regressions executed directly in the database using regr_slope(), regr_intercept() and regr_r2(). I picked up a whole bunch of tips for things I didn't know you could do with PostgreSQL here.
Observable Plot (via) This is huge: a brand new high-level JavaScript visualization library from Mike Bostock, the author of D3—partially inspired by Vega-Lite which I’ve used enthusiastically in the past. First impressions are that this is a big step forward for quickly building high-quality visualizations. It’s released under the ISC license which is “functionally equivalent to the BSD 2-Clause and MIT licenses”.
Plot & Vega-Lite. Useful documentation comparing the brand new Observable Plot to Vega-Lite, complete with examples of how to achieve the same thing in both libraries.
cinder: Instagram’s performance oriented fork of CPython (via) Instagram forked CPython to add some performance-oriented features they wanted, including a method-at-a-time JIT compiler and a mechanism for eagerly evaluating coroutines (avoiding the overhead of creating a coroutine if an awaited function returns a value without itself needing to await). They’re open sourcing the code to help start conversations about implementing some of these features in CPython itself. I particularly enjoyed the warning that accompanies the repo: this is not intended to be a supported release, and if you decide to run it in production you are on your own!
May 5, 2021
A museum bot (via) Shawn Graham built a Twitter bot, using R, which tweets out random items from the collection at the Canadian Science and Technology Museum—using a Datasette instance that he’s running based on a CSV export of their collections data.
May 8, 2021
May 9, 2021
May 10, 2021
Django SQL Dashboard
I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme. It can also be used to create saved dashboards that can be published or shared internally.
[... 2,171 words]May 12, 2021
New Major Versions Released! Flask 2.0, Werkzeug 2.0, Jinja 3.0, Click 8.0, ItsDangerous 2.0, and MarkupSafe 2.0. Huge set of releases from the Pallets team. Python 3.6+ required and comprehensive type annotations. Flask now supports async views, Jinja async templates (used extensively by Datasette) “no longer requires patching”, Click has a bunch of new code around shell tab completion, ItsDangerous supports key rotation and so much more.
Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result.
Each request still ties up one worker, even for async views. The upside is that you can run async code within a view, for example to make multiple concurrent database queries, HTTP requests to an external API, etc. However, the number of requests your application can handle at one time will remain the same.
May 13, 2021
Folks think s3 is static assets hosting but really it's a consistent and highly available key value store with first class blob support
May 14, 2021
Powering the Python Package Index in 2021. PyPI now serves “nearly 900 terabytes over more than 2 billion requests per day”. Bandwidth is donated by Fastly, a value estimated at 1.8 million dollars per month! Lots more detail about how PyPI has evolved over the past years in this post by Dustin Ingram.
May 16, 2021
May 17, 2021
geocode-sqlite. Neat command-line Python utility by Chris Amico: point it at a SQLite database file and it will add latitude and longitude columns and populate them by geocoding one or more of the other fields, using your choice from four currently supported geocoders.
No feigning surprise (via) Don’t feign surprise if someone doesn’t know something that you think they should know. Even better: even if you are surprised, don’t let them know! “When people feign surprise, it’s usually to make them feel better about themselves and others feel worse.”
