Simon Willison’s Weblog

246 items tagged “projects”

2022

viewport-preview (via) I built a tiny tool which lets you preview a URL in a bunch of different common browser viewport widths, using iframes. # 26th July 2022, 12 am

sqlite-comprehend: run AWS entity extraction against content in a SQLite database

I built a new tool this week: sqlite-comprehend, which passes text from a SQLite database through the AWS Comprehend entity extraction service and stores the returned entities.

[... 1146 words]

s3-ocr: Extract text from PDF files stored in an S3 bucket

I’ve released s3-ocr, a new tool that runs Amazon’s Textract OCR text extraction against PDF files in an S3 bucket, then writes the resulting text out to a SQLite database with full-text search configured so you can run searches against the extracted data.

[... 1493 words]

Joining CSV files in your browser using Datasette Lite

I added a new feature to Datasette Lite—my version of Datasette that runs entirely in your browser using WebAssembly (previously): you can now use it to load one or more CSV files by URL, and then run SQL queries against them—including joins across data from multiple files.

[... 546 words]

A tiny web app to create images from OpenStreetMap maps

Earlier today I found myself wanting to programmatically generate some images of maps.

[... 1388 words]

Weeknotes: Datasette Cloud ready to preview

I made an absolute ton of progress building Datasette Cloud on Fly this week, and also had a bunch of fun playing with GPT-3.

[... 370 words]

Weeknotes: Building Datasette Cloud on Fly Machines, Furo for documentation

Hosting provider Fly released Fly Machines this week. I got an early preview and I’ve been working with it for a few days—it’s a fascinating new piece of technology. I’m using it to get my hosting service for Datasette ready for wider release.

[... 1005 words]

simonw/datasette-screenshots (via) I started a new GitHub repository to automate taking screenshots of Datasette for marketing purposes, using my shot-scraper browser automation tool. # 17th May 2022, 5:56 pm

Weeknotes: Datasette Lite, nogil Python, HYTRADBOI

My big project this week was Datasette Lite, a new way to run Datasette directly in a browser, powered by WebAssembly and Pyodide. I also continued my research into running SQL queries in parallel, described last week. Plus I spoke at HYTRADBOI.

[... 1434 words]

Datasette Lite: a server-side Python web application running in a browser

Datasette Lite is a new way to run Datasette: entirely in a browser, taking advantage of the incredible Pyodide project which provides Python compiled to WebAssembly plus a whole suite of useful extras.

[... 4787 words]

Automatically opening issues when tracked file content changes

I figured out a GitHub Actions pattern to keep track of a file published somewhere on the internet and automatically open a new repository issue any time the contents of that file changes.

[... 1151 words]

Building a Covid sewage Twitter bot (and other weeknotes)

I built a new Twitter bot today: @covidsewage. It tweets a daily screenshot of the latest Covid sewage monitoring data published by Santa Clara county.

[... 1079 words]

Pillar Point Stewards, pypi-to-sqlite, improvements to shot-scraper and appreciating datasette-dashboards

This week I helped Natalie launch the Pillar Point Stewards website and built a new tool for loading PyPI package data into SQLite, in order to help promote the excellent datasette-dashboards plugin by Romain Clement.

[... 1985 words]

Instantly create a GitHub repository to take screenshots of a web page

I just released shot-scraper-template, a GitHub repository template that helps you start taking automated screenshots of a web page by filling out a form.

[... 1177 words]

shot-scraper: automated screenshots for documentation, built on Playwright

shot-scaper is a new tool that I’ve built to help automate the process of keeping screenshots up-to-date in my documentation. It also doubles as a scraping tool—hence the name—which I picked as a complement to my git scraping and help scraping techniques.

[... 1781 words]

Google Drive to SQLite

I released a new tool this week: google-drive-to-sqlite. It’s a CLI utility for fetching metadata about files in your Google Drive and writing them to a local SQLite database.

[... 1221 words]

Datasette 0.60: The annotated release notes

I released Datasette 0.60 today. It’s a big release, incorporating 61 commits and 18 issues. Here are the annotated release notes.

[... 1119 words]

What’s new in sqlite-utils 3.20 and 3.21

sqlite-utils is my combined CLI tool and Python library for manipulating SQLite databases. Consider this the annotated release notes for sqlite-utils 3.20 and 3.21, both released in the past week.

[... 2429 words]

2021

Weeknotes: datasette-tiddlywiki, filters_from_request

I made some good progress on the big refactor this week, including extracting some core logic out into a new Datasette plugin hook. I also got distracted by TiddlyWiki and released a new Datasette plugin that lets you run TiddlyWiki inside Datasette.

[... 1197 words]

git-history: a tool for analyzing scraped data collected using Git and SQLite

I described Git scraping last year: a technique for writing scrapers where you periodically snapshot a source of data to a Git repository in order to record changes to that source over time.

[... 2002 words]

s3-credentials 0.8. The latest release of my s3-credentials CLI tool for creating S3 buckets with credentials to access them (with read-write, read-only or write-only policies) adds a new --public option for creating buckets that allow public access, such that anyone who knows a filename can download a file. The s3-credentials put-object command also now sets the appropriate Content-Type heading on the uploaded object. # 7th December 2021, 7:04 am

Weeknotes: Shaving some beautiful yaks

I’ve been mostly shaving yaks this week—two in particular: the Datasette table refactor and the next release of git-history. I also built and released my first Web Component!

[... 1307 words]

Weeknotes: git-history, created for a Git scraping workshop

My main project this week was a 90 minute workshop I delivered about Git scraping at Coda.Br 2021, a Brazilian data journalism conference, on Friday. This inspired the creation of a brand new tool, git-history, plus smaller improvements to a range of other projects.

[... 1239 words]

Weeknotes: datasette-jupyterlite, s3-credentials and a Python packaging talk

My big project this week was s3-credentials, described yesterday—but I also put together a fun expermiental Datasette plugin bundling JupyterLite and wrote up my PyGotham talk on Python packaging.

[... 476 words]

s3-credentials: a tool for creating credentials for S3 buckets

I’ve built a command-line tool called s3-credentials to solve a problem that’s been frustrating me for ages: how to quickly and easily create AWS credentials (an access key and secret key) that have permission to read or write from just a single S3 bucket.

[... 1618 words]

Weeknotes: CDC vaccination history fixes, developing in GitHub Codespaces

I spent the last week mostly surrounded by boxes: we’re completing our move to the new place and life is mostly unpacking now. I did find some time to fix some issues with my CDC vaccination history Datasette instance though.

[... 514 words]

Weeknotes number 100

This entry marks my 100th weeknotes, which I’ve managed to post once a week (plus or minus a few days) consistently since 13th September 2019.

[... 593 words]

Datasette Desktop 0.2.0: The annotated release notes

Datasette Desktop is a new macOS desktop application version of Datasette, an “open source multi-tool for exploring and publishing data” built on top of SQLite. I released the first version last week—I’ve just released version 0.2.0 (and a 0.2.1 bug fix) with a whole bunch of critical improvements.

[... 2208 words]

Datasette Desktop—a macOS desktop application for Datasette

I just released version 0.1.0 of the new Datasette macOS desktop application, the first version that end-users can easily install. I would very much appreciate your help testing it out!

[... 1761 words]

Datasette Desktop 0.1.0 (via) This is the first installable version of the new Datasette Desktop macOS application I’ve been building. Please try it out and leave feedback on Twitter or on the GitHub Discussions thread linked from the release notes. # 8th September 2021, 5:14 am