Simon Willison’s Weblog

Subscribe
Atom feed for projects

491 posts tagged “projects”

Posts about projects I have worked on.

2022

AI assisted learning: Learning Rust with ChatGPT, Copilot and Advent of Code

Visit AI assisted learning: Learning Rust with ChatGPT, Copilot and Advent of Code

I’m using this year’s Advent of Code to learn Rust—with the assistance of GitHub Copilot and OpenAI’s new ChatGPT.

[... 2,661 words]

Datasette’s new JSON write API: The first alpha of Datasette 1.0

Visit Datasette's new JSON write API: The first alpha of Datasette 1.0

This week I published the first alpha release of Datasette 1.0, with a significant new feature: Datasette core now includes a JSON API for creating and dropping tables and inserting, updating and deleting data.

[... 2,817 words]

Tracking Mastodon user numbers over time with a bucket of tricks

Visit Tracking Mastodon user numbers over time with a bucket of tricks

Mastodon is definitely having a moment. User growth is skyrocketing as more and more people migrate over from Twitter.

[... 1,534 words]

Datasette Lite: Loading JSON data (via) I added a new feature to Datasette Lite: you can now pass it the URL to a JSON file (hosted on a CORS-compatible hosting provider such as GitHub or GitHub Gists) and it will load that file into a database table for you. It expects an array of objects, but if your file has an object as the root it will search through it looking for the first key that is an array of objects and load those instead.

# 18th November 2022, 6:43 pm / datasette-lite, json, projects, datasette, cors

Datasette is 5 today: a call for birthday presents

Visit Datasette is 5 today: a call for birthday presents

Five years ago today I published the first release of Datasette, in Datasette: instantly create and publish an API for your SQLite databases.

[... 548 words]

Designing a write API for Datasette

Building out Datasette Cloud has made one thing clear to me: Datasette needs a write API for ingesting new data into its attached SQLite databases.

[... 1,493 words]

Datasette 0.63: The annotated release notes

Visit Datasette 0.63: The annotated release notes

I released Datasette 0.63 today. These are the annotated release notes.

[... 1,531 words]

Weeknotes: DjangoCon, SQLite in Django, datasette-gunicorn

I spent most of this week at DjangoCon in San Diego—my first outside-of-the-Bay-Area conference since the before-times.

[... 1,184 words]

Measuring traffic during the Half Moon Bay Pumpkin Festival

Visit Measuring traffic during the Half Moon Bay Pumpkin Festival

This weekend was the 50th annual Half Moon Bay Pumpkin Festival.

[... 2,693 words]

Half Moon Bay Pumpkin Festival traffic on Saturday 15th October 2022 (via) It’s the Half Moon Bay Pumpkin Festival this weekend... and its impact on the traffic between our little town of El Granada and Half Moon Bay—8 minutes drive away—is notorious. So I built a git scraper that archives estimated driving times from the Google Maps Navigation API, and used git-history to turn that scraped data into a SQLite database and visualize it on a chart.

# 16th October 2022, 3:56 am / projects, git-scraping, git-history, half-moon-bay

shot-scraper 1.0 (via) Only a minor release in terms of features, but I decided that I’m comfortable enough with the CLI design at this point that I’m ready to stamp a 1.0 on it and commit to not making backwards-incompatible changes (at least without shipping a 2.0 release, which I’d like to avoid if possible).

# 15th October 2022, 9:28 pm / projects, shot-scraper, cli

Weeknotes: Publishing data using Datasette Cloud

My initial preview releases of Datasette Cloud (the SaaS version of my Datasette open source project) have focused on private data collaboration.

[... 582 words]

Weeknotes: Datasette Cloud preview invitations

Visit Weeknotes: Datasette Cloud preview invitations

This week I finally started sending out invitations for people to try out the preview of the new Datasette Cloud, my SaaS offering for Datasette.

[... 713 words]

A tool to run caption extraction against online videos using Whisper and GitHub Issues/Actions

Visit A tool to run caption extraction against online videos using Whisper and GitHub Issues/Actions

I released a new project this weekend, built during the Bellingcat Hackathon (I came second!) It’s called Action Transcription and it’s a tool for caturing captions and transcripts from online videos.

[... 1,362 words]

Exploring 10m scraped Shutterstock videos used to train Meta’s Make-A-Video text-to-video model

Visit Exploring 10m scraped Shutterstock videos used to train Meta's Make-A-Video text-to-video model

Make-A-Video is a new “state-of-the-art AI system that generates videos from text” from Meta AI. It looks incredible—it really is DALL-E / Stable Diffusion for video. And it appears to have been trained on 10m video preview clips scraped from Shutterstock.

[... 923 words]

Weeknotes: Datasette Lite, s3-credentials, shot-scraper, datasette-edit-templates and more

Visit Weeknotes: Datasette Lite, s3-credentials, shot-scraper, datasette-edit-templates and more

Despite distractions from AI I managed to make progress on a bunch of different projects this week, including new releases of s3-credentials and shot-scraper, a new datasette-edit-templates plugin and a small but neat improvement to Datasette Lite.

[... 1,562 words]

Open every CSV file in a GitHub repository in Datasette Lite (via) I built an Observable notebook that accepts a GitHub repository as input, scans it for CSV files and generates a link to open all of those CSV files in Datasette Lite.

# 1st September 2022, 7:24 pm / datasette-lite, projects, observable, github

Building a searchable archive for the San Francisco Microscopical Society

Visit Building a searchable archive for the San Francisco Microscopical Society

The San Francisco Microscopical Society was founded in 1870 by a group of scientists dedicated to advancing the field of microscopy.

[... 1,845 words]

Analyzing ScotRail audio announcements with Datasette—from prototype to production

Visit Analyzing ScotRail audio announcements with Datasette - from prototype to production

Scottish train operator ScotRail released a two-hour long MP3 file containing all of the components of its automated station announcements. Messing around with them is proving to be a huge amount of fun.

[... 4,428 words]

Plugin support for Datasette Lite

Visit Plugin support for Datasette Lite

I’ve added a new feature to Datasette Lite, my distribution of Datasette that runs entirely in the browser using Python and SQLite compiled to WebAssembly. You can now install additional Datasette plugins by passing them in the URL.

[... 865 words]

viewport-preview (via) I built a tiny tool which lets you preview a URL in a bunch of different common browser viewport widths, using iframes.

# 26th July 2022, 12 am / css, testing, projects, mobile, iframes

sqlite-comprehend: run AWS entity extraction against content in a SQLite database

I built a new tool this week: sqlite-comprehend, which passes text from a SQLite database through the AWS Comprehend entity extraction service and stores the returned entities.

[... 1,146 words]

s3-ocr: Extract text from PDF files stored in an S3 bucket

Visit s3-ocr: Extract text from PDF files stored in an S3 bucket

I’ve released s3-ocr, a new tool that runs Amazon’s Textract OCR text extraction against PDF files in an S3 bucket, then writes the resulting text out to a SQLite database with full-text search configured so you can run searches against the extracted data.

[... 1,493 words]

Joining CSV files in your browser using Datasette Lite

Visit Joining CSV files in your browser using Datasette Lite

I added a new feature to Datasette Lite—my version of Datasette that runs entirely in your browser using WebAssembly (previously): you can now use it to load one or more CSV files by URL, and then run SQL queries against them—including joins across data from multiple files.

[... 546 words]

A tiny web app to create images from OpenStreetMap maps

Visit A tiny web app to create images from OpenStreetMap maps

Earlier today I found myself wanting to programmatically generate some images of maps.

[... 1,388 words]

Weeknotes: Datasette Cloud ready to preview

I made an absolute ton of progress building Datasette Cloud on Fly this week, and also had a bunch of fun playing with GPT-3.

[... 370 words]

Weeknotes: Building Datasette Cloud on Fly Machines, Furo for documentation

Visit Weeknotes: Building Datasette Cloud on Fly Machines, Furo for documentation

Hosting provider Fly released Fly Machines this week. I got an early preview and I’ve been working with it for a few days—it’s a fascinating new piece of technology. I’m using it to get my hosting service for Datasette ready for wider release.

[... 1,005 words]

simonw/datasette-screenshots (via) I started a new GitHub repository to automate taking screenshots of Datasette for marketing purposes, using my shot-scraper browser automation tool.

# 17th May 2022, 5:56 pm / projects, shot-scraper, github-actions, datasette

Weeknotes: Datasette Lite, nogil Python, HYTRADBOI

My big project this week was Datasette Lite, a new way to run Datasette directly in a browser, powered by WebAssembly and Pyodide. I also continued my research into running SQL queries in parallel, described last week. Plus I spoke at HYTRADBOI.

[... 1,434 words]

Datasette Lite: a server-side Python web application running in a browser

Visit Datasette Lite: a server-side Python web application running in a browser

Datasette Lite is a new way to run Datasette: entirely in a browser, taking advantage of the incredible Pyodide project which provides Python compiled to WebAssembly plus a whole suite of useful extras.

[... 4,800 words]