Simon Willison’s Weblog

Subscribe

Posts tagged datasette in Nov

Filters: Month: Nov × datasette × Sorted by date

Weeknotes: asynchronous LLMs, synchronous embeddings, and I kind of started a podcast

These past few weeks I’ve been bringing Datasette and LLM together and distracting myself with a new sort-of-podcast crossed with a live streaming experiment.

[... 896 words]

Project: Civic Band—scraping and searching PDF meeting minutes from hundreds of municipalities

Visit Project: Civic Band - scraping and searching PDF meeting minutes from hundreds of municipalities

I interviewed Philip James about Civic Band, his “slowly growing collection of databases of the minutes from civic governments”. Philip demonstrated the site and talked through his pipeline for scraping and indexing meeting minutes from many different local government authorities around the USA.

[... 762 words]

Visualizing local election results with Datasette, Observable and MapLibre GL

Visit Visualizing local election results with Datasette, Observable and MapLibre GL

Alex Garcia and myself hosted the first Datasette Open Office Hours on Friday—a live-streamed video session where we hacked on a project together and took questions and tips from community members on Discord.

[... 3,390 words]

Datasette Public Office Hours, Friday Nov 8th at 2pm PT. Tomorrow afternoon (Friday 8th November) at 2pm PT we'll be hosting the first Datasette Public Office Hours - a livestream video session on Discord where Alex Garcia and myself will live code on some Datasette projects and hang out to chat about the project.

This is our first time trying this format. If it works out well I plan to turn it into a series.

Discord event card promoting Datasette Public Office Hours

# 7th November 2024, 7:10 pm / open-source, datasette, discord, alex-garcia, datasette-public-office-hours

Annotate and explore your data with datasette-comments. New plugin for Datasette and Datasette Cloud: datasette-comments, providing tools for collaborating on data exploration with a team through posting comments on individual rows of data.

Alex Garcia built this for Datasette Cloud but as with almost all of our work there it’s also available as an open source Python package.

# 30th November 2023, 9:59 pm / collaboration, projects, datasette, datasette-cloud, alex-garcia

Weeknotes: DevDay, GitHub Universe, OpenAI chaos

Three weeks of conferences and Datasette Cloud work, four days of chaos for OpenAI.

[... 766 words]

Financial sustainability for open source projects at GitHub Universe

Visit Financial sustainability for open source projects at GitHub Universe

I presented a ten minute segment at GitHub Universe on Wednesday, ambitiously titled Financial sustainability for open source projects.

[... 2,485 words]

Weeknotes: Implementing a write API, Mastodon distractions

Everything is so distracting at the moment. The ongoing Twitter catastrophe, the great migration (at least amongst most of the people I pay attention to) to Mastodon, the FTX calamity. It’s been very hard to focus!

[... 916 words]

Tracking Mastodon user numbers over time with a bucket of tricks

Visit Tracking Mastodon user numbers over time with a bucket of tricks

Mastodon is definitely having a moment. User growth is skyrocketing as more and more people migrate over from Twitter.

[... 1,534 words]

Datasette Lite: Loading JSON data (via) I added a new feature to Datasette Lite: you can now pass it the URL to a JSON file (hosted on a CORS-compatible hosting provider such as GitHub or GitHub Gists) and it will load that file into a database table for you. It expects an array of objects, but if your file has an object as the root it will search through it looking for the first key that is an array of objects and load those instead.

# 18th November 2022, 6:43 pm / json, projects, datasette, datasette-lite, cors

Datasette is 5 today: a call for birthday presents

Visit Datasette is 5 today: a call for birthday presents

Five years ago today I published the first release of Datasette, in Datasette: instantly create and publish an API for your SQLite databases.

[... 548 words]

Designing a write API for Datasette

Building out Datasette Cloud has made one thing clear to me: Datasette needs a write API for ingesting new data into its attached SQLite databases.

[... 1,493 words]

Weeknotes: Apache proxies in Docker containers, refactoring Datasette

Updates to six major projects this week, plus finally some concrete progress towards Datasette 1.0.

[... 1,630 words]

Weeknotes: git-history, created for a Git scraping workshop

Visit Weeknotes: git-history, created for a Git scraping workshop

My main project this week was a 90 minute workshop I delivered about Git scraping at Coda.Br 2021, a Brazilian data journalism conference, on Friday. This inspired the creation of a brand new tool, git-history, plus smaller improvements to a range of other projects.

[... 1,239 words]

Datasette is four years old today. I marked the occasion with a short Twitter thread about the project so far.

# 13th November 2021, 6:14 pm / datasette

AWS IAM definitions in Datasette (via) As part of my ongoing quest to conquer IAM permissions, I built myself a Datasette instance that lets me run queries against all 10,441 permissions across 280 AWS services. It’s deployed by a build script running in GitHub Actions which downloads a 8.9MB JSON file from the Salesforce policy_sentry repository—policy_sentry itself creates that JSON file by running an HTML scraper against the official AWS documentation!

# 6th November 2021, 3:47 am / aws, datasette

Weeknotes: datasette-jupyterlite, s3-credentials and a Python packaging talk

Visit Weeknotes: datasette-jupyterlite, s3-credentials and a Python packaging talk

My big project this week was s3-credentials, described yesterday—but I also put together a fun expermiental Datasette plugin bundling JupyterLite and wrote up my PyGotham talk on Python packaging.

[... 476 words]

Datasette Weekly volume 4. Datasette Office Hours, Personal Data Warehouses, datasette-ripgrep, datasette-indieauth, Datasette Client for Observable and more.

# 30th November 2020, 11:02 pm / datasette

Datasette Office Hours (via) I’ve decided to try running Datasette Office Hours every Friday. If you’d like to chat with me over Zoom about the project for 20 minutes for any reason at all you can grab a slot here using Calendly.

# 30th November 2020, 11:01 pm / datasette

Datasette 0.52. A relatively small release—it has a new plugin hook (database_actions(), for adding links to a new database actions menu), renames the --config option to --setting and adds a new “datasette publish cloudrun --apt-get-install” option.

# 29th November 2020, 12:56 am / projects, datasette

datasette-ripgrep: deploy a regular expression search engine for your source code

Visit datasette-ripgrep: deploy a regular expression search engine for your source code

This week I built datasette-ripgrep—a web application for running regular expression searches against source code, built on top of the amazing ripgrep command-line tool.

[... 1,362 words]

Datasette Client for Observable (via) Really elegant piece of code design from Alex Garcia: DatasetteClient is a client library he built designed to work in Observable notebooks, which uses JavaScript tagged template literals to allow SQL query results to be executed against a Datasette instance and displayed as inline tables in a notebook, or used to return JSON data for further processing. His example notebook includes a neat d3 stacked area chart example built against a Datasette of congresspeople, plus examples using interactive widgets to update the Notebook.

# 24th November 2020, 6:53 pm / javascript, datasette, observable, alex-garcia

Weeknotes: datasette-indieauth, datasette-graphql, PyCon Argentina

Visit Weeknotes: datasette-indieauth, datasette-graphql, PyCon Argentina

Last week’s weeknotes took the form of my Personal Data Warehouses: Reclaiming Your Data talk write-up, which represented most of what I got done that week. This week I mainly worked on datasette-indieauth, but I also gave a keynote at PyCon Argentina and released a version of datasette-graphql with a small security fix.

[... 724 words]

datasette-graphql 1.2 (via) A new release of the datasette-graphql plugin, fixing a minor security flaw: previous versions of the plugin could expose the schema (but not the actual data) of tables in databases that were otherwise protected by Datasette’s permission system.

# 21st November 2020, 10:21 pm / projects, security, graphql, datasette

Implementing IndieAuth for Datasette

Visit Implementing IndieAuth for Datasette

IndieAuth is a spiritual successor to OpenID, developed and maintained by the IndieWeb community and based on OAuth 2. This weekend I attended IndieWebCamp East Coast and was inspired to try my hand at an implementation. datasette-indieauth is the result, a new plugin which enables IndieAuth logins to a Datasette instance.

[... 1,225 words]

Personal Data Warehouses: Reclaiming Your Data

Visit Personal Data Warehouses: Reclaiming Your Data

I gave a talk yesterday about personal data warehouses for GitHub’s OCTO Speaker Series, focusing on my Datasette and Dogsheep projects. The video of the talk is now available, and I’m presenting that here along with an annotated summary of the talk, including links to demos and further information.

[... 5,166 words]

Intent to Remove: HTTP/2 and gQUIC server push (via) The Chrome / Blink team announce their intent to remove HTTP/2 server push support, where servers can start pushing an asset to a client before it has been requested. It’s been in browsers for over five years now and adoption is terrible. “Over the past 28 days [...] 99.97% of connections never received a pushed stream that got matched with a request [...] These numbers are exactly the same as in June 2019”. Datasette serves redirects with Link: preload headers that cause smart proxies (like Cloudflare) to push the redirected page to the client along with the redirect, but I don’t exepect to miss that optimization if it quietly stops working.

# 12th November 2020, 1:44 am / chrome, http2, datasette

Datasette 0.51 (plus weeknotes)

Visit Datasette 0.51 (plus weeknotes)

I shipped Datasette 0.51 today, with a new visual design, plugin hooks for adding navigation options, better handling of binary data, URL building utility methods and better support for running Datasette behind a proxy. It’s a lot of stuff! Here are the annotated release notes.

[... 2,020 words]

niche-museums.com, powered by Datasette

I just released a major upgrade to my www.niche-museums.com website (launched last month).

[... 1,154 words]

Weeknotes: datasette-template-sql

Last week I talked about wanting to take ona a larger Datasette project, and listed some candidates. I ended up pushing a big project that I hadn’t listed there: the upgrade of Datasette to Python 3.8, which meant dropping support for Python 3.5 (thanks to incompatible dependencies).

[... 521 words]