Simon Willison on datasette

437 posts tagged “datasette”

Datasette is an open source tool for exploring and publishing data.

2020

Weeknotes: Datasette Cloud and zero downtime deployments

Yesterday’s piece on Tracking FARA by deploying a data API using GitHub Actions and Cloud Run was originally intended to be my weeknotes, but ended up getting a bit too involved.

[... 1,431 words]

8:56 pm / 21st January 2020 / docker, traefik, csv, museums, deployment, datasette, projects, weeknotes, datasette-cloud, digitalocean, zero-downtime

Tracking FARA by deploying a data API using GitHub Actions and Cloud Run

I’m using the combination of GitHub Actions and Google Cloud Run to retrieve data from the U.S. Department of Justice FARA website and deploy it as a queryable API using Datasette.

[... 1,599 words]

7:51 am / 21st January 2020 / cloudrun, continuous-deployment, continuous-integration, github-actions, datasette, projects, data-journalism, github, git-scraping

Building a sitemap.xml with a one-off Datasette plugin

One of the fun things about launching a new website is re-learning what it takes to promote a website from scratch on the modern web. I’ve been thoroughly enjoying using Niche Museums as an excuse to explore 2020-era SEO.

[... 1,078 words]

11:02 pm / 6th January 2020 / plugins, seo, museums, datasette, projects, weeknotes

2019

Weeknotes: Datasette 0.33

I released Datasette 0.33 yesterday. The release represents an accumulation of small changes and features since Datasette 0.32 back in November. Duplicating the release notes:

[... 678 words]

9:38 pm / 23rd December 2019 / datasette, weeknotes

Building tools to bring data-driven reporting to more newsrooms. I wrote about my fellowship project so far and my goals for the next few months for the JSK Medium publication. My next priority: an invite-only hosted version for newsrooms so that figuring out how to install and manage the software isn’t the biggest barrier to entry.

# 20th December 2019, 11:17 am / jsk, data-journalism, datasette

Logging to SQLite using ASGI middleware

I had some fun playing around with ASGI middleware and logging during our flight back to England for the holidays.

[... 2,535 words]

10:30 pm / 16th December 2019 / sqlite, asgi, django, datasette, projects, logging, weeknotes, sqlite-utils

Better presentations through storytelling and STAR moments

Last week I completed GSBGEN 315: Strategic Communication at the Stanford Graduate School of Business.

[... 643 words]

12 am / 10th December 2019 / jsk, datasette, speaking, weeknotes

datasette-atom: Define an Atom feed using a custom SQL query

I’ve been having a ton of fun iterating on www.niche-museums.com. I put together some notes on how the site works last week, and I’ve been taking advantage of the Thanksgiving break to continue exploring ways in which Datasette can be used to quickly build database-backed static websites.

[... 1,084 words]

12:20 am / 3rd December 2019 / atom, syndication, datasette, projects, weeknotes

niche-museums.com, powered by Datasette

I just released a major upgrade to my www.niche-museums.com website (launched last month).

[... 1,154 words]

10:27 pm / 25th November 2019 / projects, datasette, museums, weeknotes, yaml, baked-data

Weeknotes: datasette-template-sql

Last week I talked about wanting to take ona a larger Datasette project, and listed some candidates. I ended up pushing a big project that I hadn’t listed there: the upgrade of Datasette to Python 3.8, which meant dropping support for Python 3.5 (thanks to incompatible dependencies).

[... 521 words]

10:29 pm / 18th November 2019 / projects, sql, datasette, weeknotes, coldfusion

datasette-template-sql (via) New Datasette plugin, celebrating the new ability in Datasette 0.32 to have asynchronous custom template functions in Jinja (which was previously blocked by the need to support Python 3.5). The plugin adds a sql() function which can be used to execute SQL queries that are embedded directly in custom templates.

# 15th November 2019, 12:59 am / projects, sql, templates, datasette, jinja

Datasette 0.31. Released today: this version adds compatibility with Python 3.8 and breaks compatibility with Python 3.5. Since Glitch support Python 3.7.3 now I decided I could finally give up on 3.5. This means Datasette can use f-strings now, but more importantly it opens up the opportunity to start taking advantage of Starlette, which makes all kinds of interesting new ASGI-based plugins much easier to build.

# 12th November 2019, 6:11 am / glitch, asgi, datasette, python, projects

Weeknotes: Python 3.7 on Glitch, datasette-render-markdown

Streaks is really working well for me. I’m at 12 days of commits to Datasette, 16 posting a daily Niche Museum, 19 of actually reviewing my email inbox and 14 of guitar practice. I rewarded myself for that last one by purchasing an actual classical (as opposed to acoustic) guitar.

[... 1,141 words]

11:26 pm / 11th November 2019 / glitch, markdown, museums, datasette, weeknotes, sqlite-utils, python

Setting up Datasette, step by step (via) Tobias describes how he runs Datasette on his own server/VPS, using nginx and systemd. I’m doing something similar for some projects and systemd really does feel like the solution to the “ensure a Python process keeps running” problem I’ve been fighting for over a decade. I really like how Tobias creates a dedicated Linux user for each of his deployed Python projects.

# 21st October 2019, 2:20 am / sysadmin, datasette

2018 Central Park Squirrel Census in Datasette (via) The Squirrel Census project released their data! 3,000 squirrel observations in Central Park, each with fur color and latitude and longitude and behavioral observations. I love this data so much. I’ve loaded it into a Datasette running on Glitch.

# 16th October 2019, 6:01 pm / squirrels, datasette

Weeknotes: PG&E outages, and Open Source works!

My big focus this week was the PG&E outages project. I’m really pleased with how this turned out: the San Francisco Chronicle used data from it for their excellent PG&E outage interactive (mixing in data on wind conditions) and it earned a bunch of interest on Twitter and some discussion on Hacker News.

[... 452 words]

5 pm / 14th October 2019 / open-source, datasette, projects, dogsheep, weeknotes

goodreads-to-sqlite (via) This is so cool! Tobias Kunze built a Python CLI tool to import your Goodreads data into a SQLite database, inspired by github-to-sqlite and my various other Dogsheep tools. It’s the first Dogsheep style tool I’ve seen that wasn’t built by me—and Tobias’ write-up includes some neat examples of queries you can run against your Goodreads data. I’ve now started using Goodreads and I’m importing my books into my own private Dogsheep Datasette instance.

# 14th October 2019, 4:07 am / dogsheep, books, datasette, sqlite, cli

Tracking PG&E outages by scraping to a git repo

PG&E have cut off power to several million people in northern California, supposedly as a precaution against wildfires.

[... 868 words]

11:32 pm / 10th October 2019 / projects, scraping, data-journalism, sqlite, datasette, git-scraping, digitalocean, sqlite-utils

SQL Murder Mystery in Datasette (via) “A crime has taken place and the detective needs your help. The detective gave you the crime scene report, but you somehow lost it. You vaguely remember that the crime was a murder that occurred sometime on Jan.15, 2018 and that it took place in SQL City. Start by retrieving the corresponding crime scene report from the police department’s database.”—Really fun game to help exercise your skills with SQL by the NU Knight Lab. I loaded their SQLite database into Datasette so you can play in your browser.

# 7th October 2019, 11:37 pm / datasette, projects, sql, sqlite

Weeknotes: Design thinking for journalists, genome-to-sqlite, datasette-atom

I haven’t had much time for code this week: we’ve had a full five day workshop at JSK with Tran Ha (a JSK alumni) learning how to apply Design Thinking to our fellowship projects and generally to challenges facing journalism.

[... 870 words]

8:13 pm / 20th September 2019 / design-thinking, sqlite, reading, genetics, datasette, projects, jsk, weeknotes

genome-to-sqlite. I just found out 23andMe let you export your genome as a zipped TSV file, so I wrote a little Python command-line tool to import it into a SQLite database.

# 19th September 2019, 3:58 pm / projects, datasette, genetics, sqlite

Weeknotes: ONA19, twitter-to-sqlite, datasette-rure

I’ve decided to start writing weeknotes for the duration of my JSK fellowship. Here goes!

[... 919 words]

9:35 pm / 13th September 2019 / regex, sqlite, twitter, datasette, projects, weeknotes

My JSK Fellowship: Building an open source ecosystem of tools for data journalism

I started a new chapter of my career last week: I began a year long fellowship with the John S. Knight Journalism Fellowships program at Stanford.

[... 876 words]

11:29 pm / 10th September 2019 / careers, journalism, open-source, datasette, jsk, data-journalism, personalnews

healthkit-to-sqlite. Ever since I got an Apple Watch I’ve been itching to get my hands on the step tracking and health data that it’s been collecting for me. I know it’s there in a SQLite database on my wrist, but I couldn’t figure out how to get it! A few days ago I stumbled across the “Export Health Data” button in the iOS Health app, and it turns out it creates a zip file containing XML with a full dump of the data collected by Apple Health. healthkit-to-sqlite is the tool I’ve built that can read that export and use it to create a SQLite database ready to be queried and explored with Datasette. It’s a pretty basic implementation but it’s already giving me access to over 3 million rows of data. Lots of potential here for interesting work with personal analytics.

# 22nd July 2019, 3:34 am / dogsheep, health, datasette, sqlite

Unlocking the Department of State’s foreign military training data for good this time (via) I’m so excited about this: Security Force Monitor used Datasette to publish a 200,000 row database of training engagements between the US military and foreign military units, based on their own massive efforts to clean up the official data (from thousands of PDF files). This is pretty much my dream use-case for Datasette, and their future goals are inspiring: “Our hope is that when the next report arrives in a short few months, we will be able to turn it into machine readable data and pass it around the sector in minutes, rather than months.”

# 20th July 2019, 4:42 am / datasette

Single sign-on against GitHub using ASGI middleware

I released Datasette 0.29 last weekend, the first version of Datasette to be built on top of ASGI (discussed previously in Porting Datasette to ASGI, and Turtles all the way down).

[... 1,612 words]

1:18 am / 14th July 2019 / asgi, middleware, security, datasette, projects, github

datasette-cors (via) My other Datasette ASGI plugin: this one wraps my asgi-cors project and lets you configure CORS access from a list of domains (or a set of domain wildcards) so you can make JavaScript calls to a Datasette instance from a specific set of other hosts.

# 8th July 2019, 4:30 am / projects, asgi, datasette, cors

datasette-auth-github (via) My first big ASGI plugin for Datasette: datasette-auth-github adds the ability to require users to authenticate against the GitHub OAuth API. You can whitelist specific users, or you can restrict access to members of specific GitHub organizations or teams. While it’s structured as a Datasette plugin it also includes ASGI middleware which can be applied to any ASGI application.

# 8th July 2019, 4:28 am / asgi, oauth, datasette, projects, github

Datasette 0.29 (via) I shipped Datasette 0.29! • ASGI all the way down! Plus a new asgi_wrapper plugin hook letting plugins do all kinds of powerful new things • New mechanism for secret plugin configuration options • Facet by date • ?_through= for joins through m2m tables. Much more.

# 8th July 2019, 4:26 am / projects, datasette

db-to-sqlite 1.0 release. I’ve released version 1.0 of my db-to-sqlite tool, which lets you create a SQLite database copy of any database supported by SQLAlchemy (I’ve tested it against MySQL and PostgreSQL). The tool has a bunch of new features: you can use --redact to redact specific columns, specify --table multiple times to copy a subset of tables, and the --all option now efficiently adds all foreign keys at the end of the import. The project now has unit tests which run against MySQL and PostgreSQL in Travis CI. Also included in the README: a shell one-liner for creating a local SQLite copy of a remote Heroku Postgres database based on extracting the connection string from a Heroku config environment variable.

# 1st July 2019, 1:35 am / projects, datasette, sqlite, mysql, postgresql, heroku

«« first « previous page 12 / 15 next » last »»