Simon Willison’s Weblog

22 items tagged “dogsheep”


Personal Data Warehouses: Reclaiming Your Data

I gave a talk yesterday about personal data warehouses for GitHub’s OCTO Speaker Series, focusing on my Datasette and Dogsheep projects. The video of the talk is now available, and I’m presenting that here along with an annotated summary of the talk, including links to demos and further information.

[... 5150 words]

OCTO Speaker Series: Simon Willison—Personal Data Warehouses: Reclaiming Your Data. I’m giving a talk in the GitHub OCTO (Office of the CTO) speaker series about Datasette and my Dogsheep personal analytics project. You can register for free here—the stream will be on Thursday November 12, 2020 at 8:30am PST (4:30pm GMT). # 23rd October 2020, 3 am

Dogsheep: Personal analytics with Datasette. The second edition of my new Datasette Weekly newsletter, talks about Dogsheep, Dogsheep Beta, Datasette 1.0 and features datasette-cluster-map as the plugin of the week. # 19th October 2020, 4:38 pm

Building an Evernote to SQLite exporter

I’ve been using Evernote for over a decade, and I’ve long wanted to export my data from it so I can do interesting things with it.

[... 1879 words]

evernote-to-sqlite (via) The latest tool in my Dogsheep series of utilities for personal analytics: evernote-to-sqlite takes Evernote note exports en their ENEX XML format and loads them into a SQLite database. Embedded images are loaded into a BLOB column and the output of their cloud-based OCR system is added to a full-text search index. Notes have a latitude and longitude which means you can visualize your notes on a map using Datasette and datasette-cluster-map. # 12th October 2020, 12:38 am

Weeknotes: airtable-export, generating screenshots in GitHub Actions, Dogsheep!

This week I figured out how to populate Datasette from Airtable, wrote code to generate social media preview card page screenshots using Puppeteer, and made a big breakthrough with my Dogsheep project.

[... 1461 words]

Serving photos locally with datasette-media. datasette-media is a new Datasette plugin which can serve static files from disk in response to a configured SQL query that maps incoming URL parameters to a path to a file. I built it so I could run dogsheep-photos locally on my laptop and serve up thumbnails of images that match particular queries. I’ve added documentation to the dogsheep-photos README explaining how to use datasette-media, datasette-json-html and datasette-template-sql to create custom interfaces onto Apple Photos data on your machine. # 26th May 2020, 3:53 pm

Using SQL to find my best photo of a pelican according to Apple Photos

According to the Apple Photos internal SQLite database, this is the most aesthetically pleasing photograph I have ever taken of a pelican:

[... 1937 words]

Weeknotes: Datasette 0.41, photos breakthroughs

Shorter weeknotes this week, because my main project for the week warrants a detailed write-up on its own (coming soon... update 21st May here it is).

[... 867 words]

github-to-sqlite 2.2 highlights thread. I released github-to-sqlite 2.2 today with a new “stargazers” command for importing users who have starred one or more specific repositories. This Twitter thread lists highlights of recent releases and links to a live Datasette demo that shows what the tool can do. # 2nd May 2020, 10:16 pm

Weeknotes: Datasette 0.40, various projects, Dogsheep photos

A new release of Datasette, two new projects and progress towards a Dogsheep photos solution.

[... 826 words]

Weeknotes: Hacking on 23 different projects

I wrote a lot of code this week: 184 commits over 23 repositories! I’ve also started falling for Zeit Now v2, having found workarounds for some of my biggest problems with it.

[... 901 words]

Weeknotes: Covid-19, First Python Notebook, more Dogsheep, Tailscale

My project publishes information on COVID-19 cases around the world. The project started out using data from Johns Hopkins CSSE, but last week the New York Times started publishing high quality USA county- and state-level daily numbers to their own repository. Here’s the change that added the NY Times data.

[... 993 words]

Weeknotes: Datasette 0.39 and many other projects

This week’s theme: Well, I’m not going anywhere. So a ton of progress to report on various projects.

[... 806 words]

hacker-news-to-sqlite (via) The latest in my Dogsheep series of tools: hacker-news-to-sqlite uses the Hacker News API to fetch your comments and submissions from Hacker News and save them to a SQLite database. # 21st March 2020, 4:27 am


pinboard-to-sqlite (via) Jacob Kaplan-Moss just released the second Dogsheep tool that wasn’t written by me (after goodreads-to-sqlite by Tobias Kunze)—this one imports your Pinterest bookmarks. The repo includes a really clean minimal example of how to use GitHub actions to run tests and release packages to PyPI. # 7th November 2019, 8:46 pm

Weeknotes: PG&E outages, and Open Source works!

My big focus this week was the PG&E outages project. I’m really pleased with how this turned out: the San Francisco Chronicle used data from it for their excellent PG&E outage interactive (mixing in data on wind conditions) and it earned a bunch of interest on Twitter and some discussion on Hacker News.

[... 452 words]

goodreads-to-sqlite (via) This is so cool! Tobias Kunze built a Python CLI tool to import your Goodreads data into a SQLite database, inspired by github-to-sqlite and my various other Dogsheep tools. It’s the first Dogsheep style tool I’ve seen that wasn’t built by me—and Tobias’ write-up includes some neat examples of queries you can run against your Goodreads data. I’ve now started using Goodreads and I’m importing my books into my own private Dogsheep Datasette instance. # 14th October 2019, 4:07 am

Weeknotes: Dogsheep

Having figured out my Stanford schedule, this week I started getting back into the habit of writing some code.

[... 1367 words]

twitter-to-sqlite 0.6, with track and follow. I shipped a new release of my twitter-to-sqlite command-line tool this evening. It now includes experimental features for subscribing to the Twitter streaming API: you can track keywords or follow users and matching Tweets will be written to a SQLite database in real-time as they come in through the API. Since Datasette supports mutable databases now you can run Datasette against the database and run queries against the tweets as they are inserted into the tables. # 6th October 2019, 4:54 am

Client-Side Certificate Authentication with nginx. I’m intrigued by client-side browser certificates, which allow you to lock down a website such that only browsers with a specific certificate installed can access them. They work on both laptops and mobile phones. I followed the steps in this tutorial and managed to get an nginx instance running which only allows connections from my personal laptop and iPhone. # 5th October 2019, 5:26 pm

healthkit-to-sqlite. Ever since I got an Apple Watch I’ve been itching to get my hands on the step tracking and health data that it’s been collecting for me. I know it’s there in a SQLite database on my wrist, but I couldn’t figure out how to get it! A few days ago I stumbled across the “Export Health Data” button in the iOS Health app, and it turns out it creates a zip file containing XML with a full dump of the data collected by Apple Health. healthkit-to-sqlite is the tool I’ve built that can read that export and use it to create a SQLite database ready to be queried and explored with Datasette. It’s a pretty basic implementation but it’s already giving me access to over 3 million rows of data. Lots of potential here for interesting work with personal analytics. # 22nd July 2019, 3:34 am