Simon Willison’s Weblog

Subscribe

31 items tagged “dogsheep”

2023

apple-notes-to-sqlite (via) With the help of ChatGPT I finally figured out just enough AppleScript to automate the export of my notes to a SQLite database. AppleScript is a notoriously read-only language, which is turns out makes it a killer app for LLM-assisted coding. # 9th March 2023, 6:04 am

2022

Google Drive to SQLite

I released a new tool this week: google-drive-to-sqlite. It’s a CLI utility for fetching metadata about files in your Google Drive and writing them to a local SQLite database.

[... 1221 words]

2021

Notes on Notes.app. Apple’s Notes app keeps its data in a SQLite database at ~/Library/Group\ Containers/group.com.apple.notes/NoteStore.sqlite—but it’s pretty difficult to extract data from. It turns out the note text is stored as a gzipped protocol buffers object in the ZICNOTEDATA.ZDATA column. Steve Dunham did the hard work of figuring out how it all works—the complexity stems from Apple’s use of CRDT’s to support seamless multiple edits from different devices. # 9th December 2021, 10:39 pm

Weeknotes: Getting my personal Dogsheep up and running again

I gave a talk about Dogsheep at Noisebridge’s Five Minutes of Fame on Thursday. Just one problem: my regular Dogsheep demo was broken, so I ended up building it from scratch again. In doing so I fixed a few bugs in some Dogsheep tools.

[... 1311 words]

Datasette—an ecosystem of tools for working with small data

This is the transcript and video from a talk I gave at PyGotham 2020 about using SQLite, Datasette and Dogsheep to work with small data.

[... 4654 words]

How to secure an Ubuntu server using Tailscale and UFW. This is the Tailscale tutorial I’ve always wanted: it explains in detail how you can run an Ubuntu server (from any cloud provider) such that only devices on your personal Tailscale network can access it. # 26th February 2021, 8:31 pm

2020

Building a search engine for datasette.io

This week I added a search engine to datasette.io, using the search indexing tool I’ve been building for Dogsheep.

[... 1391 words]

The case against client certificates (via) Colm MacCárthaigh provides a passionately argued Twitter thread about client certificates and why they should be avoided. I tried using them as an extra layer of protection fir my personal Dogsheep server and ended up abandoning them—certificate management across my devices was too fiddly. # 9th December 2020, 2:41 pm

Weeknotes: github-to-sqlite workflows, datasette-ripgrep enhancements, Datasette 0.52

This week: Improvements to datasette-ripgrep, github-to-sqlite and datasette-graphql, plus Datasette 0.52 and a flurry of dot-releases.

[... 1123 words]

Personal Data Warehouses: Reclaiming Your Data

I gave a talk yesterday about personal data warehouses for GitHub’s OCTO Speaker Series, focusing on my Datasette and Dogsheep projects. The video of the talk is now available, and I’m presenting that here along with an annotated summary of the talk, including links to demos and further information.

[... 5166 words]

OCTO Speaker Series: Simon Willison—Personal Data Warehouses: Reclaiming Your Data. I’m giving a talk in the GitHub OCTO (Office of the CTO) speaker series about Datasette and my Dogsheep personal analytics project. You can register for free here—the stream will be on Thursday November 12, 2020 at 8:30am PST (4:30pm GMT). # 23rd October 2020, 3 am

Dogsheep: Personal analytics with Datasette. The second edition of my new Datasette Weekly newsletter, talks about Dogsheep, Dogsheep Beta, Datasette 1.0 and features datasette-cluster-map as the plugin of the week. # 19th October 2020, 4:38 pm

Building an Evernote to SQLite exporter

I’ve been using Evernote for over a decade, and I’ve long wanted to export my data from it so I can do interesting things with it.

[... 1879 words]

evernote-to-sqlite (via) The latest tool in my Dogsheep series of utilities for personal analytics: evernote-to-sqlite takes Evernote note exports en their ENEX XML format and loads them into a SQLite database. Embedded images are loaded into a BLOB column and the output of their cloud-based OCR system is added to a full-text search index. Notes have a latitude and longitude which means you can visualize your notes on a map using Datasette and datasette-cluster-map. # 12th October 2020, 12:38 am

Weeknotes: airtable-export, generating screenshots in GitHub Actions, Dogsheep!

This week I figured out how to populate Datasette from Airtable, wrote code to generate social media preview card page screenshots using Puppeteer, and made a big breakthrough with my Dogsheep project.

[... 1461 words]

Serving photos locally with datasette-media. datasette-media is a new Datasette plugin which can serve static files from disk in response to a configured SQL query that maps incoming URL parameters to a path to a file. I built it so I could run dogsheep-photos locally on my laptop and serve up thumbnails of images that match particular queries. I’ve added documentation to the dogsheep-photos README explaining how to use datasette-media, datasette-json-html and datasette-template-sql to create custom interfaces onto Apple Photos data on your machine. # 26th May 2020, 3:53 pm

Using SQL to find my best photo of a pelican according to Apple Photos

According to the Apple Photos internal SQLite database, this is the most aesthetically pleasing photograph I have ever taken of a pelican:

[... 1937 words]

Weeknotes: Datasette 0.41, photos breakthroughs

Shorter weeknotes this week, because my main project for the week warrants a detailed write-up on its own (coming soon... update 21st May here it is).

[... 867 words]

github-to-sqlite 2.2 highlights thread. I released github-to-sqlite 2.2 today with a new “stargazers” command for importing users who have starred one or more specific repositories. This Twitter thread lists highlights of recent releases and links to a live Datasette demo that shows what the tool can do. # 2nd May 2020, 10:16 pm

Weeknotes: Datasette 0.40, various projects, Dogsheep photos

A new release of Datasette, two new projects and progress towards a Dogsheep photos solution.

[... 826 words]

Weeknotes: Hacking on 23 different projects

I wrote a lot of code this week: 184 commits over 23 repositories! I’ve also started falling for Zeit Now v2, having found workarounds for some of my biggest problems with it.

[... 901 words]

Weeknotes: Covid-19, First Python Notebook, more Dogsheep, Tailscale

My covid-19.datasettes.com project publishes information on COVID-19 cases around the world. The project started out using data from Johns Hopkins CSSE, but last week the New York Times started publishing high quality USA county- and state-level daily numbers to their own repository. Here’s the change that added the NY Times data.

[... 993 words]

Weeknotes: Datasette 0.39 and many other projects

This week’s theme: Well, I’m not going anywhere. So a ton of progress to report on various projects.

[... 806 words]

hacker-news-to-sqlite (via) The latest in my Dogsheep series of tools: hacker-news-to-sqlite uses the Hacker News API to fetch your comments and submissions from Hacker News and save them to a SQLite database. # 21st March 2020, 4:27 am

2019

pinboard-to-sqlite (via) Jacob Kaplan-Moss just released the second Dogsheep tool that wasn’t written by me (after goodreads-to-sqlite by Tobias Kunze)—this one imports your Pinterest bookmarks. The repo includes a really clean minimal example of how to use GitHub actions to run tests and release packages to PyPI. # 7th November 2019, 8:46 pm

Weeknotes: PG&E outages, and Open Source works!

My big focus this week was the PG&E outages project. I’m really pleased with how this turned out: the San Francisco Chronicle used data from it for their excellent PG&E outage interactive (mixing in data on wind conditions) and it earned a bunch of interest on Twitter and some discussion on Hacker News.

[... 452 words]

goodreads-to-sqlite (via) This is so cool! Tobias Kunze built a Python CLI tool to import your Goodreads data into a SQLite database, inspired by github-to-sqlite and my various other Dogsheep tools. It’s the first Dogsheep style tool I’ve seen that wasn’t built by me—and Tobias’ write-up includes some neat examples of queries you can run against your Goodreads data. I’ve now started using Goodreads and I’m importing my books into my own private Dogsheep Datasette instance. # 14th October 2019, 4:07 am

Weeknotes: Dogsheep

Having figured out my Stanford schedule, this week I started getting back into the habit of writing some code.

[... 1367 words]

twitter-to-sqlite 0.6, with track and follow. I shipped a new release of my twitter-to-sqlite command-line tool this evening. It now includes experimental features for subscribing to the Twitter streaming API: you can track keywords or follow users and matching Tweets will be written to a SQLite database in real-time as they come in through the API. Since Datasette supports mutable databases now you can run Datasette against the database and run queries against the tweets as they are inserted into the tables. # 6th October 2019, 4:54 am

Client-Side Certificate Authentication with nginx. I’m intrigued by client-side browser certificates, which allow you to lock down a website such that only browsers with a specific certificate installed can access them. They work on both laptops and mobile phones. I followed the steps in this tutorial and managed to get an nginx instance running which only allows connections from my personal laptop and iPhone. # 5th October 2019, 5:26 pm