Simon Willison’s Weblog

Weeknotes: Improv at Stanford, planning Datasette Cloud

Last week was the first week of the quarter at Stanford—which is called “shopping week” here because students are expected to try different classes to see which ones they are going to stick with.

I’ve settled on three classes this quarter: Beginning Improvising, Designing Machine Learning and Entrepreneurship: Formation of New Ventures.

Beginning Improvising is the Stanford improv theater course. It’s a big time commitment: three two-hours sessions a week for ten weeks is nearly 60 hours of improv!

It’s already proving to be really interesting though: it turns out the course is a thinly disguised applied psychology course.

Improv is about creating a creative space for other people to shine. The applications to professional teamwork are obvious and fascinating to me. I’ll probably write more about this as the course continues.

Designing Machine Learning is a class at the Stanford d.School taught by Michelle Carney and Emily Callaghan. It focuses on multidisciplinary applications of machine learning, mixing together students from many different disciplines around Stanford.

I took a fast.ai deep learning course last year which gave me a basic understanding of the code size of neural networks, but I’m much more interestind in figuring out applications so this seems like a much more interesting option than a more code-focused course.

The class started out building some initial models using Google’s Teachable Machine tool, which is fascinating. It lets you train transfer learning models for image, audio and posture recognition entirely in your browser—no data is transferred to Google’s servers at all. You can then export those models and use them with a variety of different libraries—I’ve got them to work with both JavaScript and Python already.

I’m taking Entrepreneurship: Formation of New Ventures because of the rave reviews I heard from other JSK fellows who took it last quarter. It’s a classic case-study business school class: each session features a guest speaker who is a successful entrepreneur, and the class discusses their case for the first two thirds of the section while they listen in—then finds out how well the discussion matched to what actually happened.

Planning Datasette Cloud

Shopping week kept me pretty busy so I’ve not done much actual development over the past week, but I have started planning out and researching my next major project, which I’m currently calling Datasette Cloud.

Datasette Cloud will be an invite-only hosted SaaS version of Datasette. It’s designed to help get news organizations on board with the software without having to talk them through figuring out their own hosting, so I can help them solve real problems and learn more about how the ecosystem should evolve to support them.

I’d love to be able to run this on serverless hosting platforms like Google Cloud Run or Heroku, but sadly those tools aren’t an option to me due to a key problem: I’m trying to build a stateful service (SQLite databases need to live on a local disk) in 2020.

I posed this challenge on Twitter back in October:

I’ve been exploring my options since then, and I think I’ve settled on a decidedly 2010-era way of doing this: I’m going to run my own instances! So I’ve been exploring hosting Datasette on both AWS Lightsail and Digital Ocean Droplets over the past few months.

My current plan is to have each Datasette Cloud account run as a Datasette instance in its own Docker container, primarily to ensure filesystem isolation: different accounts must not be able to see each other’s database files.

I started another discussion about this on Twitter and had several recommendations for Traefik as a load balancer for assigning hostnames to different Docker containers, which is exactly what I need to do.

So this afternoon I made my way through Digital Ocean’s outstanding tutorial How To Use Traefik as a Reverse Proxy for Docker Containers on Ubuntu 18.04 and I think I’ve convinced myself that this is a smart way forward.

So, mostly a research week but I’ve got a solid plan for my next steps.

This week’s Niche Museums

I also finally got around to implementing a map.

This is Weeknotes: Improv at Stanford, planning Datasette Cloud by Simon Willison, posted on 14th January 2020.

Tagged , , , ,

Next: Deploying a data API using GitHub Actions and Cloud Run

Previous: Building a sitemap.xml with a one-off Datasette plugin