Simon Willison’s Weblog

Entries in Oct

Filters: Type: entry × Month: Oct ×


Weeknotes: Niche Museums, Kepler, Trees and Streaks

Every now and then someone will ask “so when are you going to build Museums Near Me then?”, based on my obsession with niche museums and websites like www.owlsnearme.com.

[... 872 words]

Weeknotes: The Squirrel Census, Genome SQL query

This week was mostly about incremental improvements. And squirrels.

[... 911 words]

Weeknotes: PG&E outages, and Open Source works!

My big focus this week was the PG&E outages project. I’m really pleased with how this turned out: the San Francisco Chronicle used data from it for their excellent PG&E outage interactive (mixing in data on wind conditions) and it earned a bunch of interest on Twitter and some discussion on Hacker News.

[... 452 words]

Tracking PG&E outages by scraping to a git repo

PG&E have cut off power to several million people in northern California, supposedly as a precaution against wildfires.

[... 833 words]

Weeknotes: Dogsheep

Having figured out my Stanford schedule, this week I started getting back into the habit of writing some code.

[... 1367 words]

Automatically playing science communication games with transfer learning and fastai

This weekend was the 9th annual Science Hack Day San Francisco, which was also the 100th Science Hack Day held worldwide.

[... 1174 words]

How to Instantly Publish Data to the Internet with Datasette

I spoke about my Datasette project at PyBay in August and they’ve just posted the video of my talk.

[... 58 words]

How I moderated the State of Django panel at DjangoCon US.

On Wednesday last week I moderated the State of Django panel as the closing session for DjangoCon US 2018.

[... 1210 words]

The interesting ideas in Datasette

Datasette (previously) is my open source tool for exploring and publishing structured data. There are a lot of ideas embedded in Datasette. I realized that I haven’t put many of them into writing.

[... 2857 words]

Late night dining near Great American Music Hall

Tommy’s Joynt is a couple of blocks away and is a San Francisco institution—great comfort food, inexpensive, crammed with personality and open late.

[... 40 words]

Porting my blog to Python 3

This blog is now running on Python 3! Admittedly this is nearly nine years after the first release of Python 3.0, but it’s the first Python 3 project I’ve deployed myself so I’m pretty excited about it.

[... 883 words]

How to set up world-class continuous deployment using free hosted tools

I’m going to describe a way to put together a world-class continuous deployment infrastructure for your side-project without spending any money.

[... 1273 words]

Deploying an asynchronous Python microservice with Sanic and Zeit Now

Back in 2008 Natalie Downe and I deployed what today we would call a microservice: json-head, a tiny Google App Engine app that allowed you to make an HTTP head request against a URL and get back the HTTP headers as JSON. One of our initial use-scase for this was Natalie’s addSizes.js, an unobtrusive jQuery script that could annotate links to PDFs and other large files with their corresponding file size pulled from the Content-Length header. Another potential use-case is detecting broken links, since the API can be used to spot 404 status codes (as in this example).

[... 1361 words]

Changelogs to help understand the fires in the North Bay

The situation in the counties north of San Francisco is horrifying right now. I’ve repurposed some of the tools I built to for the Irma Response project last month to collect and track some data that might be of use to anyone trying to understand what’s happening up there. I’m sharing these now in the hope that they might prove useful.

[... 383 words]

Recovering missing content from the Internet Archive

When I restored my blog last weekend I used the most recent SQL backup of my blog’s database from back in 2010. I thought it had all of my content from before I started my 7 year hiatus, but in watching the 404 logs I started seeing the occasional hit to something that really should have been there but wasn’t. Turns out the SQL backup I was working from was missing some content.

[... 636 words]

Should I build my startup’s web-based product as if it’s going to one day be widely adopted and experience high-volume, or instead focus on quick delivery over scalability?

Absolutely the second: build for rapid learning, not for eventual scalability. The vast majority of startups fail, and the number one reason they fail is that they didn’t achieve product-market fit: they failed to build something that customers actually wanted.

[... 169 words]

Implementing faceted search with Django and PostgreSQL

I’ve added a faceted search engine to this blog, powered by PostgreSQL. It supports regular text search (proper search, not just SQL“like” queries), filter by tag, filter by date, filter by content type (entries vs blogmarks vs quotation) and any combination of the above. Some example searches:

[... 3049 words]

Running gunicorn behind nginx on Heroku for buffering and logging

Heroku’s default setup for Django uses the gunicorn application server. Each Heroku dyno can only run a limited number of gunicorn workers, which means a limited number of requests can be served in parallel (around 4 per dyno is a good rule of thumb).

[... 400 words]

Getting the blog back together

Getting this blog up and running again has turned out to be one of those side-projects that keeps threatening to fall down a rabbit hole.

[... 160 words]

Tell me when to quit.

If you want to get into Person of Interest without having to wade through the less great monster-of-the-week episodes in season one, Io9 has an episode guide listing the season one episodes essential to the overall plot arc. I think it’s solid gold from season two onwards—by far the smartest fictional depiction of artificial intelligence I’ve seen anywhere.

[... 112 words]

Practical gift ideas to positively improve a friend’s life and hobbies

I’m a big fan of the Dorling Kindersley travel books, which are chock full of photos, maps, diagrams and illustrations. Thanks to the internet there’s really not much point carting around a reference-style guidebook like Lonely Planet—TripAdvisor etc will always be more comprehensive and up-to-date. This makes guidebooks more important for general inspiration and browsing.

[... 75 words]

Probably need to GTFO [another how do I spin leaving question]

It strikes me that the core problem here is that your current company’s runway is privileged information: your current employer doesn’t need rumors about their financial health to start spreading. I think your instinct to avoid straight out saying that to other companies is very reasonable.

[... 132 words]

Where should we stay in Nashville?

For a delightfully unique Nashville experience, I suggest looking up Santa’s Pub. It’s a dive bar in a double wide trailer run by a man who looks a bit like Santa, and every night there is karaoke night. Aside from being a bit smokey it’s an enormous amount of fun.

[... 63 words]

Japanese pantry staples?

These answers are fantastic! I’m so glad I asked here. Thank you all very much.

[... 46 words]

Difficulty level: Eating dumplings

When I lived in Islington a few years ago I really liked New Culture Revolution for dumplings. It was never busy, not very expensive and the food was great.

[... 40 words]

My contract with web developer says a) she will create a unique description for each page, AND b) that she is not responsible for writing or inputting any content. Now she wants me to compose the page descriptions, citing (b). What do you think? Edit...

You should write the copy. Copy is important—you don’t want it to be written by someone who believes it isn’t even their responsibility, and will hence probably do a poor job of it.

[... 93 words]

Is it common a Silicon Valley startup give employee free shares, or just option to purchase shares?

The problem with giving people shares outright is that they have to pay tax on them. If the company later goes bust (as many do) and the shares hence prove to be worthless, the employee has paid tax on something that has no actual value.

[... 147 words]

Can I promote an online/virtual/ webinar event through Meetup?

No. That’s specifically mentioned as something Meetup should not be used for in their community guidelines: Meetup’s Community Guidelines

[... 54 words]

Should I ever use GIF image format for non animated elements?

These days probably not—anything non-animated that you would use a GIF for is generally better as a PNG.

[... 37 words]

How was FriendFeed’s schema less db faster than pure MySQL?

The principle reason they switched to a schemaless DB was to work around the challenges of having to make schemes changes in MySQL, which can lock the table and take hours if bit days to complete in large tables.

[... 115 words]