4 items tagged “zerodowntime”
2023
Database Migrations. Vadim Kravcenko provides a useful, in-depth description of the less obvious challenges of applying database migrations successfully. Vadim uses and likes Django’s migrations (as do I) but notes that running them at scale still involves a number of thorny challenges.
The biggest of these, which I’ve encountered myself multiple times, is that if you want truly zero downtime deploys you can’t guarantee that your schema migrations will be deployed at the exact same instant as changes you make to your application code.
This means all migrations need to be forward-compatible: you need to apply a schema change in a way that your existing code will continue to work error-free, then ship the related code change as a separate operation.
Vadim describes what this looks like in detail for a number of common operations: adding a field, removing a field and changing a field that has associated business logic implications. He also discusses the importance of knowing when to deploy a dual-write strategy. # 1st October 2023, 11:55 pm
MRSK. A new open source web application deployment tool from 37signals, developed to help migrate their Hey webmail app out of the cloud and onto their own managed hardware. The key feature is one that I care about deeply: it enables zero-downtime deploys by running all traffic through a Traefik reverse proxy in a way that allows requests to be paused while a new deployment is going out—so end users get a few seconds delay on their HTTP requests before being served by the replaced application. # 29th April 2023, 11:54 pm
2020
Zero Downtime Release: Disruption-free Load Balancing of a Multi-Billion User Website (via) I remain fascinated by techniques for zero downtime deployment—once you have it working it makes shipping changes to your software so much less stressful, which means you can iterate faster and generally be much more confident in shipping code.
Facebook have invested vast amounts of effort into getting this right, and their new paper for the ACM SIGCOMM conference goes into detail about how it all works. # 5th August 2020, 3:27 am
2010
Zero-downtime Redis upgrade discussion. GitHub have a short window of scheduled downtime in order to upgrade their Redis server. I asked in their comments if they’d considered trying to run the upgrade with no downtime at all using Redis replication, and Ryan Tomayko has posted some interesting replies. # 28th May 2010, 2:50 pm