Simon Willison’s Weblog

29 items tagged “git”

2020

Commits are snapshots, not diffs (via) Useful, clearly explained revision of some Git fundamentals. # 17th December 2020, 10:01 pm

nyt-2020-election-scraper. Brilliant application of git scraping by Alex Gaynor and a growing team of contributors. Takes a JSON snapshot of the NYT’s latest election poll figures every five minutes, then runs a Python script to iterate through the history and build an HTML page showing the trends, including what percentage of the remaining votes each candidate needs to win each state. This is the perfect case study in why it can be useful to take a “snapshot if the world right now” data source and turn it into a git revision history over time. # 6th November 2020, 2:24 pm

Git scraping: track changes over time by scraping to a Git repository

Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it.

[... 941 words]

Weeknotes: Rocky Beaches, Datasette 0.48, a commit history of my database

This week I helped Natalie launch Rocky Beaches, shipped Datasette 0.48 and several releases of datasette-graphql, upgraded the CSRF protection for datasette-upload-csvs and figured out how to get a commit log of changes to my blog by backing up its database to a GitHub repository.

[... 1294 words]

Weeknotes: cookiecutter templates, better plugin documentation, sqlite-generate

I spent this week spreading myself between a bunch of smaller projects, and finally getting familiar with cookiecutter. I wrote about my datasette-plugin cookiecutter template earlier in the week; here’s what else I’ve been working on.

[... 703 words]

Web apps are typically continuously delivered, not rolled back, and you don’t have to support multiple versions of the software running in the wild. This is not the class of software that I had in mind when I wrote the blog post 10 years ago. If your team is doing continuous delivery of software, I would suggest to adopt a much simpler workflow (like GitHub flow) instead of trying to shoehorn git-flow into your team.

Vincent Driessen # 14th May 2020, 1:49 pm

Weeknotes: Archiving coronavirus.data.gov.uk, custom pages and directory configuration in Datasette, photos-to-sqlite

I mainly made progress on three projects this week: Datasette, photos-to-sqlite and a cleaner way of archiving data to a git repository.

[... 1132 words]

Goodbye Zeit Now v1, hello datasette-publish-now—and talking to myself in GitHub issues

This week I’ve been mostly dealing with the finally announced shutdown of Zeit Now v1. And having long-winded conversations with myself in GitHub issues.

[... 2049 words]

2019

Repository driven development (via) I’m already a big fan of keeping documentation and code in the same repo so you can update them both from within the same code review, but this takes it even further: in repository driven development every aspect of the code and configuration needed to define, document, test and ship a service live in the service repository—all the way down to the configurations for reporting dashboards. This sounds like heaven. # 24th July 2019, 8:41 am

Generating a commit log for San Francisco’s official list of trees

San Francisco has a neat open data portal (as do an increasingly large number of cities these days). For a few years my favourite file on there has been Street Tree List, a list of all 190,000 trees in the city maintained by the Department of Public Works.

[... 1051 words]

2018

isomorphic-git (via) A pure-JavaScript implementation of the git protocol and underlying tools which works both server-side (Node.js) AND in the client, using an emulation of the fs API. Given the right CORS headers it can clone a GitHub repository over HTTPS right into your browser. Impressive. # 16th May 2018, 8:54 pm

Telling stories through your commits. Joel Chippendale’s excellent guide to writing a useful commit history. I spend a lot of time on my commit messages, because when I’m trying to understand code later on they are the only form of documentation that is guaranteed to remain up-to-date against the code at that exact point of time. These tips are clear, concise, teadabale and include some great examples. # 13th January 2018, 7:44 pm

2017

Anyone that has me on too high of a pedestal should see me fumbling around with git.

John Carmack # 12th November 2017, 3:50 pm

Exploding Git Repositories. Kate Murphy describes how git is vulnerable to a similar attack to the XML “billion laughs” recursive entity expansion attack—you can create a tiny git repository that acts as a “git bomb”, expanding 12 root objects to over a billion files using recursive blob references. # 12th October 2017, 7:43 pm

2013

What are the differences between “forking,” “cloning,” and downloading the project as a zip file on GitHub?

“fork” creates a copy of the project hosted on your own GitHub account. This is an exclusive Build software better, together. (links to: http://Github.com) feature and not a Git feature.

[... 98 words]

2012

Should I use Dropbox instead of Git for 2 coders? In terms of going really fast and working on things at the same time, I’m thinking it may be uber productive to use Dropbox for it’s instant syncing instead of Git/Github. What are the pros/cons?

Dropbox is definitely the wrong tool for this—you’ll find yourself running in to all sorts of weird problems very quickly if you attempt to use it this way.

[... 119 words]

2010

GitHub: Announcing SVN Support. The best kind of April Fool’s joke: one that works. It’s read-only, but that’s good enough to support referencing GitHub repositories from SVN externals. # 1st April 2010, 11:33 am

A successful Git branching model (via) This looks eminently sensible. The master branch is used for production-ready code, and is only updated by merging from either release branches or emergency hotfix branches. A develop branch is used for integration (from feature branches), and is branched to create release branches when a release is nearly ready. It’s all comprehensively documented and comes with some well-designed diagrams. # 20th January 2010, 7:30 pm

2009

Introducing the YUI 3 Gallery. Write a plugin for YUI3, BSD license it and sign a CLA and Yahoo! will push your module out to their CDN and make it loadable using the YUI().use() statement. They’re coordinating the submissions using GitHub. # 4th November 2009, 11:14 pm

How We Made GitHub Fast. Detailed overview of the new GitHub architecture. It’s a lot more complicated than I would have expected—lots of moving parts are involved in ensuring they can scale horizontally when they need to. Interesting components include nginx, Unicorn, Rails, DRBD, HAProxy, Redis, Erlang, memcached, SSH, git and a bunch of interesting new open source projects produced by the GitHub team such as BERT/Ernie and ProxyMachine. # 21st October 2009, 9:14 pm

homebrew. Exciting alternative to MacPorts for compiling software on OS X—homebrew avoids sudo and defines packages as simple Ruby scripts, shared and distributed using Git. # 21st September 2009, 6:51 pm

Fabric, Django, Git, Apache, mod_wsgi, virtualenv and pip deployment. I’m slowly working my way through this stack at the moment—next stop, fabric. # 28th July 2009, 11:56 am

djangopeople.net on GitHub. I’ve released the source code for Django People, the geographical community site developed last year by myself and Natalie Downe (it hasn’t otherwise been touched since April last year, so it needs porting to Django 1.1). If you want a new feature on the site, implement it and I’ll see about merging it in. # 4th May 2009, 6:12 pm

Dulwich. A pure Python implementation of the Git file format and protocols. Reinforces my impression that a key to Git’s success is stable, well designed and documented on-disk formats. # 16th February 2009, 10:27 pm

2008

Sam Vilain converted Perl’s history from Perforce to Git. [..] He spent more than a year building custom tools to transform 21 years of Perl history into the first ever unified repository of every single change to Perl. In addition to changes from Perforce, Sam patched together a comprehensive view of Perl’s history incorporating publicly available snapshot releases, changes from historical mailing list archives and patch sets recovered from the hard drives of previous Perl release engineers.

The Perl Foundation # 22nd December 2008, 6:06 pm

Maybe git is the monads of version control

Piers Cawley # 5th August 2008, 10:51 pm

Tailor. “Tailor is a tool to migrate or replicate changesets between ArX, Bazaar, Bazaar-NG, CVS, Codeville, Darcs, Git, Mercurial, Monotone, Subversion and Tla repositories.”—written in Python. # 24th June 2008, 9:59 am

Using Git as a versioned data store in Python. gitshelve supports the same interface as Python’s built-in shelve module but stores things to a versioned Git repository instead of just a pickled dictionary. I’ve been casually wondering what a Git-powered CMS would look like. # 15th May 2008, 3:25 pm

2007

A look back: Bram Cohen vs Linus Torvalds. Makes the case that Git has proved Linus Torvald correct on every point of his infamous debate with Bram Cohen back in 2005. # 17th July 2007, 10:29 pm