39 items tagged “git”
Figure out who’s leaving the company: dump, diff, repeat (via) Rachel Kroll describes a neat hack for companies with an internal LDAP server or similar machine-readable employee directory: run a cron somewhere internal that grabs the latest version and diffs it against the previous to figure out who has joined or left the company.
I suggest using Git for this—a form of Git scraping—as then you get a detailed commit log of changes over time effectively for free.
I really enjoyed Rachel’s closing thought: “Incidentally, if someone gets mad about you running this sort of thing, you probably don’t want to work there anyway. On the other hand, if you’re able to build such tools without IT or similar getting ”threatened“ by it, then you might be somewhere that actually enjoys creating interesting and useful stuff. Treasure such places. They don’t tend to last.” # 9th February 2024, 5:44 am
git log -L :path_with_format:__init__.py
Tracking SQLite Database Changes in Git (via) A neat trick from Garrit Franke that I hadn’t seen before: you can teach “git diff” how to display human readable versions of the differences between binary files with a specific extension using the following:
git config diff.sqlite3.binary true
git config diff.sqlite3.textconv “echo .dump | sqlite3”
That way you can store binary files in your repo but still get back SQL diffs to compare them.
For the last few years I’ve been trying to center my work around creating what I consider to be the Perfect Commit. This is a single commit that contains all of the following:[... 2019 words]
[... 1146 words]
A tiny CI system (via) Christian Ştefănescu shares a recipe for building a tiny self-hosted CI system using Git and Redis. A post-receive hook runs when a commit is pushed to the repo and uses redis-cli to push jobs to a list. Then a separate bash script runs a loop with a blocking “redis-cli blpop jobs” operation which waits for new jobs and then executes the CI job as a shell script. # 26th April 2022, 3:39 pm
I’ve been experimenting with a new variant of Git scraping this week which I’m calling Help scraping. The key idea is to track changes made to CLI tools over time by recording the output of their
--help commands in a Git repository.
I’m maintaining a lot of different projects at the moment. I thought it would be useful to describe the process I use for adding a new feature to one of them, using the new sqlite-utils create-database command as an example.[... 2779 words]
I described Git scraping last year: a technique for writing scrapers where you periodically snapshot a source of data to a Git repository in order to record changes to that source over time.[... 2002 words]
nyt-2020-election-scraper. Brilliant application of git scraping by Alex Gaynor and a growing team of contributors. Takes a JSON snapshot of the NYT’s latest election poll figures every five minutes, then runs a Python script to iterate through the history and build an HTML page showing the trends, including what percentage of the remaining votes each candidate needs to win each state. This is the perfect case study in why it can be useful to take a “snapshot if the world right now” data source and turn it into a git revision history over time. # 6th November 2020, 2:24 pm
Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it.[... 963 words]
This week I helped Natalie launch Rocky Beaches, shipped Datasette 0.48 and several releases of
datasette-graphql, upgraded the CSRF protection for
datasette-upload-csvs and figured out how to get a commit log of changes to my blog by backing up its database to a GitHub repository.
I spent this week spreading myself between a bunch of smaller projects, and finally getting familiar with cookiecutter. I wrote about my datasette-plugin cookiecutter template earlier in the week; here’s what else I’ve been working on.[... 703 words]
Web apps are typically continuously delivered, not rolled back, and you don’t have to support multiple versions of the software running in the wild.
This is not the class of software that I had in mind when I wrote the blog post 10 years ago. If your team is doing continuous delivery of software, I would suggest to adopt a much simpler workflow (like GitHub flow) instead of trying to shoehorn git-flow into your team.
Weeknotes: Archiving coronavirus.data.gov.uk, custom pages and directory configuration in Datasette, photos-to-sqlite
I mainly made progress on three projects this week: Datasette, photos-to-sqlite and a cleaner way of archiving data to a git repository.[... 1132 words]
This week I’ve been mostly dealing with the finally announced shutdown of Zeit Now v1. And having long-winded conversations with myself in GitHub issues.[... 2050 words]
Repository driven development (via) I’m already a big fan of keeping documentation and code in the same repo so you can update them both from within the same code review, but this takes it even further: in repository driven development every aspect of the code and configuration needed to define, document, test and ship a service live in the service repository—all the way down to the configurations for reporting dashboards. This sounds like heaven. # 24th July 2019, 8:41 am
San Francisco has a neat open data portal (as do an increasingly large number of cities these days). For a few years my favourite file on there has been Street Tree List, a list of all 190,000 trees in the city maintained by the Department of Public Works.[... 1051 words]
Telling stories through your commits. Joel Chippendale’s excellent guide to writing a useful commit history. I spend a lot of time on my commit messages, because when I’m trying to understand code later on they are the only form of documentation that is guaranteed to remain up-to-date against the code at that exact point of time. These tips are clear, concise, teadabale and include some great examples. # 13th January 2018, 7:44 pm
Anyone that has me on too high of a pedestal should see me fumbling around with git.
Exploding Git Repositories. Kate Murphy describes how git is vulnerable to a similar attack to the XML “billion laughs” recursive entity expansion attack—you can create a tiny git repository that acts as a “git bomb”, expanding 12 root objects to over a billion files using recursive blob references. # 12th October 2017, 7:43 pm
What are the differences between “forking,” “cloning,” and downloading the project as a zip file on GitHub?
[... 98 words]
Should I use Dropbox instead of Git for 2 coders? In terms of going really fast and working on things at the same time, I’m thinking it may be uber productive to use Dropbox for it’s instant syncing instead of Git/Github. What are the pros/cons?
Dropbox is definitely the wrong tool for this—you’ll find yourself running in to all sorts of weird problems very quickly if you attempt to use it this way.[... 119 words]
GitHub: Announcing SVN Support. The best kind of April Fool’s joke: one that works. It’s read-only, but that’s good enough to support referencing GitHub repositories from SVN externals. # 1st April 2010, 11:33 am
A successful Git branching model (via) This looks eminently sensible. The master branch is used for production-ready code, and is only updated by merging from either release branches or emergency hotfix branches. A develop branch is used for integration (from feature branches), and is branched to create release branches when a release is nearly ready. It’s all comprehensively documented and comes with some well-designed diagrams. # 20th January 2010, 7:30 pm
Introducing the YUI 3 Gallery. Write a plugin for YUI3, BSD license it and sign a CLA and Yahoo! will push your module out to their CDN and make it loadable using the YUI().use() statement. They’re coordinating the submissions using GitHub. # 4th November 2009, 11:14 pm
How We Made GitHub Fast. Detailed overview of the new GitHub architecture. It’s a lot more complicated than I would have expected—lots of moving parts are involved in ensuring they can scale horizontally when they need to. Interesting components include nginx, Unicorn, Rails, DRBD, HAProxy, Redis, Erlang, memcached, SSH, git and a bunch of interesting new open source projects produced by the GitHub team such as BERT/Ernie and ProxyMachine. # 21st October 2009, 9:14 pm