136 items tagged “github”
2020
OCTO Speaker Series: Simon Willison—Personal Data Warehouses: Reclaiming Your Data. I’m giving a talk in the GitHub OCTO (Office of the CTO) speaker series about Datasette and my Dogsheep personal analytics project. You can register for free here—the stream will be on Thursday November 12, 2020 at 8:30am PST (4:30pm GMT).
Git scraping: track changes over time by scraping to a Git repository
Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it.
[... 963 words]Render Markdown tool (via) I wrote a quick JavaScript tool for rendering Markdown via the GitHub Markdown API—which includes all of their clever extensions like tables and syntax highlighting—and then stripping out some extraneous HTML to give me back the format I like using for my blog posts.
Weeknotes: Rocky Beaches, Datasette 0.48, a commit history of my database
This week I helped Natalie launch Rocky Beaches, shipped Datasette 0.48 and several releases of datasette-graphql
, upgraded the CSRF protection for datasette-upload-csvs
and figured out how to get a commit log of changes to my blog by backing up its database to a GitHub repository.
Doing Stupid Stuff with GitHub Actions (via) I love the idea here of running a scheduled action once a year that deliberately fails, causing GitHub to send you a “Happy New Year” failure email!
zhiiiyang/zhiiiyang profile README (via) This is a brilliant hack: a GitHub profile README that uses an action to retrieve the author’s latest tweet (using R), render it as a PNG screenshot in headless Chrome via rstudio/webshot2 and embed that image in their profile.
Building a self-updating profile README for GitHub
GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case that’s github.com/simonw/simonw), add a README.md
to it and GitHub will render the contents at the top of your personal profile page—for me that’s github.com/simonw
A cookiecutter template for writing Datasette plugins
Datasette’s plugin system is one of the most interesting parts of the entire project. As I explained to Matt Asay in this interview, the great thing about plugins is that Datasette can gain new functionality overnight without me even having to review a pull request. I just need to get more people to write them!
[... 914 words]github-to-sqlite 2.2 highlights thread. I released github-to-sqlite 2.2 today with a new “stargazers” command for importing users who have starred one or more specific repositories. This Twitter thread lists highlights of recent releases and links to a live Datasette demo that shows what the tool can do.
Weeknotes: Datasette 0.40, various projects, Dogsheep photos
A new release of Datasette, two new projects and progress towards a Dogsheep photos solution.
[... 826 words]Using a self-rewriting README powered by GitHub Actions to track TILs
I’ve started tracking TILs—Today I Learneds—inspired by this five-year-and-counting collection by Josh Branchaud on GitHub (found via Hacker News). I’m keeping mine in GitHub too, and using GitHub Actions to automatically generate an index page README in the repository and a SQLite-backed search engine.
[... 1,100 words]Goodbye Zeit Now v1, hello datasette-publish-now—and talking to myself in GitHub issues
This week I’ve been mostly dealing with the finally announced shutdown of Zeit Now v1. And having long-winded conversations with myself in GitHub issues.
[... 2,050 words]Weeknotes: Datasette 0.39 and many other projects
This week’s theme: Well, I’m not going anywhere. So a ton of progress to report on various projects.
[... 806 words]Tracking FARA by deploying a data API using GitHub Actions and Cloud Run
I’m using the combination of GitHub Actions and Google Cloud Run to retrieve data from the U.S. Department of Justice FARA website and deploy it as a queryable API using Datasette.
[... 1,599 words]Your own hosted blog, the easy, free, open way (even if you’re not a computer expert) (via) Jeremy Howard and the fast.ai team have released fast_template—a GitHub repository designed to be used as a template to create new repositories with a complete Jekyll blog configured for use with GitHub pages. GitHub’s official document recommends you install Ruby on your machine to do this, but Jeremy points out that with the right repository setup you can run a blog entirely by editing files through the GitHub web interface.
How we use “ship small” to rapidly build new features at GitHub (via) Useful insight into how GitHub develop new features. They make aggressive use of feature flags, shipping a rough skeleton of a new feature to production as early as possible and actively soliciting feedback from other employees as they iterate on the feature. They static JSON mocks of APIs to unblock their frontend engineers and iterate on the necessary data structures while the real backend is bring implemented.
2019
GitHub Actions ci.yml for deno. Spotted this today: it’s one of the cleanest examples I’ve seen of a complex CI configuration for GitHub Actions, testing, linting, benchmarking and building Ryan Dahl’s Deno JavaScript runtime.
pinboard-to-sqlite (via) Jacob Kaplan-Moss just released the second Dogsheep tool that wasn’t written by me (after goodreads-to-sqlite by Tobias Kunze)—this one imports your Pinterest bookmarks. The repo includes a really clean minimal example of how to use GitHub actions to run tests and release packages to PyPI.
Cloud Run Button: Click-to-deploy your git repos to Google Cloud (via) Google Cloud Run now has its own version of the Heroku deploy button: you can add a button to a GitHub repository which, when clicked, will provide an interface for deploying your repo to the user’s own Google Cloud account using Cloud Run.
Single sign-on against GitHub using ASGI middleware
I released Datasette 0.29 last weekend, the first version of Datasette to be built on top of ASGI (discussed previously in Porting Datasette to ASGI, and Turtles all the way down).
[... 1,612 words]datasette-auth-github (via) My first big ASGI plugin for Datasette: datasette-auth-github adds the ability to require users to authenticate against the GitHub OAuth API. You can whitelist specific users, or you can restrict access to members of specific GitHub organizations or teams. While it’s structured as a Datasette plugin it also includes ASGI middleware which can be applied to any ASGI application.
Building a stateless API proxy (via) This is a really clever idea. The GitHub API is infuriatingly coarsely grained with its permissions: you often end up having to create a token with way more permissions than you actually need for your project. Thea Flowers proposes running your own proxy in front of their API that adds more finely grained permissions, based on custom encrypted proxy API tokens that use JWT to encode the original API key along with the permissions you want to grant to that particular token (as a list of regular expressions matching paths on the underlying API).
Using dependabot to bump Django on my blog from 2.2 to 2.2.1 (via) GitHub recently acquired dependabot and made it free, and I decided to try it out on my blog. It’s a really neat piece of automation: it scans your requirements.txt (plus a number of other packaging definitions across several different languages), checks for updates to your dependencies and opens pull requests against any that it finds. Combine it with a CI service such as Circle CI and your tests will run automatically against the pull request, letting you know if it’s safe to merge. dependabot constantly rebases other changes against the pull request to try and ensure it will merge as cleanly as possible.
2018
October 21 post-incident analysis (via) Legitimately fascinating post-mortem by GitHub. They run database masters in multiple data centers with raft for leader election... but when they had an unexpected network split between east and west coast they ended up with several seconds of write that had not been correctly replicated. Cleaning up the resulting mess took the best part of 24 hours! Distributed systems are hard.
github-debug.com (via) This is a neat trick: GitHub have a dedicated site for their support engineers to send you to if you can’t connect to them. The site tests download speeds from their various domains and then lets you click a button to have GitHub run a traceroute/ping from their servers to your detected IP address and output the results (use devtools to spy on their API method for doing this). Then you can paste the results into a message to their support team. Turns out fastly-debug.com and dropbox-debug.com implement a similar pattern for those services as well.
In case you missed it: @GoogleColab can open any @ProjectJupyter notebook directly from @github!
To run the notebook, just replace "github.com" with "colab.research.google.com/github/" in the notebook URL, and it will be loaded into Colab.
MySQL High Availability at GitHub. Cutting edge high availability case-study: GitHub are now using Consul, raft, their own custom load balancer and their own custom orchestrator replication management toolkit to achieve cross-datacenter failover for their MySQL master/replica clusters.
GitHub for Nonprofits (via) TIL GitHub provide legally recognized nonprofits with free organization accounts with unlimited users and unlimited private repos—and they’ve registered 30,000 nonprofit accounts through the program as of May 2017.
github-trending-repos (via) This is a really clever hack: Vitaliy Potapov built a system for subscribing to a weekly digest of trending GitHub repos in your favourite languages entirely on top of the existing GitHub issues notification system. Find the issue for your particular language and hit “subscribe” and you’ll get an email (or push notification depending on how you get your issue notifications) once a week with the latest trends. The implementation is a 220 line Node.js script which runs on a daily and weekly schedule using Circle CI, so Vitaliy doesn’t even have to host or pay for any of the underlying infrastructure. It’s brilliant.