Simon Willison’s Weblog

Useful tricks with pip install URL and GitHub

The pip install command can accept a URL to a zip file or tarball. GitHub provides URLs that can create a zip file of any branch, tag or commit in any repository. Combining these is a really useful trick for maintaining Python packages.

pip install URL

The most common way of using pip is with package names from PyPi:

pip install datasette

But the pip install command has a bunch of other abilities—it can install files, pull from various version control systems and most importantly it can install packages from a URL.

I sometimes use this to distribute ad-hoc packages that I don’t want to upload to PyPI. Here’s a quick and simple Datasette plugin I built a while ago that I install using this option:

pip install 'https://static.simonwillison.net/static/2021/datasette_expose_some_environment_variables-0.1-py3-none-any.whl'

(Source code here)

You can also list URLs like this directly in your requirements.txt file, one per line.

datasette install

Datasette has a datasette install command which wraps pip install. It exists purely so that people can install Datasette plugins easily without first having to figure out the location of Datasette’s Python virtual environment.

This works with URLs too, so you can install that plugin like so:

datasette install https://static.simonwillison.net/static/2021/datasette_expose_some_environment_variables-0.1-py3-none-any.whl

The datasette publish commands have an --install option for installing plugin which works with URLs too:

datasette publish cloudrun mydatabase.db \
  --service=plugins-demo \
  --install datasette-vega \
  --install https://static.simonwillison.net/static/2021/datasette_expose_some_environment_variables-0.1-py3-none-any.whl \
  --install datasette-graphql

Installing branches, tags and commits

Any reference in a GitHub repository can be downloaded as a zip file or tarball—that means branches, tags and commits are all available.

If your repository contains a Python package with a setup.py file, those URLs will be compatible with pip install.

This means you can use URLs to install tags, branches and even exact commits!

Some examples:

  • pip install https://github.com/simonw/datasette/archive/refs/heads/main.zip installs the latest main branch from the simonw/datasette repository.
  • pip install https://github.com/simonw/datasette/archive/refs/tags/0.61.1.zip—installs version 0.61.1 of Datasette, via this tag.
  • pip install https://github.com/simonw/datasette/archive/refs/heads/0.60.x.zip—installs the latest head from my 0.60.x branch.
  • pip install https://github.com/simonw/datasette/archive/e64d14e4.zip—installs the package from the snapshot at commit e64d14e413a955a10df88e106a8b5f1572ec8613—note that you can use just the first few characters in the URL rather than the full commit hash.

That last option, installing for a specific commit hash, is particularly useful in requirements.txt files since unlike branches or tags you can be certain that the content will not change in the future.

As you can see, the URLs are all predictable—GitHub has really good URL design. But if you don’t want to remember or look them up you can instead find them using the Code -> Download ZIP menu item for any view onto the repository:

Screenshot of the GitHub web interface - click on the green Code button, then right click on Download ZIP and selecet Copy Link

Installing from a fork

I sometimes use this trick when I find a bug in an open source Python library and need to apply my fix before it has been accepted by upstream.

I create a fork on GitHub, apply my fix and send a pull request to the project.

Then in my requirements.txt file I drop in a URL to the fix in my own repository—with a comment reminding me to switch back to the official package as soon as they’ve applied the bug fix.

Installing pull requests

This is a new trick I discovered this morning: there’s a hard-to-find URL that lets you do the same thing for code in pull requests.

Consider PR #1717 against Datasette, by Tim Sherratt, adding a --timeout option the datasette publish cloudrun command.

I can install that in a fresh environment on my machine using:

pip install https://api.github.com/repos/simonw/datasette/zipball/pull/1717/head

This isn’t as useful as checking out the code directly, since it’s harder to review the code in a text editor—but it’s useful knowing it’s possible.

Installing gists

GitHub Gists also get URLs to zip files. This means it’s possible to create and host a full Python package just using a Gist, by packaging together a setup.py file and one or more Python modules.

Here’s an example Gist containing my datasette-expose-some-environment-variables plugin.

You can right click and copy link on the “Download ZIP” button to get this URL:

https://gist.github.com/simonw/b6dbb230d755c33490087581821d7082/archive/872818f6b928d9393737eee541c3c76d6aa4b1ba.zip

Then pass that to pip install or datasette install to install it.

That Gist has two files—a setup.py file containing the following:

from setuptools import setup

VERSION = "0.1"

setup(
    name="datasette-expose-some-environment-variables",
    description="Expose environment variables in Datasette at /-/env",
    author="Simon Willison",
    license="Apache License, Version 2.0",
    version=VERSION,
    py_modules=["datasette_expose_some_environment_variables"],
    entry_points={
        "datasette": [
            "expose_some_environment_variables = datasette_expose_some_environment_variables"
        ]
    },
    install_requires=["datasette"],
)

And a datasette_expose_some_environment_variables.py file containing the actual plugin:

from datasette import hookimpl
from datasette.utils.asgi import Response
import os

REDACT = {"GPG_KEY"}


async def env(request):
    output = []
    for key, value in os.environ.items():
        if key not in REDACT:
            output.append("{}={}".format(key, value))
    return Response.text("\n".join(output))


@hookimpl
def register_routes():
    return [
        (r"^/-/env$", env)
    ]