llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs
I’ve been building out a small suite of command-line tools for working with ChatGPT, GPT-4 and potentially other language models in the future.
The three tools I’ve built so far are:
- llm—a command-line tool for sending prompts to the OpenAI APIs, outputting the response and logging the results to a SQLite database. I introduced that a few weeks ago.
- ttok—a tool for counting and truncating text based on tokens
- strip-tags—a tool for stripping HTML tags from text, and optionally outputting a subset of the page based on CSS selectors
The idea with these tools is to support working with language model prompts using Unix pipes.
You can install the three like this:
pipx install llm pipx install ttok pipx install strip-tags
pip if you haven’t adopted pipx yet.
llm depends on an OpenAI API key in the
OPENAI_API_KEY environment variable or a
~/.openai-api-key.txt text file. The other tools don’t require any configuration.
Now let’s use them to summarize the homepage of the New York Times:
curl -s https://www.nytimes.com/ \ | strip-tags .story-wrapper \ | ttok -t 4000 \ | llm --system 'summary bullet points' -s
Here’s what that command outputs when you run it in the terminal:
Let’s break that down.
curl -s https://www.nytimes.com/uses
curlto retrieve the HTML for the New York Times homepage—the
-soption prevents it from outputting any progress information.
strip-tags .story-wrapperaccepts HTML to standard input, finds just the areas of that page identified by the CSS selector
.story-wrapper, then outputs the text for those areas with all HTML tags removed.
ttok -t 4000accepts text to standard input, tokenizes it using the default tokenizer for the
gpt-3.5-turbomodel, truncates to the first 4,000 tokens and outputs those tokens converted back to text.
llm --system 'summary bullet points' -saccepts the text to standard input as the user prompt, adds a system prompt of “summary bullet points”, then the
-soption tells the tool to stream the results to the terminal as they are returned, rather than waiting for the full response before outputting anything.
It’s all about the tokens
ttok this morning because I needed better ways to work with tokens.
LLMs such as ChatGPT and GPT-4 work with tokens, not characters.
This is an implementation detail, but it’s one that you can’t avoid for two reasons:
- APIs have token limits. If you try and send more than the limit you’ll get an error message like this one: “This model’s maximum context length is 4097 tokens. However, your messages resulted in 116142 tokens. Please reduce the length of the messages.”
- Tokens are how pricing works.
gpt-3.5-turbo(the model used by ChatGPT, and the default model used by the
llmcommand) costs $0.002 / 1,000 tokens. GPT-4 is $0.03 / 1,000 tokens of input and $0.06 / 1,000 for output.
Being able to keep track of token counts is really important.
But tokens are actually really hard to count! The rule of thumb is roughly 0.75 * number-of-words, but you can get an exact count by running the same tokenizer that the model uses on your own machine.
OpenAI’s tiktoken library (documented in this notebook) is the best way to do this.
ttok tool is a very thin wrapper around that library. It can do three different things:
- Count tokens
- Truncate text to a desired number of tokens
- Show you the tokens
Here’s a quick example showing all three of those in action:
$ echo 'Here is some text' | ttok 5 $ echo 'Here is some text' | ttok --truncate 2 Here is $ echo 'Here is some text' | ttok --tokens 8586 374 1063 1495 198
My GPT-3 token encoder and decoder Observable notebook provides an interface for exploring how these tokens work in more detail.
Stripping tags from HTML
HTML tags take up a lot of tokens, and usually aren’t relevant to the prompt you are sending to the model.
strip-tags command strips those tags out.
Here’s an example showing quite how much of a difference that can make:
$ curl -s https://simonwillison.net/ | ttok 21543 $ curl -s https://simonwillison.net/ | strip-tags | ttok 9688
For my blog’s homepage, stripping tags reduces the token count by more than half!
The above is still too many tokens to send to the API.
We could truncate them, like this:
$ curl -s https://simonwillison.net/ \ | strip-tags | ttok --truncate 4000 \ | llm --system 'turn this into a bad poem' -s
But often it’s only specific parts of a page that we care about. The
strip-tags command takes an optional list of CSS selectors as arguments—if provided, only those parts of the page will be output.
That’s how the New York Times example works above. Compare the following:
$ curl -s https://www.nytimes.com/ | ttok 210544 $ curl -s https://www.nytimes.com/ | strip-tags | ttok 115117 $ curl -s https://www.nytimes.com/ | strip-tags .story-wrapper | ttok 2165
By selecting just the text from within the
<section class="story-wrapper"> elements we can trim the whole page down to just the headlines and summaries of each of the main articles on the page.
I’m really enjoying being able to use the terminal to interact with LLMs in this way. Having a quick way to pipe content to a model opens up all kinds of fun opportunities.
Want a quick explanation of how some code works using GPT-4? Try this:
cat ttok/cli.py | llm --system 'Explain this code' -s --gpt4
I’ve been having fun piping my shot-scraper tool into it too, which goes a step further than
strip-tags in providing a full headless browser.
Here’s an example that uses the Readability recipe from this TIL to extract the main article content, then further strips HTML tags from it and pipes it into the
In terms of next steps, the thing I’m most excited about is teaching that
llm command how to talk to other models—initially Claude and PaLM2 via APIs, but I’d love to get it working against locally hosted models running on things like llama.cpp as well.
More recent articles
- ChatGPT should include inline tips - 30th May 2023
- Lawyer cites fake cases invented by ChatGPT, judge is not amused - 27th May 2023
- Delimiters won't save you from prompt injection - 11th May 2023
- Weeknotes: sqlite-utils 3.31, download-esm, Python in a sandbox - 10th May 2023
- Leaked Google document: "We Have No Moat, And Neither Does OpenAI" - 4th May 2023
- Midjourney 5.1 - 4th May 2023
- Prompt injection explained, with video, slides, and a transcript - 2nd May 2023
- download-esm: a tool for downloading ECMAScript modules - 2nd May 2023
- Let's be bear or bunny - 1st May 2023