Accessing Llama 2 from the command-line with the llm-replicate plugin

18th July 2023

The big news today is Llama 2, the new openly licensed Large Language Model from Meta AI. It’s a really big deal:

Llama 2 is available for free for research and commercial use.

(swyy spotted that it was rebranded from LLaMA 2 to Llama 2 a few hours ago.)

Facebook’s original LLaMA model, released in February, kicked off a seismic wave of innovation in the world of open source LLMs—from fine-tuned variants to from-scratch recreations. I wrote about the impact this was having a few weeks after its original release in Large language models are having their Stable Diffusion moment. If anything I under-estimated how significant this was going to be.

There was just one catch: you couldn’t use it for commercial purposes.

That’s changed with Llama 2!

It does still have a couple of interesting limitations, emphasis mine:

You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof).

If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion[...]

This still makes it available for most commercial purposes. Let’s try it out.

Accessing Llama 2 using LLM

My LLM tool provides command-line access to a wide variety of language models, both via web APIs and self-hosted on your own machine.

Last week I added plugin support for extending it to support new models. An independently released plugin can now add support for new models to the existing CLI tool.

The brand new llm-replicate plugin provides CLI access to models hosted on Replicate, and this morning a16z-infra released a16z-infra/llama13b-v2-chat which provides Replicate API access to the new Llama 2 13B chat model.

You’ll need a Replicate API key.

Install LLM like this:

pip install llm

(I like to use pipx.)

Or if you don’t have pip setup but can use Homebrew:

brew install simonw/llm/llm

Run llm --version to confirm it’s installed.

Next, install the llm-replicate plugin:

llm install llm-replicate

The plugin adds a new llm replicate set of commands—see them with llm --help.

Configure your Replicate API key so the plugin can talk to the API:

llm keys set replicate

Enter key: <paste key here>

llm replicate add a16z-infra/llama13b-v2-chat \
  --chat --alias llama2

This tells the plugin that it’s a “chat” model, which means you can have continuing conversations with it, rather than just sending single prompts.

Now we can send a prompt, using the llama2 alias we configured there:

llm -m llama2 "Ten great names for a pet pelican"

I get this:

Certainly! Here are ten great names for a pet pelican:

Peanut: A cute and playful name for a small pelican.

Puddles: A fun and silly name for a pelican that loves to splash around in water.

Pete: A classic name that suits a friendly and outgoing pelican.

Penelope: A pretty and elegant name for a female pelican.

Percy: A charming and sophisticated name for a male pelican.

Pippy: A playful and affectionate name for a pelican with a lively personality.

Pelty: A fun and quirky name for a pelican with a unique sense of style.

Pip: A sweet and endearing name for a small pelican.

Plucky: A feisty and adventurous name for a pelican with a big personality.

Poppy: A lovely and delicate name for a female pelican with a sweet disposition.

I hope these suggestions help you find the perfect name for your pet pelican!

And because it’s a chat model, we can send a follow-up prompt to the same conversation:

llm -c "Five more and make them more nautical"

Ahoy! Here be five more nautical names for yer pet pelican:

Seabird: A fitting name for a pelican that loves the ocean and all its wonders.

Captain Hook: A swashbuckling name for a pelican with a bold and fearless spirit.

Anchor: A sturdy and dependable name for a pelican that’s always grounded and stable.

Finley: A fishy name for a pelican with a love for the sea and all its creatures.

Buoyant: A bouncy and cheerful name for a pelican with a lighthearted and carefree attitude.

Hope these suggestions float yer boat, matey!

LLM logs every prompt and response sent through it to a SQLite database. I’m using --truncate here to shorten the output to include in this post:

llm logs --truncate

[
  {
    "id": "01h5n5b3q6s54e03wtvqe90w2g",
    "model": "replicate-a16z-infra-llama13b-v2-chat",
    "prompt": "Five more and make them more nautical",
    "system": null,
    "response": "Ahoy matey! Here be five more nautical-themed names for yer pet pelican:\n\n1. Captain Hook\n2. Anch...",
    "response_json": null,
    "conversation_id": "01h5n5atsqfg4bhxc53j3ymnx2",
    "duration_ms": 7364,
    "datetime_utc": "2023-07-18T18:48:51.999944",
    "conversation_name": "Ten great names for a pet pelic\u2026",
    "conversation_model": "replicate-a16z-infra-llama13b-v2-chat"
  },
  {
    "id": "01h5n5atss6k9zsr5qs5hb2wzd",
    "model": "replicate-a16z-infra-llama13b-v2-chat",
    "prompt": "Ten great names for a pet pelican",
    "system": null,
    "response": "Sure, here are ten great names for a pet pelican:\n\n1. Pelty\n2. Peanut\n3. Puddles\n4. Nibbles\n5. Fu...",
    "response_json": null,
    "conversation_id": "01h5n5atsqfg4bhxc53j3ymnx2",
    "duration_ms": 7873,
    "datetime_utc": "2023-07-18T18:48:42.356413",
    "conversation_name": "Ten great names for a pet pelic\u2026",
    "conversation_model": "replicate-a16z-infra-llama13b-v2-chat"
  }
]

Or run llm logs path to see the path to the SQLite database file. If you have Datasette installed you can open the logs database like this:

datasette "$(llm logs path)"

Or with Datasette Desktop:

open "$(llm logs path)"

What’s next

I’m really excited about Llama 2. I expect we’ll start seeing all kinds of innovation on top of it as the model weights themselves start to roll out.

I’m particularly looking forward to:

The fine-tuned models. Alpaca, Vicuna et-al did amazing things with fine-tuning on top of the original LLaMA. I expect we’ll see dozens of fine-tuned models over the next few weeks.
Implementation of the reAct and Toolformer patterns, which will enable Llama 2 to drive external tools in a similar way to OpenAI Functions. The Llama 2 paper mentions testing against Toolformer prompts already, so this hopefully won’t be too difficult.

Hopefully this time we’ll start to see a concerted effort from people to document what kind of prompts work well with Llama 2 for practical purposes, since I expect this model will be an influential piece of the ecosystem for some time to come.

Posted 18th July 2023 at 7:30 pm · Follow me on Mastodon or Twitter or subscribe to my newsletter

Simon Willison’s Weblog