18th February 2024 - Link Blog
Representation Engineering: Mistral-7B on Acid (via) Theia Vogel provides a delightfully clear explanation (and worked examples) of control vectors—a relatively recent technique for influencing the behaviour of an LLM by applying vectors to the hidden states that are evaluated during model inference.
These vectors are surprisingly easy to both create and apply. Build a small set of contrasting prompt pairs—“Act extremely happy” v.s. “Act extremely sad” for example (with a tiny bit of additional boilerplate), then run a bunch of those prompts and collect the hidden layer states. Then use “single-component PCA” on those states to get a control vector representing the difference.
The examples Theia provides, using control vectors to make Mistral 7B more or less honest, trippy, lazy, creative and more, are very convincing.
Recent articles
- The Axios supply chain attack used individually targeted social engineering - 3rd April 2026
- Highlights from my conversation about agentic engineering on Lenny's Podcast - 2nd April 2026
- Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer - 30th March 2026