The boring yet crucial secret behind good system prompts is test-driven development. You don't write down a system prompt and find ways to test it. You write down tests and find a system prompt that passes them.
For system prompt (SP) development you:
- Write a test set of messages where the model fails, i.e. where the default behavior isn't what you want
- Find an SP that causes those tests to pass
- Find messages the SP is missaplied to and fix the SP
- Expand your test set & repeat
Recent articles
- Notes on Google's Gemma 3 - 12th March 2025
- Here's how I use LLMs to help me write code - 11th March 2025
- What's new in the world of LLMs, for NICAR 2025 - 8th March 2025