The boring yet crucial secret behind good system prompts is test-driven development. You don't write down a system prompt and find ways to test it. You write down tests and find a system prompt that passes them.
For system prompt (SP) development you:
- Write a test set of messages where the model fails, i.e. where the default behavior isn't what you want
- Find an SP that causes those tests to pass
- Find messages the SP is missaplied to and fix the SP
- Expand your test set & repeat
Recent articles
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode - 11th December 2024
- ChatGPT Canvas can make API requests now, but it's complicated - 10th December 2024
- I can now run a GPT-4 class model on my laptop - 9th December 2024