Examples are the #1 thing I recommend people use in their prompts because they work so well. The problem is that adding tons of examples increases your API costs and latency. Prompt caching fixes this. You can now add tons of examples to every prompt and create an alternative to a model finetuned on your task with basically zero cost/latency increase. […]
This works even better with smaller models. You can generate tons of examples (test case + solution) with 3.5 Sonnet and then use those examples to create a few-shot prompt for Haiku.
Recent articles
- GPT-5 Thinking in ChatGPT (aka Research Goblin) is shockingly good at search - 6th September 2025
- V&A East Storehouse and Operation Mincemeat in London - 27th August 2025
- The Summer of Johann: prompt injections as far as the eye can see - 15th August 2025