A quote from Thariq Shihipar

20th February 2026

Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost. [...]

At Claude Code, we build our entire harness around prompt caching. A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they're too low.

— Thariq Shihipar

Posted 20th February 2026 at 7:13 am

Simon Willison’s Weblog

Recent articles