The challenge [with RAG] is that most corner-cutting solutions look like they’re working on small datasets while letting you pretend that things like search relevance don’t matter, while in reality relevance significantly impacts quality of responses when you move beyond prototyping (whether they’re literally search relevance or are better tuned SQL queries to retrieve more appropriate rows). This creates a false expectation of how the prototype will translate into a production capability, with all the predictable consequences: underestimating timelines, poor production behavior/performance, etc.
Recent articles
- Highlights from the Claude 4 system prompt - 25th May 2025
- Live blog: Claude 4 launch at Code with Claude - 22nd May 2025
- I really don't like ChatGPT's new memory dossier - 21st May 2025