Simon Willison’s Weblog

Subscribe

Wednesday, 29th May 2024

In their rush to cram in “AI” “features”, it seems to me that many companies don’t actually understand why people use their products. [...] Trust is a precious commodity. It takes a long time to build trust. It takes a short time to destroy it.

Jeremy Keith # 11:06 am

Training is not the same as chatting: ChatGPT and other LLMs don’t remember everything you say

I’m beginning to suspect that one of the most common misconceptions about LLMs such as ChatGPT involves how “training” works.

[... 1543 words]

What We Learned from a Year of Building with LLMs (Part I). Accumulated wisdom from six experienced LLM hackers. Lots of useful tips in here. On providing examples in a prompt:

If n is too low, the model may over-anchor on those specific examples, hurting its ability to generalize. As a rule of thumb, aim for n ≥ 5. Don’t be afraid to go as high as a few dozen.

There's a recommendation not to overlook keyword search when implementing RAG - tricks with embeddings can miss results for things like names or acronyms, and keyword search is much easier to debug.

Plus this tip on using the LLM-as-judge pattern for implementing automated evals:

Instead of asking the LLM to score a single output on a Likert scale, present it with two options and ask it to select the better one. This tends to lead to more stable results.

# 8:59 am

Sometimes the most creativity is found in enumerating the solution space. Design is the process of prioritizing tradeoffs in a high dimensional space. Understand that dimensionality.

Chris Perry # 7:17 am