Whenever you build a security system that relies on detection and identification, you invite the bad guys to subvert the system so it detects and identifies someone else. [...] Build a detection system, and the bad guys try to frame someone else. Build a detection system to detect framing, and the bad guys try to frame someone else framing someone else. Build a detection system to detect framing of framing, and well, there's no end, really.
Recent articles
- The Summer of Johann: prompt injections as far as the eye can see - 15th August 2025
- Open weight LLMs exhibit inconsistent performance across providers - 15th August 2025
- LLM 0.27, the annotated release notes: GPT-5 and improved tool calling - 11th August 2025