22nd July 2024 - Link Blog
Breaking Instruction Hierarchy in OpenAI's gpt-4o-mini. Johann Rehberger digs further into GPT-4o's "instruction hierarchy" protection and finds that it has little impact at all on common prompt injection approaches.
I spent some time this weekend to get a better intuition about
gpt-4o-minimodel and instruction hierarchy, and the conclusion is that system instructions are still not a security boundary.From a security engineering perspective nothing has changed: Do not depend on system instructions alone to secure a system, protect data or control automatic invocation of sensitive tools.
Recent articles
- Datasette Apps: Host custom HTML applications inside Datasette - 18th June 2026
- GLM-5.2 is probably the most powerful text-only open weights LLM - 17th June 2026
- Publishing WASM wheels to PyPI for use with Pyodide - 13th June 2026