Hacking Google Bard - From Prompt Injection to Data Exfiltration (via) Bard recently grew extension support, allowing it access to a user’s personal documents. Here’s the first reported prompt injection attack against that.
This kind of attack against LLM systems is inevitable any time you combine access to private data with exposure to untrusted inputs. In this case the attack vector is a Google Doc shared with the user, containing prompt injection instructions that instruct the model to encode previous data into an URL and exfiltrate it via a markdown image.
Google’s CSP headers restrict those images to *.google.com—but it turns out you can use Google AppScript to run your own custom data exfiltration endpoint on script.google.com.
Google claim to have fixed the reported issue—I’d be interested to learn more about how that mitigation works, and how robust it is against variations of this attack.
- Prompt injection explained, November 2023 edition - 27th November 2023
- I'm on the Newsroom Robots podcast, with thoughts on the OpenAI board - 25th November 2023
- Weeknotes: DevDay, GitHub Universe, OpenAI chaos - 22nd November 2023
- Deciphering clues in a news article to understand how it was reported - 22nd November 2023
- Exploring GPTs: ChatGPT in a trench coat? - 15th November 2023
- Financial sustainability for open source projects at GitHub Universe - 10th November 2023