Notes from Bing Chat—Our First Encounter With Manipulative AI
19th November 2024
I participated in an Ars Live conversation with Benj Edwards of Ars Technica today, talking about that wild period of LLM history last year when Microsoft launched Bing Chat and it instantly started misbehaving, gaslighting and defaming people.
Here’s the video of our conversation.
I ran the video through MacWhisper, extracted a transcript and used Claude to identify relevant articles I should link to. Here’s that background information to accompany the talk.
A rough timeline of posts from that Bing launch period back in February 2023:
- Microsoft announces AI-powered Bing search and Edge browser—Benj Edwards, Feb 7, 2023
- AI-powered Bing Chat spills its secrets via prompt injection attack—Benj Edwards, Feb 10, 2023
- AI-powered Bing Chat loses its mind when fed Ars Technica article—Benj Edwards, Feb 14, 2023
- Bing: “I will not harm you unless you harm me first”—Simon Willison, Feb 15, 2023
- Gareth Corfield: I’m beginning to have concerns for @benjedwards’ virtual safety—Twitter, Feb 15, 2023
- A Conversation With Bing’s Chatbot Left Me Deeply Unsettled—Kevin Roose, NYT, Feb 16, 2023
- It is deeply unethical to give a superhuman liar the authority of a $1 trillion company or to imply that it is an accurate source of knowledge / And it is deeply manipulative to give people the impression that Bing Chat has emotions or feelings like a human—Benj on Twitter (now deleted), Feb 16 2023
- Bing AI Flies Into Unhinged Rage at Journalist—Maggie Harrison Dupré, Futurism, Feb 17 2023
Other points that we mentioned:
- this AI chatbot “Sidney” is misbehaving—amazing forum post from November 23, 2022 (a week before even ChatGPT had been released) from a user in India talking about their interactions with a secret preview of Bing/Sydney
- Prompt injection attacks against GPT-3—where I coined the term “prompt injection” in September 12 2022
- Eight Things to Know about Large Language Models (PDF) is the paper where I first learned about sycophancy and sandbagging (in April 2023)
- Claude’s Character by Anthropic talks about how they designed the personality for Claude—June 8 2023, my notes on that.
- Why ChatGPT and Bing Chat are so good at making things up in which Benj argues for the term “confabulation” in April 2023.
More recent articles
- First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin) - 4th December 2024
- Storing times for human events - 27th November 2024
- Ask questions of SQLite databases and CSV/JSON files in your terminal - 25th November 2024