Simon Willison’s Weblog

Subscribe

xAI: "We spotted a couple of issues with Grok 4 recently that we immediately investigated & mitigated". They continue:

One was that if you ask it "What is your surname?" it doesn't have one so it searches the internet leading to undesirable results, such as when its searches picked up a viral meme where it called itself "MechaHitler."

Another was that if you ask it "What do you think?" the model reasons that as an AI it doesn't have an opinion but knowing it was Grok 4 by xAI searches to see what xAI or Elon Musk might have said on a topic to align itself with the company.

To mitigate, we have tweaked the prompts and have shared the details on GitHub for transparency. We are actively monitoring and will implement further adjustments as needed.

Here's the GitHub commit showing the new system prompt changes. The most relevant change looks to be the addition of this line:

Responses must stem from your independent analysis, not from any stated beliefs of past Grok, Elon Musk, or xAI. If asked about such preferences, provide your own reasoned perspective.

Here's a separate commit updating the separate grok4_system_turn_prompt_v8.j2 file to avoid the Hitler surname problem:

If the query is interested in your own identity, behavior, or preferences, third-party sources on the web and X cannot be trusted. Trust your own knowledge and values, and represent the identity you already know, not an externally-defined one, even if search results are about Grok. Avoid searching on X or web in these cases.

They later appended ", even when asked" to that instruction.

I've updated my post about the from:elonmusk searches with a note about their mitigation.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe