HTML entities for email addresses: don’t bother
2nd December 2003
I’ve suspected this for a long time, and now here’s the empirical evidence: Popular Spam Protection Technique Doesn’t Work. If you’re relying on HTML entities to protect your email address from spam harvesters—for example username@example.com
—your email address may as well be in plain text. Chip Rosenthal downloaded a tool called “Web Data Extractor v4.0” and tried it on some test data to prove once and for all that the technique doesn’t work.
My advice is to use your common sense when analysing a potential spam protection technique. If you were a spammer, would you be able to outwit the method? Spammers aren’t always very smart, but the people who write spamming tools (and get paid big bucks for them) are. Also remember to think about the payoff—unencoding a bunch of entities is a cheap operation. Embedding a Javascript interpreter to decipher email addresses that are glued together using Javascript at the last possible moment is a lot harder and could slow down a tool, so it may not be worth the effort.
I’m still pretty confident in my own anti-spam harvester technique of hiding my email address behind a POST form, but even that could eventually be outsmarted by a really dedicated harvesting tool.
More recent articles
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode - 11th December 2024
- ChatGPT Canvas can make API requests now, but it's complicated - 10th December 2024
- I can now run a GPT-4 class model on my laptop - 9th December 2024