HTML entities for email addresses: don’t bother
2nd December 2003
I’ve suspected this for a long time, and now here’s the empirical evidence: Popular Spam Protection Technique Doesn’t Work. If you’re relying on HTML entities to protect your email address from spam harvesters—for example username@example.com
—your email address may as well be in plain text. Chip Rosenthal downloaded a tool called “Web Data Extractor v4.0” and tried it on some test data to prove once and for all that the technique doesn’t work.
My advice is to use your common sense when analysing a potential spam protection technique. If you were a spammer, would you be able to outwit the method? Spammers aren’t always very smart, but the people who write spamming tools (and get paid big bucks for them) are. Also remember to think about the payoff—unencoding a bunch of entities is a cheap operation. Embedding a Javascript interpreter to decipher email addresses that are glued together using Javascript at the last possible moment is a lot harder and could slow down a tool, so it may not be worth the effort.
I’m still pretty confident in my own anti-spam harvester technique of hiding my email address behind a POST form, but even that could eventually be outsmarted by a really dedicated harvesting tool.
More recent articles
- Weeknotes: Llama 3, AI for Data Journalism, llm-evals and datasette-secrets - 23rd April 2024
- Options for accessing Llama 3 from the terminal using LLM - 22nd April 2024
- AI for Data Journalism: demonstrating what we can do with this stuff right now - 17th April 2024
- Three major LLM releases in 24 hours (plus weeknotes) - 10th April 2024
- Building files-to-prompt entirely using Claude 3 Opus - 8th April 2024
- Running OCR against PDFs and images directly in your browser - 30th March 2024
- llm cmd undo last git commit - a new plugin for LLM - 26th March 2024
- Building and testing C extensions for SQLite with ChatGPT Code Interpreter - 23rd March 2024
- Claude and ChatGPT for ad-hoc sidequests - 22nd March 2024
- Weeknotes: the aftermath of NICAR - 16th March 2024