Social whitelisting with OpenID
22nd January 2007
A key feature of OpenID is that it provides a globally unique identifier for every user, no matter what site or service they are using on the Web.
This gives us a powerful tool to fight comment spam. If someone has logged in with an OpenID and we are confident that they are not a spammer (remember, spammers can create OpenIDs too) we can add them to a whitelist, allowing their comments to skip any moderation step or spam guard that we might have in place.
This weblog has a comment spam detection system based on simple heuristics. Comments are assigned a score; if the score exceeds a certain level the comment is placed in a queue for moderation. As of today, one of the heuristics is “does the comment author have an OpenID that is on the whitelist”. I’ve populated my whitelist with the OpenIDs of people who have posted two or more useful comments and do not appear to be using an anonymous provider. I’ll be adding to it regularly in the future.
Here comes the social part: I’m sharing my whitelist. If you run your own OpenID-enabled weblog you are welcome to include my whitelist in your comment spam heuristics. If you publish your own whitelist, I will happily do the same.
Social whitelisting benefits from being de-centralised, just like OpenID. If I find that you have whitelisted a spammer, I can unsubscribe from your whitelist. There’s no central authority or point of failure.
Long-time readers may be feeling a strong sense of deja-vu. Way back in September 2003, I proposed shared comment blacklists as a solution to weblog comment spam. The idea was simple: every time you delete a spam comment, you add the link it was advertising to a public blacklist. Other blogs could then subscribe to your blacklist and block any new comments advertising the same site.
The blacklisting idea was flawed from the very start. It was a classic example of Marcus J. Ranum’s number one dumbest idea in computer security: Default Permit. Spam blacklists assume that if we don’t know a link is bad, it’s good. Spammers can create new bad links far faster than we can blacklist them.
Here’s Ranum’s suggested alternative:
The opposite of “Default Permit” is “Default Deny” and it is a really good idea. It takes dedication, thought, and understanding to implement a “Default Deny” policy, which is why it is so seldom done. It’s not that much harder to do than “Default Permit” but you’ll sleep much better at night.
Social whitelisting uses Default Deny. As such, I believe it has a much higher chance of making a useful impact on the comment spam problem.
Update: I should have mentioned that this idea developed over a number of discussions with Tom Coates, which totally slipped my mind when I was writing it up at 3am.
More recent articles
- AI for Data Journalism: demonstrating what we can do with this stuff right now - 17th April 2024
- Three major LLM releases in 24 hours (plus weeknotes) - 10th April 2024
- Building files-to-prompt entirely using Claude 3 Opus - 8th April 2024
- Running OCR against PDFs and images directly in your browser - 30th March 2024
- llm cmd undo last git commit - a new plugin for LLM - 26th March 2024
- Building and testing C extensions for SQLite with ChatGPT Code Interpreter - 23rd March 2024
- Claude and ChatGPT for ad-hoc sidequests - 22nd March 2024
- Weeknotes: the aftermath of NICAR - 16th March 2024
- The GPT-4 barrier has finally been broken - 8th March 2024
- Prompt injection and jailbreaking are not the same thing - 5th March 2024