Friday, 16th August 2002
Paul Graham: A Plan for Spam. Paul suggests using content based filters that learn from users specifically marking messages as spam or legitimate mail. The system then picks emails apart looking for commmon terms (in both the body and the header of the message) that can then be used later on to identify spam messages. He claims his test have let through
only 5 per 1000 spams, with 0 false positives. Impressive stuff, and great reading for the excellent explanations of some advanced alogithmic and statistical techniques.
Hixie has posed a fiendish markup quiz—spot the four markup errors in a document that validates. It’s harder than it sounds. I’ve mailed off my answers, but I’m not expecting to get full marks.[... 43 words]
I’ve improved the comment system at the bequest of Adrian Holovaty. URLs posted in a comment (both those beginning with
http:// and those beginning just with
www.) will now be converted in to links.
Pink Goblin (otherwise known as HarryF) explains why magic quotes are evil. This is an issue that every PHP developer should be aware of, as it can cause all kinds of problems in your scripts if you ignore it. He suggests using a custom
myAddSlashes() function which only calls addslashes() if magic quotes are turned off. I have an alternative solution—chose your preferred setting (quotes on or off) and apply it at run time to all incoming data in one go. My code for doing this is available here. By a bizzare coincidence I wrote the script this morning, then spotted a link to the Pink Goblin article on tidak ada literally five minutes after finishing it.
[... 21 words]
Mark Pilgrim has written an ultra-liberal RSS locator (in Python, naturally). I guess he had to scratch an itch. The amount of work it puts in to locating an RSS feed for a site is astonishing, especially when you consider how short the actual code is.[... 50 words]
css-discuss has seen some interesting threads in the past 24 hours and the new archive means I can link straight to them—so here goes. Kentaro Kaji kicked off the topic of techniques for aligning an image with the bottom of a block of text. In the same thread, Benn Nunn advocated avoiding width and height attributes on images and keeping that information in an external style sheet. Other topics included accessible navigation and a tricky absolute positioning problem with Opera. The most informative mailing list I’m currently subscribed to just keeps getting better.[... 122 words]