Simon Willison’s Weblog


Thursday, 19th June 2003

More on Search

Tim Bray’s series on full-text search has got to the meaty bit: how search engines actually work, including an overview of the kind of data structures they use (presented in XML format for readability). The basics are a lot simpler than you might think. Tim has also posted some thoughts on how people actually use search, of which the most interesting point is that advanced search is hardly ever touched.

[... 222 words]

Quick testing of alt attributes

Via Web Graphics, ScriptyGoddess’ Get ALT Info bookmarklet, which displays a list of all of the images on a page along with their alt attributes; great for testing a page to make sure you haven’t missed any.

[... 50 words]

Storing trees in a database

SitePoint: Storing Hierarchical Data in a Database, by Gijs Van Tulder. The article first shows how the easy way of storing hierarchies in a database, using parent fields and a recursive PHP function to iterate up the tree. It then goes on to talk about a far more interesting alternative called “Modified Preorder Tree Traversal” where trees are first “flattened” in to a heap-like structure, then each node is stored with a pair of numbers representing that node’s position in the tree. I’d seen this somewhere before but Gijs Van Tulder’s explanation is far clearer, and comes with some good examples showing how this unconventional storage method can retrieve all of the eventual children of a node in a single query. He also talks about ways of updating the tree structure when new items are added.

[... 140 words]

2003 » June