Xapian performance comparision with Whoosh. Whoosh appears to be around four times slower than Xapian for indexing and empty cache searches, but Xapian with a full cache blows Whoosh out of the water (5408 searches/second compared to 26.3). Considering how fast Xapian is, that’s still a pretty impressive result for the pure-Python Whoosh.
There is a indexing tuning parameter I should have surfaced in the API that's set pretty conservatively (indexing only uses 4MB at a time). Increasing it would make indexing go a lot faster.
For searching speed I don't know if I can do too much without being very extra clever or exploring parallel processing. I believe I'm just running into the speed limit of interpreted Python at this point :(
I'm thankful to Robert, the Xapian programmer, for running this test. When I get the data from him I can try reproducing it and work on making Whoosh faster for very large datasets.
Matt Chaput - 15th February 2009 02:28 - #