Simon Willison’s Weblog

10 items tagged “fulltextsearch”

2020

Weeknotes: datasette-dump, sqlite-backup, talks

I spent some time this week digging into Python’s sqlite3 internals. I also gave two talks and recorded a third, due to air at PyGotham in October.

[... 928 words]

sqlite-utils 2.14 (via) I finally figured out porter stemming with SQLite full-text search today—it turns out it’s as easy as adding tokenize=’porter’ to the CREATE VIRTUAL TABLE statement. So I just shipped sqlite-utils 2.14 with a tokenize= option (plus the ability to insert binary file data from stdin). # 1st August 2020, 9:19 pm

datasette-search-all: a new plugin for searching multiple Datasette tables at once

I just released a new plugin for Datasette, and it’s pretty fun. datasette-search-all is a plugin written mostly in JavaScript that executes the same search query against every searchable table in every database connected to your Datasette instance.

[... 819 words]

2019

sqlite-utils 1.11. Amjith Ramanujam contributed an excellent new feature to sqlite-utils, which I’ve now released as part of version 1.11. Previously you could enable SQLite full-text-search on a table using the .enable_fts() method (or the “sqlite-utils enable-fts” CLI command) but it wouldn’t reflect future changes to the table—you had to use populate_fts() any time you inserted new records. Thanks to Amjith you can now pass create_triggers=True (or --create-triggers) to cause sqlite-utils to automatically add triggers that keeps the FTS index up-to-date any time a row is inserted, updated or deleted from the table. # 3rd September 2019, 1:05 am

Exploring search relevance algorithms with SQLite

SQLite isn’t just a fast, high quality embedded database: it also incorporates a powerful full-text search engine in the form of the FTS4 and FTS5 extensions. You’ve probably used these a bunch of times already: many iOS, Android and desktop applications use SQLite under-the-hood and use it to implement their built-in search.

[... 1398 words]

2018

Datasette: Full-text search. I wrote some documentation for Datasette’s full-text search feature, which detects tables which have been configured to use the SQLite FTS module and adds a search input box and support for a _search= querystring parameter. # 12th May 2018, 12:09 pm

2017

Implementing faceted search with Django and PostgreSQL

I’ve added a faceted search engine to this blog, powered by PostgreSQL. It supports regular text search (proper search, not just SQL“like” queries), filter by tag, filter by date, filter by content type (entries vs blogmarks vs quotation) and any combination of the above. Some example searches:

[... 3049 words]

2009

Xapian performance comparision with Whoosh. Whoosh appears to be around four times slower than Xapian for indexing and empty cache searches, but Xapian with a full cache blows Whoosh out of the water (5408 searches/second compared to 26.3). Considering how fast Xapian is, that’s still a pretty impressive result for the pure-Python Whoosh. # 14th February 2009, 1:15 pm

Tokyo Cabinet and Tokyo Tyrant Presentation. By Tokyo Cabinet author Mikio Hirabayashi. The third leg of the Tokyo tripod is Tokyo Dystopia, a full-text search engine which is presumably a modern replacement for Mikio’s older hyperestraier engine. # 14th February 2009, 11:34 am

2008

How-to: Full-text search in Google App Engine. Use search.SearchableModel instead of db.Model—it’s pretty rough at the moment which is probably why it’s still undocumented. # 27th June 2008, 8:25 am