Posts tagged python, search
Filters: python × search × Sorted by date
ast-grep (via) There are a lot of interesting things about this year-old project.
sg (an alias for ast-grep) is a CLI tool for running AST-based searches against code, built in Rust on top of the Tree-sitter parsing library. You can run commands like this:
sg -p ’await await_me_maybe($ARG)’ datasette --lang python
To search the datasette directory for code that matches the search pattern, in a syntax-aware way.
It works across 19 different languages, and can handle search-and-replace too, so it can work as a powerful syntax-aware refactoring tool.
My favourite detail is how it’s packaged. You can install the CLI utility using Homebrew, Cargo, npm or pip/pipx—each of which will give you a CLI tool you can start running. On top of that it provides API bindings for Rust, JavaScript and Python!
Building Search DSLs with Django (via) Neat tutorial by Dan Lamanna: how to build a GitHub-style search feature—supporting modifiers like “is:open author:danlamanna”—using PyParsing and the Django ORM.
Twitter API: What is the best data storage mechanism and client library for analysing tweets using Python?
It depends on how much data you intend to collect, and how you intend to then share that data.
[... 182 words]Haystack 1.0 Final Released. I’ve used Haystack on a number of projects recently, and it has proved itself as a completely painless way of adding full-text search (using Solr or Whoosh—I haven’t tried the Xapian backend yet) to a Django ORM powered project in just a few minutes. Congratulations, Daniel + contributors.
Large Problems in Django, Mostly Solved: Search. Eric Holscher shows how Haystack uses a number of common Django patterns (object registration, pluggable backends, QuerySet-style chaining and class-based views) to great effect in creating a powerful search application for Django. Makes me wonder if more of those patterns should be promoted to first class concepts within Django.
Haystack (via) A brand new modular search plugin for Django, by Daniel Lindsley. The interface is modelled after the Django ORM (complete with declarative classes for defining your search schema) and it ships with backends for both Solr and pure-python Whoosh, with more on the way. Excellent documentation.
django-springsteen and Distributed Search. Will Larson’s Django search library currently just talks to Yahoo! BOSS, but is designed to be extensible for other external search services. Interestingly, it uses threads to fire off several HTTP requests in parallel from within the Django view.
Xapian performance comparision with Whoosh. Whoosh appears to be around four times slower than Xapian for indexing and empty cache searches, but Xapian with a full cache blows Whoosh out of the water (5408 searches/second compared to 26.3). Considering how fast Xapian is, that’s still a pretty impressive result for the pure-Python Whoosh.
Whoosh. A brand new, pure-python full text indexing engine (think Lucene). Claims to offer performance in the same league as wrappers to C or Java libraries. If this works as well as it claims it will be an excellent tool for adding search to projects that wish to avoid a dependency on an external engine.
solango. Another attempt at a Django/Solr integration library, based on code written for “a top 20 newspaper site” (I’d love to know which one). This is well documented, uses a registration model clearly inspired by the Django admin which keeps search related metadata out of your regular models and includes management commands for re-indexing and generating Solr schema.xml files.
How-to: Full-text search in Google App Engine. Use search.SearchableModel instead of db.Model—it’s pretty rough at the moment which is probably why it’s still undocumented.
In-Depth django-sphinx Tutorial. Another neat Django extension from the guys at Curse: easy integration with the sphinx full text search engine.
pysolr. Python wrapper for Solr, the search web service wrapper for Lucene. One thing I’m not clear on: do you need to configure Solr with the fields you’ll be indexing in advance, or can Solr create new fields on the fly to match the data you send it?
django-sphinx (via) More code from Curse Gaming; this time a really nice API for adding Sphinx full-text search to a Django model.
Apache Solr 1.1. Solr is the search Web Service built on top of Lucene. The latest release introduces JSON, Python and Ruby response formats in addition to XML.