Simon Willison’s Weblog

Subscribe

Items tagged solr in 2010

Filters: Year: 2010 × solr × Sorted by date


Indexing JSON in Solr 3.1. The next release of Solr will support indexing documents provided as JSON—Solr currently requires incoming documents to be formatted as XML. # 10th December 2010, 9:46 am

What is the best way to hire Solr developers?

Do you really need to hire a Solr specialist? It shouldn’t take a competent developer more than a few days to get familiar with Solr—the HTTP API is extremely easy to work with in my experience. You can always hire in a consultant from one of the companies that provide commercial Solr support for a few days to help your developers get up to scratch.

[... 82 words]

Who are major competitors to Solr?

ElasticSearch is a really interesting one—it’s the same underlying search library (Lucene) and the same integration model (an HTTP interface) but takes quite a different approach. It hasn’t been around for a long time but it looks very impressive: http://www.elasticsearch.com/

[... 95 words]

How do Solr, Lucene, Sphinx and Searchify compare?

Lucene is a Java library for creating and searching through a full text index. If you want to make use of it, you’ll need to write your own Java code that integrates with it.

[... 109 words]

Which major companies are using Solr for search?

The Guardian newspaper uses Solr for its Open Platform Content API. http://www.guardian.co.uk/open-p...

[... 27 words]

Which Solr app for Django is better: Haystack or django-solr-search (solango)?

I’d go with Haystack—while it supports multiple backends, I get the feeling Solr is the principle backend it was developed for. It’s extremely well documented in my opinion, and the SearchQuerySet API it gives you makes running low-level queries really easy if the higher level class-based view it provides don’t do quite what you want.

[... 109 words]

[UPDATE] Spatial Search in Apache Lucene and Solr. Spacial search is finally coming (back) to Solr—trunk now supports sorting and boosting by distance. # 20th July 2010, 6:28 pm

What’s powering the Content API? The new Guardian Content API runs on Solr, scaled using EC2 and Solr replication and with a Scala web service layer sitting between Solr and the API’s end users. # 24th May 2010, 2:08 pm

Elastic Search (via) Solr has competition! Like Solr, Elastic Search provides a RESTful JSON HTTP interface to Lucene. The focus here is on distribution, auto-sharding and high availability. It’s even easier to get started with than Solr, partly due to the focus on providing a schema-less document store, but it’s currently missing out on a bunch of useful Solr features (a web interface and faceting are the two that stand out). The high availability features look particularly interesting. UPDATE: I was incorrect, basic faceted queries are already supported. # 11th February 2010, 6:33 pm

World Government Data. Launched last week, this is the Guardian’s meta-search engine for searching and browsing through data from four different government data sites (with more sites planned). Under the hood it’s Django, Solr, Haystack and the Scrapy crawling library. The application was built by Ben Firshman during an internship over Christmas. # 27th January 2010, 12:27 pm

The Seven Deadly Sins of Solr. Useful advice on managing and deploying Solr. # 24th January 2010, 1:30 pm