More SOLR and eDismax parser fun

I've been working on boosting certain indexed fields when searching using Django and SOLR. This seemed to work well until I deployed the code in production. In production I was getting completely nonsensical results: pages that should be explicitly excluded were included in the results, and I would only get a fraction of the results.

It turned out the production server was running SOLR 3.6 and my buildout was still pinned on version 3.5. Yes, shame on me: I should have double checked versions first, but I never imaged a minor version change would have such an impact.

Once I started looking into the version change and the eDismax plugin, I ran into a number of issues. Most notably this one

eDismax: A fielded query wrapped by parens is not recognized
... a query like this
q=(name:test)
will yield 0 hits in 3.6 while it worked in 3.5. It works without the parens.

A solution / workaround turns to be to prefix the fieldnames with a space. E.g.

http://localhost:8983/solr/select/?q=%28+text%3Atest

Note the bold + which is an encoded space. Without it I would get zero results.

I worked around this in my code by defining a custom solr backend (which I already had anyway, because of some haystack issues) and added

 query_string = query_string.replace("(", "( ")
try:
raw_results = self.conn.search(query_string, **kwargs)
except (IOError, SolrError), e:
if not self.silently_fail:
raise

 

And now the results I expected are back again :)

Last updated July 8, 2013, 5:24 p.m. | filed under solr, django
comments powered by Disqus