Kurt Jarchow's Blog

February 23, 2009

SOLR AND/OR Boolean Operators & DisMax

Filed under: Uncategorized — Kurt Jarchow @ 9:08 am

I had a problem with our search service.  SOLR was doing a great job with retrieving search results, but with over 300,000 it was hard to really narrow down specific results.  With most search engines they’ve solved this by using operators like AND/OR/NOT, and while SOLR support these, it seems to disappear when you enable dismax.  

Unfortunately, this made the Drupal apachesolr installation incompatible with the AND/OR operators.  But, almost always with SOLR, I found a solution.  Setting the “minimum should match” (mm) parameter to 1 enabled AND/OR operators seemed to do the trick.  (NOTE: Please test your results before setting this live- there might be some unwanted side-effects)

When testing AND/OR I found it confusing until I read a great article explaining how the AND/OR system works.  By using AND/OR you are actually just identify text as being REQUIRED or OPTIONAL.  I was first confusing by the results with this search query:

java AND  (cork OR dublin)

What I’m asking for here is to find all results that must contain java in cork or dublin.  This won’t work.  Use:

java AND ( OR cork OR dublin)

This will properly identify cork and dublin as being optional.

3 Comments »

  1. Hey Kurt -

    interesting – I was not away of this – layering on my own understanding of boolean algebra

    java AND (cork OR dublin)

    what results would that produce? perhaps it depends on your default boolean operator? (which could be AND)

    ?
    thank you.
    Jodi

    Comment by Jodi — June 18, 2009 @ 4:03 pm

  2. plus, I’m querying with localsolr – I wonder if this is still an issue?

    ty.Jodi

    Comment by Jodi — June 18, 2009 @ 4:08 pm

  3. Hi Jodi,

    Yes most definitely depends on your default operator.

    The best way I can describe it is to think of every term having a + or – infront. Plus being AND (required), minus being OR (optional) .. if nothing, then default is used

    having a default as AND:

    java AND (cork OR dublin)

    equals:

    +java +(+cork -dublin)

    so docs must have java and must have cork and may have dublin.

    Very confusing.

    Comment by jarchowk — June 18, 2009 @ 4:31 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress

Bad Behavior has blocked 86 access attempts in the last 7 days.