Category Archives: Java

ElasticSearch Mock Solr Plugin

I just released an ElasticSearch plugin that Mocks the Solr interface. With this you can use tools and clients that are meant to talk to Solr with ElasticSearch. Some examples are Nutch, Apache ManifoldCF, SolrJ apps, etc. Currently, indexing and deleting of documents is supported 100% for XML (/update request handler) and JavaBin (/update/javabin request handler). Basic support for the Solr search handler (/select) is also included for the Solr q, start, rows, and fl parameters. The q parameter supports 100% of the lucene query syntax. Both XML and JavaBin response formats are supported.

To use the plugin:

  1. 1. Install
  2. $ES_HOME/bin/plugin install mattweber/elasticsearch-mocksolrplugin/1.0.0
  3. 2. Update your client code to point at ElasticSearch and the /_solr REST endpoint.
    http://localhost:9200/index/type/_solr

    Specifying the index and type is optional and will default to “solr” for index, and “docs” for type.

  4. 3. Use your Solr client as normal.

I have tested the plugin with Nutch and various SolrJ test code. Using Nutch with ElasticSearch is the reason I wrote this plugin. Instead of extending Nutch to support ElasticSearch as an endpoint, I figured it would be much better to support any tool trying to talk to Solr. This plugin should greatly reduce the effort in testing and/or replacing Solr with ElasticSearch. It also opens the doors for using tools that were previously not available to ElasticSearch users.

Source available on GitHub:

https://github.com/mattweber/elasticsearch-mocksolrplugin

Running Specific Solr Unit Tests

Just realized that as of 09/17/09 and revision 816090 of Solr you can now run specific unit tests instead of everything at once. This makes a developers life (mine) much easier because you no longer need to wait for all Solr’s tests to run just to test your particular piece of code.

To run specific testcase:

ant -Dtestcase=<CLASS NAME> junit

To run all tests for a specific package:

ant -Dtestpackage=<PACKAGE NAME> junit

To run all root tests of a specific package:

ant -Dtestpackageroot=<PACKAGE ROOT> junit

Solr AutoSuggest with TermsComponent and jQuery

I needed to implement an autosuggest/autocomplete search box for use with Solr. After a little research, I found the new TermsComponent feature in Solr 1.4. To use TermsComponent for suggestions, you need to provide set the prefix and lower bound to the input term and make the lower bound exclusive. Use the terms.fl parameter to set the source field. This means:

  • Set terms.lower to the input term
  • Set terms.prefix to the input term
  • Set terms.lower.incl to false
  • Set terms.fl to the name of the source field

Your resulting query should look something like this:

http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=py&terms.prefix=py&terms.lower.incl=false&indent=true&wt=json

Note: This assumes you are using the default solrconfig.xml for Solr 1.4

In the example above I used “py” for my input term. You will then get output that looks similar to this:

{
 "terms":[
  "spell",[
	"pyblosxom",16,
	"pychm",16,
	"pyqt",16,
	"python",16]]}

Now that we have TermsComponent setup and working correctly its time to create the autosuggest/autocomplete search box. Since I am not one to reinvent the wheel, I did a quick search and found a jQuery UI plugin for autocomplete. The search frontend I was developing was already using jQuery, so this plugin was a perfect fit.

This autocomplete plugin is not in the current release of jQuery UI so I needed to grab it from their subversion repository. You can find instructions where to get it here.

The plugin supports AJAX calls for the data source. It expects the data source to return each suggestion on it’s own line, for example:

pyblosxom
pychm
pyqt
python

As you saw above, this is not what direct output from Solr looks like. On top of this, it is not a good idea to expose your backend server via your frontend code. Time to write a java servlet.

Unfortunately the java client for Solr, SolrJ, didn’t support TermsComponent yet. I decided to add this support, so please see this post for information on my patch.

Assuming you are using a version of SolrJ with my patch, here is a simple servlet that provides the functionality we need:

protected void doGet(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException {
        String q = req.getParameter("q");
        String limit = req.getParameter("limit");
	PrintWriter writer = res.getWriter();
	List<Term> terms = query(q, Integer.parseInt(limit));

	if (terms != null) {
		for (Term t : terms) {
			writer.println(t.getTerm());
		}
	}
}

And the query method:

private List<Term> query(String q, int limit) {
    List<Term> items = null;
    CommonsHttpSolrServer server = null;

     try {
         server = new CommonsHttpSolrServer("http://localhost:8983/solr");
     } catch(Exception e) { e.printStackTrace(); }

     // escape special characters
     SolrQuery query = new SolrQuery();
     query.addTermsField("spell");
     query.setTerms(true);
     query.setTermsLimit(limit);
     query.setTermsLower(q);
     query.setTermsPrefix(q);
     query.setQueryType("/terms");

     try {
         QueryResponse qr = server.query(query);
         TermsResponse resp = qr.getTermsResponse();
         items = resp.getTerms("spell");
     } catch (SolrServerException e) {
      	items = null;
     }

     return items;
}

Now you may be wondering why I used the “q” and “limit” parameters. I use these because this is what the jQuery autocomplete plugin sends to the servlet. “q” is the input term, and “limit” is the max number of suggestions to return.

Now to hook everything together. Insert the following javascript into the head of your search page and replace “#searchbox” with the id of the input box you want to use for autocompletion. Also insert the correct url to your servlet.

        	$(document).ready(function() {

        		$("#searchbox").autocomplete({ url: 'completion',
        			 max: 5,
        		});
          	});

Update your css file with required jQuery UI css:

/* Autocomplete
----------------------------------*/
.ui-autocomplete {}
.ui-autocomplete-results { overflow: hidden; z-index: 99999; padding: 1px; position: absolute; }
.ui-autocomplete-results ul { width: 100%; list-style-position: outside; list-style: none; padding: 0; margin: 0; } 

/* if  the width: 100%, a horizontal scrollbar will appear when scroll: true. */
/* !important! if line-height is not set, or is set to a relative unit, scroll will be broken in firefox */
.ui-autocomplete-results li { margin: 0px; padding: 2px 5px; cursor: default; display: block; font: menu; font-size: 12px; line-height: 16px; overflow: hidden; border-collapse: collapse; }
.ui-autocomplete-results li.ui-autocomplete-even { background-color: #fff; }
.ui-autocomplete-results li.ui-autocomplete-odd { background-color: #eee; }

.ui-autocomplete-results li.ui-autocomplete-state-default { background-color: #fff; border: 1px solid #fff; color: #212121; }
.ui-autocomplete-results li.ui-autocomplete-state-active { color: #000; background:#E6E6E6 url(images/ui-bg_glass_75_e6e6e6_1x400.png) repeat-x; border:1px solid #D3D3D3; }

.ui-autocomplete-loading { background: white url('images/ui-anim.basic.16x16.gif') right center no-repeat; }
.ui-autocomplete-over { background-color: #0A246A; color: white; }

Congratulations! You should now have a working Solr-based autocomple search box!
Solr AutoCompletion

SolrJ TermsComponent Support

I was working on implementing an auto-complete search box today using Solr 1.4 and the new TermsComponent. TermsComponent is a simple plugin that provides access to Lucene’s term dictionary and is very fast. Being fast and the fact it can hook into a search index makes it perfect for an auto-completion server.

Unfortunately, SolrJ does not support this new functionality yet. Well not officially because you could always parse the raw response object yourself. That is exactly what I was doing until I figured I might as well just add the support to SolrJ. I did, and it was extremely easy.

I added support for TermsComponent parameters and implemented a new TermsComponent response type. The TermsComponent response is parsed into a list of Type objects. The Type object has two methods, getTerm() and getFrequency(). getTerm() returns the suggested term, and getFrequency() returns the frequency of the term appearing in the index.

I have submitted my patch upstream for inclusion into a future version of SolrJ.

Here is the link to the JIRA bug report:
https://issues.apache.org/jira/browse/SOLR-1139

Here is the patch:
https://issues.apache.org/jira/secure/attachment/12407047/SOLR-1139.patch