Category Archives: Web

Solr AutoSuggest with TermsComponent and jQuery

I needed to implement an autosuggest/autocomplete search box for use with Solr. After a little research, I found the new TermsComponent feature in Solr 1.4. To use TermsComponent for suggestions, you need to provide set the prefix and lower bound to the input term and make the lower bound exclusive. Use the terms.fl parameter to set the source field. This means:

  • Set terms.lower to the input term
  • Set terms.prefix to the input term
  • Set terms.lower.incl to false
  • Set terms.fl to the name of the source field

Your resulting query should look something like this:

http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=py&terms.prefix=py&terms.lower.incl=false&indent=true&wt=json

Note: This assumes you are using the default solrconfig.xml for Solr 1.4

In the example above I used “py” for my input term. You will then get output that looks similar to this:

{
 "terms":[
  "spell",[
	"pyblosxom",16,
	"pychm",16,
	"pyqt",16,
	"python",16]]}

Now that we have TermsComponent setup and working correctly its time to create the autosuggest/autocomplete search box. Since I am not one to reinvent the wheel, I did a quick search and found a jQuery UI plugin for autocomplete. The search frontend I was developing was already using jQuery, so this plugin was a perfect fit.

This autocomplete plugin is not in the current release of jQuery UI so I needed to grab it from their subversion repository. You can find instructions where to get it here.

The plugin supports AJAX calls for the data source. It expects the data source to return each suggestion on it’s own line, for example:

pyblosxom
pychm
pyqt
python

As you saw above, this is not what direct output from Solr looks like. On top of this, it is not a good idea to expose your backend server via your frontend code. Time to write a java servlet.

Unfortunately the java client for Solr, SolrJ, didn’t support TermsComponent yet. I decided to add this support, so please see this post for information on my patch.

Assuming you are using a version of SolrJ with my patch, here is a simple servlet that provides the functionality we need:

protected void doGet(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException {
        String q = req.getParameter("q");
        String limit = req.getParameter("limit");
	PrintWriter writer = res.getWriter();
	List<Term> terms = query(q, Integer.parseInt(limit));

	if (terms != null) {
		for (Term t : terms) {
			writer.println(t.getTerm());
		}
	}
}

And the query method:

private List<Term> query(String q, int limit) {
    List<Term> items = null;
    CommonsHttpSolrServer server = null;

     try {
         server = new CommonsHttpSolrServer("http://localhost:8983/solr");
     } catch(Exception e) { e.printStackTrace(); }

     // escape special characters
     SolrQuery query = new SolrQuery();
     query.addTermsField("spell");
     query.setTerms(true);
     query.setTermsLimit(limit);
     query.setTermsLower(q);
     query.setTermsPrefix(q);
     query.setQueryType("/terms");

     try {
         QueryResponse qr = server.query(query);
         TermsResponse resp = qr.getTermsResponse();
         items = resp.getTerms("spell");
     } catch (SolrServerException e) {
      	items = null;
     }

     return items;
}

Now you may be wondering why I used the “q” and “limit” parameters. I use these because this is what the jQuery autocomplete plugin sends to the servlet. “q” is the input term, and “limit” is the max number of suggestions to return.

Now to hook everything together. Insert the following javascript into the head of your search page and replace “#searchbox” with the id of the input box you want to use for autocompletion. Also insert the correct url to your servlet.

        	$(document).ready(function() {

        		$("#searchbox").autocomplete({ url: 'completion',
        			 max: 5,
        		});
          	});

Update your css file with required jQuery UI css:

/* Autocomplete
----------------------------------*/
.ui-autocomplete {}
.ui-autocomplete-results { overflow: hidden; z-index: 99999; padding: 1px; position: absolute; }
.ui-autocomplete-results ul { width: 100%; list-style-position: outside; list-style: none; padding: 0; margin: 0; } 

/* if  the width: 100%, a horizontal scrollbar will appear when scroll: true. */
/* !important! if line-height is not set, or is set to a relative unit, scroll will be broken in firefox */
.ui-autocomplete-results li { margin: 0px; padding: 2px 5px; cursor: default; display: block; font: menu; font-size: 12px; line-height: 16px; overflow: hidden; border-collapse: collapse; }
.ui-autocomplete-results li.ui-autocomplete-even { background-color: #fff; }
.ui-autocomplete-results li.ui-autocomplete-odd { background-color: #eee; }

.ui-autocomplete-results li.ui-autocomplete-state-default { background-color: #fff; border: 1px solid #fff; color: #212121; }
.ui-autocomplete-results li.ui-autocomplete-state-active { color: #000; background:#E6E6E6 url(images/ui-bg_glass_75_e6e6e6_1x400.png) repeat-x; border:1px solid #D3D3D3; }

.ui-autocomplete-loading { background: white url('images/ui-anim.basic.16x16.gif') right center no-repeat; }
.ui-autocomplete-over { background-color: #0A246A; color: white; }

Congratulations! You should now have a working Solr-based autocomple search box!
Solr AutoCompletion

Solr for WordPress

Solr for WordPress
Solr for WordPress is a WordPress plugin that interacts with an instance of the Solr search engine. With this plugin you can:

  • Index pages and posts
  • Perform advanced queries
  • Enable faceting on fields such as tags, categories, and author
  • Treat the category facet as a taxonomy
  • Add special template tags so you can create your own custom result pages to match your theme
  • Configuration options allow you to select pages to ignore, features to enable/disable, and what type of result information you want output.
  • Hit highlighting
  • Dynamic result teasers

Solr for WordPress requires WordPress 2.7 or greater and an instance of Solr 1.3 or greater. Installation is simple, just extract the plugin in your WordPress plugins folder, activate it, then point it at your Solr instance via the configuration page. From there, you can index all your pages and/or posts and you are ready to perform searches against your WordPress data.

This plugin assumes your Solr schema contains the following fields: id, permalink, title, content, numcomments, categories, categoriessrch, tags, tagssrch, author, type, and text. The facet fields (categories, tags, author, and type) should be string fields. You can make tagssrch and categoriessrch any type you want as they are used for general searching. The plugin is distributed with a Solr schema you can use. I will eventually package up a version of Solr configured specifically for this plugin. Until then, the provided schema will have to do.

Integrating Solr for WordPress into your theme is quite simple as well. The plugin provides two template tags, one for a search box and another for search results. For the search box, use the s4w_search_form() tag. For the search results use the s4w_search_results() tag. These template tags output valid xhtml that you can style with css.

This version of the plugin requires you to create your own search page template then create a search page called “Search” using this template. It also requires you to manually update any search forms to search against the search page you just create (“/search/”) and putting the query parameters in the “qry” parameter. In future versions it will completely replace the standard WordPress search functionality.

By default, facting is enabled for the category, tags, author, and post type. Faceting allows your user to drill down into the search results filtering on values of the particular facet. The category facet can be treated as a taxonomy as well.

UPDATE:
Released Solr for WordPress 0.2.0

Plugin Home: Solr for WordPress
Download: Solr for WordPress 0.1.0
WordPress Hosted Plugin Page: Solr for WordPress

New Design

I have finally decided to update my site. I have been pretty busy with work the last couple years and have been slacking on the site. Well, I got an itch to start working on it again and decided to kick off my renewed interest with a completely new design. The design is a heavily modified version of the Elixir theme by Michael Whalen.

Along with the new design I have integrated Solr search. Solr is an amazing search engine. I wrote a plugin called Solr for WordPress that handles all the integration between Solr and WordPress. I will be writing a post about the plugin soon.

Directing the Googlebot

While setting up PyBlosxom there were a few things I wanted to be able to do. The most important was being able to direct bots around my site, more specifically, the googlebot. I did some research and found a few sites that explain how the googlebot works and how you can guide it though your site.

I found Scribbling.net’s article, “Help the Googlebot understand your web site” which describes how the googlebot should index a blog. Basically, you want google to index your posts, not your main page. You do this so people can find the actual post about a topic, not your main page that has most likely changed since googlebot last indexed your site. They show that you can use metatags telling bots when and when not to index a page.

To do this using PyBlosxom you can use the comments plugin and “comment-story” flavour file with the meta tag telling googlebot to index this page and the regular “story” flavour file telling it not to index the page. Out of the box, the comments plugin would display comments any time you viewed a page with one post. This is a problem when using the calender and categories plugins because it would show the comments when viewing categories or dates with only one post, even though you were not viewing the actual post. We do not want this because it means that we will be telling google to index directories, not post pages. To fix this I modified the comments plugin so that it will only show comments when viewing an actual post. Here is my modified comments plugin for anyone interested in doing this with their blog.