In my last post, I explained the REST-style API that underlies Maven Central's browser-based search UI. That API essentially comes "for free" with the main components on which Maven Central Search is built:
- Apache Solr, the popular, blazing fast open source enterprise search platform from the Apache Lucene project -- http://lucene.apache.org/solr
- ajax-solr -- http://evolvingweb.github.com/ajax-solr
In this post, I will highlight those components and describe how they were used to implement Maven Central Search.
When we started the project, we looked at a couple of options for implementing search, including Solr and the existing Nexus search capability built directly on top of Apache Lucene. The Nexus approach initially seemed compelling as we clearly have significant experience with it and Nexus search even provides a REST API for full-text search that we could have leveraged. So, why did we end up choosing Solr when we could have simply re-used the search functionality in Nexus or even crafted a web UI backed by an instance of Nexus running on top of Central? Two reasons:
- Flexibility -- We discovered early on during the design phase of Central Search that we needed changes to the schemas, fields, and even field contents in the Lucene indices being used by Nexus. Making those changes to the schemas would have required other changes within the Nexus codebase. With Solr, we could simply point our Solr installation against an existing index or even have Solr build a new index from scratch by adding documents through Solr's REST API. We could rapidly prototype schema changes (often in 1-2 lines of xml and not even requiring us to restart Solr) and see our updated search results almost immediately.
- Scalability -- Solr bills itself as an "enterprise search platform." One of the enterprise features that attracted us to Solr was its built-in support for replication. As query load increases in the future, we can simply balance that load across hardware serving multiple copies of the same data. Solr's support for multiple indexes also leaves us a path open for sharding our index data, once it becomes so large as to be difficult to serve out of a single index on a single server.
Once we made the decision to use Solr, we quickly discovered that practically all the search functionality we needed came “out-of-the-box” with Solr’s REST API. In fact, the entire second half of the Maven Central API Guide is simply a set of URLs that are proxied to our running Solr instance. We proxy the requests so that we can do some filtering and transformation of inbound requests in order to prevent a malformed or malicious request from taking down our server.
Next, we turned our attention to the browser-based user interface to call that API. Solr provides an administrative interface out of the box that includes search functionality. However, administrative interfaces tend to be utilitarian. We wanted our interface to be clean, but also to be a little more interactive, not just at launch, but as we added new features. So, we started researching AJAX- or javascript-based UIs that could sit on top of Solr, and we found ajax-solr.
The ajax-solr website is a great resource for understanding the architecture of ajax-solr and provides an excellent tutorial for building your first Solr-powered AJAX-based website. Our developers took hold of that tutorial and very quickly fashioned a prototype version of Central Search. During development, two major benefits of ajax-solr stood out:
- MVC Pattern – The Model–view–controller (MVC) software architecture, often used for web applications, isolates "domain logic" from the user interface, permitting independent development, testing and maintenance of each. Ajax-solr applied the MVC pattern to Solr result sets within the browser which makes for a clean and easily extensible way of working with Solr result sets (the model) and ajax-solr widgets (the views). It also helped that MVC is an easy to understand pattern.
- jQuery implementation -- According to the ajax-solr website, ajax-solr is designed to be Javascript framework-agnostic. Any framework that can send AJAX requests to Solr can work with ajax-solr. However, the ajax-solr tutorial and stock widgets are all implemented with jQuery. Even though we had to learn about jQuery along the way, we discovered that jQuery was fairly easy to pick up and, having done so, made extending ajax-solr to meet our requirements very easy. jQuery also gave us a decent amount of cross-browser compatibility. That's not to say that we don't struggle with a little bit of browser-specific tweaking here and there, but, at least anecdotally, it saved us from many hours of wondering why our pages rendered differently (or didn't render at all) in different browsers. Finally, using jQuery opened us up to a wealth of UI plugins (http://plugins.jquery.com), which we used in adding various bits of interactivity to Maven Central.
In summary, we used several standard open source Java components to build the Maven Central search and along the way our team added several new tools to our bag of tricks. We now have a very strong foundation for continuing improvement of Maven Central. We hope you have found the new features useful and we look forward to hearing your feedback at Get Satisfaction (http://getsatisfaction.com/sonatype).
Written by Joel Orlina
Joel is a Senior Java Developer at Sonatype. He is based in Texas.