This first of two guest blog entries, written by Neil Hodgkinson, a Microsoft Premier Field Engineer based in the UK, will cover the "why we did it" aspect of the open source faceted search solution for MOSS 2007 and MSS 2008 that has been released on CodePlex at http://www.codeplex.com/facetedsearch. The second guest blog entry, scheduled to be posted within a couple of weeks, will be written by Leonid Lyublinski, a Microsoft Consultant based in Ohio, USA, and will cover the "how we did it" aspect of the solution.
Metadata is information that has been gathered in addition to the resources made available to a user to locate. Classically, it can be defined as information about information, but more precisely, it's structured information about resources. For companies that have large data libraries or repositories for their corporate information, this metadata is oftentimes much more than a simple hierarchical set of subject labels. Typically, the metadata has several facets -- that is, multiple attributes assigned to the resource being indexed.
Examples of faceted metadata include:
In all of these cases, there is no single way to provide navigation for everyone because users have disparate needs. One person might want to look through all the albums created by one band; others might be more interested in particular musical genres or instruments.
With traditional parametric searching techniques, users are expected to provide from one to several parameters in order to describe the object being searched for. The drawback with this approach is that by requiring the user to choose parameters, valid results may be excluded because the search criteria have been too confining.
An alternative to parametric searching is doing full text searches, which while valid in their own right, there is a certain loss of refinement when using this approach. To a full text search engine, the fact that a recipe contains a particular ingredient is irrelevant as the context of the use of the ingredient has not been preserved.
A good solution to these problems involves exposing the facets in dynamic taxonomies so that the user can see all of the refinement options at any time. The user can easily switch between a search based approach vs. metadata browsing, using a familiar terminology while recognizing the organization and vocabulary of the data.
Key features for metadata search include:
The solution started in June 2007 as a field research project for one of Microsoft's customers. Leonid Lyublinski, a Microsoft Consultant, delivered the architectural design and development of a Faceted Search solution as an add-on to MOSS 2007 and MSS 2008. The initial version was released with an open source license at http://www.codeplex.com/facetedsearch and has been very well received. A second major version was released just last week and includes the following features:
Here are screenshots of a couple of example implementations:
Another major version of Faceted Search is scheduled for release within the next few week, and it will encapsulate foundational changes in the design and code that will provide a balance between search accuracy and performance. Key enhancements will include:
This new version will also include numerous bug fixes and be complemented by updated documentation for installation, configuration, and styling. It will be first demonstrated by Leonid and me at the Office Developer Conference 2008 in San Jose, California on February 10-13 and then released on CodePlex shortly thereafter.
Neil Hodgkinson, Microsoft PFE