As many of you know Fishbowl is a Mindbreeze Certified Partner and search appliance reseller. A core component of our company culture is using the same tools and technologies we implement for our customers. For that reason, and to give readers like you a chance to try out Mindbreeze in action, we have implemented Mindbreeze search here on fishbowlsolutions.com. Read on to learn more about the benefits and details of an integrated website with Mindbreeze.
Indexing Our Site
The first step in building an integrated website with Mindbreeze was to configure Mindbreeze to crawl our website using the out of the box web crawler. We decided to split the content into two groups, blog posts and everything else, in order to distinctly configure how blog post content would be indexed. Mindbreeze allows the configuration of one or more crawler instances, so we created two crawlers with separate follow and do-not-follow patterns to index each content group.
Next we configured the extraction of content from the site. By default the crawler will crawl the entire contents of a page, but Mindbreeze can optionally restrict content indexing to a specific DIV or section. That way, words contained in your navigation or footer won’t be indexed for every page. For example, Fishbowl’s footer currently includes the word “Mindbreeze”, but when site users search for “Mindbreeze” we don’t want to return every page on the site—only those actually related to Mindbreeze. For customers already leveraging google-on google-off tags for this purpose (a feature from the Google Search Appliance), Mindbreeze can interpret those tags. We have a few spots on our blog where this was used to restrict the indexing of blog sidebars and other non-content elements within a page template.
Entity Recognition
As part of the index setup, we configured entity recognition to parse our pages (both blog and non-blog) for the names of the five key technologies Fishbowl works with. This was done using the Mindbreeze entity extraction feature. Each of the five possible values were mapped to a metadata field called Technology. Like the metadata extraction, the entities were extracted without having to change anything about the structure of our site or templates.Query Expansion
Between the time when a user enters their query and the time the search engine computes relevant results, there is a critical piece in the search process often referred to as query expansion. Query expansion describes various ways in which the words the user types can be expanded upon or “understood” by the search engine in order to more accurately represent the original intent and locate the right content. One way queries can be expanded for better search is through the use of synonyms. Synonyms can be used to set related terms equal to one-another, make abbreviations equal to their full meanings, or set legacy terminology as synonymous with current nomenclature. Mindbreeze query expansion is used on this site to expand queries such as “Jobs” to include “Careers” and the legacy product name “UCM” to search for the new name, “WebCenter Content”. Mindbreeze also includes default stemming and spelling expansions to allow users to find content even if their query doesn’t exactly match our site’s data. For example, stemming allows users to search for “orders” and get results containing “order” “ordered” and “ordering.” It means users don’t have to know whether a word was in past tense, plural, or singular, in order to find what they need.Relevancy and Result Boosting
Relevancy boosting allows administrators to further tune result ranking (also called biasing) based on factors such as metadata values, URL patterns, or date. These relevancy adjustments can be applied to specific sites, so that each audience sees what is most relevant to them. Relevancy is configured through the Mindbreeze Management Center without requiring custom development. On our site, the number of blog posts far outweigh the number of product pages; when someone searches for a product (such as Mindbreeze) we want the first result to be the main Mindbreeze product page. To ensure the main product pages (which may be older and contain fewer words than our latest blog posts) remain on the top, we can use Mindbreeze boosting to either increase the relevancy of product pages or decrease the relevancy of blog posts. All things being equal, it is better to down-boost less relevant content than to up-boost relevant content. We added a rule to reduce the relevancy of all blog post content by a factor of 0.75. We also boost our featured results by a factor of 10 to ensure they appear on top when relevant. In addition to manual tuning, Mindbreeze automatically monitors and analyzes click patterns to learn from user behavior and improve relevancy automatically over time.Creating the Search Results Page
The search results page used on this site was created using the Mindbreeze Search App Designer. This builder provides a drag-and-drop interface for creating modular, mobile-friendly, search applications. Mindbreeze also provides a JSON API for fully custom search page development. Our search app combines a list-style results widget and three filter widgets to limit the results based on Technology, Blog Post Category, and Blog Post Author. The filter widgets available within the builder are determined by the metadata available via the indexing configuration described earlier.Search Box Integration & Suggestions
To build an integrated website with Mindbreeze, we needed to integrate Mindbreeze with our existing website’s search box. To do this, we modified the search input in the site header to direct search form submissions to the new Mindbreeze search results page. Since we are using WordPress, this involved modifying the header.php file within our site’s child theme. We also added a call to the Mindbreeze Suggest API, displayed using jQuery autocomplete, in order to provide search suggestions as you type. Most WCM systems have template files which can be modified to integrate Mindbreeze search into existing site headers. Our customers have similar integrations within Adobe Experience Manger and Oracle WebCenter Portal to name a few.
Closing Comments
We wanted to share the details about how we integrated our website with Mindbreeze to give anyone using or considering Mindbreeze an in-depth look at a real working search integration. The architecture and approach we took here can be applied to other platforms both internal and externally facing including SharePoint, Oracle WebCenter, or Liferay. Use the search box at the top of the page to try it for yourself. If you have any questions about an integrated website with Mindbreeze or other Mindbreeze search integration options, please contact us.