SharePoint Conference 2012: Prominent Role for Search in SharePoint 2013

This week, many AIS team members attended the Microsoft SharePoint Conference in Las Vegas, Nevada. We’ll be posting blog posts from each of them as they learn what’s new and what’s exciting during sessions, demonstrations and other conference highlights.

The changes made to SharePoint Search in SharePoint 2013 are too numerous to describe in a single blog post, but I’ll try to provide an overview of some of the major improvements ,with the intent of emphasizing the central role played by search in the new platform. Our future solution architectures for applications will likely have search as a key design consideration. The search-related sessions that I attended at SPC 2012 were well filled to capacity, so there does seem to be a great interest in the future to SharePoint Search.

In his session on building search-driven applications, Scot Hillier made the point that we should no longer think of search in the limited scope of what occurs when a user types in a search term in a search box and the corresponding results that appear. Rather, we should think of search as a data access technology, in the same vein as CAML, REST and CSOM. In fact, he went as far as to say that search is the data access technology because, as he put it, “Search knows where all the skeletons are buried.”

To facilitate a more ‘search-driven’ platform, Microsoft has made improvements in all areas of SharePoint Search, namely: Content sources and crawling, tagging content, querying content, and rendering results.

Crawling and Content Sources

In order for applications to reach and surface the various types of content available in various information stores, the content needs to be available in the search index database. To this end, some important optimizations have been made in the content crawling infrastructure. There are more connectors and content sources supported in 2013. For applications that require displayed information to be ‘fresh’ and not suffer from the ‘index lag’ problem (where users may have to wait for hours before new content gets crawled and indexed before appearing as a search result), a Continuous Crawl option has been introduced for SharePoint content sources. With Continuous Crawl, changed content gets picked up by the crawler and pushed to the query processing component every 15 minutes by default (the frequency is configurable). If a full crawl has started, the new system allows the latest changes to appear in results before the full crawl completes.

Tagging/Identifying Content

Document parsing has been improved in SharePoint 2013. For example, deep links can be extracted from Word and PowerPoint documents, file formats are identified automatically (instead of relying on file extensions), and there are high-performance format handlers for file types such as HTML and PDF.

Managed properties mapped to crawled properties are still the key building blocks for search applications, but some major improvements have been made. Firstly, site columns automatically become managed properties — this is a huge improvement! Site owners can create managed properties without involving the search administrator.

Secondly, search settings can be exported and imported, which provides a mechanism to migrate search solutions from development to test to production environments. Managed properties are included in the export/import.

Third, the managed properties themselves have properties. For example, setting a ‘Sortable’ property on the managed property means that it can be used for sorting results.

Querying Content

There is a new Search REST service which allows your applications to execute queries against SharePoint Search remotely from your application via standard REST calls. The access point is http://server/_api/search.

Note that the SQL query syntax used in prior versions of SharePoint has been removed in 2013 (the FAST query syntax is still supported). The new Keyword Query Language (KQL) is a powerful and feature-rich query language. For example, it has property restrictions which allows the query to restrict the results based on managed properties (e.g. filetype:docx will return only Word documents). You can use KQL in queries executed in your application either using REST or the CSOM. There are numerous features of KQL which could take up an entire blog post or even a blog series!

Adaptive learning is also new. You can create a query rule that fires only if a particular result type is commonly clicked. For example, when searching for a particular term, search returns various types of documents in the results. But let’s say most users are only clicking on Excel documents in that result set. The rule will begin to display Excel files at the top of the results set.

Rendering/Display

Search results are now displayed to via standard HTML/JS templates. There is a rich results display with hover panels, refiners, counts and actions. Deep links within Word and PowerPoint documents are displayed directly in the hover panel, so a user can click directly on a link for a title of a PowerPoint slide displayed on the hover panel and be taken directly to that particular slide.

Easy and Powerful Extensibility

Display templates are the new way to customize the display search results. There were cheers from the audience when it was announced that XSLT was no longer needed to customize the display of results, because display templates use standard HTML and JavaScript. Display Templates are particularly powerful when used in combination with the new Content Search Web Part, which is a new rollup web part (like the Content Query Web Part in 2010) that allows surfacing of content across site collections. The CSWP also has a full-screen Query Builder interface, which is a powerful interface for constructing complex queries and make use of other search improvements such as Query Rules (which is another feature that would require a dedicated blog post).

Finally, the content publishing functionality also relies on search as a foundation. In a session on this topic, it was noted that the key advantages of this approach are to “break down site collection boundaries, eliminate large list thresholds, allow flexible and dynamic publishing, separate presentation from storage.”

As developers of custom applications, whenever we’re faced with problems involving surfacing content from multiple site collections (or even multiple content sources), using the new SharePoint Search infrastructure should be foremost in our minds as a key to the solution.

About Sanjeev Bhutt

Sanjeev is a Solutions Architect at Applied Information Sciences and has been with the company for almost a decade. Sanjeev has been leading teams developing custom .NET business solutions for a large financial services company. Prior to joining AIS, Sanjeev managed the development of 'shrink-wrapped' software products. Sanjeev has an MSEE and BSEE from the University of the Witwatersrand in Johannesburg, South Africa.