Search is changing.
David Amerland begins with this sentence in the foreword of his book, Google Semantic Search. If you are a long time user of the Internet and, more specifically, the search engines, you can see how the way we look for information has changed over the past few years. We are at a moment in which semantic search, the ability to put typed searches into context, represents the most accurate option for granting answers.
However, before we delve into this new type of search is appropriate to stop and ask the question: What has motivated our change when searching in the Internet?
New Types of Queries
Certainly, the wide range of devices from which we can search represents a determining factor: PCs, laptops, smartphones, tablets, TVs, etc. Also, with the variety of devices, there are different input methods, from typing a word on the keyboard of our computer or making a request directly to voice applications like Siri, Sherpa, Cortana, or Google Now.
These advances have moved us from former queries like “restaurants in manhattan,’ to more specific queries such as ‘where to eat Indian food in Manhattan’ or ‘what is the best place to eat Indian food in Manhattan’ .
We can see two trends in this evolution: an increase in long tail queries and users requiring more precision, so search engines have had to adapt and provide more relevant results.
Search engines understood identifying keywords alone was not enough, instead, they needed to understand how the data was related, both with within the same site and through out the web. This is where the most important change within the search landscape occurs: a progression from the ubiquitous keywords to the increasingly important entities. Words become concepts and search engines evolve into genuine learning machines.
This co-evolution results in what is now known as semantic search.
What is Semantic Search?
For a definition we will use Wikipedia (as so does Google):
Semantic search seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace, whether on the Web or within a closed system, to generate more relevant results.
And the definition of Techopedia:
Semantic search is a data searching technique in a which a search query aims to not only find keywords, but to determine the intent and contextual meaning of the words a person is using for search.
We should stop at two important concepts: intent and context.
Intent, which comes from the user, explicitly states what he or she is looking for. And context could be understood as everything that surrounds a search and makes this go in either direction, i.e., what gives it meaning. Thus, by understanding and connecting intention and context, search engines are able to understand the different queries, both what motivates and what is expected of them.
The Arrival of Hummingbird
In the summer of 2013 Google announced a new update to its algorithm: Hummingbird. This time, unlike Panda or Penguin, this was an absolute rewrite of the algorithm oriented to present results in a completely different way as it had done so far.
Google had changed its famous algorithm to focus on the understanding phase of the searches. This would facilitate the subsequent phases by limiting the number of documents in the index that were consulted to show the best possible results.
This reinforcement of the understanding phase means that Google starts to pay more attention to the context in which a search is performed and in which the concepts appear in documents and relate to each other. But how does it work? Well, thanks to semantics.
We have already commented that words are no longer just letters together and have moved on to become concepts. This means that what was once a single unit now has a number of connotations that give a concrete meaning. And this is done through a process of disambiguation. Again, we turn back to Wikipedia to define this term:
In computational linguistics, word-sense disambiguation (WSD) is an open problem of natural language processing and ontology, which governs the process of identifying which sense of a word (i.e. meaning) is used in a sentence, when the word has multiple meanings.
Disambiguate a term is to clearly differentiate their different meanings, that is to say, find the different contexts in which it can be used.
How Semantic Search Works
We’ll use an example to understand in a more visual way how semantic search operates.
If we do a search for ‘panda’, without specifying anything else, we could think that we are referring to the bear species native to China, one of the Google algorithm updates or even a computer antivirus program. We just have to see the results page to understand that, in the first instance, for Google it is not clear what we are looking for:
Google, as a first option, understands that the user is looking for information about the panda bear, and that makes perfect sense, since the animal existed first and, therefore, gives the name to all other results that appear. However, the range of results is a clear example of the need to fine tune our searching. In fact, if we look at the related searches offered at the end of the page it is clear that the options are quite broad:
It is also curious how Google Suggest works:
The search engine shows as first choice a chain of Chinese restaurants (this is particularly striking in my case, since I live in Spain and I don’t do searches on google.com, so this result has nothing to do with any previous search from my part).
We could say that with a lack of context Google provides several options, so the next step on the part of the user is to have to choose from what kind of ‘panda’ want to keep getting information.
If we do now a new search by typing in the search box ‘panda diet’ it is clear that Google will know perfectly the kind of results to show: those related to the diet of this animal. What we find is the following:
This time Google responds directly to our query, offering one of the cards from its Knowledge Base, extracting data from a panda record belonging to the website of the Smithsonian National Zoological Park. For more info on how Google responds directly to inquiries from users, I recommend reading this article by Dean Cruddace.
We see that this is a step beyond the Knowledge Graph, where Google anticipated some of the information from knowledge sources like Wikipedia.
If we look now at related searches we can realize that Google understands better the query made by the user:
This time, Google Suggest is more in line with the query:
In the first query Google is dealing with a search term, but with several entities. In fact, if we go into Freebase (another source of knowledge from which Google thrives to better understand the website) and search ‘panda’, the database will return 220 results.
However, in the second case the term is better defined as we are giving a context, so we could say that it is itself an entity. It would not make sense that Google would return as results web pages of an antivirus if we have typed the word ‘diet’ as part of our search criteria.
In order to find and display the most relevant results, Google looks for help from search entities. In short, it is a process by which searches performed by users establish a set relationship that help identify the importance of the various documents and, therefore, influence the information displayed. To elaborate on search entities, I recommend reading this fantastic article by Bill Slawski explaining what they are.
When establishing the context in which a query occurs, Google takes into account a number of factors such as:
- User search history.
- User location: depending on the location of the user, the search engine is able to discern what type of results are more appropriate for him or her.
- Global search history: searches carried out consecutively or close in time associated with another search
- Relationships between a high amount of previously stored data (named terms or entities).
- Queries characteristics: spelling, variations, etc.
- Domains linked from documents on the same topic
- Co-occurrence of terms and distance between them
- and more
Why Should I Pay Attention to Semantic Search?
Google’s mission is to organize the world’s information and make it universally accessible and useful.
This is listed at the beginning of the Google page dedicated to explaining the company. In the task of organizing information, it is critical to find mechanisms to more precisely understand the existing content.
As mentioned above, search engines are becoming authentic learning machines, so if we want our pages to be understood by search engines in the way human does, we should make this easier for them.
There are many components that can help us achieve our goal, but we will highlight the semantic markup of our content as the most relevant alternative.
Google, Bing, Yahoo! and Yandex decided to join forces to develop a vocabulary to implement the HTML semantic markup of the web pages. The result was Schema.org, where we can find the reference needed to semantically mark our content appropriately.
On this website we find the following statement:
On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web.
As we can see, semantic markup helps search engines to display more relevant results for the user. We contribute to these results with the famous rich snippets.
How to Take Full Advantage of Semantic Search
Below I will give a few tips for search engines better understand the content of our pages, helping these win in relevance.
Create a Context
When you create new content, ask yourself the best way to present the information so it is easy for the user to find. When the user can easily understand our content, we have taken a big step in helping the search engines to do likewise.
When we talk or read about a specific term, this is usually accompanied by a series of other words that help to contextualize its meaning. By following the above example, if a content speaks about the panda bear it is logical that we also find in the text the following words: ‘bamboo’, ‘China’, ‘animal’, ‘mammal’, etc.
If we are able to identify all these ‘companion’ words and include them in our text, we can give clues on the subject of our content to search engines.
Synonyms and Variations
Besides enriching our text and make it more readable as it doesn’t repeat our keyword again and again, we will help search engines on two tasks we’ve seen that are very important:
- On the one hand identify another set of terms that are directly related to the topic of our content.
- And secondly to build a richer context for search engines.
Semantic Keyword Research
Keywords are still very important, so we must keep looking for those that offer us more benefit. With semantic search, we need to know how to find these profitable keywords that also benefit the context of our website.
Note: Sujan Patel wrote a useful article on this topic on SEJ.
Link to Sources With Same Topics
When linking to external sites ensure that the content of these have as much or more quality than your own. You will strengthen the relationships between documents as well as providing more value to your users.
As we know high quality content is able to attract the attention of users and make them more inclined to share. Also, for certain social networks we also have a chance to mark up our content with metadata to improve the way they are displayed.
- Facebook: OpenGraph
- Twitter: Twitter Cards
- Pinterest: Rich Pins
There are many tools and resources we can use to give our content more semantic value. Among them:
- Google Structured Data Testing Tool
- Yandex Cheker
- Structured Data Linter
- Schema Creator
- Text Razor
And many more. In this post Barbara Starr explores a great selection as well.
There is no doubt that the search landscape is changing towards a more natural and spontaneous language. Although they will remain important, keywords are not what they were. At present, we need to have a much broader view of our industry.
The language used in the world we live in is crucial to provide both users and search engines with the information they need to find the information they seek.
Semantic web is our greatest ally in this task. Fortunately, we have at our disposal a number of tools to get the most out of our content. Can we afford not to take advantage of them? I think not.
Semantic search is the new black.
Attribution Notes: I don’t have any affiliations with any of the tools or websites I mention and link to and all screenshots were taken on November 2, 2014. Featured image via Pixabay.