There are many words which are spelled the same but have different meanings based on language and location. A very simple example is the word “football”. In the US and Canada refers to a game played with a ball that is thrown in the air and carried towards a goal; while, in the UK and Australia it refers to a game that is played by kicking a ball into a goal (also known as ‘soccer’ to Americans). So, how does Google determine which meaning of a specific word a user is after?
Every time someone conducts one of these ambiguous searches on Google, Google’s algorithm immediately needs to figure out the preferred language of the user to just understand the category of results that should be returned before even determining the rankings of those results.
While the word football is spelled the same by all English speakers, a human audience would not know which type of game is being referenced in a conversation unless they knew where the person talking about the game came from. In both games, there are similar features like a great deal of running, passing, and even goal kicking.
Within a very short spoken conversation or statement there would probably not even be any semantic clues that could help the listener figure out which kind of football was being referenced. If someone just asked, “What time is the football game?” or “Do you play football?”, the answer would be dependent on the specific kind of football. (When listening to ambiguous phrases, there may be the prevalence of an accent, but this advantage will not exist for typed phrases in a search box.) However, if the conversation is expanded the listener will eventually be able to figure out whether the primary topic is American football or soccer.
Similar to spoken conversation, in longer queries, Google will also use adjoining words to the ambiguous term to help refine the query. A query like “football pitch” would mean that a user is looking for soccer, and “football field goal” would mean that it is an American (or Canadian) football query. Furthermore, Google uses additional query words combined with timing to understand the query. “What time is the football game?” searched on an NFL game day Sunday would be a great indicator of the query intent of the user.
One Word Query
When the query is just one word, this becomes far more challenging. Figuring out which kind of sport a user is seeking is certainly a challenge, but at least both variations are referring to a game. Google could just return results for both definitions of football, but that would not be a very good user experience. An American seeking the NFL would not understand why there are results for soccer in the search page.
Google is able to get away with returning different categories of results in ambiguous queries like “breadcrumbs” because a user understands that Breadcrumbs could have multiple meanings. In the screenshot below, Google is returning results for recipes, the breadcrumb design element, a product, and a book. All of these make sense, and there is no sense that Google failed to interpret the query. Adding a result from another culture or language is a lot more jarring.
This is an even greater challenge for the dozens of examples where a word means one thing in a language, but has a different meaning entirely in another language. In English, a “gift” is something nice you give to people, while in German, a gift is poison. In France, “pain” is bread, while in English, it is something we try very hard to avoid. (For some off-color examples, have a look at this Reddit thread.)
Language Prioritization for User Experience
If Google were to return results across multiple languages, the user would probably think there was something wrong with Google and use another search engine. It is even more important in these cases that Google correctly determines the user’s preferred language and returns only relevant results.
If there are other words that accompany the multi-use word, Google can use these to match the user’s language and return the best result. As before, the real challenge is when there is only a one-word query.