Google Patent : Organic Results Ranked by User Profiling
Google has filed for an organic search patent, termed Personalization of placed content ordering in search results, to serve organic search results based on user profiles. Google has also applied for a similar behavioral targeting patent for its advertising network, but this seems to be a first from Google with plans to integrate user profiling into natural search ranking.
Such profiles are created by Google and gathered from previous queries, web navigation behavior via tracked links and possibly sites visited which serve Google ads, computers with Google Applications installed such as Desktop Search, Google Wi-fi Connection or Sidebar, and personal information which Google identifies which may be “implicitly or explicitly provided by the user.”
This new ranking system, which is a spin off of PageRank and the current Google ranking algorithm, could be referred to as Profile Rank. What is the difference between this new ranking system and Google Personalized Search? Personalized Search was beta tested by Google users who have opted in to Google profile building while the new Profile Rank is based upon user profiles built by tracking a users web habits in and outside of Google Search, even if the user has not opted in to be served personalized results or is a registered Google Account member : Wide Spread Personalization to all users.
In the patent application Google explains that when a search engine generates search results in response to a search query, a listed site which satisfies the query is assigned a query score, QueryScore, in accordance with the search query. This query score is then modulated by the site’s PageRank, to generate a generic score, GenericScore, that is expressed as : GenericScore=QueryScore*PageRank.
However, Google states that the GenericScore system may not be relevant enough and proposes a more in depth Profile Rank (PersonalizedScore) : This GenericScore may not appropriately reflect the site’s importance to a particular user if the user’s interests or preferences are dramatically different from that of the random surfer. The relevance of a site to user can be accurately characterized by a set of profile ranks, based on the correlation between a sites content and the user’s term-based profile, herein called the TermScore, the correlation between one or more categories associated with a site and user’s category-based profile, herein called the CategoryScore, and the correlation between the URL and/or host of the site and user’s link-based profile, herein called the LinkScore. Therefore, the site may be assigned a personalized rank that is a function of both the document’s generic score and the user profile scores. This personalized score can be expressed as: PersonalizedScore=GenericScore*(TermScore+CategoryScore+LinkScore).
Google gives an example of a listing based upon user profiling mixed with information given by the user : a user may choose to offer personal information, including demographic and geographic information associated with the user, such as the user’s age or age range, educational level or range, income level or range, language preferences, marital status, geographic location (e.g., the city, state and country in which the user resides, and possibly also including additional information such as street address, zip code, and telephone area code), cultural background or preferences, or any subset of these.
Compared with other types of personal information such as a user’s favorite sports or movies that are often time varying, this personal information is more static and more difficult to infer from the user’s search queries and search results, but may be crucial in correctly interpreting certain queries submitted by the user.
For example, if a user submits a query containing “Japanese restaurant”, it is very likely that he may be searching for a local Japanese restaurant for dinner. Without knowing the user’s geographical location, it is hard to order the search results so as to bring to the top those items that are most relevant to the user’s true intention. In certain cases, however, it is possible to infer this information. For example, users often select results associated with a specific region corresponding to where they live.
What about shared machines? If the one computer is shared by various users with different web behavior, how is Google to define Profile Rank in its organic search results? Google has thought this though :
Sometimes, multiple users may share a machine, e.g., in a public library. These users may have different interests and preferences. In one embodiment, a user may explicitly login to the service so the system knows his identity. Alternatively, different users can be automatically recognized based on the items they access or other characteristics of their access patterns. For example, different users may move the mouse in different ways, type differently, and use different applications and features of those applications. Based on a corpus of events on a client and/or server, it is possible to create a model for identifying users, and for then using that identification to select an appropriate “user” profile. In such circumstances, the “user” may actually be a group of people having somewhat similar computer usage patterns, interests and the like.
Users identified by the way they move a mouse or typing style? Amazing.
The patent, Personalization of placed content ordering in search results, is pretty detailed and deep. I suggest running over it a couple of times, printing it out and breaking out the highlite marker from college because there is a lot to it and a handfull of clues as to the future of Google and its ranking system.