Google Latent Semantic Indexing

  • 4.4K
Aaron Wall
Aaron Wall
Google Latent Semantic Indexing

Google Latent Semantic Indexing

Many people have been noticing a wide shuffle in search relevancy scores recently. Some of those well in the know attribute this to latent semantic indexing, which Google has been using for a while, but recently increased its weighting.

How Does Latent Semantic Indexing Work? Latent semantic indexing allows a search engine to determine what a page is about outside of specifically matching search query text. A page about Apple computers will likely naturally have terms such as iMac or iPod on it.

Latent semantic indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines the document collection as a whole, to see which other documents contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant. This simple method correlates surprisingly well with how a human being, looking at content, might classify a document collection. Although the LSI algorithm doesn’t understand anything about what the words mean, the patterns it notices can make it seem astonishingly intelligent. source

By placing additional weight on related words in content LSI has a net effect of lowering the value of pages which only match the specific term and do not back it up with related terms.

Mix Your Anchor Text!
Latent semantic indexing can also be used to look at the link profile of your website. If all your links are heavy in a few particular phrases and light on other similar phrases then your site may not rank as well.

Example Related Terms:
Many of my links to this site say “SEO Book” but I also used various other anchor text combinations to make the linkage data appear less manipulative.

Instead of using SEO in all the links some of them may use phrases like
search engine optimization
search engine marketing
search engine
search engines
search engine promotion

Instead of using book in all the links some other good common words might be

How do I Know What Words are Related?
There are a variety of options to know what words are related to one another.

* Search Google for search results with related terms using a ~. For example, Google Search: ~seo will return pages with terms matching or related to seo and will highlight some of the related words in the search results.
* Use a lexical database
* Look at variations of keywords suggested by various keyword suggestion tools.
* write a page and use the Google AdSense sandbox to see what type of ads they would try to deliver to that page.
* Read the page copy and analyze the backlinks of high ranking pages.

Google Sandbox and Latent Semantic Indexing:
The concept of “Google Sandbox” has become synonymous with “the damn thing won’t rank” or whatever. The Sandbox idea is based upon sites with inadequate perceived trust taking longer to rank well.

Latent semantic indexing is just another piece of the algorithm, though many sites will significantly shift in rankings due to it. Latent semantic indexing does not necissarily directly relate to the Google Sandbox theory.

Where do I learn more about Latent Semantic Indexing?
A while ago I read Patterns in Unstructured Data and found it was wrote in a rather plain english easy to understand manner.

Brian Turner also listed a good number of research papers in this thread.

Aaron Wall is the Author of SEO Book

Aaron Wall

Aaron Wall

Aaron Wall, Search Marketing / SEO Coverage and Rants - Aaron Wall is one of the most vocal search engine ... [Read full bio]