Reduced link graphs are a way that search engines can identify high quality websites and remove low quality spam sites from the link ranking calculation. Published research demonstrates that reduced link graphs are an effective way to rank links. This article discusses research and patents that explain how reduced link graph algorithms work.
As the tweet by Bill Slawski communicates, understanding how algorithms work gives you a more accurate idea of what is possible.
This works in reverse, as well. The lack of research or patents can be seen as evidence that a theory about how Google ranks a site may be incorrect.
Understanding what is possible and what is not possible helps you to better discern between dead-end ideas and strategies based on credible evidence.
Nobody at this time can definitively assert that reduced link graphs are used. But research shows that reduced link graphs are an effective strategy for catching spam. Thus, there may be a high probability that a form of reduced link graph is in use by the search engines.
Research into Reduced Link Graphs
Researchers discovered that they could improve search results by removing links from their link graph then running the ranking algorithm on the links that survived the culling. This work noted there was room for improvement by using more sophisticated classifiers to identify more links to remove and better links include, as well as avoid false positives.
The initial research showed strong indications that this technique worked exceptionally well.
Measuring Similarity to Detect Qualified Links is the research study I will first cite. It discovered that removing certain kinds of links before the ranking process begins results in less spam and more accurate search results.
1. Advertising and Navigation Links and Link Authority
Two kinds of links they immediately removed from the ranking process were advertising links and navigational links.
“The prevalence of links that do not (or should not) confer authority is an important reason that makes link analysis less effective. Examples of such links are links that are created for the purpose of advertising or navigation…
…from the perspective of link analysis algorithms, these links are noisy information because they do not show the authors’ recommendation of the target pages. Traditional link analysis algorithms do not distinguish such noise from useful information. As a consequence, the target pages of these links could get unmerited higher ranking. Therefore, in order to provide better retrieval quality, the influence of such links needs to be reduced.”
The purpose of this research paper is to determine, “which links should be used in web link analysis.”
To answer that question they suggest using a reduced link graph, a map of the web with irrelevant and spam links thrown out.
The research paper suggests using what they call “qualified links.” What they do is create signals of similarity between the web page linking out and the target page being linked to. The links that do not score high for similarity are filtered out.
The non-similar links that are filtered out are called unqualified links. The links that remain in the link graph are called qualified links. This results in a smaller link graph comprised only of qualified links.
This is called a reduced link graph. From here, starting with the reduced link graph, the search engine can then run its link ranking algorithms.
Here is how the research paper describes the process:
“…the “unqualified links” are filtered out, which leaves only the “qualified links”. Link analysis algorithms are then performed on the reduced web graph and generate the resulting authority ranking.”
This particular approach does not rely on the anchor text or the surrounding text. It is described as not being query-specific. While previous work like statistical analysis has focused on identifying spam pages, what makes this approach stand apart is that it is focusing on the links between them.
Focusing on the links between sites and tossing out the “unqualified links” is a step in the evolution of modern link ranking algorithms. This is why anyone concerned with SEO should understand what a reduced link graph is.
Google and Reduced Link Graphs
Google described a link ranking algorithm that in my opinion, could form the basis for Penguin. This algorithm differed from any previous algorithms because it ranks the distances between links.
This link ranking algorithm identifies what it calls seed sites within a variety of niche topics. The algorithm then ranks the distances between links from one site to the next. The further away a site is from the seed sites, the likelier it is that the site is spam or irrelevant.
According to the patent, Producing a Ranking for Pages Using Distances in a Web-Link Graph:
“In a variation on this embodiment, the links associated with the computed shortest distances constitute a reduced link-graph.”
“A Reduced Link-Graph
Note that the links participating in the… shortest paths from the seeds to the pages constitute a sub-graph that includes all the links that are “flow” ranked from the seeds.
Although this sub-graph includes much less links than the original link-graph, the… shortest paths from the seeds to each page in this sub-graph have the same lengths as the paths in the original graph.”
As you can see, this new variation on PageRank, parts of which could be in use as Penguin, uses a reduced link graph. In this case, it creates a new link graph that is comprised of the sites and pages with the shortest distances from the original seed set.
The sites with the shortest distances are the reduced link graph, from which rankings are then calculated. The sites with the longest distances are shut out and do not participate in the ranking calculation.
Does Google Use a Reduced Link Graph?
Google doesn’t generally confirm what parts of any patent or research paper it uses in it’s algorithm. Since no one at Google has discussed reduced link graphs, we’re on our own to speculate whether Google uses it or not.
Given the successes researchers have experienced using the reduced link graph technique, it’s possible that a form of reduced link graph is indeed in use by Google and other search engines. But lacking confirmation at Google, it’s still a matter of conjecture.
Takeaway: How Reduced Link Graphs Affect SEO
It’s good to think about how reduced link graphs work. Understanding how modern algorithms work may help you better formulate a ranking strategy, as well as understand why some SEO strategies stop working. The important idea to take away is that link graphs are within the realms of possibility, which is more than can be said about strategies based on nothing.