We’ve seen quite a few ways to search the web by similarity including similarity-based image search , Ambiently bookmarklet (that searches for “related” pages) and TouchGraph Google Browser (that works based on Google’s RELATED: advanced operator).
This week we are looking into an alternative app: Similar Pages
The tool uses its own crawler to collect pages around the web and create The Map of Internet Similarity. The map is claimed to contain more than 3.2 billion pages.
It covers all domain extensions and types of webpages, regardless of their popularity. The degree of similarity between webpages is computed regularly, and the map is updated continuously with newly published sites.
That means that the tool goes “beyond the surface” – unlike regular search engines (e.g. Google that only allow people to see “popular” pages and leave the most part of the web “in the dark”):
Only a very small % of users are browsing results after page 1 or page 2 of the search results pages. The result of this search behavior is that we mostly navigate across a small set of sites with high popularity: most of the web exists in the dark. Millions of interesting sites remain hidden at pages 2, 10, 50 or 100 of the results of a web search, because of their low popularity.
Similarity is determined with help of PageAffinity which analyzes both the content of pages as well as the linking structure of the web to determine the level of similarity between webpages.
In order to make this a reality, we developed PageAffinity a set of algorithms to identify and calculate the degree of similarity between webpages, bringing to light the ‘long tail’ of the web.
The FireFox Addon
With Similar Pages FireFox addon you can quickly discover similar pages as well as enhance your search efficiency:
1. As a discovery tool, it suggests up to 300 sites “similar in typology and topic” to the current page. When you land on any page and want to see which pages it is similar to, click the toolbar icon and get the list generated in the sidebar:
The same functionality is available via the right-click context menu:
2. As a search tool, it loads similar pages right in the Google search results page when you click its icon:
From what you can see from the screenshots, the tool really does a good job finding “similar” pages but it gives preference to home pages rather than internal pages.
What are your thoughts?