We keep hearing about website internal duplicate content issues but many of us are sincerely unaware of the fact that that’s what our own website also suffers from. Why?
- most of the websites are made with the help of third-party site creators that create duplicate URLs to same content and we don’t know about that;
- sometimes webmasters just lack SEO knowledge. For example, they might be unaware of the fact that URLs are case sensitive and www.yoursite.com/page1 and www.yoursite.com/Page1 are handled as two different pages with the same content.
Now what can create duplicate content:
- canonical issues (www and non-www version);
- pagination when different pages have identical titles and meta description;
- various versions of the home page (e.g. www.site.com and www.site.com/index.php);
- incorrect internal navigation creating several URLs to one and the same page (e.g. www.site.com/page.php?id=567 and www.site/category/page.php?id=567); etc
Why is it important to get rid of duplicate content issues?
Google has mostly figured how to sort this out. It will drop one version and index and rank another one. But still internal content duplication may result in a few issues:
- decreased crawl rate as Googlebot is kept busy crawling unnecessary identical pages;
- a wrong version of the page ranked which results in bad user experience (e.g page 2 is ranked instead of page 1);
- delayed ranking of newly launched sites.
What can help you to find internal duplicate content issues?
There are only few free tools available that can be of much help identifying your site duplicate content:
1. Duplicate content tool estimates the following:
- www and non-www header response;
- Google cache check;
- Similarity check;
- Default page check;
- 404 header response;
- PageRank dispersion check (i.e. if www and non-www versions have different PR).

2. Xenu scans all your site links and returns a table of all available URLs - all you have to do is to sort the list by title and find pages with identical titles.

3. Google Webmaster Tools reporting your site duplicate titles and meta descriptions.
More information on that: 7 ways to Tame Duplicate Content by Dr. Pete.









Comments
17 responses so far ↓
Yannis on Jul 2, 2008 at 11:02 am
Thanks for the useful tools. Could or have you explained more about “canonical issues (www and non-www version)” how to fix this?
Ken Savage on Jul 2, 2008 at 11:50 am
I always thought that the Google webmaster console and Xenu were the 2 tools that were invaluable in finding duplicate content. Thx Anna for pointing out the others.
Ken on Jul 2, 2008 at 12:09 pm
Great article Amy. Thanks for the helpful tools in finding dup content!
Web Agency Chieti on Jul 2, 2008 at 12:27 pm
About similarity check and copyright there is this good search engine for duplicate content at http://www.copyscape.com/
Realtor on Jul 2, 2008 at 5:33 pm
That’s a good free tools for SEO. I will try out this tool.
Software Testing on Jul 3, 2008 at 12:59 am
@ Ann, Thank you!
@ Web Agency Chieti, Copyscape is good to find duplicate content outside your website. To check within your website, please use any one of the above tools.
Web Agency Chieti on Jul 3, 2008 at 1:13 am
@Software Testing
You are right. On this way, has anyone some installation guide on how to easy install Xenu on windows or eventually a Mac?
Jim on Jul 3, 2008 at 8:10 am
Ann - thanks again for pointing out the best online tools (and free to boot!) to help make our jobs a little easier.
I always look forward to your posts b/c I know there will be another tool to go play with!
Gidseo on Jul 3, 2008 at 11:50 am
Ann
Your knack of delivering great information at the right time is uncanny.
Thank you.
vignarajan on Jul 6, 2008 at 12:11 pm
Thank you very much Ann for introducing duplicate content tools to the world. It is very supportive for me.
The first duplicate content tool virante.com not worked for the sub domain sites and blogs. Is there any duplicate content tools for sub domain sites and blogs?
Phil Condo on Aug 1, 2008 at 12:52 am
Very useful tool, thanks for the link.
Dilipprasad on Aug 5, 2008 at 12:17 am
Thank you very much Ann. It is a very useful article.
Bernard Savonet on Aug 14, 2008 at 4:25 pm
@Web Agency Chieti
Xenu on windows really works without any manual! there are some options… but you will easily find them on the go.
As for install… basically there is none! just copy the exe whare you want and it is ready.
Web Agency Chieti on Aug 15, 2008 at 2:25 am
@Bernard
maybe I don’t find a binary for windows. I honestly don’t remember what was the matter.
For what I remember I read there was necessary a Perl installation or something like that.
Too much expensive setup.
But if you say me there’s a win version, certainly I looked on the wrong place. Could you please suggest me the right link path?
Thanks.
Bernard Savonet on Aug 20, 2008 at 4:51 am
@web agency chieti
Xenu is at http://home.snafu.de/tilman/xenulink.html
Download is for Windows ONLY! see
http://home.snafu.de/tilman/xenulink.html#Download
No Perl. No linux. No Apache.
Cost: 2 minutes (install time)
The install manual is a two-liner. You also have a video that show how to use it.
Web Agency Chieti on Aug 20, 2008 at 7:38 am
@Bernard
Geez … I should became really blind to never see that link !!!! :S
ryan on Oct 5, 2008 at 4:08 pm
were is my awebsite i cant find my own website i made it an google
Leave a Comment