About URL Tracking Parameters and Duplicate Content Issues

December 7, 2007
⋅
8 min read

Carsten Cumbrowski

9.4K

READS

How do you deal with links that have tracking parameters in the URL for referrer measurement for analytical reasons or in case of a partner program where the referring site gets some form of compensation for referred traffic and/or customers?

The webmasters fear was regarding the duplicate URLs that are generated for the same page and possible negative consequences in Google or other search engines as a result of it.

This question was asked by an attendee during the panel about SEO design and organic site structure on Wednesday at Webmaster Worlds “PubCon” 2007 in Las Vegas.

I was not 100% satisfied by the answers to this question by the panelists Mark Jackson, Paul Bruemer, Lyndsay Walker, Alan K’necht and moderator Todd Friesen from Range Online Media and approached the attendee after the session to provide some tips and options for their particular problem.

Paid search tracking links don’t create this problem in most cases, because those links are usually JavaScript or nofollowed and ignored by search engines. Partner links on the other hand are usually normal HTML links that are being recognized and followed by search engines.

Duplicate Content Filter; NOT Penalty
First I would like to address the fear of possible penalties because of the duplicate content that is being created as a result of those tracking links. Google and other search engines use duplicate content filters and not penalties for duplicate pages within a single website. This means that the search engine will decide for one URL to the page and suppress all others. In this example am I pretty sure that search engines will pick the URL without the URL parameters (for a number of reasons that I will not elaborate here, because it is not relevant to this particular issue). This is different from the duplicate content issue across multiple domains, caused by canonical URLs, redistribution of content or scraping.

Just to make sure, you might want to check manually if any search engine returns the wrong version of the URL or not. A phrase search for the entire page title works the best IMO. A search for the URL with tracking code might results in a match, because as stated before, the duplicate content filter is not a penalty to block content from being indexed. For this reason, a search for the URL does not provide the answer to which URL the search engine chooses in normal search results.

Search engines might reduce the frequency or amount of pages they are crawling, if a site has excessive duplication issues, but issues that create those kinds of problems are different from the one that is currently discussed.

What Are the Real Issues?
If you don’t do anything, it will not be the end of the world, but two problems are the result of it.

The first problem is that “link juice” that should flow into one particular URL is wasted on a different one (the one with the tracking code).

The second and probably more severe problem is the possibility that a URL with the tracking code is returned for some queries on the search engines and clicked by users, which is skewing your referrer tracking statistics, because you track some referrers from the search engines as referrer from a specific partner site.

You have three general options available. Actually 4 options, but let me start with those 3 first and elaborate option 4 at the end of this post.

Option 1 – Block URLs
Block URLs with tracking code from search engine spiders. You can do this via the Robots.txt file, but you have to be very careful that you not exclude the URL without the tracking code as well. You can also exclude the URL programmatically, if the landing page is a dynamic script. You could add code to the page that sets the META tag for ROBOTS in the HEAD section of the rendered HTML to NOINDEX, if the page is called with a tracking parameter in the URL and set it to INDEX, if the call is made without the tracking parameter. Both methods will ensure that only one URL for the page is being indexed, but you lose the SEO benefit of the inbound links, because the Page Rank or “Link Juice” to the excluded pages are not counted and do not help your page to rank better in the SERPS.

Option 2 – Redirect URLs
You redirect all requests to URLs with the tracking parameter in the URL to the URL without the tracking parameter in the URL via a 301 redirect. This can be accomplished via a MOD REWRITE rule in the .HTACCESS file of your site, if you use Apache as web server, a redirect rule specified in your REWRITE software for Microsoft IIS (does not come with IIS, separate software) or via code in the dynamic script of the landing page. This would allow you to keep the SEO benefit of the link, but might prohibits the tracking of the referrer with your current tracking solution, more to that in a second.

Option 3 – Do Nothing
Leave everything as it is today and do nothing. As said before, it would not be the end of the world and only create the problems described in the paragraph about the “real issues”.

Notes to Option 1 and 2
If you decide for option 1, to eliminate the problem of having URLs with tracking code indexed by the search engines and accept the loss of the value those inbound links provide, check out the free resources to Robots.txt, META Tags, HTACCESS and 301 Redirects available on my site at Cumbrowski.com.

I also have source code in classic ASP available for the detection and removal of URL parameters, followed by a 301 redirect to the URL without the parameter. The code examples can be translated easily into other script languages, such as PHP, Python, PERL or DOT.NET.

While the code option is the best, because it allows possible tracking of referrers while also doing a 301 redirect to benefit from the inbound link, does it not mean that this is the only option for you to go.

Possible Tracking Issue with Redirection
You want to make sure that the tracking still works. This will be a problem, if your current tracking is done via a 3rd party provider where you added a piece of JavaScript code to your HTML pages and that’s it. The 301 redirect is a server side redirect and no HTML is rendered to be able to execute the tracking code for the URL with the tracking parameter. If the tracking solution is custom and supports server side scripting to lock the hit of the page with the tracking code prior the 301 redirect, perfect.

A solution in between those two would be possible, if the analytics provider allows the upload of some custom tracking data into their system for processing and reporting. In this case could your programmer write a simple logging script to track the hits prior to the redirect and in addition to that provides a tool to download those hits for the upload into the analytics software.

Conclusion
Those are your options and which one is the right for you does not only depend on the technical abilities of your team and your analytics solution provider. You have to decide if the gain from implementing any of the possible solutions outweighs the cost and efforts needed to get it done. If the SEO benefit is only marginal and you don’t expect that any of the affected pages would increase significantly in ranking, the whole ordeal of implementing the server side tracking, URL parsing and redirecting might be not worth it. The block or exclusion of the URL with the tracking code might be enough, to ensure that your tracking stats are correct.

Option 4 – The Very Best Solution
The perfect solution would of course be, if you could configure your analytics solution to allow the reporting of referrer traffic based on the referring website URL to eliminate the need of a special tracking parameter in the URL altogether.

This is not possible, if the referring partner links from a SSL secured page to your site or if the URL is also used in promotional emails sent by the partner of your behalf. If those two things are not a problem, this would be the way to go. It would avoid the duplicate URLs issue and your page would get the SEO benefit of the link automatically.

I assumed for this post that this option was already thought about and not considered to be usable for your tracking purposes.

I hope this answers some of the questions that people might had regarding this subject and provides answers that allow you to make the right decisions about what to do or not to do in your specific case.

Cheers!

Carsten Cumbrowski
Affiliate Marketer, Internet Marketing Strategy Consultant, Blogger at ReveNews.com and Editor for SearchEngineJournal.com. More free resources for marketers are available at Cumbrowski.com.

Category SEO Web Dev SEO

The Ultimate Topic Cluster Cheat Sheet & Checklist Bundle

The Hidden Cost Of Google Ads: Stop Wasting Budget Bidding Against Yourself

The Hidden Cost Of Google Ads: Stop Wasting Budget Bidding Against Yourself

The State Of AI in Marketing

Social Media Planner: How To Plan Your Content (With Template)

The State Of AI in Marketing

About URL Tracking Parameters and Duplicate Content Issues