The results of using thin and duplicate content can be fatal for e-commerce sites because Google completely disapproves of these practices. E-commerce offers universal outreach, and is completely accessible at any location and at any given time. These advantages have allowed e-commerce sites to grow over the years and have facilitated smooth commercial transactions.
However, there are serious concerns regarding the content used by many e-commerce businesses on their sites.
Most e-commerce sites are killing their business prospects. Not everyone is inclined to follow the Google Algorithms and the Panda Update guidelines for using original content on sites – and some site owners may not even realize their content violates these policies.
Let’s find out how thin or duplicate content damages the prospects of an eCommerce site.
What is The Concern?
Google’s apprehension is a cautionary note for e-commerce site owners, as using thin and duplicate content has a hard impact on SEO.
Duplicate content is the real setback for sites, as it is not considered reliable or valuable for users. Therefore, Google bans your content by not displaying your site on SERPs or removing it completely for users’ sake.
What is Duplicate Content?
If you have any difficulty in understanding what duplicate content exactly is, this is how Google’s Matt Cutts explains it: “If the vast majority or all of your content is the same content that appears everywhere else, and there’s nothing else to really distinguish it or to add value, that’s something I would try to avoid if you can.”
Duplicate content cannot be regarded as reliable for many reasons. If you have picked up your content from someone else’s site, or it belongs to someone else, you are likely to let down your e-commerce empire.
Duplicate Content and SEO
You may be missing out on huge SEO benefits if you ignore the big disadvantages of using duplicate content. E-commerce sites pay attention to their content and take care to share great information regarding their products or services.
“Duplicate content issues are rarely a penalty. It is more about Google knowing which page they should rank and which page they should not. Google doesn’t want to show the same content to searchers for the same query; they do like to diversify the results to their searchers”, said Google’s Matt Cutts.
Now, you understand why duplicate content is not very good for ranking your e-commerce site in SERPs.
Getting around the search engines is like trying to sneak something through airport security screening. It most likely isn’t going to work, and the consequences when you get caught are high.
Direct impact of bad content can lead to:
- Bad or no search engine rankings
- Low web traffic
- Thumbs-down user experience
Therefore, e-commerce sites need to be extremely careful in publishing thin or duplicate content.
Let’s further understand how duplicate content is caused.
On-Site Duplicate Content
On-site duplicate content is when the same content appears on the homepage and internal pages of an e-commerce site. Below are a few patterns of on-site duplicate content which can alter SEO rankings.
If webpage content happens to be the same on two or more URLs, Google considers them both as duplicate content. For example: Both http://yourfoodhouse.com/breakfastfoods.html and http://yourfoodhouse.com/organics/breakfastfoods.html can arrive at the same product page. In such cases, Canonical URLs can help in stopping Google from reading this as duplicate content. Google recognizes Canonical URL as the original source of content. For this, set Canonical URLs for content, products and other categories in the admin panel. This can be done by suffixing a tracking code at the end of the URL.
Duplicate content also results when Session IDs get suffixed to the URL and generate new URLs. This is a byproduct of a system failing of which it uses the Session ID. This results into duplicates of the source URL to which Session ID is attached. For this, you can stop attaching the Session ID to the URLs by tracking user sessions.
In case Session IDs get attached to the URLs, you can fix it by turning Session ID URLs into Canonical URLs. You can also stop crawling of such Session ID URLs with the help of robots.txt file.
Shopping Cart URLs
Shopping cart pages are a common feature on an e-commerce site. To stop search engines from indexing these pages, create shopping cart URLs as noindex via robots.txt file or X or meta robots tag.
Internal Search Results
Google’s Panda Algorithm deals severely with the problem of too many internal search results pages on an e-commerce site. Since Google wants to send users to original content pages, you need to fix the problem by setting internal search results as noindex pages.
Duplicate URL Paths
Two URL paths or duplicate URL paths generate when a product is kept under two or multiple categories. This becomes complicated when the CMS is unable to prevent causing duplicate URL paths. For example:
- Category A – Product A – /category-A/product-A/
- Category B – Product A – /category-B/product-A/
Now for fixing the above problem, you can go for:
1. Remove the category level and use only the basic level product page URL, though this restricts track process in Analytics Software.
- Category A – Product A – /product-A/
- Category B – Product A – /product-A/
2. Revise the above first route in the below manner for helping the Analytics Software.
- Category A – Product A – /product/product-A/
- Category B – Product A – /product/product-A/
3. Canonicalize all product URLs.
- Category A – Product A – /category-A/product-A/ – Canonical = /category-A/product-A/
- Category B – Product A – /category-B/product-A/ – Canonical = /category-A/product-A/
Product Review Pages
The product review pages on an e-commerce site should be canonicalized or noindexed via X or meta robots tag like in earlier cases.
WWW or Non-WWW URLs
Using one kind of URLs on an e-commerce site can be beneficial for avoiding the problem of duplicate content. For example: http://www.bookstore.com and http://bookstore.com. You can use 301 redirecting to avoid duplicate content. Similarly uppercase URLs and lowercase URLs should be dealt in the same manner.
To save category pages from looking dull, write interesting content describing products, featured in the product grid. This will save it from turning into duplicate content because sometimes the product names can be the same.
Homepage content is associated with the visibility of a site in the search engines. Therefore, your content should be unique and should craft user experience. Make sure your homepage content is not duplicate, and is different from internal web pages.
Off-Site Duplicate Content
When similar content is found on two or more e-commerce sites, it is considered as duplicate content. This can happen due to many reasons: scraping/pirating content, guest blogs, syndicated content, etc. Let’s examine a few patterns of off-site duplicate content.
Very often, product description is used in the same way it comes from product manufacturers. This way more e-commerce sites are likely to have similar product descriptions. To avoid this, write different yet unique content and keep multiple product photos. Also, include product benefits or consumer reviews to appear differently and improve SERP rankings.
When search engines index testing websites, it alters the live site. Avoid this by appending noindex or meta robots tag. You can also remove the testing site by setting a “Disallow: /” command.
Product Feed Specification
For marketing reasons, e-commerce owners share their product feed specifications on shopping websites. This results into having same product details on two different sites, which is off-site duplicate content. To deal with such inaccuracies, give different product feeds to the shopping site. When doing this, try to keep a better on-site description than product feeds on shopping sites.
If your original content goes to other sites with no mention of the original source, it may not work in your favor. This is because Google does not know which the original source is. Therefore, always include a link to the original source. It is best to syndicate the original content on other websites.
If other sites are becoming jealous over the popularity of your website’s business, it is not unusual for them to scrap your content. While you could be innocent in this case, Google is likely to miss that due to absence of links to the original source.
The other common problem of e-commerce sites is thin content. Shallow or low-quality web pages are considered as having thin content. Such pages are deemed unnecessary and inappropriate by Google because they create a bad user experience. These pages have high bounce rates because users leave them in few seconds. For that matter, Google completely slams ‘thin sites’ on the search engines.
To avoid this, always ensure to use interesting and well-written content on every page of your site.
Panda’s Concerns Over Duplicate Content
Cutts has mentioned that Google Panda Update is “designed to reduce the rankings of low quality sites”. Such sites either “copy content from other websites” or are lacking in “original content”. According to Panda there are a number of factors to define duplicate content on Google. This includes:
- Any site or page with duplicate content (off-site and on-site duplicate content)
- Low or meager original content
- Reiteration or identical content on every page
Low quality content can crash a site’s ranking. Therefore, Google has a warning for such sites and recommends avoiding thin and duplicate content completely.
Increasing Content Quality
If you are adding content on the web for business promotion, you should know how to manage your content well. On the subject of duplicate content, the Google Webmaster Central Blog has provided a few guidelines for producing high-quality sites and hoisting them on SEO. By incorporating Panda algorithms, it means using high quality and original content to get better rankings.
- You can make use of Webmaster Tools or do manual testing by searching the name of your site in Google. This will tell you about the Google crawling route of your site. On finding any identical content, remove it from your site completely.
- Make use of tools like the Screaming Frog, which alerts you to any duplication. It also tells you about wrong redirects.
- Apply 301 redirects. Redirect entire duplicate content to canonical URLs and save yourself from any trouble.
- Link to the original source of the content, in case you have tried all other options and failed. This helps in link building for content.
- Report any breach of copyright to Google. For this, fill the ownership form and register your complaint.
There is a clear warning from Google: Thin or duplicate content is bad for any e-commerce business. Presence of similar content on different URLs is considered as duplicate. Both thin and duplicate content affect your e-commerce site badly in terms of SEO rankings. Now that you know about Google cautioning and the outcomes of keeping thin and duplicate content on your site, you should try to avoid it completely. In its place, use best content strategies for your e-commerce site and aim at good yields.
It is extremely important to identify thin, duplicate, or low-quality content on your site. There is no point in replicating content that search engines are not very supportive of. Instead, focus on creating good quality and unique content.
Featured Image: Chris Harvey via Shutterstock
Image #1: Created by author for Search Engine Journal
Subscribe to SEJ
Get our daily newsletter from SEJ's Founder Loren Baker about the latest news in the industry!