Thin or Duplicate Content: Not a Safe Bet for E-Commerce SEO

SMS Text
Thin or Duplicate Content: Not a Safe Bet for E-Commerce SEO

The results of using thin and duplicate content can be fatal for e-commerce sites because Google completely disapproves of these practices. E-commerce offers universal outreach, and is completely accessible at any location and at any given time. These advantages have allowed e-commerce sites to grow over the years and have facilitated smooth commercial transactions.

However, there are serious concerns regarding the content used by many e-commerce businesses on their sites.

Most e-commerce sites are killing their business prospects. Not everyone is inclined to follow the Google Algorithms and the Panda Update guidelines for using original content on sites – and some site owners may not even realize their content violates these policies.

Let’s find out how thin or duplicate content damages the prospects of an eCommerce site.

What is The Concern?

Google’s apprehension is a cautionary note for e-commerce site owners, as using thin and duplicate content has a hard impact on SEO.

Duplicate content is the real setback for sites, as it is not considered reliable or valuable for users. Therefore, Google bans your content by not displaying your site on SERPs or removing it completely for users’ sake.

What is Duplicate Content?

If you have any difficulty in understanding what duplicate content exactly is, this is how Google’s Matt Cutts explains it: “If the vast majority or all of your content is the same content that appears everywhere else, and there’s nothing else to really distinguish it or to add value, that’s something I would try to avoid if you can.”

Duplicate content cannot be regarded as reliable for many reasons. If you have picked up your content from someone else’s site, or it belongs to someone else, you are likely to let down your e-commerce empire.

Duplicate Content and SEO

Thin or Duplicate Content: Not a Safe Bet for E-Commerce SEO

You may be missing out on huge SEO benefits if you ignore the big disadvantages of using duplicate content. E-commerce sites pay attention to their content and take care to share great information regarding their products or services.

“Duplicate content issues are rarely a penalty. It is more about Google knowing which page they should rank and which page they should not. Google doesn’t want to show the same content to searchers for the same query; they do like to diversify the results to their searchers”, said Google’s Matt Cutts.

Now, you understand why duplicate content is not very good for ranking your e-commerce site in SERPs.

Getting around the search engines is like trying to sneak something through airport security screening. It most likely isn’t going to work, and the consequences when you get caught are high.

Direct impact of bad content can lead to:

  1. Bad or no search engine rankings
  2. Low web traffic
  3. Thumbs-down user experience

Therefore, e-commerce sites need to be extremely careful in publishing thin or duplicate content.

Let’s further understand how duplicate content is caused.

On-Site Duplicate Content

On-site duplicate content is when the same content appears on the homepage and internal pages of an e-commerce site. Below are a few patterns of on-site duplicate content which can alter SEO rankings.

Non-Canonical URLs

If webpage content happens to be the same on two or more URLs, Google considers them both as duplicate content. For example: Both http://yourfoodhouse.com/breakfastfoods.html and http://yourfoodhouse.com/organics/breakfastfoods.html can arrive at the same product page. In such cases, Canonical URLs can help in stopping Google from reading this as duplicate content. Google recognizes Canonical URL as the original source of content. For this, set Canonical URLs for content, products and other categories in the admin panel. This can be done by suffixing a tracking code at the end of the URL.

Session IDs

Duplicate content also results when Session IDs get suffixed to the URL and generate new URLs. This is a byproduct of a system failing of which it uses the Session ID. This results into duplicates of the source URL to which Session ID is attached. For this, you can stop attaching the Session ID to the URLs by tracking user sessions.

In case Session IDs get attached to the URLs, you can fix it by turning Session ID URLs into Canonical URLs. You can also stop crawling of such Session ID URLs with the help of robots.txt file.

Shopping Cart URLs

Shopping cart pages are a common feature on an e-commerce site. To stop search engines from indexing these pages, create shopping cart URLs as noindex via robots.txt file or X or meta robots tag.

Internal Search Results

Google’s Panda Algorithm deals severely with the problem of too many internal search results pages on an e-commerce site. Since Google wants to send users to original content pages, you need to fix the problem by setting internal search results as noindex pages.

Duplicate URL Paths

Two URL paths or duplicate URL paths generate when a product is kept under two or multiple categories. This becomes complicated when the CMS is unable to prevent causing duplicate URL paths. For example:

  • Category A – Product A – /category-A/product-A/
  • Category B – Product A – /category-B/product-A/

Now for fixing the above problem, you can go for:

1. Remove the category level and use only the basic level product page URL, though this restricts track process in Analytics Software.

  • Category A – Product A – /product-A/
  • Category B – Product A – /product-A/

2. Revise the above first route in the below manner for helping the Analytics Software.

  • Category A – Product A – /product/product-A/
  • Category B – Product A – /product/product-A/

3. Canonicalize all product URLs.

  • Category A – Product A – /category-A/product-A/ – Canonical = /category-A/product-A/
  • Category B – Product A – /category-B/product-A/ – Canonical = /category-A/product-A/

Product Review Pages

The product review pages on an e-commerce site should be canonicalized or noindexed via X or meta robots tag like in earlier cases.

WWW or Non-WWW URLs

Using one kind of URLs on an e-commerce site can be beneficial for avoiding the problem of duplicate content. For example: http://www.bookstore.com and http://bookstore.com. You can use 301 redirecting to avoid duplicate content. Similarly uppercase URLs and lowercase URLs should be dealt in the same manner.

Category Pages

To save category pages from looking dull, write interesting content describing products, featured in the product grid. This will save it from turning into duplicate content because sometimes the product names can be the same.

Homepage Content

Homepage content is associated with the visibility of a site in the search engines. Therefore, your content should be unique and should craft user experience. Make sure your homepage content is not duplicate, and is different from internal web pages.

Off-Site Duplicate Content

When similar content is found on two or more e-commerce sites, it is considered as duplicate content. This can happen due to many reasons: scraping/pirating content, guest blogs, syndicated content, etc. Let’s examine a few patterns of off-site duplicate content.

Product Narration

Very often, product description is used in the same way it comes from product manufacturers. This way more e-commerce sites are likely to have similar product descriptions. To avoid this, write different yet unique content and keep multiple product photos. Also, include product benefits or consumer reviews to appear differently and improve SERP rankings.

Testing Sites

When search engines index testing websites, it alters the live site. Avoid this by appending noindex or meta robots tag. You can also remove the testing site by setting a “Disallow: /” command.

Product Feed Specification

For marketing reasons, e-commerce owners share their product feed specifications on shopping websites. This results into having same product details on two different sites, which is off-site duplicate content. To deal with such inaccuracies, give different product feeds to the shopping site. When doing this, try to keep a better on-site description than product feeds on shopping sites.

Content Syndication

If your original content goes to other sites with no mention of the original source, it may not work in your favor. This is because Google does not know which the original source is. Therefore, always include a link to the original source. It is best to syndicate the original content on other websites.

Content Scraping

If other sites are becoming jealous over the popularity of your website’s business, it is not unusual for them to scrap your content. While you could be innocent in this case, Google is likely to miss that due to absence of links to the original source.

Thin Content

The other common problem of e-commerce sites is thin content. Shallow or low-quality web pages are considered as having thin content. Such pages are deemed unnecessary and inappropriate by Google because they create a bad user experience. These pages have high bounce rates because users leave them in few seconds. For that matter, Google completely slams ‘thin sites’ on the search engines.

To avoid this, always ensure to use interesting and well-written content on every page of your site.

Panda’s Concerns Over Duplicate Content

Cutts has mentioned that Google Panda Update is “designed to reduce the rankings of low quality sites”. Such sites either “copy content from other websites” or are lacking in “original content”. According to Panda there are a number of factors to define duplicate content on Google. This includes:

  • Any site or page with duplicate content (off-site and on-site duplicate content)
  • Low or meager original content
  • Reiteration or identical content on every page

Low quality content can crash a site’s ranking. Therefore, Google has a warning for such sites and recommends avoiding thin and duplicate content completely.

Increasing Content Quality

If you are adding content on the web for business promotion, you should know how to manage your content well. On the subject of duplicate content, the Google Webmaster Central Blog has provided a few guidelines for producing high-quality sites and hoisting them on SEO. By incorporating Panda algorithms, it means using high quality and original content to get better rankings.

  1. You can make use of Webmaster Tools or do manual testing by searching the name of your site in Google. This will tell you about the Google crawling route of your site. On finding any identical content, remove it from your site completely.
  2. Make use of tools like the Screaming Frog, which alerts you to any duplication. It also tells you about wrong redirects.
  3. Apply 301 redirects. Redirect entire duplicate content to canonical URLs and save yourself from any trouble.
  4. Link to the original source of the content, in case you have tried all other options and failed. This helps in link building for content.
  5. Report any breach of copyright to Google. For this, fill the ownership form and register your complaint.

Summing Up

There is a clear warning from Google: Thin or duplicate content is bad for any e-commerce business. Presence of similar content on different URLs is considered as duplicate. Both thin and duplicate content affect your e-commerce site badly in terms of SEO rankings. Now that you know about Google cautioning and the outcomes of keeping thin and duplicate content on your site, you should try to avoid it completely. In its place, use best content strategies for your e-commerce site and aim at good yields.

It is extremely important to identify thin, duplicate, or low-quality content on your site. There is no point in replicating content that search engines are not very supportive of. Instead, focus on creating good quality and unique content.

 

Image Credits

Featured Image: Chris Harvey via Shutterstock
Image #1: Created by author for Search Engine Journal

Alan Smith

Alan Smith

Tech Blogger & Web Designer at SPINX Inc.
Alan Smith is an avid tech blogger with vast experience in various IT domains, currently associated with SPINX Inc., Los Angeles, California based website design, web development and internet marketing company.
Alan Smith
Get the latest news from Search Engine Journal!
We value your privacy! See our policy here.
  • Sandy @ Workado

    Good article on how to avoid falling off or to the bottom of the SERP. Thanks for explaining the cananical URLs.

    Sandy @ WorkadoApp

    • http://www.spinxwebdesign.com Alan Smith

      Thanks, Sandy. Glad you liked the write up and the stuff about Canonical URLs. Thin or duplicate content can be a major threat to SEO results. It’s best to avoid using it on our site, always.

  • Thomas

    Hey Alan. Liked your article. The way you have explained what causes duplicate content and how does it work to spoil our efforts to get our site up on the SERPs. It has actually helped me calm down. You have cleared my fears about it. There’s lot of which I had heard about it from different sources. But this one was with absolute clarity. You did a great job!

    • http://www.spinxwebdesign.com Alan Smith

      Thanks for the comment. I agree with you. There are people who freak out about thin or duplicate content. But, if you can get a little careful about your content, you’d never go wrong. Honestly, duplicate content has become a hot topic of discussion in recent times. So, one must learn about solutions to this problem.

  • Mihai Aperghis

    I think the article is a bit misleading and doesn’t offer accurate information. Google won’t “crash” your rankings or “ban” your content because it simply appears on other websites.

    John Mueller has repeatedly communicated in the Webmaster Office Hours hangouts that they don’t penalize websites or their pages for duplicated content, saying that the penalty issues are a “myth” and you shouldn’t really worry about it (barring scenarios where you do this maliciously or to an extreme extent, like copying a full website).

    You can check out the full video where he explains how Google detects and deals with duplicated content here: http://youtu.be/XzzY2T2Ph8w?t=8m14s

    Basically, when dealing with off-site duplicate content, such as product descriptions for e-commerce websites, they would simply take other factors into consideration (such as location) when choosing which of the websites that feature the same content to display for a certain query.

    Using the same product descriptions as 5 other websites isn’t ideal, sure, but it isn’t necessarily a setback. Writing unique descriptions can be helpful, but only as long as the information presented in them is still useful to the user. Otherwise, the other 5 websites will outperform you simply because they feature content that the user is searching for (i.e., writing an unique description for an HDTV without specifying that it supports a 4K resolution won’t improve your rankings for queries that feature terms related to that feature).

    Also, adding extra content to the duplicated descriptions, such as reviews or other unique content will also not necessarily help you rank better as long as the user is searching for content that is found inside the descriptions, in which case Google will use its same decision process to decide which websites to show in the results.

    And regarding on-site duplicated content, such as accessing the same product from multiple URLs, Google is fairly able to detect which pages to take into consideration and which ones to ignore, and will not affect your rankings (at least not directly). Using canonicals and redirects is useful, sure, but mainly for consolidating page quality signals and to avoid wasting Googlebot’s bandwidth.

    TL;DR: Avoiding or resolving duplicated content can be helpful, but will not otherwise negatively impact your website (unless it’s done to an extreme extent). Source: http://youtu.be/XzzY2T2Ph8w?t=8m14s

    • http://www.spinxwebdesign.com Alan Smith

      Hi Mihai,

      Thanks for sharing your insights in such a detailed way. Matt Cutts has said that duplicate content affects SEO space if it is spammy and keyword stuffing. You could be right, but, people often want to know whether duplicate content will hurt their SEO rankings. It is decided by Google which page to rank up or down. Let me quote directly from Google Webmaster guidelines:

      Don’t create multiple pages, subdomains, or domains with substantially duplicate content.

      Avoid… “cookie cutter” approaches such as affiliate programs with little or no original content.

      If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.

      Another interesting thing that I found on the same page was that: Search engines strive for a certain level of variety; they want to show you ten different results on a search results page, not ten different URLs that all have the same content. To this end, Google tries to filter out duplicate documents so that users experience less redundancy.

      Now, this suggests that Google is able to detect ‘all the duplicates of a particular page’, which we may possibly not. I suppose based on this Google decides to rank content in the search engine result pages.

      Thanks,
      Alan

      • http://www.vertify.ro Mihai Aperghis

        Hey Alan,

        Your quote saying that “Matt Cutts has said that duplicate content affects SEO space if it is spammy and keyword stuffing” is completely true. I would go further to emphasize the “spammy and keyword stuffing” part, since this is not the type of duplicate content that e-commerce sites usually have (which is what this article focuses on).

        The duplicate content types typically found on online shops are the ones mentioned in your post, such as multiple URLs for the same product (on-site) and similar or identical product descriptions for products that are featured on other shops as well (off-site). These are not the types of duplicate content that will attract Google penalties.

        As I mentioned, I do agree that creating original unique content can definitely help, but only as long as it still provides useful information that users are searching for, and doesn’t guarantee higher rankings.

  • http://seo-kerala.com Aju

    Duplicate URL path has been explained really well. I tried to explain the same issue to many of my clients, but this is really simple. These tips are really helpful, especially home page content, canonical and reviews related tips.

    • http://www.spinxwebdesign.com Alan Smith

      Hi Aju,

      Thanks for liking my article. It’s quite simple that if you place a product under two or more categories, duplicate URLs generate. This internal site duplication can be worst as none of the duplicate URLs will get full value.

  • Nate Somsen

    Thanks for you in-depth article Alan! As I read through it, the one thought that kept coming to my mind is “Write content for the users”. It seems so clear that there is no point to creating a page without having original content for visitors, they don’t want to see the same generic junk.

    I liked the mention of noindexing shopping carts, for whatever reason, that idea didn’t occur to me. I also wanted to mention where you discussed the canonicalization of www vs non-www. I think you could get bonus points with Google if you set the preferred domain via Google Webmaster Tools. Once again, thanks for your article, I enjoyed reading and learning from it.

    • http://www.spinxwebdesign.com Alan Smith

      Hi Nate,

      Glad that you liked the article. If your content is original and is written by keeping users in mind, it can be great and fetch lot of admiration. You are right when you say the ‘same generic junk’ that you mention is just not worthy. I have often noticed how much people like to read and share an original and interesting content, be it on social channels or elsewhere.

  • http://www.alseoblog.com/ Al Gomez

    Nice points there, Alan. But I would slightly disagree with thin content. Not ALL thin content is bad. A few words describing what your product is really about, a helpful video or even an original image could add character and value to a page. However, when it comes to product descriptions and company services, I would suggest more detail because the user likes to focus on these areas. It really depends on the niche, the market demographics you’re targeting, and your overall goal with SEO.

    • http://www.spinxwebdesign.com Alan Smith

      Hi Gomez,

      Thanks for sharing your views. Your points are valid. It indeed depends on the niche, and the market demographics, besides SEO goals. If you can describe your product by using interesting description, users will find it attractive. Adding a product photo further enhances the product value. Thin content is of little value as Google disapproves of it. Google search results ensure that content with substantial value, relevant keywords and rich information appears up in the rankings.