WTF exactly is thin content?
New clients who suffer from little engagement or few conversions often ask me this after my agency conducts a content audit.
The short answer?
The content has no value.
It lacks quality and resolves nothing for the reader. Most of it isn’t remotely relevant to the user’s intent, or what initially brought them to the particular page.
With thousands of $10-a-page copywriters available and content strategies that lack a long-term focus, thin content is one of the largest epidemics in modern digital marketing.
Publishing thin content on your website can quickly damage your brand’s image.
It can also destroy any possibility of engagement and stop users from taking any profitable action.
For some – especially SMBs with smaller websites – the solution is simple.
For others, such as huge ecommerce sites with hundreds of products and endless category pages, the solution is also simple – but it’s going to take much more time to implement.
This piece will focus on three elements:
- What thin content is and examples of what type of content is considered thin.
- How to analyze and diagnose current content.
- How to fix pages/posts with thin content (including some technical SEO elements), and prep for a future of thin-free content.
WTF Exactly Is Thin Content?
Back in the days of keyword stuffing and cloaking, it was simple for low-value websites to rank high for competitive keyword queries in the search engines.
Simply scrape content from value-driven websites and use shady link building tactics to get your content ranking higher than the original content creators.
The Panda update had one simple goal: to stop low-quality websites from ranking high in the search results.
The update was built upon penalizing sloppy content practices, including duplicate content and poor quality copywriting that failed to provide a relevant solution to a user’s intended search query.
Thin content provides little to no value, and typically resides on pages with:
- Duplicate content (though some is not duplicate, but perceived duplicate because of a few technical mistakes, such as not properly redirecting HTTP to HTTPS).
- Scraped content from another website (copy/paste, typically with little rewriting or re-arrangement).
- Auto-generated content.
- Invaluable affiliate pages.
- Doorway pages.
The latter is a big one, and easy to recognize. Google describes these doorway pages as:
- Having multiple domain names or pages targeted at specific regions or cities that funnel users to one page.
- Pages generated to funnel visitors into the actual usable or relevant portion of your site(s).
- Substantially similar pages that are closer to search results than a clearly defined, browseable hierarchy.
Any of the above incidents of thin content can cause your website’s ranking to tank.
This is why it’s always smart to start with an audit of your current content – first from thinking like a reader and then moving onto some technical elements.
How to Analyze & Diagnose Current Content
Now that you understand what thin content is, and what pages it mostly resides upon, it’s time to take a closer look at how to analyze a website’s current content and diagnose any issues.
Get the Big Picture
Start by taking a human approach:
- Use a site operator command for a quick overview of what you’re getting into.
- Observe how many pages are indexed.
- Take a quick glance at what title tags and URL structures are being used.
Remember the site operator results are not in order of importance, and sometimes the SERPs look much different than they would for a search query based on a specific keyword.
Now push the 80/20 principle into play so you can provide the most value to the client up front.
Ask the business owner what the highest ROI pages are, and focus on them.
Using Google Analytics or a third-party tool, check the highest-trafficked pages. You’ll want the focus to be on these for the beginning of any content audit.
Read the Content
Take some time to read the content.
Focus your attention on the quality and relevance of that particular page (not the number of words).
Long content doesn’t necessarily rank better; it’s a matter, again, of quality and relevance.
The best content can say something sharply in 250 words vs. a sloppily-written 2,500-word article.
The above process will quickly diagnose the major issues – the ones that are basically costing the most money for the client.
Address Any Duplicate Content Issues
Two tools make this task easy: Copyscape and Screaming Frog.
With Copyscape, you can enter the domain and quickly recognize any threats of duplicate content.
Duplicate content issues are always an issue for websites with quality blogs.
Things get much worse for news-based organizations that post hundreds of stories a month.
For one of my news-based clients, which sometimes creates up to 50 posts per month, Copyscape has been the go-to tool to immediately recognize others who scrape content.
I check it once a week for any scraping so I can immediately address the issue (typically through a cease and desist email to site owner).
The other tool is Screaming Frog, which crawls the website and provides data for each URL, from page titles to meta descriptions to canonical elements to redirects (the free version provides 500 URLs, typically enough for SMBs).
Here you can check for duplicate title tags, which sends a signal of duplicate content to the search engines.
Another worthy trick with Screaming Frog is sorting by word count. Focus first on the pages with the least amount of words, and compare the word count with the performance of that page.
You can reverse engineer here, and prioritize the fixes from the worst performing to the best-performing pages.
Typically, the pages with the least amount of text rank the worst, but sometimes you’ll find a proverbial gem that already ranks well – beefing up the content with a strong keyword strategy can help push that page even higher – and quickly.
As for pages with auto-generated content and doorway pages, do what’s needed to abolish them.
If that means a total rewrite of an auto-generated page or not pointing those other domains at your main website, just do it. It will help in the long term.
Also, if you have weak affiliate pages, make it a point to work with that affiliate to create stronger content, explaining that the work is best for both parties when it comes to boosting revenue.
How to Fix Pages/Posts with Thin Content (Including Some Technical SEO Elements)
Once you’ve diagnosed which pages have thin content, create a prioritized list based on ROI and get to work.
Yes, it may be a huge task, but look at it as an opportunity to further strengthen your website’s overall SEO because you’ll be adhering to an updated strategy.
How much fixing you do is a page-by-page decision, which I base on budget.
If my agency finishes a content audit and recognizes and prioritizes 100 pages with thin content, I’ll work out an engagement based on that client’s budget. Sometimes this will be month-to-month, other times a one-time project.
And I’d argue 90 percent of the time you’ll find some major issues where you can provide super value for the client.
One client had a few pages with outstanding content – but it wasn’t using keywords properly, used no subheadings, and had zero internal links throughout the text. The other pages had about 50 words of text that seemingly was written by a toddler.
Again, every situation will be unique and will need a unique strategy based on prioritization of highest ROI pages and down.
Besides actual content creation, here are a few of the top tech issues that can send signals of thin content to search engines, and how to resolve them.
www vs. non-www URLs
There should be only one preferred URL, which seems super basic to SEO pros – but sometimes it quickly gets overlooked. Always have the unwanted version 301 redirect to the preferred canonical version.
HTTP vs. HTTPS
Same as above, but also make sure all internal HTTP links are redirected to the HTTPS version.
Thin Category Pages
Some product-based companies have hundreds of category pages, and some may only feature a few items, which may appear like thin content to search engines. You can either chop the category itself, or noindex it.
If the website provides print-friendly pages, this can create duplicate content due to the creation of print-friendly URLs. Make sure to block the print URLs with robots.txt or a robots meta tag.
There are 172 million websites online – and 75 million of those are on WordPress, according to WhoIsHostingThis.
The main problem with WordPress is it allows comment pagination, which means a new URL will be created for each new comment on the same article.
Never allow comment pagination, or make sure there’s a canonical tag in place that directs to the URL of the main article.
Though this is (mostly) a thing of the past, due to (almost) everyone having a mobile-first design, some websites still have subdomain for mobile users (m.example.com), which can cause an onslaught of duplicate content issues.
If this is the case, make sure the proper canonical tags are in place that point back to the desktop version.
Thin content is an epidemic. It can cause some major problems for websites, whether you have a few dozen pages or thousands of pages.
Now that you know exactly what thin content is, and what to do about it, you can diagnose your issues and get to work.
While expanding on those thin pages it’s also a great time to create an updated content strategy based on up-to-date keyword research.
Also, make content audits part of a yearly ritual – there may be some serious problems that need to be addressed the first time around, but that will prevent any issues in the future.
And with constant analyzing, you’ll be able to recognize any major thin content threats before they damage your rankings – and your website’s reputation.
More SEO Resources:
- 16 Ways to Get Deindexed by Google
- Your Site Didn’t Get Penalized, Your Content Got Devalued
- Content With Purpose: How to Set Goals for Every Content Piece You Create
Subscribe to SEJ
Get our daily newsletter from SEJ's Founder Loren Baker about the latest news in the industry!