SEO 101: An Expert’s Guide to Auditing a Website’s Onsite SEO Health

I just had to audit one of my websites, and while it was a long and painful process (as always), I’ve discovered and fixed many problems I didn’t know existed.

There are a number of reasons why you might want to audit the SEO health of your own site. Maybe you own the site and you haven’t checked in a while, and you’re looking for a DIY solution rather than an expensive external audit.

Maybe you just bought the website and you want to make sure everything is in order before you proceed with your plans. Maybe you’re the third-party auditor and you’ve been contracted to check on the health of a client site.

No matter the reason, there will be a lot of factors to check, and you have to be comprehensive. Missing one factor can be a hit to SEO moving forward, and an old, lingering problem can compound if left alone. Here’s what to check, and how to check it.

Author Note: This is all about SEO on your own site. For off-site SEO or competitive analysis, you’ll have to check other guides.

Check for Broken Links with Screaming Frog

Screaming Frog

Screaming Frog will provide you with a lot of information that will be useful for a number of other steps on this list as well as this one. For this particular step, you just want to run a crawl of your site and check on the integrity of your links. Any link on your site that points to a page that doesn’t exist needs to be changed or removed. Find an updated version of the previous destination, or remove the link entirely.

Check Sitemap Integrity

Your sitemap is essentially a list of every page you want the search engines to find on your site. You need to make sure it’s well-formed and lists every page on your site. The exact process can be a little complex, so you have two options. Your first option is to check your current sitemap. Your second option is to just generate a new sitemap, preferably once the rest of your audit is complete and any changes have been made.

Check for 404s in Google Webmaster Tools

Crawl Errors

Assuming your site is linked to your Google Webmaster Tools account, you can go to your Crawl -> Crawl Errors report. This report will show you any point where the Googlebot has attempted to crawl a page only to find that the page doesn’t exist. If the page does exist, the error may be old, or you may be blocking the bot. If the page doesn’t exist, it’s a good opportunity to either create a real page for that spot, or to redirect to the page it should be loading.

Check for Duplicate Content with Copyscape

Copyscape Example

Duplicate content, either on your own site or on other sites, can hurt your site ranking. Copyscape is the premier source for checking for duplications online. You’ll find a few potential issues.

  • Duplicate content on your own site. If the same content is registering in more than one place, you either need to fix it – as in the case of duplicate product descriptions on thin pages – or canonicalize it, such as when a dynamic search makes multiple URLs for the same page.
  • Duplicate content on another site, part 1. This is when the other site was the originator of the content, such as when you copied manufacturer’s descriptions for products, or a content writer blatantly copied content. Remove the offending content and replace it with original content.
  • Duplicate content on another site, part 2. This is when another site has copied your content. Chances are this won’t penalize you, but you may want to report it to Google nonetheless.

Check for Thin Content

The exact definition of thin content is nebulous, but you can guess that any page with under 300 words of content is probably going to be considered thin. Any page with more navigational, header, or footer content than actual body content is going to be a thin page. Thin pages should be merged with similar pages, removed entirely, or expanded to become valuable pages.

Check for Content Errors

There are two types of errors that can crop up in content; grammatical and factual. Grammatical errors require some proofreading to fix, and it’s a simple fix to make. Just make sure you update your “last updated” date in your sitemap, so Google knows to reindex the page with the error-free version.

Factual errors are a bit harder to deal with. If your page is old and what it says was factual at the time, you don’t need to do anything except maybe add a disclaimer that the advice is out of date. It’s also an opportunity to create new, updated content, if it’s still relevant. The choice is yours.

Check the Number of Indexed Pages

Once again, go into your Google Webmaster Tools. This time, pull the Google Index -> Index Status report. At the same time, go to Google and run a search for site:yourURL. Each will provide you with a number of pages.

Do those numbers match? If so, Google is indexing everything you want indexed. If the index count is smaller, you might have an issue with accessibility or with robots directives. If the index count is larger, you may have duplicate content issues coming from dynamic URLs, as mentioned above. More on fixing both of those later.

Check for Well-Formed Meta Tags

There are three types of meta code you should check for each page on your site.

  • Meta title. Your title should be succinct, and under 70 characters whenever possible. You should append brand information to the end, not the beginning. Your title should also be descriptive, to avoid a bait and switch scenario. Likewise, it should include a keyword, but not an overly optimized keyword. Finally, every page should have a unique title, to avoid duplication errors.
  • Meta description. Your description isn’t directly an SEO factor, but it is important for clicks and attractiveness in previews. Keep it short and relevant, with a keyword that matches the title and content.
  • Meta robots directives. Typically, you can handle most robots directives in the main txt. Using them at the page level is asking for contradictions and trouble.

You should also make sure paginated pages use the rel=prev/next tags, and that you have proper canonicalization. Also, if any of your pages have meta keywords, remove them. Only spam sites use the keywords tag these days.

Check for an Optimized 404 Page

Optimized 404

Ideally, no one visiting your site will land on the 404 page, but it will happen no matter how much care you put into things. Optimize your 404 page. Don’t just redirect it to your homepage.

Check for Proper Canonicalization

Canonicalization can be complicated. When you have multiple URLs for the same page, such as when URL parameters are involved, session data is stored in the URL or the page is dynamically generated, you can end up with dozens or hundreds of URLs all pointing to the same page. Google, however, operates by the URL. This means any two URLs are assumed to be different pages. A dozen dynamic URLs for the same page, to Google, look like different pages with duplicate content. Avoid this issue by canonicalizing the pages.

Check Site Load Times with Pingdom

Pingdom Example

Users don’t like to wait, so don’t make them. Use Pingdom to check for the load times and performance of your site in general and specific pages in particular. Ideally, none of your pages will take longer than 2-3 seconds to load. The fastest pages should be measured in milliseconds, while the slowest should rarely take longer than five seconds. If anything takes longer than ten seconds to load, you have a dangerous error you need to take care of.

There can be any number of problems leading to slow response times. You may have to make a drastic change in your server architecture or your web hosting, or you may just be able to remove a broken plugin or fix a broken script.

Check Robots.txt for Errors

Your robots.txt file is important for telling well-behaved search engines what do to. Make sure you’re not accidentally blocking the bots from your entire site. If you have any blocked pages, make sure they’re blocked for a reason.

Check for Proper Redirects

Moz comes to the rescue for this one. There are several different types of redirects, each serving a particular purpose. If you have any redirects on your site – and you might, if you’ve changed your site architecture or your URL structure for another step – make sure the redirects are properly implemented.

Check URL Format

Permalink Settings Screenshot

A sane URL structure that is human readable is an important aspect of SEO. Called HRULs or Semantic URLs, these are very common today. Any time you see a site with www.example.com/blog/2015/title-of-the-blog, you’re looking at a semantic URL. This is opposed to strings of letters and numbers that don’t make any sense.

If you don’t currently use semantic URLs, you have a very important change to make, and it’s a major change. It will likely involve a lot of redirects and a lot of work, so be careful when you’re implementing the changes.

Check Image Alt Tags

Every time an image is used on your site, it needs an alt description, even if it’s just your logo up in the corner. Alt text is important for usability, because any time an image doesn’t load, the text loads in its place. This helps users with accessibility issues and slow connections. Alt text also helps an image rank in Google’s image search, which can be important for bringing in traffic as well. With that in mind, try to craft a keyword-optimized alt description for every image you use.

Check Proper H Tag Utilization

Every page should have an H1 tag for the primary title. You don’t necessarily need to use any other H tags, though using H2 for subtitles is a good idea. Avoid common mistakes, like using H2 for the first subhead, H3 for the second, and so on. Think of them as nested elements, not as numbered lists.

While you’re at it, make sure that you don’t skip a number. Never have an H3 on a page without an H2 before it. Likewise, never use any H# tags if you don’t have an H1 at the top. It might seem like a minor factor, but every little bit helps, and properly formed code is important.

Check for Keyword Cannibalism

Keyword cannibalism is a phenomenon that happens more often on large, old sites and less often on small, new sites. The idea is there are only a limited number of valuable keywords in a given niche, so blogs end up repeating themselves. However, if a keyword is targeted multiple times on the same site, the cumulative SEO value of that keyword is split up amongst those pages. This means that each individual page is less potent than one combined page would be.

The result is that you may end up holding the rank 6, 7, and 9 spaces on Google with three cannibalized pages rather than holding the rank 1 spot with a single focused page.

To find keyword cannibalization, you will need to figure out what the targeted keywords are for every relevant piece of content on your site and make sure there are as few duplicates or overlaps as possible.

Check on Proper Site Architecture

If you were to draw a circle on a piece of paper representing your homepage, how splayed out of a spider’s web would the rest of your site look, with each page a circle and each link a line? There’s actually a science to it, and if your site violates some of the basic rules, like hiding content too many clicks away from the homepage, you may have a redesign in your future.

Check for Search Penalties

Manual Action

If your site is old and ill-monitored or freshly purchased, you may have a lingering search penalty.

  • First, check for hints of a penalty, like significantly lower than expected search ranks or missing indexing.
  • Second, check for an actual penalty. If the pages you suspect are penalized are just noindexed, it’s an easy fix. If you’re actually penalized, you have some work to do.
  • Fix the problem. It might be links, it might be content, it might be code; whatever it is, you need to fix it to get your ranking restored.
  • Request reconsideration. For most penalties, Google will detect the changes and lift the penalty automatically, but a reconsideration request can’t hurt.

Check for External Link Quality

Pull a profile of all of the links on your site pointing to other domains. By now, you should have already fixed any broken links. For the rest, you need to go through and determine if you want to keep them. Are they pointing at sites that have since changed, been parked or hacked?

If so, you may want to remove them and replace them. Are they pointing at sites you consider low quality? If so, consider removing them or adding the nofollow attribute. If they are high-quality sites you trust, leave them as they are.

In Summary

Your page stats on Google analytics will be a strong indicator of which pages are affected the most, and after auditing your old content, you may realize your content creation habits have improved and you have some old content that needs to go. The only way to properly audit a website is to leave no stone unturned, and to audit every aspect of your website content. Don’t forget to use tools and software to make your life easier.

