If you want the Ultimate SEO Audit Checklist, you’ve come to the right place.
Did you know that Google has approximately 200 ranking factors? For SEOs, Google has always maintained that chasing after the algorithm has a minimum effect and does not provide much — if any — return for most ranking factors. Of course, they don’t want everybody gaming the system!
This SEO audit checklist provides checkpoints in your auditing process that will help you better improve certain areas of your site as they relate to these ranking factors.
Please note: This list is not intended to be the official list of ranking factors from Google. It is intended to pool together all potential SEO audit items that you may want to investigate if it’s an existing site.
Has the site fallen under penalty? If it has, this checklist will give you the tools and checks you need to get the site working again. If it has not, this checklist will help you identify where your site’s weakest links are (no pun intended).
Other articles and graphics have talked about the 200 factors, but do you have methods and points of investigation to check? The goal of this checklist is to give you all of the applicable methods of checking and points of investigation to help take the hard work out of identifying each ranking factor.
The bulk of each investigation factor is provided in the form of Screaming Frog audit checks, or using another audit tool or technique, with some paraphrased background information, also mentioned fully with a thorough walk-through of what you should check within each section.
If you are looking for a comprehensive checklist of SEO audit items, it’s here.
Matt Cutts has stated in a webmaster video that domain age does not play “very much of a role” in search ranking. In addition, John Mueller himself has stated that domain age does not play a role in search rankings. While they do state it does not, this does show Google uses it in some way, at least for minimal reasons.
If you are curious about the age of any domain, here’s a good, cheap method to use to check it. You can also use a bulk WHOIS checker if you are feeling especially adventurous.
- Go to godaddy.com/whois
- Type in the domain.
- Look up the age of the domain.
It can be helpful to create a spreadsheet of the domains you’re analyzing for your specific marketing campaign. For bulk WHOIS checkers, Whibse is an excellent tool for scraping domain WHOIS and web analytics information.
Keyword Appears in Top Level Domain
Notice until recently how Google bolded keywords that appear in a domain name? This isn’t exclusive proof that it’s used in ranking, but it makes sense that having the keyword in the domain would be a relevancy signal.
Because this no longer occurs, it probably isn’t something you need to worry too much about. Just keep it in mind when purchasing a new domain name. Use it as a relevancy signal more than anything else.
Keyword as First Word in Domain
Moz’s 2015 Search Engine Ranking Factors Survey has most SEOs placing domain-level keyword usage at a minimum influence on rankings nowadays, compared to Moz’s 2011 survey which has a very high correlation of SEOs placing domain-level keyword use at high influence on rankings. While this appears to no longer be a high-influence ranking factor, it may be worthwhile to check when creating a new site.
Just use your eyes. Eyeball the domain to see if there is a keyword as the first word of the domain or not. It probably isn’t something that will negatively impact you.
Domain Registration Length
Google’s patent states the following about domain registration lengths: “Valuable (legitimate) domains are often paid for several years in advance, while doorway illegitimate domains rarely are used for more than a year. Therefore, the data when a domain expires in the future can be used as a factor in predicting the legitimacy of a domain.”
While this isn’t conclusive proof that domain registration length is used as a ranking factor, it’s still a relatively quick and painless check that lets you examine the state of competition in your industry and will allow you to adjust your SEO strategy. For checking domain registration length, use Whibse or Godaddy’s WHOIS.
Keyword in Subdomain Name
Moz’s panel in the 2015 Search Engine Ranking Factors survey agrees that keywords in the subdomain name have a smaller impact on search rankings than they had in the past. Although, it is a good idea to check for the keyword in the subdomain name to reinforce topical relevancy of your domain.
It’s likely that a site with a volatile domain history can negatively impact your SEO efforts if you purchase that domain. When purchasing a new domain, investigate factors such as its link profile, previous owners, and previous activity.
For this section of the audit, we will need several tools: Whibse.com, HosterStats.com, and several link checking tools like Ahrefs, Majestic, and SEMrush.
- Using Whibse.com, you can check the current state of the domain WHOIS.
- Using HosterStats.com, you can check domain ownership history, including things like domain hosting history, DNS history, and much more. These are useful information for determining the domain’s track record before you buy it.
- Using Ahrefs, Majestic, and SEMrush, you can check the domain’s link profile and ensure that the domain’s linking history isn’t spammy. Nothing’s worse than buying a domain and figuring out later that you need to perform additional link cleanup as a result.
Exact Match Domain
While in the past, having an exact match domain with the exact keyword you want to use was a heavy ranking factor, an EMD update was launched in September 2012. The intent of this update was to stop poor quality sites from obtaining higher rankings just because they had domain names that matched their primary targeted keywords. Sites that have an exact match domain but are a higher quality site will likely see a benefit from this.
Look at the domain name. Ask yourself the following: Is it exact match? Do other factors of the site have high enough quality so as to mitigate the exact match nature of the domain? If so, you probably can leave the domain alone. If not, then it’s time to think about a domain name overhaul.
Public vs. Private WHOIS
Google’s John Mueller has stated that using private registration won’t hurt your rankings.
Use a WHOIS tool like https://www.godaddy.com/whois. Check that the WHOIS itself is public or private. This can be useful in finding spammy links in a bad link profile and determining spammy blog networks, so it could be a good check to use for later. But in this context, registering your domain with a public vs. private WHOIS is purely personal preference and shouldn’t influence your rankings much.
Penalized WHOIS Owner
If you purchase a domain and aren’t seeing much benefit from it despite using non-spammy Google practices, it’s possible that your site could have been penalized in the past by an unscrupulous owner who decided to dump the domain. Because you own the domain and have access to its Google Analytics profiles, use the Panguin tool as an overlay to check whether you’ve been penalized by any major algorithm changes.
- Using the Panguin tool, investigate the domain you purchased and whether any algorithm updates match your traffic drops in Google Search Console.
- As a bonus, it’s a good idea to also check Google Analytics to see whether the correlations are there. If the three-point check reveals a correlation in traffic drops, it’s likely that domain was impacted by an algorithm update, and you have more work to do to get that domain up to speed.
Using Screaming Frog, perform a crawl of your site. For this crawl example, I used Screaming Frog SEO Spider 7.2. I also used CNN.com as an example site for all these bits and pieces.
- Fire up Screaming Frog.
- For most basic audits, you can use the following settings by going to Configuration > Spider:
Screaming Frog Basic Settings:
Screaming Frog Advanced Settings:
You’ll want to check the following to make sure all of these elements are up to date and they are implemented according to your SEO strategy. If not, then you’ve identified fixes you’ll need to perform after this audit:
- Keyword in the title tag
- Title tag starts with keyword
- Keyword in description tag
- Keyword appears in the H1 tag
- Keyword is most frequently used phrase in the document
Most SEO audits tend to identify thin content to avoid penalties, but what about other items that are technically necessary? Let’s take a look.
From thin content to image links on a particular page, images that are missing alt text or images that have too much alt text, Flash implementations, and how content is organized site wide, it would be helpful to know about everything that can impact rankings, right?
Identifying Pages With Thin Content
After screaming Frog has finished crawling, just click on the internal tab, click on the arrow next to “filter”, and select “HTML”. If you scroll to the right, you will see a Word Count column.
While this isn’t quite as precise as other methods, it’s will help you identify thin content pages. In most instances, you should know that only so many words are taken up by navigation, and then you can mentally assess from there if the page has thin content.
If the site has a heavy menu like 150 words, has a heavier footer at 200 words but not much else, and there are 3,500 words of content on the page, you can generally assume that 3,150 of those words belong to the meaty article on that page. If you aren’t sure, dive deeper into the page with the Word Count extension from Google Chrome, and count how many words belong just to the meaty article.
Page Load Speed via HTML
Page speed, in recent times, has become a critical ranking factor to get right. In fact, Google’s John Mueller quite recently gave a legit number for Google’s Page Speed recommendations: He recommends that you keep your load time to less than 2-3 seconds.
Using any one tool to identify bottlenecks and page speed issues can hurt you. That’s why I recommend using at least three tools to check this metric. This is because Google’s Page Speed tool is not always accurate.
Using more than one tool will help you identify multiple issues that aren’t always identified by Google’s Page Speed Tool. Other tools can uncover issues even when Google’s tool says your site is 100 percent optimized.
Another tool like https://www.webpagetest.org/ will help you identify server bottlenecks, including another issue like time to first byte which can cause longer load times if you aren’t careful.
You can also identify page speed issues in Google Analytics. If you wanted to get more granular and prioritize pages based on traffic, you can identify pages that need work in Google Analytics.
Simply sign into Google Analytics, navigate to Your View, open up Reports, then click on Behavior > Site Speed. The page timings report gives you a fully detailed analysis of individual page performance. Making the case for prioritized pages and site-wide overhauls will be easier with all of this data at your fingertips.
Different types of content issues can plague a site — from URL-based content issues to physical duplicate content, actually replicated from page to page without many changes. Identifying duplicate content issues is only half the battle. The other half of the battle comes when you need to fix these issues (in some cases this is much easier said than done).
Using the tool Siteliner.com (made by Copyscape) can help identify duplicate content issues on your site quickly. It gives an easy-to-see view that shows you which pages have a match percentage, and which pages match other pages.
Also, use Copyscape to check and see which pages of your site have been duplicated across the web. Copyscape is considered one of the standard audit tools in SEO circles. This tool can help you identify duplicate content sitewide by using the private index functionality of their premium service.
To cover all your bases, check Google’s index for plagiarized copy of your site’s content from around the web. Select a section of text that you want to check, and simply copy/paste it into Google’s search bar. This should help you identify instances where it has been stolen.
Identifying duplicate content isn’t just limited to text content on the page. Checking for URLs leading to duplicate content can also reveal issues that cause Google great confusion when they crawl your site. Check and investigate the following:
- How recent content updates are
- Magnitude of content updates
- Historical trend of page updates
In Screaming Frog, scroll all the way to the right, and you’ll find a Last Modified column. This can help you determine how recent content updates are and the magnitude of content updates on the site. This can also help you develop historical trends of page updates.
If you’re obsessed with your competitors, you could go as far as performing a crawl on them every month and keeping this data on hand to determine what they’re doing.
It would be pretty easy to analyze and keep this data updated in an Excel table, and identify historical trends if you want to see what competitors are doing in terms of developing their content. This can be invaluable information.
- Syndicated content
- Helpful supplementary content
Understanding how content is segmented within a site, or somehow syndicated, is useful for divvying up original content on a site from syndicated content on a site, especially when syndicated content is a heavy site feature.
This trick is especially useful for identifying thin content and creating custom filters for finding helpful supplementary content.
The above trick for creating custom filters can also help you identify keyword prominence — where the keyword appears in the first 100 words of a page’s content.
Keyword in H1, H2, H3 Tags
In Screaming Frog, click on the H1 tab then take a look at the H1, H2, and H3 tags. Alternatively, you can also click on the H2 tab. In addition, you can set up a custom filter to identify H3 tags on the site.
- Keyword word order
- Grammar and spelling
- Reading level
Identifying poor grammar and spelling issues on your site during a site audit isn’t ideal, and can be painful, but doing so before posting content is a good step towards making sure your site is a solid performer.
If you aren’t a professional writer, use the Hemingway App to edit and write your content. It can help identify major issues before you publish.
Number of Outbound Links
The number of outbound links on a page can interfere with a page’s performance, but Matt Cutts has stated that the requirement of 100 links per page has been removed. If you want to perform a bonus check, you could always check this in Screaming Frog, although generally it isn’t required anymore.
In Screaming Frog, after you identify the page you want to check outbound links on, click on the URL in the main window, then click on the Outlinks tab.
Alternatively, you can click on Bulk Export > All Outlinks if you want a faster way to identify site-wide outbound links
Number of Internal Links Pointing to Page
To identify the number of internal links pointing to a page, click on the URL in the main Screaming Frog window then click on the Inlinks tab. You can also click on Bulk Export > All Inlinks to identify site-wide inlinks to all site pages.
Quality of Internal Links Pointing to Page
Using the exported Excel document from the step where we bulk exported the links, it’s easier to judge the quality of internal links pointing to each page on the site:
Identifying broken links in an SEO audit can help you find pages that are showing up as broken to Google, and will give you an opportunity to fix them before they become major issues.
Once Screaming Frog has finished your site crawl, click on the Internal tab, select HTML from the Filter: dropdown menu, and sort the pages by status code. This will organize pages in a descending order so you can see all of the error pages before the live 200 OK pages.
In this check, we want to identify all of the 400 errors, 500 errors, and other page errors. For some links, depending on their context, it is safe to ignore 400 errors and let them drop out of the Google index, especially if it has been awhile and you don’t find them in the Google index. But if they are indexed and have been for a while, you’ll probably want to redirect them to the proper destination.
If the goal of your audit is to identify and remove affiliate links from an affiliate-heavy website, then the next tip is a good path to follow.
Affiliate links tend to have a common referrer or portion of their URL that is identifiable across many different websites. Utilizing a custom filter can help you find these links. In addition, using conditional formatting in Excel, you can filter out affiliate links and identify where they are in the bulk exports from Screaming Frog.
To identify URLs over 115 characters in Screaming Frog, click on the URL tab, click on Filter then click on Over 115 Characters. This will give you all the URLs on-site that are more than 115 characters and can help you identify issues with overly long URLs.
For a high-level overview of page categories, it’s useful to identify the top pages of the site via Screaming Frog’s site structure section, located on the far right of the spider tool.
Using the site structure tab, you can identify the top URLs on the site, as well as which categories they fall into. In addition, you can identify page response time issues in the response times tab.
Content Provides Value and Unique Insights
Identifying content that provides value and uniqueness can only be done via a thorough competitor analysis. For the extreme white hat folks, if you don’t want to do anything to upset the Google folks, I suggest performing this check manually.
Type your keyword in Google and perform a check of the top 10 organic competitors that you find. Assuming Screaming Frog isn’t blocked, we can identify this kind of content through quick Screaming Frog checks.
Pull the competitors from Google, and add them to a list in Excel. Do a Screaming Frog crawl on each site, and export the CSVs.
To identify the content-heavy pages, scroll to the right in Excel and log all of the word counts for the top 20-50 pages. In a new Excel sheet, track all of these results in a single tab per site, identifying the content that provides value and unique insights.
Pull the competitors from Google and add them to a list in Excel. Do a Screaming Frog crawl on each site and export the CSVs. To identify the content-heavy pages, scroll to the right in Excel and log all of the word counts for the top 20-50 pages. In a new Excel sheet, track all of these results in a single tab per site, identifying the content that provides value and unique insights.
This, of course, will need to be judged on a case-by-case basis and a competitor basis.
For those who prefer to use more efficient means to identify these competitors, you could use Scrape Box’s Competition Finder. I’ll leave instructions out for this here, because if you don’t know what you’re doing, you could block your own internet from being able to access Google.
Contact Us Page
Identifying a contact page on the site and ensuring that it has appropriate contact information is a good idea.
Eyeball the site and take a look at their contact info. If it is thorough and matches the WHOIS information, it’s likely good. If not, then you will probably want to make this a point of your audit.
Website architecture, or how your site is organized, can help Google better crawl and organize your content.
There are different schools of thought on this topic. Some SEOs believe that having a flat site architecture is the best (where you have no more than one click to arrive at internal pages from the home page). Other SEOs believe that a siloed architecture is best. A silo structure is where content is siloed and organized per content topic.
It has been observed by some SEOs that a silo structure tends to enhance topical focus, which in turn enhances Google’s understanding of the website’s architecture.
In Screaming Frog, check the far right window. Click on the site structure tab. Here, it will be possible to spot issues with the top 20 URLs on-site.
At first glance, you can identify whether too much content is far too deep for the user or search engine. Utilizing this check, you can identify URLs that may need some changes as to where they may be within the site structure.
In an SEO audit, identifying site uptime issues can help determine problems with the server. If you own the site, it’s a good idea to have a tool like Uptime Robot that will email you every time it identifies the site as being down.
Identifying the server location can be an ideal check to determine location relevance and geolocation.
Using a tool such as site24x7.com or iplocation.net can help you identify the physical location of the server for a specific domain.
Terms of Service and Privacy Pages
Quite simply, you can use Screaming Frog’s search function (on the right side) to identify terms of service and privacy pages showing up in your Screaming Frog crawl. If they don’t show up on the crawl, check on-site and make sure they are actually there and not hosted elsewhere (this can happen sometimes).
Duplicate Meta Information On-Site
In Screaming Frog, it’s quite easy to find duplicate meta information on-site.
After the crawl, click the Meta Description tab. You can also check for duplicate titles by clicking on the Page titles tab. In addition, this information is easily visible and can be filtered in the Excel export.
Identifying breadcrumb navigation can be simple or complex, depending on the site. If a developer is doing their job correctly, breadcrumb navigation will be easily identifiable, usually with a comment indicating the navigation is a breadcrumb menu. In these cases, it is easy to create a custom extraction that will help you crawl and extract all the breadcrumb navigation on the site.
Click on configuration > custom > extraction to bring up the extraction menu. Customize the settings and set up your extraction where appropriate, and it will help you identify all the breadcrumb navigation on the site.
Google’s mobile index is coming soon, so it is important now more than ever to make sure your site is optimized for mobile.
Use Google’s Mobile Friendly Testing tool to find out whether your site is mobile friendly.
Usability is important to get right among your users. The easier a site is to use, the better. There are user testing services available that will help you figure out what reactions are really happening when users use your site.
First, check and see how users are really using your site through a service like UserTesting.com. This will give you invaluable information you can use to identify where your site’s weaknesses lie.
Heatmaps are also invaluable tools showing you where your users are clicking most. When you use heatmaps to properly test your sites among your users, you may be surprised. Users may be clicking where you aren’t thinking they are. One of the best tools that provides heatmap testing functionality is Crazy Egg .
Use of Google Analytics and Google Search Console
When using Google Analytics, there is a specific analytics ID (UI-#####…) that shows up when it’s installed. In addition, Google Webmaster tools has its own signature coding.
Using Screaming Frog, it’s possible to create a custom extraction for these lines of code, and you can identify all pages that have proper Google Analytics installs on your site. Click on configuration > custom > extraction, and use the proper CSSPath, XPath, or Regex to figure out which pages have Google Analytics installed.
For Google Search Console, it can be as simple as logging into the GSC account of the owner of the domain, and identifying whether the site still has access.
Penguin has become a critical part of Google’s algorithm, so it’s important to have an ongoing link examination and pruning schedule. This helps to identify any potential links that will harm you before they cause trouble and before significant issues arise because of their linking to your site.
The factors that you can check using a Link Detox link profile audit include:
- # of linking root domains
- # of links from separate C-class IPs
- # of linking pages
- Alt text (for Image Links)
- Links from .EDU or .GOV domains
- Authority of linking domain
- Links from competitors
- Links from bad neighborhoods
- Diversity of link types
- Backlink anchor text
- No follow links
- Excessive 301 redirects to page
- Link location in content
- Link location on page
- Linking domain relevancy
Using Majestic SEO, it’s possible to get some at-a-glance identification of the issues surrounding your link profile pretty much instantaneously.
For example, let’s examine ABC7.com’s link profile.
We can see a healthy link profile at more than 1 million backlinks coming from more than 20,000 referring domains. Referring IPs cap out at more than 14,000 and referring subnets are at around 9,700.
The charts below for the backlink history give you a good idea of link velocity and acquisition factors over the past 90 days.
Here is where we can begin to see more in-depth data points about ABC7’s link profile. Utilizing in-depth reports on Majestic, it’s possible to get every possible backlink. Let’s get started and identify what we can do to perform a link profile audit.
Next, compile them into a single report and run them through Link Detox. Don’t forget to download and import your disavow file. This will help keep your already-disavowed links ignored and Link Detox won’t count them when it performs its audit. In addition, Link Detox will automatically de-dup all of the duplicate entries, so you shouldn’t have to identify any duplication issues.
Go through Link Detox and begin training the tool’s AI to learn about your link profile by upvoting and downvoting links. It’s also a good idea to visit the sites and figure out what Link Detox considers spam and not spam. By using this upvoting/downvoting process, you can train the tool to include or exclude links when it reprocesses your report.
It’s also a good idea to classify your link anchor text if you’re prompted to do so. Doing this will greatly increase the accuracy of your Link Detox audit reports. The more you train the tool’s AI, the better it will get at identifying the bad links in your link profile, as well as the good ones.
There are several things to watch out for when investigating issues with your link profile. These are all my opinion and may not necessarily apply to your site, but it has been my experience that these items are crucial to watch out for and can impact your link profile positively or negatively, depending on how they are implemented. As per usual, anything used super excessively could be interpreted as spam.
Positive Link Velocity: This refers to how much your link profile is expanding and how fast it is expanding. A link profile that is expanding too fast could trigger a red flag to Google, as this could be seen as a link manipulation tactic.
Negative Link Velocity: Or too many links being removed at once.
Contextual Links: Links that are strategically placed within content that has context on the actual page. Bonus points if that content is natural, not spammy-sounding, and has some significant meat to it. Contextual links are good, but yet again, these could be manipulated, too — and you should keep an eye out for excessive contextual linking.
“Sponsored Links” or Other Words Around Link: Links that are obviously paid, or otherwise sponsored.
Links from “Hub” Pages: Links that are from top resources on the topic.
Links from Authority Sites: These are links from typically large authority sites in any one niche. Having many links from great authority sites is a good thing, and is something that shouldn’t count against you.
Natural Link Profile: A natural link profile has most link techniques kept to a minimum and a more natural distribution of link anchor text that doesn’t lean in any way toward significantly large amounts of any one type of anchor text. In other words, it looks like many people from many different niches (or within the same niche) linked to the site, rather than one person linking over and over.
Reciprocal Links: Basically “link to me and I will link to you” for the sake of cross-linking, which is a type of linking scheme that is prohibited by Google in their Google Webmaster Guidelines.
User-generated Content Links: These are basically links appearing in the comment portion of blogs.
Excessive Blog Comments: Blog comments are easily manipulated, and thus excessive blog commenting could be interpreted as coming from a black hat program like Scrape Box, which is known for its blog commenting capabilities.
Links from 301 Redirects: 301 redirects by themselves are OK and links from 301 redirects should not hurt your site, per Matt Cutts. But having excessive redirects can weigh down your site, cause bandwidth issues (may not be an issue if you can afford significant amounts of traffic hitting your 301 redirects), and additional issues with crawling if they aren’t managed properly. They can become so excessive that management of them could be a challenge.
Internal Link Anchor Text: Any excessive use of any one type of internal link anchor text is considered spam, and something that should be avoided. For example, linking to “Georgia widgets” over and over shouldn’t be done. Instead, your link profile should feature varied anchor text. Try to diversify it so it appears natural rather than manipulative.
Backlink Age: Some SEOs are of the opinion that backlink age is a ranking factor based on a Google patent.
Number of Outbound Links on the Page: Some SEOs believe that a page with many outbound links performs significantly less than a page with a healthy amount of outbound links. However, this is no longer true. Matt Cutts has stated that they dropped the 100 links per page guideline, but they may take action if the links are too spammy.
Site-wide Links: Matt Cutts talks about this in his video, “How Does Google Consider Site-Wide Backlinks”. He confirmed that they compress the links together into one link but that they may take into consideration different actions from a manual webspam analyst perspective if they are too spammy.
Excessive Focus of Link Anchor Text on Any One Keyword: It’s generally accepted to have a link profile where no more than 20 percent of your links come from one type of site, one type of spam, or focused on one single keyword. The more variety, the better.
Too Much Link Velocity, Too Little Link Velocity: If too many links are being acquired in too short of a time, this can negatively affect your site’s performance in the SERPs. This is especially true if it becomes a weighty part of your link profile (approaching 50 percent or much more).
Too little link velocity isn’t necessarily something to be concerned about. Link velocity becomes a concern if it’s from one source coming too much, too fast and destroying your link profile too rapidly for any other techniques to actually help.
Too Many Links Coming From One Domain Too Fast (part of link velocity, but an investigative factor): This could send a signal to Google that this domain is actually helping manipulate linking factors on the target site. The idea here is that too much of anything is a bad thing, and most linking techniques should follow linking in moderation — if it’s done too much and too fast, it could be considered spam.
Excessive Forum Profile Links: These types of links could point to a program like XRUMER or Scrape Box being used on the black hat side to manipulate links. Forum profiles can easily be manipulated and created with these programs.
Links From Real Sites vs. Splogs: A splog, or spam blog, is pretty quickly identifiable these days. If blogs are created as part of a major network, they may have the same template, they may have similar content, and all the content may look like gibberish. If the splog network is sufficiently advanced, they may even take care in creating high-quality content. But most factors between these splog networks will be similar, and they will be easily identifiable.
Guest Posts: Guest posts get a bad rap these days because again, they are easily manipulated and can go the way of article marketing if they aren’t properly executed. But if they are executed correctly, they can be a good benefit, especially if they add unique content with significantly high value. Guest posts can become bad when it’s the only thing you’re doing to build your link profile.
Unnatural Influx of Links: Already covered and mentioned elsewhere in this document in the link velocity section.
Google Penalty: Using Barracuda’s Panguin tool can help you identify not only if you have a penalty, but which penalty you have. It includes a program that overlays your Google Analytics data with dates and information about specific penalties — not just Penguin but Panda and many other algorithmic and manual penalties that have been levied by Google. It can help you diagnose your problem much easier.
If you have identified exactly when you incurred a penalty, it’s a good idea to go into Google Search Console and Google Analytics to find supporting data before taking action. If you don’t have supporting data that occurred at exactly the time of the penalty, it’s likely that some other major change has affected your site.
Link Profile With High Percentage of Low-Quality Links: Many links from low-quality sites — such as those frequently used by spammers — are known as the following:
- Excessive blog comment links
- Sites with excessive
- Blog Networks
- Article Marketing sites
The following are all specifically called out in Google’s Webmaster Guidelines as spam:
- Links with optimized anchor text in articles
- Press releases distributed on other sites done solely for the link
- Any low-quality directories or low-quality social bookmarking site links
- Any PPC advertising links that pass page rank to the buyer of the ad
Linking Domain Relevancy
Relevant links from sites in the same/similar niche are all known as being more powerful than links from sites that are unrelated.
During your link profile checks, check that the domain itself is relevant and doesn’t cause any major spam issues. If it’s spam and there’s no way to contact the webmaster to ideally have them remove the link, consider adding it to your disavow file.
“Poison” Anchor Text
Having significant amounts of this kind of anchor text (basically anything from well-known spam niches like gambling or viagra) could negatively impact your site and could be considered spam. This is a popular negative SEO technique, where a competitor will point a bunch of links with spammy anchor text to another competitor, effectively decreasing their rankings. Although Google continues to maintain that participating in negative SEO will rarely do any harm.
Identify any major instances of significant amounts of poison anchor text happening. If you catch it and disavow the links, it is likely they will not cause any significant issues for your site. If you don’t and you find that you have been the victim, it is a good idea to contact a member of the Google Webspam team (e.g., John Mueller or Gary Illyes) and have them investigate.
Time to Compile the Disavow File
After you’ve gone through your entire link profile and identified all the bad links that you want to disavow, it’s time to compile your disavow file and upload it using Google’s Disavow Tool for your site.
Check that your disavow file is free of errors. Excessive www and some formatting issues may occur before you upload your file. To be safe, copy/paste your disavow file into notepad, save as txt file, and upload.
Schema.org has become somewhat of a cult phenomenon. It isn’t quite as popular among SEOs as other techniques since it is difficult to prove its effectiveness, but some SEOs believe that pages supporting these microformats may rank above pages without it. But there is no doubt that it adds support for Google’s featured snippets, which may help rankings when it comes to featured snippet results and the Google Carousel for different niches.
Identifying Schema.org microformats in your audit will help you figure out how to optimize next, and where to go from here. You can establish a baseline of Schema.org optimization and identify what needs to be further optimized and other opportunities for growth.
You can use Screaming Frog to identify Schema.org microformat code that exists on-site.
To identify pages that have Schema.org coding, you’ll want to use Custom Filters. Simply bring up Screaming Frog and click on Configuration > Custom > Search.
Enter the following code in the Custom Filter Configuration to identify whether that page has Schema markup: itemtype=http://schema.org. Depending on how it’s coded (don’t forget to check for this first!) you may want to enter itemtype=”http://schema.org”. This will identify any main parent code items that have Schema.org coding, but won’t identify items specifically.
If you wanted to identify a specific type of Schema markup, you would use that exact Schema. For example, you would have to use itemprop=”name” to find any coding that contained that particular Schema value.
It all depends on your niche and the Schema that you want to identify. Be sure to visit Schema.org for a comprehensive listing of all available Schema formats.
Technical SEO is incredibly important. You need a technical foundation in order to be successful.
Let’s examine some of the more common technical SEO issues and get some checks and balances going so we can fix them.
The presence of a sitemap file on your site will help Google better understand its structure, where pages are located, and more importantly, will help give it access to your site, assuming it’s set up correctly. XML sitemaps can be simple, with one line of the site per line. They don’t have to be pretty. HTML sitemaps can benefit from being “prettier” with a bit more organization to boot.
This is a pretty simple check. Since the sitemap is installed in the root directory, you can check for the presence of the sitemap file by searching for it in Screaming Frog, or you can check it in the browser by adding sitemap.xml or sitemap.html.
Also, be sure to check the sitemaps section in Google Search Console. It will tell you if a sitemap has previously been submitted, how many URLs were successfully indexed, whether there are any problems, and other issues.
If you don’t have one, you’ll have to create one.
Using Screaming Frog, it’s quite simple to create an XML Sitemap. Just click on Sitemaps > Create XML Sitemap.
Go to the Last modified tab and uncheck it. Go to the Priority tab and uncheck it. Go to the Change Frequency tab and uncheck it. These tags don’t provide much benefit for Google, and thus the XML sitemap can be submitted as is.
Any additional options (e.g., images, noindex pages, canonicalized URLs, pPaginated URLs, or PDFs) can all be checked if they apply to your site.
It’s also a good idea to check your sitemap for errors before submitting it. Use an XML validator tool like CodeBeautify.org and XMLValidation.com. Using more than one validator will help ensure your sitemap doesn’t have errors and that it is 100 percent correct the first time it is submitted.
In addition, uploading the URL list to Screaming Frog using list mode is a good way to check that your sitemap also has all 200 OK errors. Strip out all the formatting and ensure it’s only a list of URLs. Then click on Mode > List > upload > Crawl and make sure all pages in the sitemap have 200 OK errors.
Identifying whether robots.txt exists on-site is a good way to check the health of your site. The robots.txt file can make or break a website’s performance in search results.
For example, if you set robots.txt to “disallow: /”, you’re telling Google never to index the site because “/” is root! It’s important to set this as one of the first checks in SEO because so many site owners get this wrong. It is always supposed to be set at “disallow: ” without the forward slash. This will allow all user agents to crawl the site.
Check Google Search Console for the presence of a robots.txt file. You can go to Crawl > robots.txt Tester to do this. It will help you see what is currently live on-site, and if any edits will improve that file.
It’s also a good idea to maintain records of the robots.txt file. Monthly screenshots will help you identify whether changes were made and when, and help you pinpoint errors in indexation if any were to arise. Checking the link “See live robots.txt” will let you investigate the currently live state of the site’s robots.txt file.
The Crawl Errors section of GSC will help you identify whether crawl errors currently exist on-site. Finding crawl errors and fixing them are an important part of any website audit because the more crawl errors a site has, the more issues Google has finding pages and indexing them. Ongoing technical SEO maintenance of these items is crucial for having a healthy site.
In Google Search Console, identify any 400 and 500 server and not found errors found on-site. All of these types of errors should be called out and fixed.
In addition, you can use Screaming Frog to find and identify 400 and 500 server error codes. Simply click on Bulk Export > Response Codes > Client Error (4xx) Inlinks and Server Error (5xx) Inlinks.
This issue can cause Google to see two or more versions of the page as the source of single content on your site. Multiple versions can exist, from capital URLs to lower case URLs, to URLs with dashes and URLs with underscores. Sites with severe URL issues can even have the following:
What’s wrong with this picture? In this case, seven different URL versions exist for one piece of content. This is awful from Google’s perspective, and we don’t want to have such a mess on our hands.
The easiest way to fix this is to point the rel=canonical of all of these pages to the one version that should be considered the source of the single piece of content. However, the existence of these URLs is still confusing. The ideal fix is to consolidate all seven URLs down to one single RL, and set the rel=canonical tag to that same single URL.
Another situation that can happen is that URLs can have trailing slashes that don’t properly resolve to their exact URLs. Example:
In this case, the ideal situation is to redirect the URL back to the original, preferred URL and make sure the rel=canonical is set to that preferred URL. If you aren’t in full control over the site updates, keep a regular eye on these.
Does the Site Have an SSL Certificate (Especially in E-Commerce)?
Ideally, an e-commerce site implementation will have an SSL certificate. But with Google’s recent moves toward preferring sites that have SSL certificates for security reasons, it’s a good idea to determine whether a site has a secure certificate installed.
If a site has https:// in their domain, they have a secure certificate, although the check at this level may reveal issues.
If a red X appears next to the https:// in a domain, it is likely that the secure certificate has issues. Screaming Frog can’t identify security issues such as this, so it’s a good idea to check for certain issues like https://www, https://blog, or https://. If two of these have X’es across them, as opposed to the main domain (if the main domain has https://), it is likely that during the purchase process of the SSL certificate, errors were made.
In order to make sure that all variations of https:// resolve properly, it’s necessary to get a wildcard secure certificate. This wildcard secure certificate will ensure that all possible variations of https:// resolve properly.
Continuing our audit of ABC7.com’s website, we can identify the following:
There are at least 5 CSS files and 11 script files that may need minification. Further study into how they interact with each other will likely be required to identify any issues that may be happening.
Since this site is pretty video-heavy, it’s also a good idea to figure out how the video implementations are impacting the site from a server perspective as well as from a search engine perspective.
Identifying images that are heavy on file size and causing increases in page load time is a critical optimization factor to get right. This isn’t a be-all, end-all optimization factor, but it can deliver quite a decrease in site speed if managed correctly.
Using our Screaming Frog spider, we can identify the image links on a particular page. When you’re done crawling your site, click on the URL in the page list, and then click on the Image Info tab in the window below it:
You can also right-click on any image in the window to either copy or go to the destination URL.
In addition, you can click on Bulk Export > All Images or you can go to Images > Images missing alt text. This will export a full CSV file that you can use to identify images that are missing alt text or images that have lengthy alt text.
HTML Errors / W3C Validation
Correcting HTML errors and W3C validation by themselves doesn’t increase ranking, and having a fully W3C valid site doesn’t help your ranking, per Google’s John Mueller. But correcting these types of errors can help lead to better rendering in various browsers, and if the errors are bad enough, these corrections can help lead to better page speed. But it is on a case-by-case basis. Just doing these by themselves won’t automatically lead to better rankings for every site.
In fact, mostly it is a contributing factor, meaning that it can help enhance the main factor — site speed. For example, one area that may help includes adding width + height to images. Per W3.org, if height and width are set, the “space required for the image is reserved when the page is loaded”. This means that the browser doesn’t have to waste time guessing about the image size, and can just load the image right then and there.
Using the W3C validator at W3.org can help you identify HTML errors and fix them accordingly.
Be sure to always use the appropriate DOCTYPE that matches the language of the page being analyzed by the W3C validator. If you don’t, you will receive errors all over the place. You cannot change DOCTYPES from XHTML 1.0 to HTML 5, for example.
Why Certain “Signals” Were Not Included
Some SEOs believe that social signals can impact rankings positively and negatively. Other SEOs do not. Correlation studies, while they have been done, continue to ignore the major factor: correlation does not equal causation. Just because there is improvement in correlation between social results and rankings doesn’t always mean that social improves ranking.
There could be a number of additional links being added at the same time, or there could be one insanely valuable authority link that was added, or any other number of improvements. Gary Illyes of Google continues to officially maintain that they do not use social media for ranking.
The goal of this SEO audit checklist is to put together on-site and off-site checks to help identify any issues, along with actionable advice on fixing these issues. Of course, there are a number of ranking factors that can’t easily be determined by simple on-site or off-site checks and require time, long-term tracking methods, as well as in some cases, custom software to run. These are beyond the scope of this article.
Hopefully you found this checklist useful. Have fun and happy website auditing!
Featured image and in-post images: Paulo Bobita
Screenshots by Brian Harnish. Taken May 2017.
Subscribe to SEJ
Get our weekly newsletter from SEJ's Founder Loren Baker about the latest news in the industry!