Learn More
  1. SEJ
  2.  » 
  3. SEO

How to Find Every Orphan Page on Your Website

In this post, you'll learn what orphan pages are, why fixing them is important for SEO, and how to find all of your orphan pages.

How to Find Every Orphan Page on Your Website

How do you find pages that have no links?

It’s nearly impossible.

If you have pages on your website that users and search engines can’t reach, you have a problem you need to fix. Fast.

These types of pages have a name: orphan pages.

In this post, you’ll learn what orphan pages are, why fixing them is important for SEO, and how to find every orphan page on your site.

What Is an Orphan Page?

A page without any links to it is called an orphan page.

In order for Google and other search engines to index your pages, they need to know they exist and where.

This is usually accomplished in one of two ways:

  • The crawler follows a link from another page.
  • The crawler finds the URL listed in your XML sitemap.

Why Are Orphan Pages an SEO Issue?

Because search engines can’t find an orphan page through other links on the end, orphan pages often go unindexed and never show up in search results.

Continue Reading Below

Even if your orphan pages are listed in your XML sitemap, they are still a problem for SEO.

With no internal links, no authority is passed to the pages, and search engines have no semantic or structural context in which to evaluate the page.

Without any way of knowing where the page fits into your site as a whole, it can be more difficult to determine which queries the page is relevant for.


Orphan vs. Dead-End Pages

Before we dive into orphan pages, I wanted to take a moment to briefly clarify the difference between orphan and dead-end pages, as there has been some confusion between these SEO terms.

As we’ve already established, an orphan page is a webpage that isn’t linked to by, or reachable from, any other page on the same website.

A dead-end page, on the other hand, is a webpage that doesn’t link to any other internal webpages or any external websites, thus creating a “dead end.”

Continue Reading Below

When people land on this page, they can either hit back or just abandon the site. When search engine crawlers land on the page, they have nowhere to go, and no link equity can be passed.

Today, with so many templates and themes available, it’s more difficult to create a dead end – but hardly impossible. A dead end can easily be remedied by adding links to your on-page content, or in the sidebar or footer navigation.

Al clear? Good.

Now let’s find your orphan pages.

1. Identify Your Crawlable Pages

First, you’ll need a list of all of the URLs that currently can be reached by crawling your site’s links.

You will need an SEO spider to do this. I recommend ScreamingFrog.

Whatever crawler you use, make sure it is set to crawl only pages that are indexable by search engines, meaning that it should not crawl pages that are noindexed or pages that are hidden from search engines by robots.txt.

Start the crawl from the homepage of the site, making sure to use the canonical URL, including the proper https or http and www versus non-www.

Once you have crawled your site, export the URLs to a spreadsheet like this:

Export the URLs to a spreadsheet

2. Resolve 2 Common Causes of Orphan Pages

Before checking any tools or sources to find orphan pages, there are two common causes of orphan pages that should be immediately addressed and dealt with.

What both of these causes have in common is that they are essentially page duplicates that should automatically redirect consistently to only one URL.

If they don’t, it’s likely that some versions of the page are not linked to and as a result are orphans.

In this case, the fact that they are orphans isn’t the primary issue, the fact that they are duplicates is.

Continue Reading Below

Still, these will come up later while you are looking for orphan pages, and need to be dealt with, so it’s a good idea to get these out of the way beforehand.

Non-canonical https/http or www/non-www

Every public page on your site should ideally use http or https consistently (preferably https), and www or non-www consistently.

To check if this is the case, try typing all of these variations of your site’s homepage into your browser:

  • https://www.example.com
  • http://www.example.com
  • https://example.com
  • http://example.com

All four variations should redirect automatically to the exact same URL.

You should test this on a few other pages of your site, and check your site’s .htaccess file to make sure that redirects for these are set up properly.

Here is how to force https in .htaccess. If you do this, verify that every page on your site has SSL capabilities, or your users will get a scary browser warning.

Continue Reading Below

Here is how to force www or non-www. Again, verify that this won’t create any server errors.

Trailing Slashes

Another thing to watch out for is consistent use of trailing slashes.

For example, these two URLs may produce the same content, but the URLs are not identical:

  • https://example.com/page1/
  • https://example.com/page1

Check a few pages on your site both with and without the trailing slash, and make sure that they redirect automatically to the same URL, and that they do so consistently.

Verify that this is set up properly in .htaccess. Here is how to force a trailing slash in .htaccess.

3. Get a List of URLs from Google Analytics

Crawlers, by definition, will have a difficult time finding orphan pages.

So using any SEO tool to find one is bound to be problematic.

Continue Reading Below

The best place to start looking for orphan pages, then, is your own Google Analytics data (or any other analytics packages you use).

As long as the pages in question have Google Analytics installed, if the page has ever been visited, there is a record of it somewhere in Google Analytics.

To get a comprehensive list of URLs, from the left sidebar, select “All Pages” under “Site Content” from the “Behavior” section:

All Pages in Behaviour

Since our orphan pages are difficult to find, the number of times they have been visited is likely to be quite low.

Click “Pageviews” so that the arrow is pointing upward, indicating that the list of URIs is sorted in ascending order from least to most pageviews.

This will move the pages most likely to be orphans to the top:

Continue Reading Below


To make sure our list is as comprehensive as possible, go to the time range at the top right and set the starting date back to a time before Google Analytics was in place, and click the “Apply button:

Date Picker in Google Analytics

Now we will need to expand our list of URLs as much as possible.

In the bottom right, click the “Show rows” dropdown menu and select the highest number of rows.

Continue Reading Below

Our biggest obstacle is that Analytics can only list up to 5,000 URLs at a time:

Display number of rows in Google Analytics

If you have more than this, you will have to export 5,000 pages at a time until you have all you Google Analytics visitor data.

However, we are sorting pageviews by ascending, so our list should hopefully include all, and will most likely include most orphan URLs that have had a visitor.

It will likely take a bit of time Analytics to fetch all of the data. Be patient and don’t try to rush things or you will risk crashing your browser.

Once the URLs are loaded, head up to the top right, select export, and export a Google Sheet, Excel file, or CSV spreadsheet to get your URLs.

Export Sheet

Now copy the URLs from your exported analytics file into your orphan page spreadsheet, like so:

Continue Reading Below

Copy the URIs

We will need to get these into URL format in order for them to be useful. To do this, insert a new column and paste down the homepage URL, like so:

Insert a new column for Root URL

And use the concat() formula to combine these together into a URL in the next column over:

Combine both the columns

Then just drag the formula down to get the full list of URLs:

Full list for combining columns

4. Identify Your Orphan URLs

To identify our orphan URLs, we will need to compare the list of “Crawlable URLs” and the list of “Analytics URLs” in our spreadsheet.

Continue Reading Below

In our hypothetical example, it’s obvious that https://example.com/11 is an orphan page, but in reality you will almost always have far more URLs to sift through, and we will need to automate the process of identifying our orphan URLs.

To do this, we need a formula that checks if each URL in our Analytics list is also found in our list of Crawlable URLs.

Here is an example of a formula that will accomplish this:

Formula to get Crawlable URLs

The “match” formula we have used in cell E2 here is:


This formula checks if the URL in cell D2 is in the range $A$2:$A$11. (If you’re not too familiar with spreadsheets, the dollar signs are there to make sure that when we drag the formula down the column, the range won’t change.)

Continue Reading Below

The value “0” tells Google Sheets that the columns aren’t necessarily sorted. (See the Google Sheets documentation.)

If there is a match, the formula returns its position in the range, which in this case is the first position in the range.

What we’re more interested in, however, is if there isn’t a match.

As you can see, the formula returns the error #N/A for https://example.com/11, because it is not found in our list of Crawlable URLs. This means it is an orphan page.

To get a list of our orphan pages, then, all we need to do is sort our “Match” column to collect all of the #N/A results in one place.

Sort results

We can then copy our list of orphan URLs and paste them to a new sheet where we can address how to fix them.

Continue Reading Below

5. Other Places to Look for Orphan URLs

You can repeat this process for identifying orphan URLs using data sources other than Google Analytics.

Any of the following tools will have a list of pages crawled from your site:

  • SEMrush
  • Ahrefs
  • Moz Link Explorer
  • Raven Tools

I would not recommend signing up for any of them exclusively to look for orphan pages, because they will need to somehow crawl these pages in order to find them.

However, it is possible that in some cases these tools will find pages that aren’t directly crawlable because they were found using other means, usually at some point in history when the page was crawlable:

Also, it’s a good idea to work with your dev team to see if they can get the complete list of URLs on the site directly from the server, since this should be the most complete list available anywhere.

Continue Reading Below

Finally, you can get a list of URLs from the Google Search Console’s Search Analytics report.

Even though these pages are obviously indexed if they are showing up here, you may still find pages that aren’t crawlable from your internal links that will need to be fixed.


Orphan pages can’t be indexed by search engines if they don’t show up in your sitemap – and they can create other SEO issues even if they do.

Use the methods outlined in this post to find your orphan pages and get this issue resolved.

More Resources:

Image Credits

Featured Image: E2M Solutions
All screenshots taken by author, November 2018


Subscribe to SEJ

Get our daily newsletter from SEJ's Founder Loren Baker about the latest news in the industry!


Manish Dudharejia

Founder & President at E2M Solutions Inc.

Manish Dudharejia is the President and Founder of E2M Solutions Inc, a San Diego Based Digital Agency that specializes in ... [Read full bio]

Read the Next Article
Read the Next