Sitemaps and Internal Links – the 10000 Page Test

Just because someone with a huge following claims this or that, isn’t good enough for me when it comes to knowing whether to implement some SEO strategies or not. I’m not a sheep to slaughter. No matter how many other sheep are around me, nor how “trusted” the sheep-herder. I want to find out for myself. So I did. In a BIG way!

To Sitemap Or Not?

One thing that some SEO people say is “You don’t need a sitemap.xml file.” Eventually the GoogleBot will find your pages and index them. According to this theory, sitemaps belong in the “boondoggle” category. But do they really? What about a site that has poor internal linking? How well do sitemaps do at overcoming that issue?

What About Internal Links?

Can a site with tens of thousands of pages be properly indexed without full HTML navigation? And when it comes to linking, is having a “funnel” really as important as has been billed in our community? Or do your really important links have to be at the top of pages? Or can they be in the footer?

The Client and the Scope

SIteSnapshotLate last year, one of my web development clients, WebSight Design, got a gig building out a directory of storage facilities across the United States. The site is As a directory with listings in tens of thousands of locations across the country, this particular site would need to compete on so many levels it could make your head spin.

National, regional, and multiple types of local search aspects all come into play. Because this site is an aggregate data site, and does not itself have local addresses in every location, any typical local search optimization methods would not be able to be fully implemented here. Some other method would be called for at the local level.

From the Top – Competitive Analysis

When I got this assignment, one of my first tasks was to do a detailed review of the competition. And let me tell you – in the self storage market, the competition is ugly. There’s two very important factors – site depth and back links.


From the chart above, you can see what the competition is like in this field. The very high page count is due to the fact that every storage facility has it’s own page, or at the very least, every city in a database has it’s own page. And for the competitors with many less pages, they’ve spent a lot of time getting links back. (The really high link counts usually come from a partnership with a complimentary site that also has city based listings.)

Bigger Competitive Base

Some of the competition in this market is bleed-through, because some sites that come up are actually large moving companies, offering both moving and storage solutions. That’s a funny thing about the nature of our industry. Explaining why a client’s competition is more often bigger than they think.

You’re Also Competiting Against Google

Google-Blended-ResultsWhere it gets even more challenging is when you do a search and include a city name. Not only do you get all of the usual suspects in the competitive arena, you’re also now competing directly with Google’s Local listings.

Now, I understand why Google would want to present these results to someone doing a search. But it means there’s a much bigger problem for some companies trying to compete for the first page of Google results.

To business directory owners in markets like the storage comparison service, Google is now a direct competitor. I’ll leave it to you to decide whether that fits with the whole “Google is Evil” mantra or not. Instead, it just becomes one more factor for me to work into my analysis and action plan.

I don’t LIKE the fact that I have to work it into my analysis, but heck that’s what I get paid for. And as much as it pained me when I first realized this was an issue, I thought it would be exponentially more painful to the client. And it was. He was shell-shocked, because he never previously thought he’d be going up against Google, of all entities…

He’s getting into an arena that is about as entrenched as it gets. So even if he could match his top competitors page for page, he’d be in for a heck of a ride given the volume of back-links. And the potential for his site’s pages to be viewed as duplicate content because all those other sites have been around a lot longer…

But Wait – There’s More!

Not only are you competing against Google at the Local level, for some phrases, you’re even competing against Google on the national level!

That’s right – thanks to the fact that Google wants to provide what THEY consider the most relevant content to a search, for some keywords that you enter, even when you DON’T enter a geographic location, they ASSUME you may want to see that information.

In the screen shot, notice how there are three organic listings above Google’s Local results. Yet that means you HAVE to be in the top three now, to be sure you’re both seen above the fold, and before mass confusion sets in for the person doing the search when they get that map and all those local links thrown into that eye space.

I’m sure many of you reading this article already knew this stuff. I’m just mentioning it here though because some readers may not have considered the implication or ramifications of this before.

And when as it relates to my client’s site and what has to be done to overcome such insanity it’s completely relevant.

As you can see in the above screen-shot, my client is already in the top three for this particular search.

They’re not in such a stellar position for every non-city related phrase yet. But they are for a few. And they’re in the top five results for a few phrases targeting their prospective customers, the facility owners.


At the City Level

At the City level, for “find storage facilities Tucson”, they’re the first organic result that shows up just below the map and individual facilities that come from Google Local. That’s crucial, because if I were doing a search, and saw all those choices, I would surely get really annoyed really fast at having to click on every one of them to do a comparison on my own. Having an alternative to compare results would be really helpful.

A Word About Best Practices Page Titles

Note how my client site’s Title in this Tucson screeencap doesn’t say “find storage facilities Tuscson”. Instead, it says “Compare Self Storage Tucson”…

That’s magic when it comes to helping an end user who wants a quick way to compare all those results. Most of the time, we shoot for having an exact match in a page title so the person using Google sees the same phrase repeated, and bold.

In this case, it’s actually better having what I consider a more appropriate title given the need to differentiate from the Google Local listings confusion.

So what’s the key to my initial modest success?

Back Links – It’s Back Links I Tell You!


It’s actually not back -links. The site currently has about 200 links coming back, mostly low level stuff so far. Compared to the tens of thousands of back links the competition has, that’s peanuts.

No, back links are not what’s done it. Sure, the next level of success will require acquiring more back links – but that’s chicken feed compared to what’s been accomplished already without giving a second’s thought to obtaining more links. And I’ve saved all that time and effort in quality back-link hunting…

Alan Bleiweiss
Alan Bleiweiss is a Forensic SEO audit consultant with audit client sites consisting of upwards of 50 million pages and tens of millions of visitors a month. A noted industry speaker, author and blogger, his posts are quite often as much controversial as they are thought provoking.
Alan Bleiweiss

Comments are closed.

20 thoughts on “Sitemaps and Internal Links – the 10000 Page Test

  1. Minor edit: in the article, in regard to individual facility pages, I state
    “the sitemap.xml files don’t include any links to any of the individual facility pages. Yet. So Google is essentially ignoring them, even though they are in the sitemap.xml files, because there are no internal site links to support them, while Yahoo thinks they’re perfectly valid pages that deserve indexing.”

    The first part of that should read:
    “So Google is essentially ignoring them because the only links to those pages are at the city level, most of them are very similar (duplicate content), and they’re not in the sitemap.xml files. Yet.”

    (Confused statements happen when I make changes to my massive articles at 3AM. )

      1. Thanks Ann. I really do review and re-edit my articles befor submission. It’s just that I do that late at night mostly. And after six re-reads, I sometimes miss things.

    1. Great article, very interesting read…what’s the reason for not including individual facility links in the xml map right away? I’m doing something very similar but the site is much larger than any I’ve ever worked on so it’s presenting me with some challenges.

      And about yahoo showing that many pages indexed. Y! shows yellow pages has 47 million pages indexed while site: on G shows 680 K – but they only have 137 xml sitemaps (which cover < 137*50K= 6.85 million links). I don’t know what the actual number of pages is for YP, but 6.85 mil does sound low, and 47 mil indexed sounds v high.

      1. Thanks DiscoStu!

        This site is a step by step see how it goes project. Very rare to have such a great opportunity. It’s the only way to see how effective each step is.

        The whole Yahoo vs. Google showing X thing is a bit muddy. On the one hand, there could be X URLs in sitemap files. Google will then index all or some of them. They will then display all or some of THOSE in the results on any given day. It does go up and down when I check.

        Yahoo seems to give more value to links on pages regardless of a duplicate content issue.

        Then again, Many times when I am reviewing client or competitor sites in the Yahoo link: view, I see pages that link back ONLY because of AdSense ads. Isn’t that sweet?

        Honestly though, I don’t know enough about the exact cause of how exactly either handles indexing vs displaying in results vs link factoring to speak more accurately on the subject, and thus I invite SEJ readers who have done granular testing on those topics to chime in…

  2. Very interesting post. I do wonder, however, how your test would have turned out if you hadn’t used a sitemap at all first – but had simply created the internal links first. If you’d done the state / city pages first, with the one sitewide footer link to the states page, without submitting a sitemap at all, would the results have basically been the same?

    Personally, I like sitemaps. It gives me a warm, fuzzy feeling to have a feeling of some modicum of control over what gets indexed, even if it’s a bogus feeling. :) But I just wonder if the test would have shown the same results even without one.

  3. Hi Alan,

    Great article and good to see straight “best practice” SEO working out :). One thing, you mention that you haven’t submitted the sitemap to Bing, why not use robots.txt to link to a sitemap of the various sitemaps, effectively auto-submitting everything?

    Would be great to see a progress report in a few months too.


  4. Great post Alan.

    It is definitely now the time to start optimizing for Bing too. You said you are also working on “inbound links” to, then why have you not linked to it from this article?

  5. Awesome post, really in-depth and very clear. I’m glad you went through the whole background and how you went through the experiment! Great information.

  6. @DazzlinDonna

    That’s a very good question. One reality is that it’s impossible to test on the exact same site, so I can only talk about other sites and how not having sitemap.xml files while still doing the funnel DOES work.

    As relates to this article, I needed to test in a big way whether a sitemap.xml file can get at least some pages indexed, especially for those sites that we don’t have the luxury of doing things like controlling the HTML based links in that way. (Sites that use AJAX, JavaScript, or off-page CSS for links can cause the search engines serious problems).

    @millerian, one of the sitemap files is referenced in the robots.txt file – I need to update that to include an index of sitemaps actually – (We’re rolling out a few others as well)…

    @db – I intentionally did not add the link to the site because I felt a direct link, even with a nofollow, would possibly imply ulterior motives on why I wrote the article.

  7. Excellent article. I’ll be sprucing up my sitemaps and internal linking tonight.

    Thanks for the motivation :)

  8. This was outstanding. I’ve taken the easy way out with clients when talking about sitemaps. I’ll say “The bigger the site the more a sitemap matters”. But I’m seeing that even small sites, especially if they have poor nav links, can benefit as well. thank you for giving me a new easy way out, as this article is now required reading for my clients :)

    1. Eric,

      I’m glad you find the article so helpful. I’m not so sure a non-SEO type would be able to undestand some of it. One of my clients read it and said “I understand about as much of this as you probably do about my stitching (She’s a professional quilter). What’s SEO? What’s backlinks?…”

      Just sayin…

  9. Thanks for the research and posting. It was especially helpful to hear about loading multiple sitemaps, as this is more rare for the sites I’ve worked with. For Yahoo, the engine seems to have a higher threshold for duplicated content on pages and also has trouble accounting for 301s. Some of this was fixed when we all saw some big drops in Yahoo backlinks a few months back, but it still reports some inflated numbers.

  10. @Adam

    This was the first time I’ve needed to use multiple sitemaps. I’ve since needed to do so on other client projects as well. I can’t always control the stability of a client web server, so this mitigates Google’s trying to get to them. So I stay ridiculously under the 50,000 URL limit.

  11. While its not the post i was originally looking for, it was worth the read as you have highlighted several common issues and how to work through them with a focus around increased conversions/ctr not just rankings.

  12. @Alan

    Loved the article, great advice and insight from a true pro. A question I have for you: my blog articles do not have the word blog in the URL string, will google still recognize them?