SEO

Five Steps to SEO-Friendly Site URL Structure

Screen1 SEJ Five Steps to SEO Friendly Site URL StructureSome people say there is no such thing as SEO-friendly URL structure. They claim that search engines are perfectly capable of making sense of any type of URL and pretty much any URL structure. In most cases, the people who say this are web developers (just so you know, I love Web devs).

I’ve noticed that sometimes web developers and SEOs live in two parallel universes, each with its own center of gravity. While web developers basically care about crawlability, site speed, and other technical things, SEOs are mostly focused on what constitutes their sacred grail: website rankings and ROI.

Hence, what may be an OK site URL structure to a web dev, could be a totally SEO-unfriendly URL architecture to an SEO manager:
Screen2 SEJ 637x442 Five Steps to SEO Friendly Site URL Structure

What is an SEO-friendly URL structure?

First of all, let me start by saying that it is always better to call in an SEO manager early in the development stage, so that there is no need to make sometimes hard-to-implement tweaks afterwards.

From an SEO point of view, a site’s URL structure should be:

  • Straightforward: URLs with duplicate content should have canonical URLs specified for them; there should be no confusing redirects on the site, etc.
  • Meaningful: URL names should have keywords in them, not gibbering numbers and punctuation marks.
  • With emphasis on the right URLs: SEO-wise, not all URLs on a site are of equal importance as a rule. Some even should be concealed from the search engines. At the same time, it is important to check that the pages that ought to be accessible to the search engines are actually open for crawling and indexing.

So, here is what one can do to achieve an SEO-friendly site URL structure:

1. Consolidate your www and the non-www domain versions

As a rule, there are two major versions of your domain indexed in the search engines, the www and the non-www version of it. These can be consolidated in more than one way, but I’d mention the most widely accepted practice.

Most SEOs (in my experience) use the 301 redirect to point one version of their site to the other (or vice versa).

Alternatively (for instance, when you can’t do a redirect), you can specify your preferred version in Google Webmaster Tools in Configuration >> Settings >> Preferred Domain. However,this has certain drawbacks:

  • This takes care of Google only.
  • This option is restricted to root domains only. If you have an example.wordpress.com site, this method is not for you.

But why worry about the www vs non-www issue in the first place? Thing is, some of your backlinks may be pointing to your www version, while some could be going to the non-www version.

So, to make sure that both versions’ SEO value is consolidated, it’s better to explicitly establish this link between the two (either via the 301 redirect, or in Google Webmaster Tools, or by using a canonical tag – I’ll talk about that one a bit further).

2. Avoid dynamic and relative URLs

Depending on your content management system, the URLs it generates may be “pretty” like this one:

www.example.com/topic-name

or “ugly” like this one:

www.example.com/?p=578544

As I said earlier, search engines have no problem with either variant, but for certain reasons it’s better to use static (prettier) URLs rather than dynamic (uglier) ones. Thing is, static URLs contain your keywords and are more user-friendly, since one can figure out what the page is about just by looking at the static URL’s name.

Besides, Google recommends using hyphens (-) instead of underscores (_) in URL names, since a phrase in which the words are connected using underscores is treated by Google as one single word, e.g. one_single_word is onesingleword to Google.

And, to check what other elements of your page should have the same keywords as your URLs, have a look at the screenshot 3 of the “On-Page SEO for 2013: Optimize Pages to Rank and Perform” guide that we released recently.

Besides, some web devs make use of relative URLs. The problem with relative URLs is that they are dependent on the context in which they occur. Once the context changes, the URL may not work. SEO-wise, it is better to use absolute URLs instead of relative ones, since the former are what search engines prefer.

Now, sometimes different parameters can be added to the URL for analytics tracking or other reasons (such as sid, utm, etc.) To make sure that these parameters don’t make the number of URLs with duplicate content grow over the top, you can do either of the following:

  • Ask Google to disregard certain URL parameters in Google Webmaster Tools in Configuration > URL Parameters.
  • See if your content management system allows you to solidify URLs with additional parameters with their shorter counterparts.

3. Create an XML Sitemap

An XML Sitemap is not to be confused with the HTML sitemap. The former is for the search engines, while the latter is mostly designed for human users.

What is an XML Sitemap? In plain words, it’s a list of your site’s URLs that you submit to the search engines. This serves two purposes:

  1. This helps search engines find your site’s pages more easily;
  2. Search engines can use the Sitemap as a reference when choosing canonical URLs on your site.

The word “canonical” simply means “preferred” in this case. Picking a preferred (canonical) URL becomes necessary when search engines see duplicate pages on your site.

So, as they don’t want any duplicates in the search results, search engines use a special algorithm to identify duplicate pages and pick just one URL to represent the group in the search results. Other webpages just get filtered out.

Now, back to sitemaps … One of the criteria search engines may use to pick a canonical URL for the group of webpages is whether this URL is mentioned in the website’s Sitemap.

So, what webpages should be included into your sitemap, all of your site’s pages or not? In fact, for SEO-reasons, it’s recommended to include only the webpages you’d like to show up in search.

4. Close off irrelevant pages with robots.txt

There may be pages on your site that should be concealed from the search engines. These could be your “Terms and conditions” page, pages with sensitive information, etc. It’s better not to let these get indexed, since they usually don’t contain your target keywords and only dilute the semantic whole of your site.

The robotx.txt file contains instructions for the search engines as to what pages of your site should be ignored during the crawl. Such pages get a noindex attribute and do not show up in the search results.

Sometimes, however, unsavvy webmasters use noindex on the pages it should not be used. Hence, whenever you start doing SEO for a site, it is important to make sure that no pages that should be ranking in search have the noindex attribute. Or else you may end up like this guy here:

5. Specify canonical URLs using a special tag

Another way to highlight canonical URLs on your site is by using the so-called canonical tag. In geek speek, it’s not the tag itself that is canonical, but the tag’s parameter, but we’ll just call it the canonical tag by metonymy.

Note: the canonical tag should be applied only with the purpose of helping search engines decide on your canonical ULR. For redirection of site pages, use redirects. And, for paginated content, it makes sense to employ rel=”next” and rel=”prev” tags in most cases.

For example, on Macy’s website, I can go to the Quilts & Bedspreads page directly, or I can take different routes from the homepage:

  • I can go to Homepage >>Bed& Bath >> Quilts & Bedspreads. The following URL with my pass recorded is generated:

http://www1.macys.com/shop/bed-bath/quilts-bedspreads?id=22748&edge=hybrid&cm_sp=us_catsplash_bed-%26-bath-_-row6-_-quilts-%26-bedspreads

  • Or I can go to Homepage >> For the Home >> Bed & Bath >> Bedding >> Quilts & Bedspreads. The following URL is generated:

http://www1.macys.com/shop/bed-bath/quilts-bedspreads?id=22748&edge=hybrid

Now, all three URLs lead to the same content. And, if you look into the code of each page, you’ll see the following tag in the head element:
Screen3 SEJ 637x47 Five Steps to SEO Friendly Site URL Structure

As you see, for each of these URLs, a canonical URL is specified, which is the cleanest version of all the URLs in the group:

http://www1.macys.com/shop/bed-bath/quilts-bedspreads?id=22748

What this does is, it funnels down the SEO value each of these three URLs might have to one single URL that should be displayed in the search results (the canonical URL). Normally search engines do a pretty good job identifying canonical URLs themselves, but, as Susan Moskwa once wrote at Google Webmaster Central:

“If we aren’t able to detect all the duplicates of a particular page, we won’t be able to consolidate all of their properties. This may dilute the strength of that content’s ranking signals by splitting them across multiple URLs.”

Conclusion

Having SEO-friendly URL structure on a site means having the URL structure that helps the site rank higher in the search results. While, from the point of view of web development, a particular site’s architecture may seem crystal-clear and error-free, for an SEO manager this could mean missing on certain ranking opportunities.

Image credits: skyfish81via FlickrArbyreed via FlickrJo Dooher via Flickr

0fd00b7c6162fc872435a33567824608 64 Five Steps to SEO Friendly Site URL Structure

Alesia Krush

Alesia is an SEO and a digital marketer at Link-Assistant.Com, a major SEO software provider and the maker of SEO PowerSuite tools. Link-Assistant.Com is a group of SEO professionals with almost a decade of SEO experience.
0fd00b7c6162fc872435a33567824608 64 Five Steps to SEO Friendly Site URL Structure

You Might Also Like

Comments are closed.

20 thoughts on “Five Steps to SEO-Friendly Site URL Structure

  1. Thank you for the explanation of rel-canonical. I understand it correctly, it is a message to the search engine saying “Choose me, I’m the one. Not the other ones pointing here. Just me, just me…

    Is that right?

  2. Nice article but I have to point out the fact that step one , two and to an extend five are relevant for SEO-Friendly Site URL Structure. While Sitemap and Robot.txt are key onsite optimization factors it does not have much to do with url structure .

    I am not 100% on this but urls over 2000 characters are not really recommended and you might even get a ” 414 Request-URI Too Long ” message from some servers . This is worth a read : http://support.google.com/webmasters/bin/answer.py?hl=en&answer=76329

    1. Hey, Saijo!

      Good point regarding URL length. It’s true that certain servers and browsers won’t handle URLs that are over 2000 characters long (see this clarification by Tom Boutell – http://www.boutell.com/newfaq/misc/urllength.html).

      As for Google, it simply says it doesn’t recommend “overly complex URLs, especially those containing multiple parameters”. As you see, no specific limit on the number of characters one can have in a URL is mentioned. But I’d imagine an SEO-friendly URL would hardly be hundreds of characters long anyway, as this wouldn’t be good for the user in the first place.

  3. Really great stuff for having perfect URL structure for the website to rank better for the keywords. Canonical issue is one of the most important thing to be considered from getting penalized by Google.

  4. Good Post Alesia. For an SEO friendly website you also have to make sure that your site is search engine compatible. Apart from what you mention in your article, I would say avoid the use of web designe techniques that would affect the capability of your site to be indexed and crawled by search engines, such as, use of flash, java script, frames, and so on.
    Let’s never forget about the importance of original and high quality content. Content is king!

    1. Thanks, Jose!

      You are spot on on frames, flash and JavaScript. But I thought I’d mention that, sometimes, the developer is put in a position when he/she MUST have them. In which case it’s not the end of the world, but I’d avoid using those in general.

  5. Good post Alesia. Short, descriptive and keyword optimized URLs really makes lot of sense as they are user friendly and the URLs itself make the users understand exactly what the page is about. Search engines give more importance to keywords in URLs while they check the relevancy of the page on the site before it ranks higher in the serps for a related keyword searched and also has a fruitful impact on the users visiting the page.

  6. Nice post Alesia. Almost every website I’ve worked on or done an audit has some sort of URL or navigation issue covered above. I’ve seen SEOs over-optimize URLs based on #2 and not address other areas like #1 www and non-www domain URLs.

    It seems to me like programmers know about these issues but tend to ignore them since it’s not their job to do SEO or it’s not in the budget during a redesign. The result is these poor URLs have weeks or months to bake in search engines, requiring additional work to fix and may result in short term drops in search. Has anyone else experienced the same?

    -Sean

  7. People who are into SEO need guides like this especially if they don’t know which URL could be helpful in SEO. Static URL is highly recognized and preferred by bots.

  8. I was also unaware about the Canonical issue until reading your site. I’m very active in SEO with many sites which I will be looking at now from this angle. I haven’t been penalised yet so I assume my structure is ok, but it’s still very useful to know these things, Good post, thanks.

  9. I would say avoid the use of web designe techniques that would affect the capability of your site to be indexed and crawled by search engines, such as, use of flash, java script, frames, and so on.

  10. Hi,

    Great article. I’m not an SEO specialist, but a Project Manager for a Mobile website, I was wondering if you may be able to help clarify something for me –

    We have a desktop site and a mobile site. They are both served under the www. domain instead of having a separate mobile only domain like m. and in most cases the content on served for a mobile device under a url is the same as the content served under that URL for desktop devices.

    However there is one URL on the desktop site that has been split across a few urls for mobile user agents. These new urls are used to separate content out onto separate pages – and reflect different types of info for users e.g. a google map page, a text heavy description page, a page for an image gallery. They exist as separate pages for both technical and UX reasons. E.g. on desktop URL, http://www.example/item-1234.html but on the same url for mobile this map is linked to on the page as a separate page with a new URL, http://www.example/map-view/item-1234.html) .

    There are concerns from some people in the organisation regarding the SEO ramifications of splitting the original desktop URL and its content over several extra urls for mobile.
    Are these concerns valid?
    Will proper use of canonical tags prevent any issues with rankings and link juice?
    Any insights greatly appreciated

    Thanks
    Alistair
    Will Are there any meta tags I should be using to tell the mobile googlebot that these extra urls on my mobile site when looked at together have the same content as one URL on my desktop site? I can’t show all the same content as the desktop site on the mobile version of the page as there is not enough room. What should I do to avoid indexing and ranking problems?

  11. So am I right in thinking that it doesn’t matter how many levels the urls stretch to (within reason obviously), as long as the site is structured right, the urls say what they do and the products are named in the url?

    1. That’s right, Tom.
      It’s not really about some rules one must follow to a T.
      It’s more about user experience, and whether the search engines have no problem interpreting the URL structure of your site.

  12. Very basic SEO must-do’s but nonetheless very important, and thinks a lot of marketeers forget! Thanks for the post. Canonical is the one most people don’t know about. Takes some work but it has a high impact on results I experienced.