Five Steps to SEO-Friendly Site URL Structure

Screen1 SEJ Five Steps to SEO Friendly Site URL StructureBy Alesia Krush

Some people say there is no such thing as SEO-friendly URL structure. They claim search engines are perfectly capable of making sense of any type of URL and pretty much any URL structure. In most cases, the people who say this are web developers (just so you know, I love Web devs).

I’ve noticed sometimes web developers and SEOs live in two parallel universes, each with its own center of gravity. While web developers basically care about crawlability, site speed, and other technical things, SEOs are mostly focused on what constitutes their sacred grail: website rankings and ROI.

Hence, what may be an OK site URL structure to a web dev, could be a totally SEO-unfriendly URL architecture to an SEO manager:
Screen2 SEJ 637x442 Five Steps to SEO Friendly Site URL Structure

What is an SEO-friendly URL structure?

First of all, let me start by saying that it is always better to call in an SEO manager early in the development stage, so there is no need to make sometimes hard-to-implement tweaks afterwards.

From an SEO point of view, a site’s URL structure should be:

  • Straightforward: URLs with duplicate content should have canonical URLs specified for them; there should be no confusing redirects on the site, etc.
  • Meaningful: URL names should have keywords in them, not gibbering numbers and punctuation marks.
  • With emphasis on the right URLs: SEO-wise, not all URLs on a site are of equal importance as a rule. Some even should be concealed from the search engines. At the same time, it is important to check that the pages that ought to be accessible to the search engines are actually open for crawling and indexing.

So, here is what one can do to achieve an SEO-friendly site URL structure:

1. Consolidate your www and the non-www domain versions

As a rule, there are two major versions of your domain indexed in the search engines, the www and the non-www version of it. These can be consolidated in more than one way, but I’d mention the most widely accepted practice.

Most SEOs (in my experience) use the 301 redirect to point one version of their site to the other (or vice versa).

Alternatively (for instance, when you can’t do a redirect), you can specify your preferred version in Google Webmaster Tools in Configuration >> Settings >> Preferred Domain. However,this has certain drawbacks:

  • This takes care of Google only.
  • This option is restricted to root domains only. If you have an example.wordpress.com site, this method is not for you.

But why worry about the www vs non-www issue in the first place? Thing is, some of your backlinks may be pointing to your www version, while some could be going to the non-www version.

So, to make sure that both versions’ SEO value is consolidated, it’s better to explicitly establish this link between the two (either via the 301 redirect, or in Google Webmaster Tools, or by using a canonical tag – I’ll talk about that one a bit further).

2. Avoid dynamic and relative URLs

Depending on your content management system, the URLs it generates may be “pretty” like this one:

www.example.com/topic-name

or “ugly” like this one:

www.example.com/?p=578544

As I said earlier, search engines have no problem with either variant, but for certain reasons it’s better to use static (prettier) URLs rather than dynamic (uglier) ones. Thing is, static URLs contain your keywords and are more user-friendly, since one can figure out what the page is about just by looking at the static URL’s name.

Besides, Google recommends using hyphens (-) instead of underscores (_) in URL names, since a phrase in which the words are connected using underscores is treated by Google as one single word, e.g. one_single_word is onesingleword to Google.

And, to check what other elements of your page should have the same keywords as your URLs, have a look at the screenshot 3 of the “On-Page SEO for 2013: Optimize Pages to Rank and Perform” guide that we released recently.

Besides, some web devs make use of relative URLs. The problem with relative URLs is that they are dependent on the context in which they occur. Once the context changes, the URL may not work. SEO-wise, it is better to use absolute URLs instead of relative ones, since the former are what search engines prefer.

Now, sometimes different parameters can be added to the URL for analytics tracking or other reasons (such as sid, utm, etc.) To make sure that these parameters don’t make the number of URLs with duplicate content grow over the top, you can do either of the following:

  • Ask Google to disregard certain URL parameters in Google Webmaster Tools in Configuration > URL Parameters.
  • See if your content management system allows you to solidify URLs with additional parameters with their shorter counterparts.

3. Create an XML Sitemap

An XML Sitemap is not to be confused with the HTML sitemap. The former is for the search engines, while the latter is mostly designed for human users.

What is an XML Sitemap? In plain words, it’s a list of your site’s URLs that you submit to the search engines. This serves two purposes:

  1. This helps search engines find your site’s pages more easily;
  2. Search engines can use the Sitemap as a reference when choosing canonical URLs on your site.

The word “canonical” simply means “preferred” in this case. Picking a preferred (canonical) URL becomes necessary when search engines see duplicate pages on your site.

So, as they don’t want any duplicates in the search results, search engines use a special algorithm to identify duplicate pages and pick just one URL to represent the group in the search results. Other webpages just get filtered out.

Now, back to sitemaps … One of the criteria search engines may use to pick a canonical URL for the group of webpages is whether this URL is mentioned in the website’s Sitemap.

So, what webpages should be included into your sitemap, all of your site’s pages or not? In fact, for SEO-reasons, it’s recommended to include only the webpages you’d like to show up in search.

4. Close off irrelevant pages with robots.txt

There may be pages on your site that should be concealed from the search engines. These could be your “Terms and conditions” page, pages with sensitive information, etc. It’s better not to let these get indexed, since they usually don’t contain your target keywords and only dilute the semantic whole of your site.

The robotx.txt file contains instructions for the search engines as to what pages of your site should be ignored during the crawl. Such pages get a noindex attribute and do not show up in the search results.

Sometimes, however, unsavvy webmasters use noindex on the pages it should not be used. Hence, whenever you start doing SEO for a site, it is important to make sure that no pages that should be ranking in search have the noindex attribute. Or else you may end up like this guy here:

5. Specify canonical URLs using a special tag

Another way to highlight canonical URLs on your site is by using the so-called canonical tag. In geek speek, it’s not the tag itself that is canonical, but the tag’s parameter, but we’ll just call it the canonical tag by metonymy.

Note: the canonical tag should be applied only with the purpose of helping search engines decide on your canonical ULR. For redirection of site pages, use redirects. And, for paginated content, it makes sense to employ rel=”next” and rel=”prev” tags in most cases.

For example, on Macy’s website, I can go to the Quilts & Bedspreads page directly, or I can take different routes from the homepage:

  • I can go to Homepage >>Bed& Bath >> Quilts & Bedspreads. The following URL with my pass recorded is generated:

http://www1.macys.com/shop/bed-bath/quilts-bedspreads?id=22748&edge=hybrid&cm_sp=us_catsplash_bed-%26-bath-_-row6-_-quilts-%26-bedspreads

  • Or I can go to Homepage >> For the Home >> Bed & Bath >> Bedding >> Quilts & Bedspreads. The following URL is generated:

http://www1.macys.com/shop/bed-bath/quilts-bedspreads?id=22748&edge=hybrid

Now, all three URLs lead to the same content. And, if you look into the code of each page, you’ll see the following tag in the head element:
Screen3 SEJ 637x47 Five Steps to SEO Friendly Site URL Structure

As you see, for each of these URLs, a canonical URL is specified, which is the cleanest version of all the URLs in the group:

http://www1.macys.com/shop/bed-bath/quilts-bedspreads?id=22748

What this does is, it funnels down the SEO value each of these three URLs might have to one single URL that should be displayed in the search results (the canonical URL). Normally search engines do a pretty good job identifying canonical URLs themselves, but, as Susan Moskwa once wrote at Google Web  master Central:

“If we aren’t able to detect all the duplicates of a particular page, we won’t be able to consolidate all of their properties. This may dilute the strength of that content’s ranking signals by splitting them across multiple URLs.”

Conclusion

Having SEO-friendly URL structure on a site means having the URL structure that helps the site rank higher in the search results. While, from the point of view of web development, a particular site’s architecture may seem crystal-clear and error-free, for an SEO manager this could mean missing on certain ranking opportunities.

Image credits: skyfish81via FlickrArbyreed via FlickrJo Dooher via Flickr