Mar 09 2010

W3C Validation for SEO – Myth and Reality


A topic that has been discussed extensively by some people in the SEO community is the notion that having a W3C compliant web site is either critical to SEO or it’s not.  Most recently, Edward Lewis and I debated back and forth on Twitter, with Edward holding firm to his long-standing position that compliance is necessary and me taking the stand that 100% compliance is not necessarily a factor in ranking.

This latest back and forth was prompted by a tweet I posted while listening to Matt Cutts being interviewed by WebProNews during SMX West.  In that tweet, I summed up what Matt had said regarding SEO and having a site that passes code validation.   My tweet stated :

W3C compliance is NOT an SEO factor to Google #MattCuttsQuote #SMX

Edward then followed up to our Twitter conversation on this by writing a very lengthy and quite detailed article backing up his position on the matter.  I’ll leave it up to you to take the time to read that article yourselves.   And here, I would like to speak to my own position and the logic behind it.

How Much Compliance Is Important?

If the goal is to ensure that a client site has the best possible chance at it’s highest organic rankings, then we need to acknowledge that even just factoring in Google, there are about 200 indicators to consider.  Now, I don’t know about you, but here’s the reality.  With 200 factors to consider, I also have just as much of a responsibility to my clients to focus on those factors that I believe will yield the most results for the investment of time and resources.

Because of this, I do not check client web sites for 100% W3C compliance as part of my audits.  Even though having a site 100% compliant with W3C standards is a best practice concept, I am not a web developer.  And I have not been hired to ensure that W3C compliance is being met in every way possible.

So if a page has an opening paragraph tag without it’s subsequent closing tag, I don’t ultimately care as much to document such findings.

Sure, it might only take me a few minutes to run compliance tests.  But as Edward is so capable of doing, documenting a site’s shortcomings at that level could, potentially, just as easily take hours – if you’re going to include details on how to resolve those specific problems.  Multiply that by cross-browser testing work.  Because we all know perfectly well that not one single web browser truly complies 100% with W3C compliance either.

So just because a site is 100% compliant doesn’t even mean it’s going to be properly displayed across web browsers.  And that means compliance isn’t so straight forward either.

Maximized Return On Recommendations

There are so many other fish to fry, that I need to use my time wisely.  Telling clients that their site is not properly validating due to some P tag problem might earn me a pat on the back for being thorough.  Yet if they fix that issue, I do not believe it’s going to have enough of a positive impact on their SEO as compared to them focusing on any one of the dozens of other action items I usually come up with that are direct SEO issues.

Competitive Reality

Here’s where the two camps differ.  In one camp, if a site is not 100% compliant, it’s not a truly optimized site.  In the other camp, if a site, when held up to the competitive landscape, is just as fast, and has just enough compliance aspects to get by, then an SEO focus is better spent on quality content depth, site architecture from a content topic relationship perspective, internal and external link depth and relationships.

Limited Budgetary Resources

How many clients do you have where the budget for web initiatives is unlimited? Maybe I don’t work with the right clients, because none of my clients has EVER fit that bill.  Instead, they’ve got specific budgetary parameters from within which they can work.  And if a 5,000 page site has to have its entire dynamically generated URL structure rebuilt from the ground up in order to address just the Page Title to URL relationship for SEO, I guarantee you that this task is infinitely more critical, being specific to SEO, than ensuring there’s a proper closing to a paragraph tag.

No Disrespect

Now this isn’t about disrespect to people who believe that strict W3C compliance is important.  Let’s face it – while most of us in the SEO industry know that we can make use of image alternate attributes, personally, I make sure I’m very careful in clarifying to clients that making use of them is first and foremost an issue of providing content to visually impaired people that helps them understand the purpose of an image they may not see.  With that caveat in place, I then go on to communicate how this is an opportunity for SEO because the search engines interpret images on a page to be one indicator as to the purpose of that page.

So if I have to choose between instructing my clients to work on the alternate attributes of their site’s photos or alternately, making sure paragraph tags are closed, given those aforementioned budgetary limits, I’m always going to go with the one that’s going to help the search engines learn more about the page’s focus or purpose.  Whether a paragraph tag is closed or not does not help or hurt this either way.

W3C Standards For SEO

In addition to the image alternate attribute, there are several other HTML standard tags, that when properly used (in accordance with W3C guidelines for valid coding of web pages), are clearly SEO best practices as well. Whether it’s proper use of header tags, bolding, bullet point implementation, or any one of dozens upon dozens of elements to HTML, it’s clear that a site IS better off from an SEO perspective, when that site meets those standards.

Heck, if a site is so botched up in the validation process, it’s even possible that Google won’t even be able to index it.

The Elephant In The Room We Can’t Ignore

Any discussion about validation and SEO must, by nature of this arena, include addressing the primary causality of 100% validation not being a “have to”.  One that Validation evangelists refuse to acknowledge as being a serious consideration.

There are many millions of sites that already exist,  and millions more being deployed all the time, that are NOT 100% compliant. I’m not here to debate the cause of that or rail against anyone who might have caused it.  Heck.  My own sites fail to meet complete validation.  Because, as stated earlier, I am NOT a developer.  I use OFF THE SHELF solutions.  Programmed by other people.  Because that is what I felt was justifiable in costs to create my own sites.

And that’s just reality.

If any of the top search engines were to put more emphasis on w3C compliance than on those aspects that help the engine learn the purpose of a site or indicate importance of specific content, or help indicate 3rd party verification of a specific site’s authority, then the SERPs would be spitting out even more garbage than they do now.  Honestly.

At The End Of The Day

That’s what it really comes down to at the end of the day.  Because even if they DID say compliance is a factor, they could NEVER say it’s more important than those factors that are currently most important.  Because at the end of the day, if two sites are both fully compliant, that’s a one-time thing.  Either they are compliant, or they’re not.  But if two sites have different depth of content, the amount on one site or the other can change at any time.  And the relationships between pages can continually change. And the number of other sites that mention or link to either can change.  Frequently.

And even if I tell my clients they have to get their sites to validate, in cases other than ones similar to those I describe here that are clearly specific to SEO, that validation is NOT going to move my client’s site up in the SERPs.  Because while they were busy revamping their P tags, sixty of their competitors were adding new content.  Or building quality inbound links.

Yeah, But

Okay so I just said that ensuring a site validates for things that are not specific to current SEO isn’t necessary. Well, if that’s the case, then why is half the SEO industry freaking out over page speed?

Clearly, on the surface, page speed has nothing to do with SEO. Except Google has given enough indication now that we are recommending page speed issues be addressed. Because Google has changed their tune. They say that a faster page load is providing a better user experience. Well surely then, if that’s the case, what’s to stop them from saying that 100% validation is also providing a better user experience?

Simple. It’s that pesky P tag.

Let me know when that changes.

Alan Bleiweiss has been an Internet professional since 1995, managing client projects valued at upwards of $2,000,000.00.  Just a few of his most notable clients through the years have included PCH.com, WeightWatchers.com, and Starkist.com.  Follow him on Twitter @AlanBleiweiss , read his blog at Search Marketing Wisdom, and be sure to read his column here at SearchEngineJournal.com the 2nd and 4th Tuesday each month.

Written By:

PG

Alan Bleiweiss | Search Marketing Wisdom | @AlanBleiweiss

Alan Bleiweiss has been an Internet professional since 1995, managing client projects valued at upwards of $2,000,000.00. Just a few of his most notable clients through the years have included PCH.com, WeightWatchers.com, and Starkist.com.

More Posts By Alan Bleiweiss

  • w3cvalidation
    Nice information, I really appreciate the way you presented.
  • It is nice to have perfect code but not necessary as the author says. It is the declaration that tells Google it is up to date and that implies that the page will work in the browser. Whether it does or not does not matter. However the user experience can be so bad you will want to have good code anyway.
  • The W3C compliance is a SEO factor. I had a client in Mexico that his Website's PHP code was terrible, We chaged everything, We built the Website from 0 but we kept the same Urls, same day we chaged the code, his visits jumped 500% and staed there. after I had some problem with one his manager and we stop working with them, after they changed back to the old code now they have very few visits.
    This proved me that the clean code does matter.
  • Rafael,

    Your experience further proves that "clean code" matters, not validation.

    Alan
  • I'd hate to miss that one little thing that was invalid that Google did come to a full stop on and miss a ranking that I could have had, if I had simply done a little more. Furthermore, from a troubleshooting perspective, it feels nice to check off one more thing that if it doesn't help will at least prevent things from getting worse
  • Brett,

    If you feel the need to be that micro-focused it's perfectly valid for you to do so. All of this comes down to risk/reward decisions at every step.
  • First of all, thank you for bringing this up with such a well presented post!

    I struggle with this as a both a developer and SEO, and I'm going to say that, with the exception of certain glaringly bad cases, I pretty much agree -

    Simply said: Just because the code VALIDATES doesn't necessarily mean it's SEMANTIC and ready to be best understood by the extracting powers that be.

    You touched on this when you mention what W3C standards *are* beneficial - and the thing is, the validator doesn't tell you to USE those elements; it only flags you when you've typed them out wrong.

    I can't count how many times I've rewritten VALID code because they used span tags where h2's should be or p tags with line breaks for items more suited to an unordered list. Why? Because semantic markup that correctly describes the content between the tags is of much greater benefit than a poorly nested div is a detriment.

    Like you said, it's about the best return you can provide on the time and budget given. When the main goal is return on SEO investment, fighting the phantom unescaped ampersand isn't going to win my attention when I have 200 or even 20 pages of content with ugly links and no H1 tags.
  • Uh, that should, of course, have said "Casse", not "Cassie" but then I'm not all that swift on the uptake. So please, Casse, forgive me for that.
  • You're forgiven.

    ...this time :)
  • Cassie,

    OMG yes! CSS used instead of H1 tags, CSS based links where the URL is even in the off-page style sheet - two of the most common examples I deal with a LOT!
  • I agree with you Alan. Validation although useful, there are so many other factors that are more relevant then whether the page validates or not.
  • Google webmaster guidelines say that W3 validation is not necessary to rank but encourage it
  • Studies on SEO’s 16 years and never felt that there is something to learn. I worked and still work a part-time as a freelancer on RentACoder (User Creative Trend, where currently 52) to many optimization projects, most of the content. Personally, I think it’s a complex puzzle, made of many small pieces, interdependent. I thought to write a short guide that summarized all these pieces, which are necessary in a comprehensive SEO plan. Being late, it could be forgotten a few items, so please, contribute and correct me. http://stylishfirst.com/seo-plan-completed/
  • I was just checking out your link and it wasn't showing up, the only thing showing up was the Adwords, the whole page was black. Just trying to help.
  • hahaha now that you showed me that I start to see how it's hypocritical of Google to push a standard when they're not even following the one they're pushing.
  • Michael - where do you see the HTML 5 doctype? When I view source on the home page all I see is . It's not surprising they have it though. They determined they're going to push 5 through no matter what, and no matter who else jumps on board or not. Kind of like when Microsoft was the first big browser company to decide they were going to go rogue on compliance with W3C, thinking that others would have no choice but to join them.
  • Alan,

    I didn't see it in the code either but with the W3C automatic detection it went with the HTML5 Doctype

    http://validator.w3.org/check?uri=www.google.co...
  • Well Michael,

    It is ironic that they themselves don't validate. Yet Google never does what they say we should do, unless that happens to fit their business profit model.
  • Google not validating is old news as well pretty much all well ranking sites - ironically the few sites that do COMPLETELY validate even under Strict generally don't rank :)

    More interesting is that Google.com has an HTML5 Doctype - anyone know how long thats been since its still not an "official" standard yet?
  • Next Matt Cutts will do a Google W3C report card

    46 W3C errors & 2 warnings under its declaration of being HTML5 doctype
  • Rob,

    I'm sorry to learn you feel that way. It clearly shows you don't grasp the importance of...

    Oh. Wait.

    I'm Alan, not Edward.

    Nevermind. :-)
  • Hi Alan, I have to agree that while it may be desirable to be 100% compliant I just can't see it being worth the effort for most sites. I can build a site that still serve the needs of users who cannot browse the web in the usual ways the average person can, without being 100% compliant. In a world of scarce resources to work on a site I'd much rather put my efforts into driving traffic to my site and making that site more usable (and higher converting) for the vast majority of users rather than chasing down those last few compliance issues that really don't affect anyone anyway. Honestly, until it either has a material impact on my business or in some way affects the usability of the site, 100% compliance just isn't that big of a priority.
  • I´m completely agreed with you Alan, one of the most important factors that makes a successfully website is the user's browsing experience and a site that complies with the standards of the W3C code validation is a great opportunity to get to display the entire contents of a web page at the largest possible number of users regardless of operating system, browser or device they use.
    Thanks for your articles and sorry about my poor english, i´m spanish, see you.
  • Anuncios

    no need to be concerned about your English - your comments are appreciated!
  • Hi Alan,

    I love this topic and I completely agree - if code validation was a factor, 90% of the sites on the internet would fail and in some cases, they have to use "invalid" code just to get the site to display in all the browsers. That said, I've seen some sites that are obviously broken to some degree in the eyes of Google - I'm basing this on what seems to be indexed. I've seen pages where Google did not appear to be able to read their meta tags properly or that won't come up for the text on the page even in an exact match search. While valid code may not be a factor in the algorithm, it's like the robots.txt file. If you screw it up, it doesn't matter how many other factors you're compliant with - Google can't see it and you're not in the game.

    Just sayin'
  • Marjory,

    Not seeing a page come up in the SERPs for a specific phrase or sentence can be an indicator that there's a code level problem. Yet given the complexity of the ranking process it could also be due to other factors, such as penalties incurred in other ways.

    In my audits, I do as much random testing of individual pages as I feel is appropriate given the overall scope of my tasking. If that testing doesn't reveal validation errors that I think might cause pages to not be indexed, I leave it up to my later review of the total volume of pages indexed for that site compared to the total number of pages that exist.

    At that point I go ahead and look for reasons why that might be and typically find several other causes beyond validation.

    More often than not, if there's a coding problem serious enough to cause the issue you describe, it's going to be present in enough pages on a site that my methods will uncover it, given that most of my client sites are template driven.
  • Mike


    Would I like to see every client site validate? Yes. Clearly a properly validating site ensures better markup.

    On a 5,000 page web site that is maintained by multiple departments across a corporate organization, where entire sections of the site have been implemented using different 3rd party solutions, is definitely a pain to deal with regarding validation.

    Adam,
    As I've said elsewhere, yes - certain aspects of W3C compliance do come into play. Yet not all do. So compliance as far as those factors that DO is what I care most about.
  • You forgot to introduce the fact that when a page isn't W3 compliant often browsers go into debugger mode making the pages load slower. Slow loading pages equate to higher bounce rates by both users, and by limitation bots under certain circumstances of time spent on page downloading. While by definition SEO is for the search engines ultimately it's for the conversion of getting people through your site. if a site isn't useable even the best SEO isn't going to help you. If there's no budget for repairs then absolutely there's not much point other than to at the very least due dilligence to inform the client that the developer that built their site didn't take the time to check their code. While I can appreciate Google says W3 compliance isn't a requirement it does come into play at times, and we shouldn't down play that. Lets say you didn't close an h1 tag.. it could in theory get your site flagged for spam thus dropping your ranking. It is for the above reasons that I believe W3 compliance IS in fact important, but of course not as important as some other factors like title tags, domains, headers, content
  • W3C validation is just good practice as mentioned, and it isn't a pain at all once your experienced developing within the compliance guidelines. I believe the underlying benefit to compliance really comes down to semantics. A W3C valid website is often (but not always) a more semantically correct site, thus adding to it's SEO value. The benefits of a semantically structured website, I'm sure we can all agree on.
  • I agree with you alan, I've seen sites that are not w3c compliant above others, which are compliant, numerous times in the google rankings
  • I honestly think that SEOs in the 'your website must be 100% W3C compliant' camp are stunningly ignorant on how information retrieval works. Search engine crawlers do NOT render a page - they retrieve the HTML code and parse it, but they never render it as a browser does. For SEO purposes the HTML code needs to be suitable for parsing, NOT for rendering in a browser. Thus W3C compliance really isn't an issue. The code just needs to be clean enough for a search engine to parse it and distinguish content, navigation, and style.
  • I also agree with Alan, actually all Google's product pages are not validated. It's about time we take what we read online with a pinch of salt - especially regarding SEO.
  • I've always wondered about that. Thanks for finally clearing that out - fully :)
  • Jacob

    I don't know that I've FULLY cleared that up, but hopefully I've cleared it up ENOUGH :-)
  • i totally agree with Alan, After all google home page is not validated :)
  • LOL Google home page not meeting validation is something we do get a good laugh over. But seriously - that only throws 40 errors. Which is pretty close to being valid. :-)
  • Interesting summary and I tend to agree with your view. It would be surprising to think that Google would run a full compliance check of all websites rather than just some standard settings, but I guess the underlying issue is the usability and functionality of a site - which can include load times - and so website should aim for compliance anyway, or as close to it, as possible.
  • Clive, I definitely agree that usability and functionality are vital to a web site's success. And to the degree that I believe these are factors, I will include these in my audit review. However the degree to which they matter is what is really the core of this debate. Anyone who claims anything less than 100% validation will definitely harm a web site is being myopic in their position.
  • Great article. W3C compliance can be a pain. Its good to know that sites are ok if they are not 100% W3C compliance. Thanks!
  • Jasmine,

    Yes - as long as it's understood that some aspects of the W3C standards do directly affect SEO, we're okay. Whether a paragraph has a closing "p" tag or not, or if you use Tables instead of CSS Divs however, are just two examples of when compliance is not a factor. :-)
blog comments powered by Disqus
Newsletter Icon
Twitter Icon Facebook Icon RSS Icon

Authors