SEO

W3C Validation for SEO – Myth and Reality

A topic that has been discussed extensively by some people in the SEO community is the notion that having a W3C compliant web site is either critical to SEO or it’s not.  Most recently, Edward Lewis and I debated back and forth on Twitter, with Edward holding firm to his long-standing position that compliance is necessary and me taking the stand that 100% compliance is not necessarily a factor in ranking.

This latest back and forth was prompted by a tweet I posted while listening to Matt Cutts being interviewed by WebProNews during SMX West.  In that tweet, I summed up what Matt had said regarding SEO and having a site that passes code validation.   My tweet stated :

W3C compliance is NOT an SEO factor to Google #MattCuttsQuote #SMX

Edward then followed up to our Twitter conversation on this by writing a very lengthy and quite detailed article backing up his position on the matter.  I’ll leave it up to you to take the time to read that article yourselves.   And here, I would like to speak to my own position and the logic behind it.

How Much Compliance Is Important?

If the goal is to ensure that a client site has the best possible chance at it’s highest organic rankings, then we need to acknowledge that even just factoring in Google, there are about 200 indicators to consider.  Now, I don’t know about you, but here’s the reality.  With 200 factors to consider, I also have just as much of a responsibility to my clients to focus on those factors that I believe will yield the most results for the investment of time and resources.

Because of this, I do not check client web sites for 100% W3C compliance as part of my audits.  Even though having a site 100% compliant with W3C standards is a best practice concept, I am not a web developer.  And I have not been hired to ensure that W3C compliance is being met in every way possible.

So if a page has an opening paragraph tag without it’s subsequent closing tag, I don’t ultimately care as much to document such findings.

Sure, it might only take me a few minutes to run compliance tests.  But as Edward is so capable of doing, documenting a site’s shortcomings at that level could, potentially, just as easily take hours – if you’re going to include details on how to resolve those specific problems.  Multiply that by cross-browser testing work.  Because we all know perfectly well that not one single web browser truly complies 100% with W3C compliance either.

So just because a site is 100% compliant doesn’t even mean it’s going to be properly displayed across web browsers.  And that means compliance isn’t so straight forward either.

Maximized Return On Recommendations

There are so many other fish to fry, that I need to use my time wisely.  Telling clients that their site is not properly validating due to some P tag problem might earn me a pat on the back for being thorough.  Yet if they fix that issue, I do not believe it’s going to have enough of a positive impact on their SEO as compared to them focusing on any one of the dozens of other action items I usually come up with that are direct SEO issues.

Competitive Reality

Here’s where the two camps differ.  In one camp, if a site is not 100% compliant, it’s not a truly optimized site.  In the other camp, if a site, when held up to the competitive landscape, is just as fast, and has just enough compliance aspects to get by, then an SEO focus is better spent on quality content depth, site architecture from a content topic relationship perspective, internal and external link depth and relationships.

Limited Budgetary Resources

How many clients do you have where the budget for web initiatives is unlimited? Maybe I don’t work with the right clients, because none of my clients has EVER fit that bill.  Instead, they’ve got specific budgetary parameters from within which they can work.  And if a 5,000 page site has to have its entire dynamically generated URL structure rebuilt from the ground up in order to address just the Page Title to URL relationship for SEO, I guarantee you that this task is infinitely more critical, being specific to SEO, than ensuring there’s a proper closing to a paragraph tag.

No Disrespect

Now this isn’t about disrespect to people who believe that strict W3C compliance is important.  Let’s face it – while most of us in the SEO industry know that we can make use of image alternate attributes, personally, I make sure I’m very careful in clarifying to clients that making use of them is first and foremost an issue of providing content to visually impaired people that helps them understand the purpose of an image they may not see.  With that caveat in place, I then go on to communicate how this is an opportunity for SEO because the search engines interpret images on a page to be one indicator as to the purpose of that page.

So if I have to choose between instructing my clients to work on the alternate attributes of their site’s photos or alternately, making sure paragraph tags are closed, given those aforementioned budgetary limits, I’m always going to go with the one that’s going to help the search engines learn more about the page’s focus or purpose.  Whether a paragraph tag is closed or not does not help or hurt this either way.

W3C Standards For SEO

In addition to the image alternate attribute, there are several other HTML standard tags, that when properly used (in accordance with W3C guidelines for valid coding of web pages), are clearly SEO best practices as well. Whether it’s proper use of header tags, bolding, bullet point implementation, or any one of dozens upon dozens of elements to HTML, it’s clear that a site IS better off from an SEO perspective, when that site meets those standards.

Heck, if a site is so botched up in the validation process, it’s even possible that Google won’t even be able to index it.

The Elephant In The Room We Can’t Ignore

Any discussion about validation and SEO must, by nature of this arena, include addressing the primary causality of 100% validation not being a “have to”.  One that Validation evangelists refuse to acknowledge as being a serious consideration.

There are many millions of sites that already exist,  and millions more being deployed all the time, that are NOT 100% compliant. I’m not here to debate the cause of that or rail against anyone who might have caused it.  Heck.  My own sites fail to meet complete validation.  Because, as stated earlier, I am NOT a developer.  I use OFF THE SHELF solutions.  Programmed by other people.  Because that is what I felt was justifiable in costs to create my own sites.

And that’s just reality.

If any of the top search engines were to put more emphasis on w3C compliance than on those aspects that help the engine learn the purpose of a site or indicate importance of specific content, or help indicate 3rd party verification of a specific site’s authority, then the SERPs would be spitting out even more garbage than they do now.  Honestly.

At The End Of The Day

That’s what it really comes down to at the end of the day.  Because even if they DID say compliance is a factor, they could NEVER say it’s more important than those factors that are currently most important.  Because at the end of the day, if two sites are both fully compliant, that’s a one-time thing.  Either they are compliant, or they’re not.  But if two sites have different depth of content, the amount on one site or the other can change at any time.  And the relationships between pages can continually change. And the number of other sites that mention or link to either can change.  Frequently.

And even if I tell my clients they have to get their sites to validate, in cases other than ones similar to those I describe here that are clearly specific to SEO, that validation is NOT going to move my client’s site up in the SERPs.  Because while they were busy revamping their P tags, sixty of their competitors were adding new content.  Or building quality inbound links.

Yeah, But

Okay so I just said that ensuring a site validates for things that are not specific to current SEO isn’t necessary. Well, if that’s the case, then why is half the SEO industry freaking out over page speed?

Clearly, on the surface, page speed has nothing to do with SEO. Except Google has given enough indication now that we are recommending page speed issues be addressed. Because Google has changed their tune. They say that a faster page load is providing a better user experience. Well surely then, if that’s the case, what’s to stop them from saying that 100% validation is also providing a better user experience?

Simple. It’s that pesky P tag.

Let me know when that changes.

Alan Bleiweiss has been an Internet professional since 1995, managing client projects valued at upwards of $2,000,000.00.  Just a few of his most notable clients through the years have included PCH.com, WeightWatchers.com, and Starkist.com.  Follow him on Twitter @AlanBleiweiss , read his blog at Search Marketing Wisdom, and be sure to read his column here at SearchEngineJournal.com the 2nd and 4th Tuesday each month.

12bcd73262dd3dcb8597e6d4f9884119 64 W3C Validation for SEO   Myth and Reality
Alan Bleiweiss is a Forensic SEO audit consultant with audit client sites consisting of upwards of 50 million pages and tens of millions of visitors a month. A noted industry speaker, author and blogger, his posts are quite often as much controversial as they are thought provoking.
12bcd73262dd3dcb8597e6d4f9884119 64 W3C Validation for SEO   Myth and Reality

You Might Also Like

Comments are closed.

59 thoughts on “W3C Validation for SEO – Myth and Reality

  1. Nicely done, Alan. I love and respect Edward, but I definitely fall into your camp.

    The reality is that for most medium to large enterprise sites, CMS elements make 100% validation impossible, but that doesn’t mean that these sites are SEO unfriendly.

    I could go on and on, but that’s what Twitter is for ; )

    1. I agree with you. If I remember right Google to maintain the marketshare it has, must serve up the most relevant results, not the most perfectly designed page. Honestly, I would think the most relevant page would not be compliant, it would be more likely for a mass produced SPAM page to be compliant in my eyes. I have several pages on page 1, and unless my hand coding is somehow perfect on each page, I sure don’t think that this is a factor, if it is I don’t see it being a large factor. The fact is people searching the internet could care less about the code and more about relevancy so to say that this is important and even further for Google to weight this, would then make their search results not as relevant. For instance to punish Adobe who get’s 18 errors on validation who is ranked number one for Adobe Reader and put a more compliant site in the number one position wouldn’t make a lot of sense.

  2. Thanks Hugo

    It’s really a matter of capacity to implement. For those who make a living doing so, it’s incomprehensible that anyone else would not want to do so as well, and rather than acknowledge that there are other factors just as valid, they do what they can to prove they are right. Better that than admit their logic is flawed. Which is a human trait we all express from time to time.

  3. I can’t say that as a web designer I don’t spend a lot of time on validation, not necessarily for SEO, but for overall quality. Consider this, as far as “optimization” goes, wouldn’t you want to make that site the “best it can be?” Making sure everything is up to par or “standards?” To me it’s not about the SEO, it’s about the craftsmanship of the site, that helps motivate validation standards. Great article, nice to see someone questioning the myths out there!

  4. Interesting topic. I read Edward’s informative article yesterday and so it’s nice to read the other side of the coin here.

    That said, I’m not a developer either. Despite that sad fact, I have been able to quickly fix plenty of validation errors without being an authority on how to actually fix them. You might have to do a little digging but I do feel it’s important to have your website as clean as possible with regard to these errors because as you said, Google may not even be able to index your pages depending on which errors are being returned.

    I’m going to continue working to keep my website (and those of my clients) as compliant as possible because I think it’s good SEO practice. I don’t blame those of you for being on the other side of the fence, as it really comes down to a difference of opinion. I’ve seen websites with 0 errors not make a dent in the SERP landscape, and I’ve seen websites with SO MANY errors ranking #1.

    The most recent article on my website is about how to fix validation errors without knowing what you’re doing. I’d like to see you and Edward do something together to at least find acceptable errors in validation so those of us who don’t know what they’re doing can at least identify the real problem errors from the harmless one’s!

    1. Nathan,

      It really would help a lot of people if there were a how-to regarding validation errors that can prevent indexing. Since I’m nowhere near the expert Edward is, I’ll leave it to him to decide if he would like to write this up. Heck, he may have already done so.

  5. I am with you Alan, I think it is a good thing to have in terms of quality but not necessary for search rankings to succeed. Cheers and great meeting you, if ever briefly, at SMX West!

  6. Was good meeting you as well Ross. Hopefully next time I can actually attend the whole conference. In the mean time, you can find me here de-myth-ifying :-)

  7. Very good read Alan.

    I was once a web developer and have since jumped ship to SEO. I used to find W3C validation to be very important, but since I’ve dealt with clients in my freelance work and at my previous day job I have found it to be an extremely wasteful use of my time, especially if you’re handing the site over to someone else to manage and update.

    To me, in a perfect world every person would not go hungry and every site would validate. But that is simply not the case. It’s a waste of time to focus your efforts on W3C validation when the only people who are going to drool over it are other web developers. The real value and ROI comes from focusing your efforts on SEO.

    If someone wants to argue W3C validation is important, let that ball fall in the web developers field.

  8. Alan it’s funny that I agree with you, but I agree with almost everything Edward says too. I think that shows that you two are taking some things out of context if not completely exaggerating each others words in order to keep your battle/debate/egos alive! I will sit back and enjoy the next rant from either one of you and stoke the fire whenever possible.

    1. David

      So your comment has arrived! Edward and I are both passionate about the work we do, and though I can’t speak to his motives for what he says in this dialogue, personally, I focus exclusively on expressing my opinion in the community because I feel this is a very important topic that needs the dialogue.

      Sure, I’m over the top in some of my writing. Yet I honestly don’t believe that I’ve ever intentionally taken anything out of context or exaggerated things, at least not just to keep something going. It really is about giving voice to concerns and perspective.

      Yet at the end of the day, it really is just perspective.

    1. Jasmine,

      Yes – as long as it’s understood that some aspects of the W3C standards do directly affect SEO, we’re okay. Whether a paragraph has a closing “p” tag or not, or if you use Tables instead of CSS Divs however, are just two examples of when compliance is not a factor. :-)

  9. Interesting summary and I tend to agree with your view. It would be surprising to think that Google would run a full compliance check of all websites rather than just some standard settings, but I guess the underlying issue is the usability and functionality of a site – which can include load times – and so website should aim for compliance anyway, or as close to it, as possible.

    1. Clive, I definitely agree that usability and functionality are vital to a web site’s success. And to the degree that I believe these are factors, I will include these in my audit review. However the degree to which they matter is what is really the core of this debate. Anyone who claims anything less than 100% validation will definitely harm a web site is being myopic in their position.

    1. LOL Google home page not meeting validation is something we do get a good laugh over. But seriously – that only throws 40 errors. Which is pretty close to being valid. :-)

  10. I also agree with Alan, actually all Google’s product pages are not validated. It’s about time we take what we read online with a pinch of salt – especially regarding SEO.

  11. I honestly think that SEOs in the ‘your website must be 100% W3C compliant’ camp are stunningly ignorant on how information retrieval works. Search engine crawlers do NOT render a page – they retrieve the HTML code and parse it, but they never render it as a browser does. For SEO purposes the HTML code needs to be suitable for parsing, NOT for rendering in a browser. Thus W3C compliance really isn’t an issue. The code just needs to be clean enough for a search engine to parse it and distinguish content, navigation, and style.

  12. W3C validation is just good practice as mentioned, and it isn’t a pain at all once your experienced developing within the compliance guidelines. I believe the underlying benefit to compliance really comes down to semantics. A W3C valid website is often (but not always) a more semantically correct site, thus adding to it’s SEO value. The benefits of a semantically structured website, I’m sure we can all agree on.

  13. You forgot to introduce the fact that when a page isn’t W3 compliant often browsers go into debugger mode making the pages load slower. Slow loading pages equate to higher bounce rates by both users, and by limitation bots under certain circumstances of time spent on page downloading. While by definition SEO is for the search engines ultimately it’s for the conversion of getting people through your site. if a site isn’t useable even the best SEO isn’t going to help you. If there’s no budget for repairs then absolutely there’s not much point other than to at the very least due dilligence to inform the client that the developer that built their site didn’t take the time to check their code. While I can appreciate Google says W3 compliance isn’t a requirement it does come into play at times, and we shouldn’t down play that. Lets say you didn’t close an h1 tag.. it could in theory get your site flagged for spam thus dropping your ranking. It is for the above reasons that I believe W3 compliance IS in fact important, but of course not as important as some other factors like title tags, domains, headers, content

  14. Mike

    Would I like to see every client site validate? Yes. Clearly a properly validating site ensures better markup.

    On a 5,000 page web site that is maintained by multiple departments across a corporate organization, where entire sections of the site have been implemented using different 3rd party solutions, is definitely a pain to deal with regarding validation.

    Adam,
    As I’ve said elsewhere, yes – certain aspects of W3C compliance do come into play. Yet not all do. So compliance as far as those factors that DO is what I care most about.

  15. Hi Alan,

    I love this topic and I completely agree – if code validation was a factor, 90% of the sites on the internet would fail and in some cases, they have to use “invalid” code just to get the site to display in all the browsers. That said, I’ve seen some sites that are obviously broken to some degree in the eyes of Google – I’m basing this on what seems to be indexed. I’ve seen pages where Google did not appear to be able to read their meta tags properly or that won’t come up for the text on the page even in an exact match search. While valid code may not be a factor in the algorithm, it’s like the robots.txt file. If you screw it up, it doesn’t matter how many other factors you’re compliant with – Google can’t see it and you’re not in the game.

    Just sayin’

    1. Marjory,

      Not seeing a page come up in the SERPs for a specific phrase or sentence can be an indicator that there’s a code level problem. Yet given the complexity of the ranking process it could also be due to other factors, such as penalties incurred in other ways.

      In my audits, I do as much random testing of individual pages as I feel is appropriate given the overall scope of my tasking. If that testing doesn’t reveal validation errors that I think might cause pages to not be indexed, I leave it up to my later review of the total volume of pages indexed for that site compared to the total number of pages that exist.

      At that point I go ahead and look for reasons why that might be and typically find several other causes beyond validation.

      More often than not, if there’s a coding problem serious enough to cause the issue you describe, it’s going to be present in enough pages on a site that my methods will uncover it, given that most of my client sites are template driven.

  16. I´m completely agreed with you Alan, one of the most important factors that makes a successfully website is the user’s browsing experience and a site that complies with the standards of the W3C code validation is a great opportunity to get to display the entire contents of a web page at the largest possible number of users regardless of operating system, browser or device they use.
    Thanks for your articles and sorry about my poor english, i´m spanish, see you.

  17. Hi Alan, I have to agree that while it may be desirable to be 100% compliant I just can’t see it being worth the effort for most sites. I can build a site that still serve the needs of users who cannot browse the web in the usual ways the average person can, without being 100% compliant. In a world of scarce resources to work on a site I’d much rather put my efforts into driving traffic to my site and making that site more usable (and higher converting) for the vast majority of users rather than chasing down those last few compliance issues that really don’t affect anyone anyway. Honestly, until it either has a material impact on my business or in some way affects the usability of the site, 100% compliance just isn’t that big of a priority.

  18. Well Michael,

    It is ironic that they themselves don’t validate. Yet Google never does what they say we should do, unless that happens to fit their business profit model.

    1. Google not validating is old news as well pretty much all well ranking sites – ironically the few sites that do COMPLETELY validate even under Strict generally don’t rank :)

      More interesting is that Google.com has an HTML5 Doctype – anyone know how long thats been since its still not an “official” standard yet?

  19. Michael – where do you see the HTML 5 doctype? When I view source on the home page all I see is . It’s not surprising they have it though. They determined they’re going to push 5 through no matter what, and no matter who else jumps on board or not. Kind of like when Microsoft was the first big browser company to decide they were going to go rogue on compliance with W3C, thinking that others would have no choice but to join them.

  20. hahaha now that you showed me that I start to see how it’s hypocritical of Google to push a standard when they’re not even following the one they’re pushing.

  21. Studies on SEO’s 16 years and never felt that there is something to learn. I worked and still work a part-time as a freelancer on RentACoder (User Creative Trend, where currently 52) to many optimization projects, most of the content. Personally, I think it’s a complex puzzle, made of many small pieces, interdependent. I thought to write a short guide that summarized all these pieces, which are necessary in a comprehensive SEO plan. Being late, it could be forgotten a few items, so please, contribute and correct me. http://stylishfirst.com/seo-plan-completed/

  22. First of all, thank you for bringing this up with such a well presented post!

    I struggle with this as a both a developer and SEO, and I’m going to say that, with the exception of certain glaringly bad cases, I pretty much agree –

    Simply said: Just because the code VALIDATES doesn’t necessarily mean it’s SEMANTIC and ready to be best understood by the extracting powers that be.

    You touched on this when you mention what W3C standards *are* beneficial – and the thing is, the validator doesn’t tell you to USE those elements; it only flags you when you’ve typed them out wrong.

    I can’t count how many times I’ve rewritten VALID code because they used span tags where h2′s should be or p tags with line breaks for items more suited to an unordered list. Why? Because semantic markup that correctly describes the content between the tags is of much greater benefit than a poorly nested div is a detriment.

    Like you said, it’s about the best return you can provide on the time and budget given. When the main goal is return on SEO investment, fighting the phantom unescaped ampersand isn’t going to win my attention when I have 200 or even 20 pages of content with ugly links and no H1 tags.

    1. Cassie,

      OMG yes! CSS used instead of H1 tags, CSS based links where the URL is even in the off-page style sheet – two of the most common examples I deal with a LOT!

  23. I’d hate to miss that one little thing that was invalid that Google did come to a full stop on and miss a ranking that I could have had, if I had simply done a little more. Furthermore, from a troubleshooting perspective, it feels nice to check off one more thing that if it doesn’t help will at least prevent things from getting worse

    1. Brett,

      If you feel the need to be that micro-focused it’s perfectly valid for you to do so. All of this comes down to risk/reward decisions at every step.

  24. The W3C compliance is a SEO factor. I had a client in Mexico that his Website’s PHP code was terrible, We chaged everything, We built the Website from 0 but we kept the same Urls, same day we chaged the code, his visits jumped 500% and staed there. after I had some problem with one his manager and we stop working with them, after they changed back to the old code now they have very few visits.
    This proved me that the clean code does matter.

  25. It is nice to have perfect code but not necessary as the author says. It is the declaration that tells Google it is up to date and that implies that the page will work in the browser. Whether it does or not does not matter. However the user experience can be so bad you will want to have good code anyway.

  26. I don’t understand why everyone excludes all the other search engins from consideration. Google may be number one, but there are OTHER search tools out there (Yahoo for instance amongst others). This attitude of treating Google as is it was the ONLY valid search tool out there is the equivalent of coding exclusively for one browser and ignoring all the others.

    If you’re at all professional, you adhere to standards and make sure your work is solid on all platforms, all browsers. Thus it seems only logical that if you’re a SEO professional you optimise to get decent results for all search engins and all robots.

  27. You said “I do not check client web sites for 100% W3C compliance as part of my audits.” My experience shows that if something is hard to do, it will not be done as much. So maybe that factors into it?

  28. You said that if one tag is missing is well ok; this doesn’t fall into W3C Compliance, but into sloppy programming and coding practices…

    1. Devhead,

      That is correct – if code is not W3C compliant, it can very well be considered sloppy programming and coding practices. However, the focus of the article was on whether sloppy programming and coding is a concern specifically from an SEO perspective. And at the time this article was written, it was not an SEO concern. However I can tell you that now, almost three years after this article was posted, clean code is more important than ever. Not because search algorithms devalue a site that does not validate to W3C standards though. Instead, it’s because that sloppy code may cause a page crawl speed to be problematic, or that sloppy code could confuse crawler content discovery or lastly, that sloppy code could confuse search algorithms that need to evaluate page and cross-page content intent, focus and relationships. So it’s wise to get code to be as clean as possible for those reasons, if we’re strictly talking about SEO and not best practices web development.

  29. So Alan, you’ve warming up to validation? I agree it is important for SEO, along with 199 other things. Another specific factor you touched on was Pagespeed. There again you have what was a bad experience for the crawler measured as a bad experience for the user. I agree, so it is too with validation. If the crawler can’t find the links, whether due to malformed HTML or other shenanigans, neither will the user.

    One thing I hear you mention is “time” and “at the end of the day” you have bigger fish to fry. I agree 100%. Keeping after your users with training and correction can become utterly futile. So the best time expenditure for a dev is, well, during development. I try to find those major errors during template development. These errors will multiply, so its a good time-spend to knock them down quick. No matter what, the client is going to mess it all up with their content, so might as well be sure what we ship is correct or 99% correct. I just absolutely refuse to worry about it after that. My time is valuable too.

    I ended up developing an html5 validator to try to save myself some time. It’s called html.validator.pro and its use should be to proof websites, or a seed set of pages from a site. It doesn’t do anything special, multi-page validators have been around (the recipe is easy: 1 part validator, 1 part spider, UX to taste). Early detection will save the most hair pulling (and time suckage) later. I hope I’m not wrong dropping this link on you, Would you give it a click and see if it saves you anything at all?

    1. Scott,

      First, I normally do not prefer to allow dropping links in comments however I do believe it’s relevant to this discussion, so no worries there. And yes, I’m totally warming up to validation. Personally, I’ve actually always believed that from a best practices perspective validation fits right in. It’s just that until Google baked in page speed I wasn’t so concerned about it. Then, as they moved into the whole “intent understanding” process and started pushing microformats (and now Schema) it became clear that code matters more than ever.

      I really like how you approach it – deal with validation challenges during the development cycle – seriously that should ALWAYS be the policy. I think where the flaw in that really came along was the rush by so many companies to come out with their own CMS or their own themes to be layered onto other CMS’s, the proliferation of widgets, and most recently, the explosion of CDN based data off-loading. Every one of those has allowed too many coders to just get sloppy. And every one has added to validation problems and speed problems as financial gain by CMS, theme, widget and CDN vendors took priority over clean code.

  30. “…while they were busy revamping their P tags, sixty of their competitors were adding new content. Or building quality inbound links.” Hit the nail on the head! Great review, a competitor who is no. 1 on Google for our search term has W3C for CSS and XHTML and I viewed it as a priority but after reading this insightful article will carry on making content and prioritising more useful and productive work.