SEO Terminology: Boilerplate (How Google May be Treating Repeated Content)

SMS Text

SEOs have been discussing how Google may be giving different treatment to different page elements based on their location on the page. Identifying so-called “boilerplate” is part of this process dealing with finding and analyzing “repeated non-content across [web] pages“.

Last year Bill Slawski describing the related Google patent was really good at explaining what “boilerplate means in other spheres of life:

Computer programmers will sometimes use the term “boilerplate” code to refer to standard stock code that they often insert into programs. Lawyers use legal boilerplate in contracts – often the small print on the back of a contract that doesn’t change regardless of what a contract is about.

Boilerplate is classified as follows:

  • (Sitewide) global navigation (home, about us, etc)
  • Certain spacial areas, especially if including links, (blogroll, navbar)
  • Markup (javascript, CC id/class names such as header, footer)

How Google may be treating it boilerplate:

  • Ignore it completely (e.g. never waste time following each link on each page, never store pages like “Contact us” in their indices unless the page contains some valuable data like the business physical address);
  • Index the links within the boilerplate understanding they are repeated links and thus using them for good or for worse (e.g. combined with other factors those links may be identified as paid)
  • The boilerplate may be identified and used for understanding the overall structure of the site (e.g. identifying duplicate content issues, adjusting PR flow, etc)

Related discussions on boilerplate and how it can impact the algorithm:

Cre8aSite Forums

I think it goes back to what M*tt C*tts said a long time ago. If we penalize all the websites out there that don’t have W3C compliant code, we’d lose 40% of the Internet. Translation: Google knows about crappy coding, duplicate content, and boilerplate phrases and they “TRY” to adjust to it, but don’t let it affect their results. They may in fact put out what they prefer to see on websites in the form of doctrine, but if they act on that doctrine they lose valuable results searchers need.


And soon to come HTML5 with new named markup – article, section, header, footer, nav… html markup that specifies content might just be what the SEs ordered

WbmasterWorld Forums

The first time I noticed “boilerplate” issues, though I didn’t know what to call them then, was during the horrendous Florida update of November 2003. At that time I suspected it with on-page text, and it was the first time I suspected that excessive use of keywords in internal anchor text could be a problem. After all the years since then, and now that others have been seeing the effect of navigation repetition issues, it kind of figures, since this patent was applied for soon after that.

SEObytheSea – comment by Michael Martinez

…boilerplate text is part of the public record. Excluding those embedded terms and conditions that are placed in footer text misrepresents the content of the page. If a search engine is claiming to make the Web searchable, then morally it MUST provide some means for searching boilerplate content even if that requires the user to stipulate bypassing a filter.

Boilerplate text is not any less relevant or important to a user’s query simply because people are tired of seeing it. Boilerplate text helps you identify which site you’re actually looking at, and that is always very important to know.

Related Patents:

Ann Smarty
Ann Smarty is the blogger and community manager at Internet Marketing Ninjas. Ann's expertise in blogging and tools serve as a base for her writing, tutorials and her guest blogging project,
Ann Smarty
Get the latest news from Search Engine Journal!
We value your privacy! See our policy here.
  • Mercylivi

    Interesting read Ann! This indeed an must need one to improve efficiency and relevancy in SERPs.

  • CD Designer

    Definitely something to watch!!

  • Frank Marcel

    Hi Ann! I believe these boilerplate links/content are treated by Search Engines through Page Segmentation:

    just by identifying the template/design of the site.

    And, well, there’s a lot of contact.html pages indexed by Google:

    not to mention .php, .asp, …

    Thanks for sharing what’s going on those Foruns! =)

  • Fábio Ricotta

    As Frank said above I really think this is a common way to analyze content across the web. My opinion is that this is a kind of page segmentation.