Stale vs Fresh Document as Defined by Google

Google highly values the “freshness” of the content. But how does Google defines it against the “staleness” of the content? Google’s Historical Data Patent reviewed at WebmasterWorld and SEroundtable sheds some light on what Google thinks to be fresh content and what can be defined as stale documents.

Bill Slawski did a great job of summarizing the patent with help of two examples:

The Constitution of the United States is an old document, but it’s not stale. A news article about the “World Series” from 1918 may not be what a baseball fan wants to see when searching for “World Series” this October.

According to Google itself:

Stale content refers to documents that have not been updated for a period of time and, thus, contain stale data (documents that are “no longer updated, diminished in importance, superceded by another document“).

The staleness of a document may be based on:

  • document creation date,
  • anchor growth, traffic,
  • content change,
  • forward/back link growth, etc.

Google patent explains how they can spot the stale content using 4 factors:

  • Query-based factor;
  • Link-based criteria;
  • Traffic – based criteria;
  • User-behavior-based criteria.

Stale content: 4 factors

1. Query-based factor basically refers to analyzing which pages in SERPs are selected by users.

Besides, the search engine tracks which queries one and the same document ranks for: “discordant set of queries” might mean the page is spammy.

2. Link-based factor analyzes the page backlinks monitoring the dates that new links appear (i.e. “indexed by Google or the date the linking page was created”) to a document and that existing links disappear. By looking into the the rate at which links appear or disappear over time and how many links appear or disappear during a given time period, the search engine is able to conclude whether there is trend toward appearance of new links versus disappearance of existing links to the document or vice versa:

  • downward trend = > stale document (more links disappear than appear);
  • decrease in links = > stale content (either sudden or significant link disappearance).

3. Traffic – based criteria: a large reduction in traffic may indicate that a document may be stale.

4. User-behavior-based criteria: if people spend too little time on the page (compared with the similar / tightly relevant page), that might mean the document is stale.

Ann Smarty
Ann Smarty is the blogger and community manager at Internet Marketing Ninjas. Ann's expertise in blogging and tools serve as a base for her writing, tutorials and her guest blogging project,
Ann Smarty

Comments are closed.

8 thoughts on “Stale vs Fresh Document as Defined by Google

  1. I think it’s interesting that while document creation date/date updated is listed as a factor in the stale/freshness quotient, it’s not one of the primary four metrics analyzed in the patent.

    I guess it stands to reason that a frequently updated document no one likes is less relevant than a timeless work which people continue to find relevance in.

  2. Hi,

    Interesting information. I always put a related articles or last articles at the end of a post so that the page looks a little different for the search engines. Of course that the main content remains unchanged but it’s worth a shot.

  3. Google contradicts itself – because how does something that is ranked high fall off? It continues to get clicked, which in turn never makes it stale. Even though the story is from old info.

  4. MUSKARA TRAVEL AGENCY Cappadocia tours, cappadocia hotels, turkey tours privite tours, istanbul city tours, adventure tours, cultural tours, anzac day tour, blue voyage, rent a car, balloon tour, jeep safari tour

  5. @afewtips – yes, this is true in a way, but if you have ever found your site ranking for something really random to your main subject, you’ll find that it does drop down over time as people choose the more relevant results around it in preference.