SEO

Let’s Try to Find All 200 Parameters in Google Algorithm

I am sure Googlers should be enjoying this: hardly can they say a word, there follows a wealth of guessed and speculations. This time Matt Cutts is said to have mentioned that their 200 variables in Google algorithm and already plenty of people started looking for them.

Anyway, I stumbled across this forum thread and made up my mind to share this discussion at SEJ by providing my own list of variables (the SEO perspective, please note that, like one of my best friends pointed out, this post is not intended as the list of search algorithm variables but rather as the list of SEO parameters) and asking you to contribute.

Currently there are fewer than 120 130 variables in the list, try to make it 200 :)

Update: I created a Google Wave for that: please Tweet or email me to get in there and participate!

Update: Mark Nunney, a very awesome and smart SEO and also blogger at Wordstream blog, who has joined our collaborative Wave on that, has put together a great tabl:

I added an ‘importance column’ (we could vote on that) and a ‘degree of dependance’ column. Poor phrase but important point: Lists like these miss the point that many ‘big’ ‘parameters’ do nothing on their own.Eg., brand or site rep are useless on their own – CNN and the BCC do not come top for everything.

I’ve published the copy here (We think we will set up a poll to vote for the variables in empty columns):

200 parameters Lets Try to Find All 200 Parameters in Google Algorithm

Parameters we are almost sure (with different level of confidence) to be included in the algorithm (for your convenience I linked some of them to our previous discussions on the topic):

Domain: 13 factors

  1. Domain age;
  2. Length of domain registration;
  3. Domain registration information hidden/anonymous;
  4. Site top level domain (geographical focus, e.g. com versus co.uk);
  5. Site top level domain (e.g. .com versus .info);
  6. Sub domain or root domain?
  7. Domain past records (how often it changed IP);
  8. Domain past owners (how often the owner was changed)
  9. Keywords in the domain;
  10. Domain IP;
  11. Domain IP neighbors;
  12. Domain external mentions (non-linked)
  13. Geo-targeting settings in Google Webmaster Tools

Server-side: 2 factors

  1. Server geographical location;
  2. Server reliability / uptime

Architecture: 8 factors

  1. URL structure;
  2. HTML structure;
  3. Semantic structure;
  4. Use of external CSS / JS files;
  5. Website structure accessibility (use of inaccessible navigation, JavaScript, etc);
  6. Use of canonical URLs;
  7. “Correct” HTML code (?);
  8. Cookies usage;

Content: 14  factors

  1. Content language
  2. Content uniqueness;
  3. Amount of content (text versus HTML);
  4. Unlinked content density (links versus text);
  5. Pure text content ratio (without links, images, code, etc)
  6. Content topicality / timeliness (for seasonal searches for example);
  7. Semantic information (phrase-based indexing and co-occurring phrase indicators)
  8. Content flag for general category (transactional, informational, navigational)
  9. Content / market niche
  10. Flagged keywords usage (gambling, dating vocabulary)
  11. Text in images (?)
  12. Malicious content (possibly added by hackers);
  13. Rampant mis-spelling of words, bad grammar, and 10,000 word screeds without punctuation;
  14. Use of absolutely unique /new phrases.

Internal Cross Linking: 5 factors

  1. # of internal links to page;
  2. # of internal links to page with identical / targeted anchor text;
  3. # of internal links to page from content (instead of navigation bar, breadcrumbs, etc);
  4. # of links using “nofollow” attribute; (?)
  5. Internal link density,

Website factors: 7 factors

  1. Website Robots.txt file content
  2. Overall site update frequency;
  3. Overall site size (number of pages);
  4. Age of the site since it was first discovered by Google
  5. XML Sitemap;
  6. On-page trust flags (Contact info ( for local search even more important), Privacy policy, TOS, and similar);
  7. Website type (e.g. blog instead of informational sites in top 10)

Page-specific factors: 9 factors

  1. Page meta Robots tags;
  2. Page age;
  3. Page freshness (Frequency of edits and
    % of page effected (changed) by page edits);
  4. Content duplication with other pages of the site (internal duplicate content);
  5. Page content reading level; (?)
  6. Page load time (many factors in here);
  7. Page type (About-us page versus main content page);
  8. Page internal popularity (how many internal links it has);
  9. Page external popularity (how many external links it has relevant to other pages of this site);

Keywords usage and keyword prominence: 13 factors

  1. Keywords in the title of a page;
  2. Keywords in the beginning of page title;
  3. Keywords in Alt tags;
  4. Keywords in anchor text of internal links (internal anchor text);
  5. Keywords in anchor text of outbound links (?);
  6. Keywords in bold and italic text (?);
  7. Keywords in the beginning of the body text;
  8. Keywords in body text;
  9. Keyword synonyms relating to theme of page/site;
  10. Keywords in filenames;
  11. Keywords in URL;
  12. No “Randomness on purpose” (placing “keyword” in the domain, “keyword” in the filename, “keyword” starting the first word of the title, “keyword” in the first word of the first line of the description and keyword tag)
  13. The use (abuse) of keywords utilized in HTML comment tags

Outbound links: 8 factors

  1. Number of outbound links (per domain);
  2. Number of outbound links (per page);
  3. Quality of pages the site links in;
  4. Links to bad neighborhoods;
  5. Relevancy of outbound links;
  6. Links to 404 and other error pages.
  7. Links to SEO agencies from clients site
  8. Hot-linked images

Backlink profile: 21 factors

  1. Relevancy of sites linking in;
  2. Relevancy of pages linking in;
  3. Quality of sites linking in;
  4. Quality of web page linking in;
  5. Backlinks within network of sites;
  6. Co-citations (which sites have similar backlink sources);
  7. Link profile diversity:
    1. Anchor text diversity;
    2. Different IP addresses of linking sites,
    3. Geographical diversity,
    4. Different TLDs,
    5. Topical diversity,
    6. Different types of linking sites (logs, directories, etc);
    7. Diversity of link placements
  8. Authority Link (CNN, BBC, etc) Per Inbound Link
  9. Backlinks from bad neighborhoods (absence / presence of backlinks from flagged sites)
  10. Reciprocal links ratio (relevant to the overall backlink profile);
  11. Social media links ratio (links from social media sites versus overall backlink profile);
  12. Backlinks trends and patterns (like sudden spikes or drops of backlink number)
  13. Citations in Wikipedia and Dmoz;
  14. Backlink profile historical records (ever caught for link buying/selling, etc);
  15. Backlinks from social bookmarking sites.

Each Separate Backlink: 6 factors

  1. Authority of TLD (.com versus .gov)
  2. Authority of a domain linking in
  3. Authority of a page linking in
  4. Location of a link (footer, navigation, body text)
  5. Anchor text of a link (and Alt tag of images linking)
  6. Title attribute of a link (?)

Visitor Profile and Behavior: 6 factors

  1. Number of visits;
  2. Visitors’ demographics;
  3. Bounce rate;
  4. Visitors’ browsing habits (which other sites they tend to visit)
  5. Visiting trends and patterns (like sudden spiked in incoming traffic)
  6. How often the listing is clicked within the SERPs (relevant to other listings)

Penalties, Filters and Manipulation: 12 factors

  1. Keyword over usage / Keyword stuffing;
  2. Link buying flag
  3. Link selling flag;
  4. Spamming records (comment, forums, other link spam);
  5. Cloaking;
  6. Hidden Text;
  7. Duplicate Content (external duplication)
  8. History of past penalties for this domain
  9. History of past penalties for this owner
  10. History of past penalties for other properties of this owner (?)
  11. Past hackers’ attacks records
  12. 301 flags: double re-directs/re-direct loops, or re-directs ending in 404 error

More Factors (6):

  1. Domain registration with Google Webmaster Tools;
  2. Domain presence in Google News;
  3. Domain presence in Google Blog Search;
  4. Use of the domain in Google AdWords;
  5. Use of the domain in Google Analytics;
  6. Business name / brand name external mentions.
 Lets Try to Find All 200 Parameters in Google Algorithm
Ann Smarty is the blogger and community manager at Internet Marketing Ninjas. Ann's expertise in blogging and tools serve as a base for her writing, tutorials and her guest blogging project, MyBlogGuest.com.
 Lets Try to Find All 200 Parameters in Google Algorithm

You Might Also Like

Comments are closed.

89 thoughts on “Let’s Try to Find All 200 Parameters in Google Algorithm

  1. That’s a one long list :)

    You might want to create a Google Wave and make it a collaborative experience for everyone who wants to add their $0.02 :)

  2. Great List Ann:

    If a campaign can cover all of those with definitive solutions, then the competition doesn’t stand much of a chance (aside from domain authority and having a head start)…

  3. You have “page load time” as one line item but there are several factors that influence this. I’m not sure if listing all of them would count toward the 200.

  4. I’m pretty sure that goolge also trains a classifier with the answers, selected by a user, after a search for some keywords was perfomed.
    To make a bit more clear, what I mean:
    1) You search for some keywords
    2) Google offers you some links to pages of interest
    3) You select some of these links
    4) Based on what you clicked, a classifier (e.g. a neural network) is trained
    5) If someone else searches for similar keywords, the classifier makes a suggestion

  5. Domain registration information hidden/anonymous can be a negative flag
    Load time “check labs in webmaster tools”
    Self promotional signs/links to SEO agencies from clients site
    Use of canonical URLs is positive?

  6. I perform blackhat SEO and I regularly engage in linkspam on hundreds of sites. It’s manual and subtle, not the autospam atrocities of viagra promoters etc.

    This is a nice list of parameters, but I can tell you that Google’s algo is nowhere near as sophisticated as all this. I have sites which have 100% comment spam links and hit no.1, and I’ve been able to rank highly with terrible auto-spun content.

    Why do I do what I do?

    I focus on things which are a simple search -> click -> buy operation. I’m looking for money off a tv, I find no.1 for that TV’s coupon, and I buy and save money. My content is just as good as anyone’s in this regard.

    Why should all the money go to amazon, nextag or the other big guys?

    Google just seems to want to make the rich a lot richer.

  7. Hmm. And by your own saying the list is not complete :).
    Wonder how many website builders/developers take all these factors into account. It is mind boggling, I must say.

    Thanks for these guidelines.

    Srinath

  8. Nothing about outright malicious content and past hacks yet… it seems that since they track (and warn users) of that, they’d use it for ranking.

    I’d also suggest that rampant mis-spelling of words, bad grammar, and 10,000 word screeds without punctuation would likely affect quality. While this is touched on by some existing sections, perhaps some of the “magical 200″ could be a simple as “does this site have a period on the page to ensure it isn’t garbage text pulled from a phrase engine?”

    Put another way, the “readability” score could be many scores.

    Another one I didn’t see was origination of unique phrases. If somebody wrote a unique phrase, knowing the first appearance of that phrase, vs. the scrapers, would cut down on scrapers. Frosbie Scroodle Jetsam. Since this is the first blog to ever have that phrase, that should rank higher than scrapers, as this would be the “first place” it was seen, if the comments are scraped, and re-used.

  9. Ah, some more : On-site trust factors : Contact info ( for local search even more important), Privacy policy, TOS, and similar. At least some claim these have effect, haven’t tested them separately yet.

  10. Great project!

    This reminds me however of something I learned in Physics class: ‘with enough parameters you can fit an elephant’

    This means that you can prove anything as long as you have enough parameters, and is a reminder to focus on the mosy imoprtant ones.

    The value of identifying the 10 most important parameters is therefore generally much more valuable than identifying the 200 most important ones (if that means you don’t know the 10 most important ones….)

  11. Great Ann,
    something more: when Google (and Jeffrey Dean) presented “Big Table” (the storage system of Google’s datas) in 2006, was explicit that all the parameters of the crawl are divided in 16 family groups.. (see table 2 at page 11 of the paper: http://labs.google.com/papers/bigtable.html download the pdf). So you need 2 groups more and much more parametres for group.

    In the same table you can see also the confirm of the number of parameters stored for each document (like 200) : the approximately 5 billion document crawled in 2006 was stored in 1000 billions cells of the data base… something like 200 cell for documents ;-)

    In 2006 i writed an email to Jeffrey to ask him :

    “…..it’is correct to say ” With “BigTable system”, Google stored (and “normalized”) the world wide web (something like 10 billions of pages indexing) in a “single” table of 800 TB, with 1.000 billions of cells and with 16 types of families datas” ?

    and the answer was: “…The first table desription above is pretty much correct….”

    All the story here (in italian): http://www.giorgiotave.it/forum/seo-focus/23768-google-bigtable-e-l-archiviazione-dei-dati-storici-e-altre-congetture.html#post182014

    bye! Nicola

  12. Thanks for initiating this list.

    A few additional items for discussion:

    H1-H6 tags.

    There’s a range of underutilized tags: abbr, acronym (deprecated in HTML5), optgroup, etc. that enable addition info to be placed.

    Character counts in the title element, description meta tag, alt and title attributes, etc. Also, to differentiate between the number of characters displayed on a search engine results page (SERP) and the number Google may actually index in a given field.

    There are also a few tags that are only observed, at present, for a few geographic regions. This will, of course, be expanded as products such as Google Maps Street View are expanded. A few such tags are the geo.placename, geo.position, and geo.region meta tags.

    Each factor needs to be understood relative to the value Google assigns. Also, the value Google assigns is variable. Using keywords as an example, many factors for which keyword placement is weighed have negligible value if the given long tail keyword phrase isn’t placed in contextually relevant page content — not to mention also being placed in alt and title attributes, etc.

    Given that Google has begun integrating search results from Local Search, Image Search, etc. on Web SERPs, understanding how the values assigned differs has become more complex.

    How Personalized Search, Page Speed, Rich Snippets, and the new elements being introduced in HTML5 will all influence the values, effectively changing the playing field.

  13. @Robert
    I use the acronym tag all the time on my website about Corporate Social Responsibility (CSR).

    Unfortunately, CSR can mean many things, most notably “Customer Service Representative.” The acronym has at least 161 different interpretations:
    http://acronyms.thefreedictionary.com/Csr

    I always make sure that the first instance in which the acronym appears is marked with the CSR tag. I also try to include the tag within an tag whenever possible.

    The other relatively unused SEO markup that I always include is the anchor text attribute. Generally, I fill this attribute with the linked page’s title tag.

    I’m not sure if the engines are parsing this markup or using them to weight results, but I figure it certainly can’t hurt to include them. If nothing else, they improve the usability of my websites.

  14. LOL

    While we’re at it we could also search for the Arc of the covenant, cure cancer, and find out who killed JFK as well.

    Problem is, even if you come up with 200 variables, there’s no way to tell if you’re even close to being right. Not to mention that there’s probably much more than 200 variables since they made that statement a couple years ago.

    1. We are not trying to prove anything or find any solution here.

      We are just having fun listing factors that we believe are considered by search engines.

      “200″ is just another way to make it even more fun :)

      1. nice. send newbie and unknowing SEO’s and search marketing professionals on a wild goose chase of misinformation. And people wonder why there are so many bad/spammy SEO’s in our industry.

  15. There’s a lot of stuff that has a big effect that you cannot control. For instance, how many people search for your exact domain name?

    I had a radio interview for a site of mine, alright. And this was a brand new site with very few backlinks. I just happened to have a pre-existing connection on a radio show. A lot of people heard the interview, and most came direct. But a LOT searched for my domain name. Within a day, I saw HUGE traffic boosts. It’s hard to isolate the cause of anything, but in this case it was clear to me, given other factors, that this was a big part.

    Worry about the other stuff all you want, but 90% of that list I’d say really doesn’t make a ton of difference.

  16. Nice idea Ann, indeed I quitted reading right after the “Domain Factors” section.

    Length of domain registration:
    Matt Cutts and John Mueller confirmed this is NOT taken into account at all (source)

    Domain external mentions (non-linked)
    Again, we have an official confirmation of this NOT being a factor (source)

    If you have knowledge of any counter-evidence to the above, please share!

    I understand we don’t have to take John Mueller and Matt Cutts for their words, indeed they are official and reliable sources of information (that must be read with a grain of salt, anyway).
    So in absence of any counterproof, I think that (critically) trusting them is not a bad idea at all.

  17. Excellent work Ann! I think the presence of the keyphrase to the top of the page and in first 100 words of the body text is also very important.

  18. I think it’s worth to mention that probably the 80/20 rule applies here.

    10-20% of these factors contribute to 80-90% of the rankings.

    I know that a) external links b) internal stuff like title tag, some keywords are probably one of them.

  19. No mention of keywords in heading tags (h1-h6)?

    Has anyone seen any evidence of page segmentation affecting rankings? I mean boilerplate content and nav menus etc.

  20. Good list Ann! But I think you should add “Status Updates” in the above list as well. Real time search was & will be a hot topic as per the current Google Behavior.

  21. Hi, Very good article, though perhaps useful to add the importance of labels … and regarding the title, I would say it should have adequate length for google to show us the full title.

    A greeting

  22. I think so you must include in the list, that the lengh of meta title is necesary put exactly characters if do you want is well done. Additionaly i would like put linkwords in footer.

  23. This is some good food for thought. thanks for the list. very helpful to those of use trying to start new sites or rehab old ones that aren’t doing too well :)

  24. Ona recent video Matt Cutts, explicitily said domain age is not relevant and do not recommend buying old domains just for ranking sake.

  25. What about query-related parameters, such as topic, popularity (overal and trending) etc? Or parameters of universal search (“do we have video/image/news/maps/local/etc results to show?”)

  26. This is definitely a great list of potential elements that are taken into account when ranking. As some have suggested there may be some myth elements, and then others that used to matter more in the past that have now lost weight. Overall, I think this is a great list and helps dissect the many elements that are being taken into account and weighed upon.

  27. Great post Ann. There were definitely a few factors I had not considered. What would be interesting would be a weight scale of how these factors are ranked in terms of importance.

  28. Thats a useful list and good for discussion. It would be very difficult to back engineer an algorithm but this is still useful for discussion purposes. Someo f the things mentioned don’t seem to be in the current wisdom of SEO but stand up to reason. In my experience there is always some truth to any factor that can stand up to reason.

  29. I definitely think there is something in the algorythm to identify the balance of the content going deeper into how unique it is… For example multiple concepts running along the same theme. It’s hard to explain but I had two sites for the same topic, one I worked hard on (back links, multiple pages worth of content…) the other I did very little on which had extremely unique content looking at it from various viewpoints, the latter shot to the top of the search results and remains there still.

  30. That’s a great idea to put it in an excel file, how about adding 1. Backlink Profile: Social media links from high profile/social influencer’s 2. Backlink Profile: backlink quantity history (increasing at a certain speed)

  31. Ann,

    Love this post and had to check back to see if it has progressed. I don’t see a whole lot of activity. I wish I could contribute but I think you’ve covered everything I would have been able to come up with, and more.

    Thanks for the great info, as always.

  32. Wow, thank you for listing the 200 elements of Google’s Algorithym! I just have a question. How does algorithym frequency play a part in these 200+elements? Great by the way Ann! Is that really your last name?