As everyone knows, the Google AdWords Keyword Tool “KWT” is one of the best free research tools available online. But I thought it would be interesting to see how its data consistency holds up if you change a few external factors and then cross check against a real campaign. I have seen variances in data previously, but never took the time to examine the issue in much detail until now.

Since Google pushes users to login when using the keyword tool I assumed that there must be a massive advantage for users to do this. My test would discover if there are any differences in the quality of the keyword suggestion data around average CPC rates, depending on whether you are logged into a Google AdWords account or not.

I assumed prior to gathering the data that there should be little difference in Google’s suggestions whether you are signed in your AdWords account or using the semi-anonymous external keyword tool. However, after seeing the results of the estimated average CPC amounts, I was surprised at the range of variance shown.

As I already had an idea of what the actual average CPC rates were, I quickly realized there was more to this and  that I should conduct a larger test, this time using around 150 keywords.

Google Keyword Tool Match Bug?

One interesting aside I discovered during the research was that while the Keyword platform shows “Broad” match type as the default setting, if you also select “Exact” and “Phrase” match you get a different estimated average CPC amount.

The example below demonstrating this strange variance is using the keyword wynn las vegas. The Keyword Tool shows the estimated average CPC rate is $7.03 for the default result, but if you select to view all the match types you are now shown:

  • Broad Match $0.00
  • Phrase Match $8.09
  • Exact Match $10.39

The $7.03 does not appear to be an average amount of the three match types. To confirm this I picked the first four related keywords which showed a variance of between 92% and 114%. It would be interesting to know where exactly the default average CPC rate data comes from, if not from averages. These small variations in the data that make it almost impossible at any scale to accurately compare Google’s AdWords data, so you have to take their average CPC data at face value. If you refine it too much for low traffic terms you will get little or no data, as you can see from the broad match result.

Adwords keywords

Actual Average CPC vs KWT Estimated Average CPC

I had to remove 13 keywords from the graph data shown below as it was skewed beyond readability. These keywords recorded average CPC differences ranging from 1420% to 13240% which were far higher than the average variance of 826%.

The variance between the estimated average CPC amounts was large enough to cause headaches and could cause campaign estimates to be scaled back dramatically.

Estimated Average CPC Logged-in vs External Access

I uncovered another interesting point:  the variances did not seem to easily correlate to any other datapoints such as actual campaign clicks, impressions or actual campaign CPC amounts. This lack of ability to cross-check estimates when planning an AdWords campaign should create some concern with marketers using purely the Google KWT as it shows that is a fairly large variance.

Lastly, I discovered that you can’t assume that just because the term has a high number of clicks or impressions, it will therefore be more accurate when using the Google AdWords Keyword Tool. The only upside is that the higher the estimated average CPC rate, the more accurate the KWT estimated would be between external and logged-in data.

There seemed to be a sweet spot where if the estimated average CPC rate was around $1.20 I found it to be fairly accurate. Below $1, the estimated average CPC rate was usually incorrect. Over $1.20, my data showed that Google’s estimated average CPC rates are far higher than the actual average.

My conclusion: It’s important that you occasionally cross-check Google’s Keyword Tool estimates against real data from Google Analytics or existing AdWords campaigns data, especially if you are using it to form a business case for a new budget in 2011 for your PPC campaigns.

About the data: Because of the effort to use a larger set of keywords there was some contamination of data and the primary match type selected on the AdWords keyword tool was set to “broad” match type. This means you need to run your own tests to explore in more detail but you can assume that you will get a similar variance between Keyword tool data and actual campaigns. If you are using KWT data you should disclose this source of information to cover any variances when using it for a proposal, audit or presentation. A bulk of the AdWords campaign data used in this analysis had a fairly higher quality score of 6 and higher which does explain why the actual CPC rates are lower and the variance is greater but even when the quality score was below 4 the trends seemed fairly consistent.

David Iwanow

SEO Product Manager at Marktplaats
David is now located in Amsterdam, Netherlands working with the ebay classifieds group. He was previously the marketing director of The Lost Agency, a web analytics focused search agency. His rants, interviews, research and thoughts on digital marketing can be found on his blog Lost Press Marketing.
  • John

    Good post. We need to examine all the tools available to us.

    SEO Expert and Marketing Strategist

    • David Iwanow

      Thanks John, this might be something WordTracker, SEMRush or AdGooroo might be able to confirm based on their datasets.

  • CheetahDeals Blog

    Wow that’s a great post. Sample size is eh…, but I believe the general idea is shown pretty clearly. I’ve never “trusted” the KWT, even as I use it frequently.

    • David Iwanow

      Thank you, yes the sample size is always a problem but the fact that a chunk of it had to be manually matched did cause a bit of a headache.

  • goodnewscowboy

    Really nice work David. Statistical purity aside, it definitely shows a marked difference in the data that Google provides us. It reinforces that while we can use the Google keyword tool for general direction, we can’t depend upon it for accurate numbers.

    • David Iwanow

      Thank you, yes let’s statistics is the course/area of focus for 2011 for self improvement. I would love to hear more possible data points that people have noticed where Google seems to cripple the data to make it less than accurate.

  • YZ

    Hi David,

    Thanks for sharing this. As many in the industry have found the estimates out of the Google Keyword Tool vary greatly. It is great that you have been able to identify a sweet spot of close matching and verify to some extent the differences in the data. I was just reading a discussion at AdWords API forums, members there were discussing the differences between the API and web tool for stats on keywords.

    A moderator mentioned that using the same match type modifiers in both the API request and the Web application request at Google Keyword Tool would result in the same data outputs. Obviously different to an estimated result set and true paid for delivery data, but interesting none the less.



    • David Iwanow

      Hello Damien,

      Thanks for the detailed comment, I had seen a few of those threads over the past year but shrugged it off until I had noticed some stranger results in some recent research for new projects.

      Actually that would be interesting to see how the same data fetched via the API might vary it again as just because it didn’t impact the output last month doesn’t mean it was the same this month. This issue would potentially impact the possible results of SEO Toolsets that calculate both ROI and cost based on avg CPC using the AdWords data.

      Statistically it would have been great to both include much more data and be able to confirm that sweet spot occurs consistently across both generic and brand terms and even via vertical.

      I assume that some of the variance might be due to what time of the year the search was conducted, but I did notice a number of keywords bid estimates did not change over the 5 weeks when the data was first gathered. So the question is what is the range or time period when your AdWords data would no longer be valid? Does it need to be refreshed/updated weekly, monthly or just yearly?

  • Miroslav Varga

    Hi David,

    thanks for your interesting post but there are 3 things I would point out:

    1: What was the Impression share lost in the real Campaign? The lost of data can influence the difference!

    2. Why did you use linear correlation when from the graph it’s obvious that for the estimated curve their is a parabolic or hyperbolic correlation.

    3. The difference of the linear correlation is (as I could figure out from your diagram):
    For estimate correlation: y = -0.374x + 2,7$
    For actual correlation: Y = -0.204X + 3,7$
    In the interval in question, the actual data have a statistical difference of +6$ to -1,8$ More than 1:3
    The correlation factors have a difference of 1:1,5. Therefore I think that the Keyword tool gave you quiet accurate results. If you try the hyperbolic correlation the difference is even lower – almost 10%.

    Have a nice 2011′

    • David Iwanow

      Hello Miroslav,

      Sorry for the delayed response, thank you for the crazy cool analysis of the data. I did really want to do a better analysis but it was a matter of the time I had already spent looking at the data as it was started as one post but the results were not really significant and was kinda not something you could put into practice.

      The impression share lost in the real campaign for one was around 15-25% and the other might have been slightly higher around 35-45%. It’s very hard to trace that data back to keyword level as that impression share is only available at a ad group level.

      There was some interesting factors I saw about this data with estimates for 1st Page bids are also not consistent between real/actual and login/public data.

      I didn’t really look at too many other correlations besides as Linear as it’s mostly the accepted standard, but using a platform besides excel such as SPSS allows for this.

      Even with a 10% variance that is still enough to know that you need to include a disclosure of +/- 10% of budget/costs in any budget plans or proposals. With more data it might be possible to reduce that statistical difference but it’s still a worrying concern at 10% for something most people take for granted.

      Happy new year to you also

  • Dejan Petrovic

    Brilliant post David!

  • Anonymous

    Great post David. You may have already done this but was there any variance in the data when logging into different AdWords accounts?

    • David Iwanow

      Technically no as 3 different accounts were used but each account was used for the same group of keywords.

      I tried to limit any external influences if possible, so if the AdWords account was running keywords on “tennis shoes”, I used that same account login to do research in the KWT on “tennis shoes”.

      I assume based on this limited test that there might be a variance across different logins but the theory was that if I was using an account that already had access to that data then Google should show a more accurate estimate.

    • SEO

      I applaud your honesty!

    • Vman

      So what is the good keyword research SEO tool that you use?

