This is part 2 of 2 in a series of articles where I am laying out a vision to bring more objectivity to SEO, by applying a search engine engineer’s perspective to the field. If you haven’t yet, please read up on the first of this series, where I present a step-by-step way to statistically model any search engine.
In this installation, I will show you how to utilize your new search engine model to uncover exciting opportunities that have never been possible before, including accurately predicting when you will pass your competition (or when they will pass you!), precisely forecasting your traffic/revenue months before it happens, and even efficiently budgeting between SEO and Pay-Per-Click (PPC), solving the “Traveling Salesman Problem”.
Full Disclosure: I am the Co-Founder and CTO of MarketBrew, a company that develops and hosts a SaaS-based commercial search engine model.
So What? What Can I Do with This Thing?
For an outsider to the SEO industry, or even a conventional CMO who hasn’t been exposed to newer technical SEO approaches, this is certainly not a stupid question. In part 1 of this series, you learned how to build a search engine model and self-calibrate it to any search engine environment.
So now what? How does one benefit from having this technical capability?
Well, it turns out that since you have a model, you now have the ability to “machine learn” each local search engine environment’s quirks (and fluctuations in algorithmic weightings). This can really help you predict what will happen in that environment, even when we make the tiniest of changes. We call this statistical inference.
This statistical inference gives you both a tremendous financial and operational advantage, when it comes to SEO (and even budgeting and targeting Pay-Per-Click, as you will learn shortly).
Here are just some questions this approach can solve:
- How much optimization do I have to do, to pass my competitor for a specific keyword?
- When and will my competitor pass me in ranking for a specific keyword?
- Which specific optimizations will cause the biggest shift in revenue? I’ve got 1,000 top keywords and and hefty number (1M+) of web pages, how do I solve for this? And which optimizations should I avoid because they are pitfalls?
- Which keywords should I be targeting with Pay-Per-Click (PPC) and which should be targeted using SEO? Are there keywords where I should absolutely be targeting both?
Developing a Real-Time 60-Day Forecast of Your Traffic (And Revenue)
First things first, the most important benefit of this approach is that it can accurately predict search results months in advance of traditional ranking systems, because traditional ranking systems are based on month’s old scoring data from their respective search engine.
Remember our model, once calibrated, is now decoupled from reality. You can run changes through it, and shortcut all of the lengthy processing and query layer noise that you are subjected to when using standard ranking platforms.
For example, on a standard ranking platform, rankings are reported as they show up on the search engine, naturally. Some ranking platforms will tell you that they update every week (or even every day!) but all that is measuring is how synchronized they are with Google or whatever search engine. However, that search engine data and its many generations of scoring are inherently months old — having been through the process of crawling, indexing, scoring, and shuffling with thousands of other parallel calculations to other web pages.
With your own search engine model, you can easily simulate changes, instantly. Need to test out a change and see what will happen to your traffic and/or revenue once the search engine has had the time to process and integrate this new data into its search results? No problem, fire up the model, test, repeat, test, confirm, and push to production.
The Perfect Set of Moves is Right There. You Just can’t See it, Yet.
Those heading up their Fortune 1000 SEO division are responsible for trillions of potential combinations of keyword / web page optimizations that, if made in the right order, will determine their future career success (or failure). No wonder SEO carries such personal emotions for so many of its practitioners.
“In chess, as it is played by masters, chance is practically eliminated.” — Emanuel Lasker
However, with a highly correlated statistical model of their target search engine environment, the series of “right moves” is more about their execution than mystery. Much more controllable = much more sleep for SEOs and their managers.
To accurately and precisely determine WHERE to start optimizing, we must solve a very big computer science problem called The Traveling Salesman Problem. The Traveling Salesman Problem asks: Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?
Remember, you cannot simply point out one keyword/web page combination that might benefit from a particular isolated optimization. Why? Because it could actually have an overall negative effect on your traffic/revenue! Instead, each pairing and combination of optimizations must be solved with respect to each other. This is very similar to The Traveling Salesman Problem, with a couple of twists.
In our case, the “cities” are optimizations or “changes to the website”, and the “distances” between those “cities” are represented by the amount of effort it takes to implement those optimizations.
We now have a well-defined computer science problem, and a very straight analogy to create our inputs. How do we determine the “effort” it takes to carry out each proposed optimization? With your new search engine model, it’s actually quite simple and straightforward.
First, we need to measure each optimization’s benefit and cost. Those ROI calculations, or “roads between cities” gives us the “cost” of implementing each optimization.
Creating the ROI Model
Before we determine what are the “right” optimizations, we first must define what is “right” — in this instance, we are going to be defining success by either traffic, revenue, or both.
To find the highest ROI moves, we will need to measure traffic/revenue increase verus amount of optimization needed (cost of optimization). This turns out to be a perfect problem for our new search engine model.
To measure this, the search engine model must:
- Determine ranking distances of each search term, including the overall query score and all individual scores that go into the overall query score.
- Simulate all possible ranking changes for every web page / keyword combination.
- Order simulations by most efficient (i.e. least amount of changes for most gain in reach).
- Recommend best path of optimizations / tasks, based on user preferences, taking into account the classical Traveling Salesman Problem.
Step 1: Determine Ranking Distances
Here’s where our transparent search engine model leaps into action. Not only can you see the order of the search results for a particular query, but you can also see the distances between those results, and how those distances were calculated.
Knowing the distances between search results, in terms of query score, is an important step towards automating SEO. We can now define the cost / work that each optimization requires, and begin to simulate all possible moves with great precision.
Step 2: Simulate All Possible Ranking Changes
Once your search engine model has established a highly correlated environment, it can begin to simulate, for each web page and query, all the possible optimizations and subsequent ranking changes that can be made.
Within each simulation, we first determine how much efficiency there is within each ranking change. To determine efficiency, we must create a metric, for each simulated ranking change, that allows us to see how much a particular web page stands to gain in terms of reach versus how much it must increase its query score (i.e. cost) to gain that additional reach.
Reach Potential (RP)
Take the estimated search volume K for a given query and the click-through rate CTRw for position occupied by web page w, and the query score QSw for current web page w, the Reach Potential (RP) can be represented by:
We must also take into account any negative side effects of this ranking change on the overall traffic to a website. To do this, we must create a metric called “Potential Reach Loss”.
Potential Reach Loss (PRL)
Take the estimated search volume K for a given query and the click-through rate CTRw for position occupied by web page w, we remove the web page w occupying current position x and now rely on the next best web page in the website to rank. Given x as the current position and t as the new (lower) position, the Potential Reach Loss (PRL) can be represented by:
Expected Reach Potential (ERP)
Once we have calculated the Reach Potential (RP) and Potential Reach Loss (PRL), we can combine these two metrics and all of the simulations, for a specific query, into a statistical expected value.
Take the estimated search volume K for a given query, the reach potential RP and potential reach loss PRL of the web page we are simulating, and the total number of simulations S off all possible ranking changes, then the Expected Reach Potential (ERP) can be described as such:
This gives us a metric that we can then order by, giving us the most efficient and least risky ranking changes for a given website.
Step 3: Order Simulations by Most Efficient
Next, we simply order all possible ranking simulations by Expected Reach Potential (ERP). For simplicity sake, I will call this the “Optimization Score”.
Here, each row represents a potential ranking change, due to a precise set of optimizations that can be applied for that web page/keyword query combination.
Step 4: Recommend Best Path of Optimizations
Once your search engine model has identified and ordered all possible ranking simulations, it can begin to simulate all possible optimization paths (one particular path is outlined in red the screenshot above). Again, this is similar to the Traveling Salesman Problem, where the “cities” are represented by the different optimization simulations, and the “distance” is represented by the Optimization Score.
Once the Traveling Salesman Problem is solved, your model will have a list of the best “path” of optimizations to make. In other words, which “set” of optimizations to make, and in which order.
But that only gets us so far. We also need to know WHAT to optimize. Some components might actually work AGAINST this simulation becoming reality. To do this, we must first break down each optimization simulation into individual tasks.
How do we know which types of optimizations will make our simulation turn into reality the quickest? We must first know each algorithmic sub-score behind the model’s overall query scoring algorithm.
Query Score Breakdown
Because your search engine model is fully transparent, that is to say that, you can drill-down all the way to the scoring of individual links, the query score itself can be broken down into further algorithmic sub-scores that lead to a precise map regarding which type of optimization to start focusing on.
For instance, you might imagine something like this: your website is currently ranking second in the model, but it is handily beating the number one position in semantic score. Knowing that you should focus on ranking power instead of semantic markup changes the optimization score. Why? Remember, the optimization score was defined by what and how much work there was to do. That is directly related to what specific tasks you will be required to implement.
By breaking down the sub-scores, we can better identify and score each task, which leads us to finer-grained control over how the optimal path is executed. It also gives us the ability to personalize this process even more, since we can define which types of tasks are more costly to your organization.
You can now clearly see that in this particular optimization simulation, the model told us that we should focus both on semantic (content) and link flow boost (ranking power). But it also told us exactly how much each to focus on for each component.
The model also contains exactly what needs to be done to carry out each part, because we have the query sub-scores.
Semantic Score Breakdown
For the semantic scoring, we can dig deeper to find what aspects should be optimized first.
Link Flow Boost Breakdown
The same can be done for optimizing the ranking power. The search engine model will have modeled the core family of algorithms (outlined in my first installment of this series), and therefore it will know exactly how much additional ranking power can be gained by eliminating those inefficiencies.
Your search engine model will then output a combination of selected optimization tasks that, combined together, represent the best possible set of optimizations and subsequent ranking changes for a given website.
Remember our Reach Potential calculation that determined our Optimization Score? One could imagine a tremendous amount of personalization that could be integrated into this basic equation. For instance, these sets of optimizations could incorporate conversion metrics and revenue figures for specific combinations of keywords and web pages. To do this, we could personalize the relative weighting K of all keywords, as well as the click-through or conversion rates CTRw of specific web pages.
We could also specify which types of optimization tasks were available within a given amount of resources. Let’s say your team doesn’t really do off-page SEO, and you’d rather focus on on-page SEO and the high ROI tasks associated with those actions. You could adjust your model to weigh those tasks more importantly, revealing a high ROI list of on-page tasks only.
Determining Where Keywords Go: SEO, PPC, or Both?
We can also sort the optimization score the other way. This gives us a clear indicator of your worst optimization moves. You’d be surprised that some seemingly obvious ranking situation calls for you to back away…slowly…and put down your keyword and mouse!
You know the scenario: you have been ranking number two for the last year for some query phrase, and you know if you can just get to the number one spot, huge gains will be made. But little did you know, it was fool’s gold. THAT particular optimization, if implemented, would, in reality, take tremendous amounts of ranking power improvements, at the expense of all of your other optimizations. The model will tell you these scenarios, and allow you to safely course-correct.
With a search engine model, you now know it’s better to leave those keywords with the PPC team, as the search engine model is showing large amounts of work for little respective gain.
In this way, we are using the organic search engine model to determine the budget split between SEO and PPC. The keywords that are cost-prohibitive in SEO are targeted on other channels, including PPC.
As you can see, we are just brushing the surface of what can be done with a calibrated search engine model. To recap, we can use this statistical search engine model to:
- Provide answers, months in advance of current technology
- Precisely identify WHERE to optimize, and WHAT type of optimization to make
- Couple specific optimization tasks directly to revenue change
- Plan out a budget between SEO and PPC
This is certainly not an exhaustive list. And this statistical approach has the ability to dramatically streamline the day-to-day work of an entire marketing division, not just the individual that uses it.
In addition, because of the personalization to our model, we can now forecast revenue as a part of each small optimization that is made — a boon to CMOs and SEO Managers alike.
It continues to be my opinion that organic search should be treated like any other marketing channel – a clearly defined risk versus reward environment that gives CMOs useful metrics that help them determine the viability of their SEO campaigns.