Visit our Marketing Nerds archive to listen to other Marketing Nerds podcasts!
In this week’s episode of Marketing Nerds, we are joined by Eric Siegel, founder of Predictive Analytics World and author of the award-winning book Predictive Analytics, to take a deeper look at what Predictive Analytics really is and how businesses can use it to make important decisions and improve success.
Here are a few transcribed excerpts from our discussion, but make sure to listen to the podcast to hear everything:
What is Predictive Analytics?
It’s learning from data to make predictions. In the case of marketing, it’s predicting response, predicting churn, or any other company consumer behavior that would help target marketing efforts. It’s a direct way to decide, per individual prospect or customer, whether to contact or not—which offer to contact with, whether to offer a retention offer, etc. Predictive analytics is quite straight-forward and specific to define, which actually stands in stark contrast to the more general hype we hear about data science and big data—both these umbrella terms refer to a culture of doing smart things with data but don’t actually allude to any specific method or technology.
Outside of Marketing, how is Predictive Analytics being utilized?
The most widely publicized use of the government is in fraud detection, which is very much in parallel also to the law enforcement angle. Now, law enforcement—we’re hearing more and more police precincts around the US doing predictive policing, and we have every reason to surmise that the National Security Administration also use predictive analytics to triage the hunt for terrorism suspects. In all those cases—fraud, law enforcement, terrorism—you’re trying to find needles in the haystack. You’re trying to find the anomalous individuals, the unusual cases, that are actual perpetrators or connected to crime in some way, the individuals that are worth spending time on.
Just like in marketing, it’s the same core technology and it’s the same value proposition. It’s about triaging and targeting resources en masse over millions of cases. Is it worth spending $2 contenting this customer in marketing? Is it worth expanding the cost of a retention discount to try to keep this customer because we believe they’re at risk of leaving? Is it worth expanding the time of an auditor to consider this individual transaction that may be more likely than average to be fraudulent? Is this individual likely to commit a crime again, or should we keep them in prison longer—which is actually one of the ways predictive analytics is called recidivism prediction, and so these predictive models actually help determine how long people stay in prison.
What about False Positives when using Predictive Analytics to make important decisions?
I mean, you’re asking the most important question and, yeah, there is a catch system. In the case of law enforcement and even fraud detection, it’s not decision automation. It’s decision support. A judge making a sentencing decision or a parole board deciding whether to release an inmate is not depending on the machine. They’re using the machine’s input. The output of the predictive model—the probability score that the machine derives—they’re using that as one of whatever ad hoc considerations they already used, even before technology existed, to render a judgment about the individual.
On the other hand, despite the fact that it’s just to support a human decision, there of course will naturally be cases where the machine’s output that’s taken into consideration is a determining factor and, for some of those cases, the machine will be wrong. On the one hand, you know, scientifically, we’re potentially improving accuracy on average of these human judgments, but at the same time, by virtue of bringing that into the system, that means machines can now commit injustice.
What about using Predictive Analytics to automate everyday marketing tasks?
Definitely automation is okay. The ethical considerations aren’t nearly as high. The false positives means that you just wasted $2 sending a brochure. That’s what you’re already doing in mass marketing. It’s a numbers game. Your false positives are the name of the game; not literally, but in marketing, it’s decision automation is the standard way predictive models are used because if you have a list of a million customers or prospects and you’re just making a million yes/no decisions—to contact or not to contact, for example. That is the question, like Hamlet in Shakespeare. Deciding whether to contact with an acquisition or sales offer, a brochure, whatever it is, or extend a retention offer. These decisions are fully automated.
Now, on the other hand, in the marketing arena, there are places where ethics comes back. The most poignant and well-known example is when retailer Target predicted which individual customers were pregnant for marketing-related targeting.
Netflix itself predicts which movie you’re likely to rate highly, in order to determine the recommendations of movies. They enhanced that capability with a public contest where they released a large amount of data, allowed anybody who wanted to try to make a better predictive model to participate, and they awarded a $1 million prize. This was several years ago.
It turned out that from the publicly-released data, even though it was cleansed and no longer…you know, didn’t include the name or identifier of the individual movie-watcher, there was a way for some records, comparing what people had done with rating on the Internet Movie Database to determine, “Oh, well, this belongs to John Smith,” and then ascertain things that John Smith would not necessarily want publicly as far as the full suite of movies that he himself had viewed and rated.
So how do I turn my data into Predictive Analytics?
There’s a lot of software solutions that embody what are a fairly standardized public set of methods. The methods are things that are decision trees, neural networks, log linear regression…Log linear regression, for example, has been used for targeting direct mail for decades now. Now we now refer to that as one example of predictive analytics, where predictive analytics is a broader class of using that type of technology across all different kinds of sectors—private and public sectors—but yeah, you have to get your data together and then use some software tool.
There’s a very popular free software tool called “R”, there’s a lot of other less popular free tools, and then there’s a great number and a constantly increasing number of analytics software vendors that provide tools—but in fact, the most daunting and time-consuming part of a project is actually getting your data prepared so that those software tools can use it. What that really means is taking the data you have already today, which is extremely valuable because it’s predictive. I mean, the big data hype really is about, well, it’s big excitement because the value is this data is super predictive. It has predictive patterns; you just need to derive them.
Whatever the form of your data today across tables and silo’d databases and whatever, you need to pull it together so that you’ve essentially got a row of data for each individual historical case. You might have a row of data that corresponds to a customer, everything you know about the customer, and then whether they did or did not respond to the last marketing offer, for example. Something that you would have liked to predict in the past at some point, but now you don’t need to predict it; it already happened and it’s an example from which to learn. It’s a learning case. You need to pull the data together. Getting that data, kind of cleaning it up, and getting it into a meaningful form that represents that. Usually it’s one row per example. It’s quite simple. It’s a simple concept. It doesn’t mean it’s easy to get your data cleanly into that format. That ends up being 80% of the technical project.
The actual rocket science—the core analytics process of learning from that data—although it needs expertise to use that type of software tool usually, is not nearly as intense or as important a part of the project as getting the data ready for it in the first place.
Can I get Predictive Analytics out of Google Analytics data?
Yes, absolutely, and pulling things out of web analytics solutions and merging it with other data is common practice. I mean, especially if you’ve authenticated. In general, whatever you’re trying to predict—whether it’s “this customer’s going to cancel” or “they will respond to this offer”—whatever it is, there’s so many different types of demographic and behavioral data, including what they did on your website, that could tip in the balance and inform those odds. You want to pull together whatever might be available and let the system, the analytics procedure, determine what’s more predictive and how to use those different elements or variables.
For example, a major telecommunications company in the United States is trying to predict who’s going to cancel their cell phone subscription. It turned out that, if you were observed going on the website, logging in, and checking out to remind yourself when your current contractual obligation is up for renewal, that’s a big key indicator that you’re more likely to cancel because you’re thinking, “Oh, when am I free to cancel?” That’s just one little example. There are so many things that you do online. Yeah, you do want to pull in that data if possible.
Do any companies assist in gathering, preparing, and analyzing your data?
That’s a great question. In general, the data aggregation and preparation phase is very much specialized for your business, so there’s not usually a catch-all solution off the shelf that you buy and that kind of does it for you. It depends very much on the nature of your business, the nature of the databases, and historically how that data came to be and what it means. Because of that, typically, although the prepping of the data itself is not rocket science, it can be elusive exactly how to define that project. Typically, it’s very valuable to get experienced consultants in on that phase in determining that, even if most of the actual database programming or the data manipulation steps are taken internally by people you already have on-staff.
There are a great number of analytics service providers and consultants. If anyone wants a referral, I have a lot of colleagues who do this stuff all the time; I’d be glad to provide a referral.
What’s some advice for people just getting into Predictive Analytics?
Rather than the core technology, ironically, the first thing to understand is what it means to define your objective on the business level. What is it you want to improve with a predictive model? This is particularly important today with all the hype around big data and data science, where people are saying that data’s really valuable; you have to get this stuff going; you better do it; all your competitors are doing it. You better hustle your bustle because this is the way the world is moving.
None of that describes the purpose or value or how to get started. In the case of predictive modeling, you need to understand “what am I going to do?” For example, I’m going to improve this marketing campaign. I’m going to improve my retention offers. Let’s take retention, for example. Well, the best way to streamline retention is to target those at risk of leaving. That is to say, predictively model who’s mostly likely to cancel or defect. You need to think about what data you have of historical customers who did turn out to cancel and others who did not. You need both positive and negative examples in your data from which to learn.
You start with the carrot at the end of the stick. How am I going to use these predictive scores? What’s the operation, such as the targeting of retention offers, that I’m going to improve by way of the predictive scores? To that end, what exactly do I need to predict? Deciding to predict who’s going to cancel is still not quite specific enough because it has to be, well, which customer has been tenured for this long is going to definitely cancel explicitly within a 2-month window or something like that. You need to define it extremely precisely and make sure you’ve got the right data that represents both positive and negative examples of that. You need to get organizational buy-in that I’m going to predict this, I’m going to model this—then we’re going to use the predictions to change our existing operations such as who receives these retention discounts, in this particular way, using these scores. You need to get that buy-in; that needs to be the plan from the get-go.
It’s very much not just a technology process, but it’s an organizational process where you’re kind of improving the organizational operations; it’s no longer business as usual and you’re changing those operations with the predictions.
To listen to this Marketing Nerds podcast with Eric Siegel and Brent Csutoras:
- Download and listen to the full episode at the bottom of this post
- Subscribe via iTunes
- Sign up on IFTTT to receive an email whenever the Marketing Nerds podcast RSS feed has a new episode
- Listen on Stitcher
Think you have what it takes to be a Marketing Nerd? If so, message Kelsey Jones on Twitter, or email her at kelsey [at] searchenginejournal.com.
Visit our Marketing Nerds archive to listen to other Marketing Nerds podcasts!
Subscribe to SEJ
Get our weekly newsletter from SEJ's Founder Loren Baker about the latest news in the industry!