How YouTube Generates & Ranks Suggested Videos

Ever wondered how YouTube's suggested videos work? A Google research paper sheds light on how it uses deep learning to generate and rank suggested videos.

VIP CONTRIBUTOR Greg Jarboe

May 20, 2020
⋅
14 min read

VIP CONTRIBUTOR Greg Jarboe President and co-founder at SEO-PR

Bio

598

SHARES
28K

READS

How YouTube Generates & Ranks Suggested Videos

Back in 2008, I was one of the first writers in the industry to notice that YouTube had passed Yahoo! to become the second largest search engine in the world, behind only Google.

Back then, I was writing for Search Engine Watch and my article was entitled, Has YouTube Passed Yahoo in expanded searches?

(Spoiler alert: The answer to the rhetorical question in the headline was: “Yes.”)

Today, I want to ask a related question: “Is YouTube about to pass Amazon as the largest scaled and most sophisticated industrial recommendation system in existence?”

This question isn’t rhetorical – because I don’t know the answer.

But, I do know that suggested videos are a force multiplier for YouTube’s search algorithm that you’ll want to understand.

I kinda, sorta hinted at this last year in a Search Engine Journal article, YouTube Algorithm: 7 Key Findings You Must Know.

I said, “To maximize your presence in YouTube search and suggested videos, you still need to make sure your metadata is well-optimized. This includes your video’s title, description, and tags.”

Now, I apologize, because I then went on to explain how to optimize your video’s title, description, and tags.

I totally slid on past the phrase, “your presence in YouTube search and suggested videos.” But, let me correct that oversight right now.

Most SEOs focus on the search results – because that’s what matters in Google.

But, most YouTube marketers know that appearing in suggested videos can generate almost as many views as appearing in YouTube’s search results.

Why?

Because viewers tend to watch multiple videos during sessions that last about 40 minutes, on average.

So, a viewer might conduct one search, watch a video, and then go on to watch a suggested video.

In other words, you might get two or more videos viewed for each search that’s conducted on YouTube.

That’s what makes suggested videos a force multiplier for YouTube’s search algorithm.

Sheepishly, I admit that I took advantage of this phenomenon back in 2008.

One of our clients back then was STACK Media, the nation’s leading producer and distributor of sports performance, training, and lifestyle content for high school athletes.

We optimized 137 videos to the STACKVids, STACK Football, STACK Baseball, and Stack Basketball channels on YouTube, which presented expert sport workout tips and inside stories from some of the world’s premier athletes.

For example, we had a video that featured Will Bartholomew, who talked about the dumbbell bench press workout that Peyton Manning used in the off-season.

Which keywords did we use in the title?

Well, if you look at the video’s title, the answer is pretty obvious: Peyton Manning Workout.

And the video’s description won’t leave anyone puzzled about the search terms we were targeting:

“Peyton Manning training at D1 during the off-season. See Manning’s full workout (with a tracking link to a related article on STACK’s website).”

But, which tags did we use?

Well, back then, YouTube still showed which tags a video was using.

That’s no longer the case. But, I shared this case study at SES San Jose 2008, so I got permission from my client to disclose that we used the following tags:

“Peyton Manning”
“Peyton Manning workout”
“quarterback workout”
“quarterback training”
“Peyton Manning training”
“bench press”
“quarterback bench press”
“dumbbell bench press”
“Manning workout”

How did we come up with these tags?

We looked at the top ranking video back then for the term, “Peyton Manning workout”, and then used as many of the tags as we could that were also relevant for our video.

That way, we improved our odds of becoming the top suggested video after someone watched that top ranking video.

These days, it’s hard to find the video that was top ranked for that term back in 2008.

But, it is worth noting that STACK’s video currently ranks #1 for “Peyton Manning workout”, #1 for “Manning workout”, and #5 for “Peyton Manning training”.

How Does YouTube Discover & Rank Suggested Videos Today?

That was how suggested videos worked back when users were uploading 13 hours of video content to YouTube every minute.

So, how does YouTube discover and rank suggested videos now that more than 500 hours of video content are uploaded to YouTube every minute?

Until recently, the only answer that I could find came from a video on the YouTube Creators channel entitled, How YouTube’s Suggested Videos Work.

As the video’s 300-word description explains:

“Suggested Videos are a personalized collection of videos that an individual viewer may be interested in watching next, based on prior activity.”

There’s no way that creators can influence a viewer’s prior behavior, but this also means that a sports channel can tap into sports fans.

“They are shown to viewers on the right side of the watch page under ‘Up next’, below the video on the mobile app, and as the next video in autoplay.”

More than 70% of YouTube watch time comes from mobile devices, so you need a mobile-first strategy for suggested videos.

“Studies of YouTube consumption have shown that viewers tend to watch a lot more when they get recommendations from a variety of channels, and suggested videos do just that. Suggested Videos are ranked to maximize engagement for the viewer.”

So, optimizing your meta data still helps, but you also need to create a compelling opening to your videos, maintain and build interest throughout the video, as well as engage your audience by encouraging comments and interacting with your viewers as part of your content.

According to the description, suggested videos are more likely to be:

“Videos … that are topically related. They could be videos from the same channel, or from a different channel.” In other words, sports videos for sports fans either from your channel or a different sports channel.
“Videos from a viewer’s past watch history.” Unless you have a DeLorean time machine, there’s no way you can influence a viewer’s past watch history.

The video’s description also tells creators:

“You can see which videos bring viewers to your channel from Suggested Videos in the Traffic sources report (in YouTube Analytics) by clicking on the ‘Suggested videos’ box.”

Um, yes. But, don’t the vast majority if YouTube creators already know that?

Finally, the description includes the following tips for creators:

Include strong calls-to-action in your videos to watch another video in your series.
Persuade viewers why they should go and watch another video in your series.
Be mindful of how your videos end because long endings may discourage viewers from watching more videos.
Use playlists, links, cards, and end-screens to suggest the next video to watch.
Develop a series of videos that are organically connected.
Make videos that are related to popular formats on YouTube such as challenges or lists.

Now, this video has 394,000 views.

So, it’s safe to assume that several hundred thousand graduates of what was formerly known as the YouTube Creator Academy know at least this much about how YouTube’s suggested videos work.

So, this won’t give you much of a competitive edge.

However, there is more detailed information available – although it was safely hidden in plain sight until an anonymous source, who may or may not be a Bothan, sent me a link to where I could find it.

The link took me to a paper that had been published on September 15, 2016, and is now archived on Google Research.

This old research paper was written by Paul Covington, Jay Adams, and Emre Sargin of Google. It is entitled, “Deep Neural Networks for YouTube Recommendations.”

How Do YouTube’s Recommendation Systems Generate & Rank Suggested Videos?

If you’re looking for a serious competitive edge, then you’re going to want to download the PDF and read this research paper for yourself.

But, if you need to be convinced that reading an 8-page academic document that’s more than three-and-a-half years old is worth your time and attention, then let be share some of the highlights that I found squirreled away in “Deep Neural Networks for YouTube Recommendations.”

For starters, Covington, Adams, and Sargin reveal that YouTube’s massive recommendation system is comprised of “two neural networks: one for candidate generation and one for ranking.”

That’s important.

Or, as Mon Mothma (Caroline Blakiston) solemnly says in Star Wars: Episode VI – Return of the Jedi (1983), “Many Bothans died to bring us this information.”

Their paper says:

“The candidate generation network takes events from the user’s YouTube activity history as input and retrieves a small subset (hundreds) of videos from a large corpus. These candidates are intended to be generally relevant to the user with high precision.”

Now, we can’t optimize our videos for a viewer’s past watch history – unless we have a time machine.

But, we can create videos that are targeted at audiences that YouTube also uses for targeting video ad campaigns.

In other words, your video won’t end up in a small subset (hundreds) of videos if it’s about a totally different topic than other videos on your channel, or if it targets a totally different demographic group than you have in the past.

Oh, and don’t even think about creating a new video targeted at “music fans” if all of the other videos that your channel’s subscribers have watched were targeted at “sports fans.”

As I pointed out in an article entitled, Platform Trends: How the Verticalization of Content Increases Reach on YouTube and Facebook, which was published on Tubular Insight in September 2018, half a dozen digital-first publishers recognize are already pursuing a vertical strategy.

This includes: Axel Springer SE, Group Nine, BuzzFeed, UNILAD, Jungle Creations, The LADbible Group, and 9GAG.

Here’s the rhetorical question that I asked in that article:

“So, why would all these publishers segment their properties into several verticals instead of just stuffing a broad range of content into giant, horizontal … YouTube channels? Because in an increasing competitive online video ecosystem, you’re more likely to engage audiences with content that is narrowly targeted at their special interests than you are with a random collection of content that may or may not appeal to their general interests. In other words, it’s smarter to go deep than it is to go wide.”

This brings us to the second neural network for ranking.

Covington, Adams, and Sargin acknowledge that there are many ways to rank suggested videos. But they disclose:

“Ranking by click-through rate (CTR) often promotes deceptive videos that the user does not complete (‘clickbait’) whereas watch time better captures engagement.”

So, avoid using misleading, clickbaity, or sensational titles and thumbnails.

Yes, they worked in the past.

But, they went the way of the dodo once YouTube replaced “views” with “watch time” in its algorithm back in October 2012.

Okay, so the second neural network doesn’t use CTR as a signal. What other signals does it use?

The paper’s authors observe that “the most important signals” include:

What was the user’s previous interaction with the video itself and other similar videos?
How many videos has the user watched from this channel?
When was the last time the user watched a video on this topic?

Covington, Adams, and Sargin say:

“These continuous features describing past user actions on related items are particularly powerful because they generalize well across disparate items. We have also found it crucial to propagate information from candidate generation into ranking in the form of features, e.g. which sources nominated this video candidate? What scores did they assign?”

They add:

“Features describing the frequency of past video impressions are also critical for introducing ‘churn’ in recommendations (successive requests do not return identical lists). If a user was recently recommended a video but did not watch it, then the model will naturally demote this impression on the next page load. Serving up-to-the-second impression and watch history is an engineering feat onto itself outside the scope of this paper, but is vital for producing responsive recommendations.”

Covington, Adams, and Sargin divulge:

“Our goal is to predict expected watch time given training examples that are either positive (the video impression was clicked) or negative (the impression was not clicked). Positive examples are annotated with the amount of time the user spent watching the video. To predict expected watch time we use the technique of weighted logistic regression, which was developed for this purpose.”

In other words, if you want to optimize your video for YouTube’s recommendation systems, then you need to help viewers find the videos they want to watch, and then maximize their long-term engagement and satisfaction.

That’s hard.

But, with more than 500 hours of video content being uploaded to YouTube every minute, that’s what you need to do these days.

What Does This Mean to You?

But wait, there’s more!

The paper’s authors also revealed that YouTube has been using “deep learning” to design, iterate, and maintain “a massive recommendation system” since 2016.

And they saw “dramatic performance improvements” with “enormous user-facing impact” even back then.

Now, that may not keep you up at night.

But, if Google rolls out what they’ve learned to, oh, Google Shopping for example, then I’ll bet that it will create nightmares for researchers and developers at Amazon.

Now, what does this mean to you?

I realize that your focus is on digital marketing, SEO, content marketing, and paid search. Well, that got you this far.

What about the next four years?

Well, if you or someone on your team already understands TensorFlow, (formerly known as Google Brain), then you are ready to rock and roll.

But, if you don’t have a researcher or developer on your team who understands how to use TensorFlow’s comprehensive, flexible ecosystem of tools, libraries, and community resources to push the state-of-the-art in machine learning (ML) to build and deploy ML-powered applications for your organization or clients, then you need to find one … fast.

Why?

Because down the road, your fate – and the fate of your organization or clients – will be increasing in the hands of recommendation systems.

That’s why it’s worth your time and attention to read “Deep Neural Networks for YouTube Recommendations” today.

Just like one of those periscope spy toys that let kids “see around corners and over walls,” this 8-page academic document can help you see what’s been hidden in plain sight for more than three-and-a-half years.

More Resources:

Category SEO YouTube

The Ultimate Topic Cluster Cheat Sheet & Checklist Bundle

The New SEO Playbook: How AI Is Reshaping Search & Content

The Hidden Cost Of Google Ads: Stop Wasting Budget Bidding Against Yourself