Native YouTube Ranking Factors and Rank Correlations 2016 helps provide marketers and YouTubers the insight needed, to gain organic (non-paid) visibility within the YouTube SERPs (Search Engine Result Page – the page that shows the results of any given search on YouTube). As any type of data analysis can get rather complicated, you can click any of the following links to jump to the corresponding point, or read the YouTube Ranking Factors Key Learnings and Takeaways post here.
What is meant by “Native”? YouTube ranking factors can largely be divided into two categories, one that provides power to the video (Native), that you have no control over (such as Likes or Watch Time), and another that is about “Relevancy” (such as name of the video, keywords in the description, and so forth). Ranking correlations on Relevancy is expected to hit in about a month.
The data was gathered from roughly (varies slightly on some factors) 250 different high-volume search terms, during January 2016. Limiting it to high-volume searches was chosen to have more similarity between how the algorithm might look and rank the videos. With terms with very low monthly volume, there might be very little content, and as such the algorithm wouldn’t be “forced to work in full effect”, so to speak, and hence terms with an assumed high volume was chosen.
Examples of these searches are: “pewdiepie”, “science”, “cute animals”, “america”, “how to cook”, “photoshop tutorial”, and “call of duty”.
For each of these searches, data was gathered on the top 40 ranking videos for each search term.
A total of 33 different data points was gathered for each video, and an additional 8 were calculated from the gathered data (such as viewer retention and total engagement).
In total, over 400.000 individual data points was used in the calculation of the data shown below.
While this sounds like a lot, it’s important to remember to key things: this is based on 250 search terms (billions exist), and on top 40 rankings. Both of these numbers greatly limits the data, and how correlations may play out. This was released under the mantra “A good plan executed today, is better than a great plan executed in a week.”
More data than shown was gathered, but was removed from the final findings either because the data turned out to be incomplete, or because arranging the data correctly from the way it was gathered would take too long. This data includes total words in subtitles (used to calculate Words per Minute), Exact Match Keywords in Title, and more. This data is expected to be included in future studies.
Last, it’s incredibly important to remember that: Correlation Does Not Always Equal Causation.
The data below is correlation. In theory, all that means is that throughout the data, mathematically there is a connection between the ranking and corresponding data. However, it is not possible to determine whether the data (such as comments or likes) resulted in higher rankings, or higher rankings result in more comments and likes. This is discussed again further down.
The below graph shows all the factors measured, ordered in a hierarchical list, with the factors that got the highest correlation score at the top, with the lowest correlation score at the bottom.
“Likes” represents the total amount of likes a given video has gotten, over its lifetime on YouTube.
It was quite the surprise to see this gaining a top spot in this list, as this is not something YouTube has ever seemed to put particular weight on. However, it could (perhaps more likely) be that Likes are simply the biggest player in total engagement, which comes in second – meaning that Likes itself is not important, but is just the most common form of engagement on a video.
The difference in correlation between Likes and Total Engagement, is so small, that it is statistically impossible to say which would be higher on the list, in case of a larger dataset.
“Total Engagement” is the total amount of Likes, Dislikes, Comments, Shares, and Subscriptions driven by a video, over its lifetime on YouTube.
It was surprising to see it ranking higher than factors such as Watch Time, but it’s not surprising at all to see it high on the list. As you’ll notice with Dislikes slightly lower on the list, YouTube cares more about how engaging a video is, whether that engagement is negative or positive – and that makes sense. If negative engagement (such as Dislikes) made a video rank lower, Rebecca Blacks “Friday” video would never be able to rank for the keyword “Friday”, even though its highly relevant.
Instead, we can assume that “negative engagement” is viewed by YouTube as “provocative” or “controversial”, instead of “bad”.
“Total Likes/Dislikes” is the total amount of Likes and Dislikes a video has received, over its lifetime on YouTube.
After understanding that Engagement overall is one of the more important factors in the list, it’s a given that the total amount of Likes and Dislikes would also be very high on the list, as they are one of the biggest factors in Total Engagement, and Dislikes appears just a few steps further down.
“Views” is the total amount views a video has gotten, over its lifetime on YouTube.
It’s unlikely that views are actually a core ranking factor in and of itself, but rather an indicator of what is needed to rank highly. Without many views, a video would most likely never get the Watch Time needed, to rank highly in competitive search results.
Since these were measured only on high-volume keywords, it could also be the natural situation that any video that ends up ranking highly, will automatically gain many views relatively fast simply from the organic search traffic it would receive.
“Dislikes” is the total amount dislikes a video has gotten, over its lifetime on YouTube.
The fact that Dislikes ranks this highly on the list, is at the very least proof that YouTube does not care whether a video has gotten many dislikes or not. It is most likely not a ranking factor in and of itself, but rather part of total engagement.
“Comments” is the total amount comments a video has gotten, over its lifetime on YouTube.
As with the rest of the engagement factors, it’s not surprising to see this at a high spot. That it ranks lower, seems to indicate that the amount of comments on a video vary much more, than likes for instance. Where many likes are very often an indicator of a “rank worthy” video, comments fluctuate more.
“Watch Time” is the total amount of minutes a video has been watched, over its lifetime on YouTube.
Its particularly interesting, that what has been touted as the main ranking factor by YouTube, is not found on the list until now. Watch Time was specifically put in place, to replace Views as the main ranking factor – ironically, at least in correlation, it is found considerably lower on the list than Views or Engagement.
This could potentially indicate, that Watch Time, while important, is not the end-all-be-all that is often touted by YouTube themselves.
Total Channel subscribers
“Total Channel Subscribers” is the total amount of subscribers the YouTube Channel of a video, when the data for the video was collected.
Whether the total amount of channel subscribers is a ranking factor (it could potentially be used as a way to say “many people regularly like the content from this creator, so we assume it is generally good”), or not is hard to say.
Channel subscribers could also simply play a big factor in getting engagement much higher, and also result in many more views (as it is automatically sent out to all the subscribers) which in turn would result in a higher Watch Time.
Nonetheless, it shows that large channels have a very clear advantage over smaller channels, when it comes to ranking videos.
“Viewer Retention” is the average percentage of how much of a video is viewed, over its lifetime on YouTube.
Viewer Retention is the second “official” ranking factor from YouTube, but it would seem it is considerably less important than Watch Time. In fact, Watch Time has a correlation that is almost 4 times stronger than Viewer Retention.
This could mean that Viewer Retention is used more as a “tie breaker”, when two videos reach the same amount of watch time, rather than a strong ranking factor in and of itself.
“Resolution” is the max resolution a video is available on YouTube.
Unsurprisingly, people prefer to watch a video in a High Definition format. As internet speeds have gotten faster all over the world, it allows people to watch video in a much crisper format, and this is a clear indicator that (obviously) people prefer to do so.
It should serve as a warning to video creators, that while “studio production” quality is still not needed on YouTube, there is definitely a tendency for people to prefer videos of a higher quality.
Total Channel Videos
“Total Videos” is the total amount of videos the channel of a video has, when the data for the video was collected.
As with the total amount of subscribers, Total videos is equally difficult to say. YouTube could use it as an indicator, that if a channel has uploaded many videos, it is most likely more trustworthy than a very small channel.
But it could also be, that channels that have uploaded many videos, have simply perfected their craft more, and as such tend to upload videos of a higher quality, and have most likely also garnered more subscribers – both of which might not help in and of itself, but would help boost other factors.
“Driven Subscriptions” is the total amount of people, who have subscribed to the channel through the video itself.
It was rather surprising to see this as low as it is, considering the quality signals it is actually sending. If a person subscribes to a channel, after having watched a video, most would agree that obviously that video must have been of such high quality that the user says “I’ll gladly watch whatever else this channel sends my way”. By that logic, should be one of the strongest factors.
That it ranks as low as it does, might mean that YouTube isn’t counting this as a factor at all, and that it’s simply natural for videos of a high quality to garner more subscribers than “bad” videos.
As we will see with other numbers further down, this could indicate that the search algorithm is considerably less advanced/intricate, than one might assume.
“Length” is the length (time) of a video.
It is highly unlikely that the length of a video itself, is a ranking factor. However, as we know that Watch Time is a ranking factor, it is only natural that videos that are longer (meaning people watch more minutes) would fare better, than videos that are much shorter.
Total Likes/Dislikes per View
“Total Likes/Dislikes” is the total amount of Likes and Dislikes a video has received, divided by the total amount of views, over its lifetime on YouTube.
As you will notice with anything on this list, that is measured “per view”, it has an outright negative correlation. This should not be understood as “YouTube ranks my video lower because of this”, rather it is likely that the correlation is so non-existent, that with somewhat limited data, it ends up with a negative correlation.
This is surprising because one would assume, that instead of simply looking at totals, looking at how the average viewer interacts with the video would be more important. But as other numbers indicate, the ranking algorithm for YouTube might be surprisingly simple.
“Shares” is the total amount of times a video has been shared (Twitter etc.), over its lifetime on YouTube.
Again, it is incredibly unlikely that many people sharing your video, has a negative impact on its ranking. Instead, as above, the correlation is simply non-existent – but within the data, there is a slightly negative correlation.
“Video Age” is how many days the video has been on YouTube, compared to its ranking. It should be understood, that the data shows that the older a video a video is, the worse it ranks.
Here we see that new videos have a substantial advantage over old videos. This is very much alike to how it is on Google, where it has also been shown many times, that Google most likely has a “freshness algorithm”.
However, as this data was gathered on highly competitive searches, it could be that this is only apparent in searches that receive a large number of constant new videos. This makes a lot of sense, particularly in “news worthy” searches.
Total Engagement Per View
“Total Engagement per View” is the total amount of Likes, Dislikes, Comments, Shares, and Subscriptions driven by a video, over its lifetime on YouTube, divided by the amount of views it has received.
As with all other data that compares “per view”, we see that it seemingly doesn’t matter at all to YouTube, based on the data.
Subscriptions per View
“Subscriptions per View” is the total amount of people, who have subscribed to the channel through the video itself, divided by the amount of views it has received.
In this pie chart, I’ve gone ahead and made a few changes to give an easier overview.
First, all negative factors have been removed, EXCEPT Video Age (which has been replaced with “Freshness”, and now shows the opposite value – that is, how much does being a “young” video correlate). Second, I’ve removed “duplicate” numbers, such as “Total Engagement”. Last, the data is shown as percentage based on the correlation, compared to the total group.
When presented in this light, it becomes more obvious how much lower correlation Viewer Retention has, compared to many other factors. It also becomes very clear, how little the total amount of videos a channel has uploaded, matters, which further points in the direction that it isn’t actually a ranking factor, but instead a correlation between experience in making good videos.
To further simplify the data and provide a different way of looking at the factors, here I’ve added several data groups together.
“Official Factors” refers to Watch Time and Viewer Retention. “Total Engagement” remains the same, as that data was already added together. “Video Strength” is Freshness, Length, Resolution, and Views. “Channel Strength” is total channel subscribers and videos.
It becomes interesting when looking at these factors grouped together. First, we see that YouTubes own official factors make up only 27.01% of correlation between a video and it’s rank in the search results. This is largely the same that “Total Engagement” takes up, with only a difference of 0.02%.
The second very interesting thing we see popping up, is that the “strength” of a video itself, has actually become the largest positive correlation now. Though it’s important to remember, that this is primarily fueled by “Views”. Still, one would have assumed that “Views” and “Watch Time” would at the very least have been closer linked, for Video Strength not to gain a particular advantage here.
The biggest surprise in this data, was that practically all “per view” data that was measured, had little or negative correlation. The main reason this is as surprising as it is, is that measuring data “per view” would give a much better indicator of whether a video was good or not, instead of simply looking at “total engagement” for instance.
It is only natural that a video that has more views, is also likely to have more Likes, for example. An example of two videos: You have a video that has gotten 500 views and 5 likes, and a video that has 100 views and 10 likes. Which video do you think people enjoyed the most? The video with 100 views and 10 likes, obviously.
Chances are that the first video simply hit a lucky keyword, maybe got shared on some kind of social media, or had a bit of advertising money thrown after it. But once people saw it, few felt it was touching or good enough, to leave a “Like” on it. On the other hand, the video with 100 views most likely just never got shown to a lot of people, but out of the people that DID see it, many more thought it was worth leaving a “Like” on it.
This could mainly mean two things:
A) That engagement is not a ranking factor whatsoever in any way (read further down).
B) That instead of “Per View”, the data is used in another. For instance, the data could be used in “per day since upload” (this might explain why newer videos rank higher).
Engagement has been the big winner in this data collection – in almost any way it is measured (with the exception of “per view” and “Shares”), User Engagement has some of the largest correlations compared to the rankings of a video.
However, as with all calculations based on correlation, it’s perfectly possible that they have nothing in common with causation for higher rankings.
One possible alternative explanation for the high correlation, is that videos that are generally good enough to receive a high watch time and viewer retention, are also naturally videos that engage people. After all, why would you sit and watch a 10-minute video, if it didn’t engage with you at all?
Or put another way, the more engaged you are with any form of content, the more likely it is that you are willingly to spend more time watching that content.
Two key changes will happen in the future when gathering and analysing this data:
1) Aiming to gather data for top 100 results, instead of top 40.
2) Based on roughly 500 searches, instead of 250.
Additionally, the data will be gathered, from a software perspective, differently. This will result in all the gathered data being able to be used more efficiently, and will in return yield much more reliable results into the importance of Exact Match appearances in the title, description, and tags, the general importance of keywords, the importance of subtitles on videos, and much more.
Further, either at the same time or slightly down the road, data will be released in categories, such as “Gaming” and “Education”. When data is presented in this way, it becomes possible to look at the optimal length of a high ranking video, and whether certain groups of people prefer talkative or quiet videos (as examples).
For questions or enquires, see the contact page.