Play Head Time Analysis On OTT Video At Scale

May 26, 2021 04:25 PM (PT)

Download Slides

Play Head Time(PHT) is the pointer representing exact point in a video’s play-span that is currently being watched by the user.  We are all familiar with Play head pointer being displayed as a slider bar on the video screen.  Play head time can apply to regular media content, as well to Ads.  It is usually displayed on the slider bar of a running video.  However, when measured and analyzed at large scale, with accuracy, in near-real-time, across players and publisher environments – it enables us to solve some very interesting and practical business problems.  

For example: 

  1. Product placement analysis and measurement.
  2. Content viewer engagement analysis.
  3. Ad Content mutual impact analysis.

At Conviva, with its sensor software present on billions of devices on the planet, across hundreds of publishers/players – we are able to analyze video PHT at scale to solve the above use-cases, and more.  There are many challenges to collecting and analyzing Play head time.      

  1. Very large data volume.
  2. Need for high precision, at least seconds level data accuracy.   
  3. Complex and diverse environment.  We need to analyze different user behaviors when watching videos, including pause, buffering, seek, rewind etc.  We need to understand different player behaviors across different publisher environments across different video (Live, SVOD, …). 
  4.  Final challenge is data sanitization.   

In this talk we will present how in Conviva we collect PHT data from billions of devices across players and publishers. How in Conviva we use Databricks technology stack to analyze, sanitize, store and process Play Head Time from billions of devices in real-time. How in Conviva we use large volume of processed Play Head Time data to solve real-world business problems.

In this session watch:
Adam Liu, Data Scientist, Conviva
Biplab Chattopadhyay, Architect , Conviva



Biplab Chattopa…: Hi, everyone. Welcome to the talk on Play Head Time analysis on OTT Video At Scale. Myself, Biplab Chattopadhyay, with my friend and colleague, Adam Liu.
First, we start with the agenda. So, we will go on introduction and we will go over definition of play head time, and then we’ll go over some of the technical challenges that we face when collecting and handling real world problems for the play head time. And finally, we’ll go over some of the solutions and some of the business use cases that we can solve using a play head time.
But first, we will carefully watch a video, a clip from the very popular blockbuster movie, “Castaway”. Let’s watch this carefully.

Tom Hanks: Happy birthday. The most beautiful thing in the world is, of course the world itself. Johnny, have the happiest birthday ever. Score. Your gran.

Biplab Chattopa…: All right. The most beautiful thing in the world is the world itself. However, we are talking about play head time here. So, let’s move on to the next slide and I hope everyone watched this carefully.
So, here’s a question for you and I need you to think back and rescan the video in your mind and tell which products were placed inside that video clip? And here are a few of the options from there. And I’ll give you five seconds to think. 1, 2, 3, 4, 5. Let’s see who got it right. So, for those who got it right, congratulations. The answer is FedEx, Wilson, and Riedell.
If you are one of the brands which placed product in a very popular blockbuster video like that, you will have certain questions in your mind. One would be, how many and what kind of people or household, on what type of devices watched my brand on this movie worldwide? You can also have the question, what actions did the individuals take after watching my brand placement in the movie?
Solving these kind of questions, requires good quality tracking of the play head time. So, accurate play head time tracking of the video from many players at a global scale with identity. Now, here the word identity is important, because in order to track what the people are doing at that time or before, or what kind of people are watching this video, you need to have the identity.
But before we get into details of that, let’s get into what is a play head time? Head time is the pointer representing exact point in a video’s play span that is currently being watched by the user. We represent that using time in milliseconds from beginning of the video content. Now, on the left side of the diagram, you can see that the slider bar is pointing at 51 seconds. You can also see that the content length or the length of the video is two minutes, 35 seconds.
Now, on the right hand side, what you see is that some data capture from Conviva’s tiny sensor. A piece of software that runs with the player, with publishers player, is sending data back to Conviva’s packet. And that data contains typically a lot of qualitative data, but there is also a field called PHT, which is pointing to 51 seconds, which matches exactly with the 51 seconds here.
We also have different types of PHTs and in this slide we are defining them. One, is what we call the Progress-bar PHT. Meaning the play head time value that is exactly shown on the video. The other is the Reported PHT. Now, when you are collecting play head time as a third party, you are calling certain API to the player, to get the play head time value. So, the value that we’ll get there when you call that API, is called the Reported PHT. However, oftentimes the reported PHT is not something that’s usable for us. That is not what we intend to get. So, we need to adjust that accordingly, to get the real value of the play head time and that is what we call the adjusted PHT.
In this slide, we see a schematic of how we collect and process play head time values. On the left side, you can see multiple publishers. So, these rectangles and typically every big publisher will be present on multiple player platforms. And the green dot here is a small piece of software that Conviva calls it sensor. So, a small piece of sensor that runs with each player and this sensor collects that the play head time values continuously and sends it to the Conviva’s backend. Now, our video analytics backend, processes this play head time data and finally makes it available to build applications. So, we can build a very interesting applications out of them and we’ll go over some of those interesting use cases at the end of the presentation.
But before we get into that, let’s see some of the challenges that we face when we are even talking about the technology of play head time. Now, the challenges differ, or can be categorized into two different types. One, is challenges related to collection of the play head time data and the other is, how do you solve the real world problems once we have their correct play head time data?
Now, on the collection side, of course we have that data precision, diversity of data sources, impact of advertising, diversity of user behaviors, data verification at large scale. That is how to verify that when you are collecting the play head time values, you are doing it correctly. Challenges to solving real world problems. Here comes data volume and coverage. Also, identifying that user and it’s post view behavior. Now, that last point is the one that we will not get into too much detail that’s out of scope for this particular presentation. However, what we are doing that enables this very interesting use case?
Firstly, data position. To solve most real world play head time-based use cases, we need second, if not millisecond level resolution of the play head time data. Now, how do we collect it? This collection is happening across publishers and players. So, we cannot have a continuous stream of the play head time coming to us, just like you see on the slider bar video, which continuously slides over the video. So, it depends on the sampling frequency and we typically have two types of sampling here. One is event based sampling, other is periodic sampling. Event based sampling happens when certain events happen in the video’s timeline. These events can be pause, user seek, back or forward. It can also be advertisement starting, advertisement ending, but in any of these things happens, we collect the play head time value calling the API and we send it to the backend. So, that’s the event based sampling.
The other thing happens by periodic sampling. So our sensor, every 20 seconds or 40 seconds sends a heartbeat message to the Conviva’s backend. When that happens, we also collect the play head time value and send it. So, the backend knows the play head time data of what is happening inside the video, only at those points when the events happen or the periodic points.
Then comes the diversity of data sources. Now, on the diversity of data sources, we have the player level diversity. Different players report different types of video events actually. Different players report different data in those video events and we’ll see an example showing on the right side, but the example of the video events that can happen goes from this video start, seek forward, seek backward, buffering, ad start, ad end, video end and actually, there are quite a large number of events that can happen.
But on the right side, you can see what happens from player to player. And there’s an example of a Roku player, where the Seek start event, you can see that’s in this rectangle, contains the correct PHT value and here say it’s wanting to 35 seconds. However, you can see the rest of the heartbeat message here in the lower rectangle and what it is pointing to, is that there is no seek end events. So, the Roku player is not sending us any seek end event. The PHT value that you see here is actually PHT value of the heartbeat, but the heartbeat does not contain an event called seek end.
Then we come to the impact of advertising. So, typically most videos will have advertising or a large chunk of videos, actually has advertising. Now, advertisement on video is typically in two popular ways. One is client-side ad ins, another is server-side ad ins. These two methods are drastically different on how it plays on the screen. It’s not very different on how it plays on the screen, but the internal methods are very different, on the client-side-ad-insertion, ads are managed and added on the end user’s device. From delivery point of view, content and ads are separated. The result of that is that you see the collected PHT value actually goes over the content and does not increase when the ad is playing on. However, if it is a server-side-ad-insertion, the ad becomes a part of the content itself. And in that case, we notice that the play head time value is increasing, even when the ad is playing and that is not something we intend. So, on the backend side, we need to adjust that part.
Now, with this, I will stop here and I will let my colleague Adam, to take you through the rest of the presentation.

Adam Liu: Thanks Biplab and this is Adam. I will finish the rest of the, presentation. So the other challenge of play head time data collection is a diversity of user behavior. So, behavior of user watching videos includes watching pre-roll ads, mid-roll ads, post-roll ads, seek/rewind content, seek/rewind over a single ad, or seek/rewind multiple or skip ads, pause or buffering. So, the player may also affect user behavior. For example, the ads in some players are forced to play. So, the user seek over the ads. The progress bar will automatically jump back to the starting point of the ad and force the user to watch the ad.
So, in some players can skip the ads after playing a few seconds, but some players are not allowed. So, different players have different effects on use of behavior, which makes this problem more complicated for each behavior. Where there is see user [inaudible] or players control, we need to test an analysis, the impact of behavior of the play head time value and ensure that the collected data is accurate. This is Conviva’s data analysis pipeline. So, Conviva’s sensor will collect play head time from publishers’ player and periodically send the package into Conviva gateway. We call this package and heartbeat package. So, raw heartbeat package data will be stored into Conviva’s data lake. Next, we’re using data Databricks to read the raw heartbeat data and perform data analysis and visualize the play head time.
So, we are using Databricks technologies stack to analyze, centralize, store and process play head time from billions of OTT devices. Specifically, we’re focusing on analysis or testing play head time, collecting coverage and accuracy for all players or publisher. Analyses the diversity of user behavior to match to the play head time. To understand the play head time offset caused by advertising. Implement play head time adjustment algorithm and verify the adjusted play head time data. Because of the advertisement, we’ll make the [inaudible] play head time different from the true play head progression value. So, we need to detail this analysis, to understand how to adjust the play head time value data.
So, this is a real example for one player. So, this video contains a 30 second play-roll ad and two minutes middle ads. Suppose the user watching this video has finished the 30 second play-roll ads. So, after a few seconds, the user performs a seek action. After a few minutes, the user seek over the middle ad, because the ad is forced to play in this player. The progress bar automatically jump to the starting point of the ad and let the user watching the ads. When the user finishes watching the ad, the progress bar automatically jumps again to the position that the user wants to seek to. After a few minutes, the use of rewinds over the middle ads. If the user watches the middle ads again, the player allows them to skip the ad, so the user performs a skip action.
So, every time the user behavior occurred, we need to analysis the offset between the collected play head time and the true play head version on pure content. In this particular case, at the end of 30 second play-roll ads, we’ll find that the collected play head time value is 30 seconds, but the true play head progression is zero, because the video has not yet started to play. After the second middle ad play, the offset between the collected play head time and the play head progression become two minutes, 30 seconds, which is equal to the ad length of the play-roll 30 second plug for two minutes middle ad.
When the user [inaudible], to offset return back to 30 seconds immediately, but when user skips the ad, the gap changes from the 30 seconds to two minutes 30 seconds again. In summary, for this player, no matter what behavior user perform between pre-roll and middle-ad, the offset is always 30 seconds. No matter what behavior the user performs after the middle ad, the offsets is always two minutes, 30 seconds. So, based on these findings, we designed [inaudible] adjustment algorithm on top of the collective play head time, to calculate adjusted play head time taken for this player. So, remember that this example is only for one player, one publisher. To get a general adjustment play head time algorithm, we need to conduct the detailed analysis for all players, all publishers.
The next question is how to verify the justice part-time data? We cannot directly verify the data with a true content progression, because we have no way to collect them. So in this case, we have to verify the adjusted play head time data indirectly, by using video content lengths. Here’s our solution. So firstly, for each redo content to calculate the total content length. The content lengths of video can be provided by publisher or inferred from all playback session for that content. Secondly, for each video content, to calculate the offset between content length, as well as a maximum of value of play head time, then plot the offset distribution for all video content.
So, we can see to offset distribution from right side. The first one is a distribution of collective play head time and the content length, where the median is minus 800 seconds. The second one is the offset distribution of adjusted play head time and content length, where the median is zero. The standard deviation of the second one is also smaller than the first one, which proves that the adjusted play head time is accurate.
So, the next step is to apply the analytical results into the real application. So, here is a Conviva play head time, processing and the application pipeline. So, similar to the analytic pipeline, Conviva sensor collected the play head time and periodically sent the heartbeat package into Conviva gateway. Backend will construct the raw heartbeat data into the video session data, then calculate the adjusted play head time. After a few ETL jobs, the adjusted play head time will store in distributed databases like [inaudible]. Play head time data will be enriched by other Conviva data or third party data sources. For content or devices metadata, which is also collected by Conviva’s sensor. We can gather information of video series, episode name, absolute name, genre, category, or device ID, device hardware type, device name, device operation system, et cetera.
So, by ingesting ad server logs, we will gather data of ad unit, advertiser name, campaign, line item and creative. Through the corporation with the third party demographic data providers, we can get the age, gender, marital status, education level, income information, et cetera. All data will be put together into the distributed database, then support the visualization of product placement dashboard. Because the data enrichment, we can not only see the product placement exposure, but also detailed insight on advertisers, devices, household demographic analysis in our product placement dashboard. So, here’s an example.
There are a lot of business problems that can still in this dashboard and we give three main use cases here. So, the first use case is the viewer content engagement analysis. Through this analysis, we will know which second of the video is the most attractive or the most boring that people may trim. For the left chart, the x-axis of the chart represent the timeline of the video and the y-axis represent the number of people that watch at a point of the timeline. We can see that the [inaudible] has two popular scenes. The first scene, is the actor and actress seen together by the worker concern. The second one is actually sing a popular song, always remember this way. From the curve. We can see that the people really liked these two parts of the movie.
The chart on the right is a skip or rewatch heat map. From the chart, we can see which part of the video users don’t like and fast-forward them and which part of the video user really like and watch them again and again. The second use case is through analysis by the impact of advertising. [Inaudible] in this video. From our viewership chart, we can see that user will have different degrees of turn, where each advertisement is started. The Conviva play head time value analysis help detect this impact and empower publishers and advertisers to take right action to mitigate these effects.
Our last use cases is to answer the first question, product placement measurement. So, product placement is also known as embedded marketing. It’s a marketing technique, where reference to a specific brand or product are incorporated into a publisher content of video. For example, the 10 second Nike product placement, shown in this video. By using Conviva play head time data, we can easily query and get a total number of viewers, who have been watching this 10 second video segment. So, this kind of measurement will help the advertiser to better understand their brand exposure.
Okay, so that’s all for our presentation. If you’re interested in video play head time, product placement measurement on our other use cases, please reach out to me and Biplab. Thank you.

Adam Liu

Mater of computer science at the University of Queensland. 10 years data analytics/data scientist and 3 years product management working experience. Hands-on big data and AI skills. Experience in data...
Read more

Biplab Chattopadhyay

Biplab is a software architect with long experience of designing, developing large scale software in the fields of Ad-Tech, OTT video, programmatic, audience management systems, location services, wir...
Read more