Summarizing Entertainment Video Using Color and Dialogue.

By Fred Hohman, Sandeep Soni, Ian Stewart, and John Stasko

Data is Coming

The hit TV show Game of Thrones has won critical acclaim and the attention of millions of fans with its inspiring cinematography and brilliant dialogue. Dedicated fans of the show can recall critical scenes and key quotes with ease, but can they pick out high-level trends that emerge over the course of the series? For instance, in which season was the frozen North the focus of the plot? With all the violent twists in the plot, how often does the show's dialogue tend toward the topic of death? Our system seeks to highlight these large-scale patterns that might otherwise be missed.

Our visualization highlights the high-level aspects of each Game of Thrones episode, separating the data into visual and textual components. We process over 60 hours of video data and its subtitles (dialogue only) into nominal and quantitative variables of color and text frequencies as follows.

The Color

Color quantization is a process that reduces the number of distinct colors used in an image, usually with the intention that the new image should be as visually similar as possible to the original image. We use the well-studied median cut quantization clustering algorithm from computer graphics (Heckbert 1982) on our data in order to extract the top ten most dominant colors in an image. Here's an example of color quantization applied to the title screen:

Since this technique requires stills, we segment the video data into frames at a rate of one frame per second, producing on average 3,600 images for a 60-minute episode. Now we divide each episode into 60 time slices of equal size and extract the top ten colors from each slice to get one color palette. For six seasons of ten episodes each, that means our color data distillation looks like:

60 video hours → 216,000 images → 3600 color palettes

In total we are crunching approximately 75GB of video data into 0.7MB of RGB color values. Our color extraction technique highlights the most visually salient colors within the data, which are often different from the most frequent colors. An example of this is seen below in the 2nd time slice of Episode 1 of Season 1 (S1E2), "Winter is Coming":

The Text

Similar to the color extraction, we first segment each episode into 60 equal-sized time slices and group all dialogue within each slice. We then annotate the dialogue for words that fall into one of the following categories: anger, death, family, home, humans, negative affect, positive affect, religion, swearing, and sexual.

These categories are a subset of those provided by the Linguistic Inquiry and Word Count, which is commonly used for automated textual analysis (Tausczik and Pennebaker 2010). We selected these categories due to their relevance to the Game of Thrones dialogue: for instance, Tyrion Lannister is known for his colorful swearing habit which is captured in the "swear" category.

From the annotated dialogue, we extract the raw frequency of each category at each time slice, as well as the individual word frequencies at each episode. An example of the full pipeline is provided below.



















Case Study 1: Wildfire

Color + Text Analysis

The bright green of "wildfire" is one of the most memorable colors in the entire Game of Thrones series, because of the fire's hue and its importance in the plot as a deadly weapon.

Tyrion holding a small jar of wildfire.

We explore the visual impact of the wildfire, as well as its corresponding impact on the dialogue, in the two episodes where it appears: S2E9 "Blackwater" and S6E10 "The Winds of Winter."

S2E9: "Blackwater"

Time slice twenty eight through thirty.

Season 2 Episode 9 portrays the first of only two occurances where viewers witness a wildfire explosion. At this point in the series, King's Landing prepares for an attack from an on-coming, yet unsuspecting fleet of ships. Wildfire changes the tides of battle, and ultimately the rest of the plot. Our visualization clearly reveals the visually prominent wildfire:

Notice here the bright green color sticks out in the episode's color palettes, and all dialogue ceases, a rather unusual scenario in Game of Thrones. We can inspect the image data to take a peak at scene itself to illustrate the lack of dialogue and vibrant colors.

S6E10: "The Winds of Winter"

Time slice fifteen.

Viewers witness the second wildfire explosion during an important gathering in King's Landing, where the event directly leads to a change in rulership. We show below the resulting visualization for this episode. Once again, we note the bright green colors that dominate time slice 15.

Here we see that the wildfire gets less screentime than before, but its impact on the dialogue is the same. Furthermore, there is a considerable buildup to the explosion in the lack of dialogue (unusual to see little dialogue for 15 straight minutes) and the dialogue immediately afterward shows a burst of anger words, reflecting the characters' angry reaction to the explosion. Once again, we can inspect the image data around this scene to clarify the wildfire's impact on color.

In both episodes, we see that the wildfire represents the most vivid color within the episode and even silences all dialogue when it appears.

Case Study 2: White Walkers

Color Analysis

Fans of Game of Thrones will also recognize the "white walkers" as a major plot point throughout the series, as they represent a major threat to all citizens of Westeros. These creatures are characterized by bright blue eyes and skin that corresponds with their habitat in the frozen North.

The Night King leading an army of hooded white walkers.

The walkers appear periodically throughout the series to terrorize the main characters, and we provide visual evidence of their key appearances in one episode in each of the six seasons. For each of the episodes, its corresponding color visualization and three key images from its white walker scene are presented.

S1E1: "Winter is Coming"

The beginning of the episode.

In S1E1, before any characters are introduced, the series opens by showing the destruction white walkers are capable of, foreshadowing their importance for seasons to come.

S2E10: "Valar Morghulis"

The end of the episode.

In S2E10, the last scene of the season, the viewers are presented with the first good look at the white walkers and the size of their force after they spare a defenseless character.

S3E8: "Second Sons"

The end of the episode.

In S3E8, Sam battles a white walker and learns their weakness. This scene falls on the darker side of the blue spectrum, as seen by the deeper blues in the color visualization.

S4E4: "Oathkeeper"

The end of the episode.

In S4E4, the leader of the white walkers, The Night King, is revealed. It recruits a new follower to the cause, as seen in the characteristic bright blue eyes.

S5E8: "Hardhome"

The last third of the episode.

In S5E8, Jon and his new comrades are ambushed by the largest white walker force seen to date. The Night King demonstrates its powers and forces the human characters to flee by boat.

S6E5: "The Door"

The last third of the episode.

In S6E5, more white walker leaders are revealed and attack Bran, as he accidently revealed his location when exploring the extent of his powers and knowledge of the future. Hodor captures fans hearts. Notice the bright white and red colors from the fire explosisions toward the very end of the episode.

Although the walkers do not speak, their presence on-screen is often the most prominent color combination of blues, whites, and grays in the episodes during which they appear. Furthermore, exploring the interactive color plot above reveals that their blue/white colors occur more frequently toward the end of the series, which could be the producers' way to foreshadow the walkers' increased promimence in the upcoming seasons, and ultimately the series finale.

Case Study 3: Rise and Fall of the Houses

Text Analysis

We've looked at the frequency of word categories like "anger" and "death," but what about the frequency of mentioning characters by name? Which House is discussed the most in the series?

We first compute the frequency of character names mentioned in each episode (restricted to main characters), then group the characters by their representative House. For instance, if Arya and Jon Snow are mentioned once each, then House Stark gets a count of 2. We then plot these frequencies using the same bubble plot as above to compare the relative importance of Houses over time.

As expected, the Starks and Lannisters dominate throughout the series, with the Lannisters garnering less interest in the 5th and 6th seasons as their numbers dwindle. Interestingly, examining the histogram reveals that Ned Stark enjoys popularity throughout the series despite his death in the first season. Meanwhile, the Baratheons enjoy modest popularity until the end of season 2, when the Baratheon uprising is squashed by the Lannisters and only Stannis remains as the living Baratheon heir to the throne. The user is encouraged to look for connections between character names and color patterns in the interactive plot above (e.g. more Stark references in snowy seasons).

A Viz of Ice and Fire

As we've shown, visualizing data from Game of Thrones reveals interesting insight into the audiovisual experience. We hope that fans of other TV shows can apply these ideas to find other hidden patterns in their favorite shows. We do not own any of the imagery displayed here. All rights belong to HBO.