Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now


Practical Magic: AI’s Potential in Video Production

So what exactly can neural networks, machine learning, computer vision, and natural language processing do for you?

By Greg Scoblete

According to the Gartner Hype Cycle, several critical components of what we collectively call “artificial intelligence” (AI)—namely neural networks, machine learning, computer vision and natural language processing—have passed the “peak of inflated expectations.” The next stop on Gartner’s cycle? A looming “trough of disillusionment,” followed by a “slope of enlightenment.”

Read more: The State of AI in Media and Entertainment: Pedal to the Metal

Yet “disillusionment” is about the last adjective you’ll hear manufacturers and developers use to describe AI’s capabilities in the video market today. Even if expectations for the technology are drifting back down to Earth, there appears to be plenty of optimism left that AI will continue to unlock a host of practical innovations for video professionals. Here are just a few:

AI-Powered Transcription

Jim Tierney, CEO of software developer Digital Anarchy, says the company has incorporated machine learning and natural language processing in its Transcriptive AI plug-in to deliver video transcriptions automatically with an accuracy rate of 97 percent. The software can be used to create video captions and subtitles and can be integrated with Adobe Premiere to give users a library of video clips that are searchable via text keywords.

“I don’t think we’ll be completely replacing all our algorithms with ML-based algorithms any time soon.” — Jim Tierney, Digital Anarchy

“AI is not going to magically make everything a ‘one-click’ solution in post,” Tierney says. “Users are still going to have to help it along and, as developers, we need to build the tools that enable users to do that.” To that end, Digital Anarchy is researching areas where AI  “might be helpful… both to complement Transcriptive with features like translation or to add features to our visual effects plug-ins that are pushing pixels around. For example, creating better skin tone masks for Beauty Box.”

Read more: An Introduction to AI-Based Video and Image Compression

You Say You Want More Resolution

One emerging use case for machine learning is image upscaling. Companies such as Denmark’s Pixop are leveraging techniques inspired by those created by Google to upscale lower-resolution still images to create higher-resolution videos from low-res source material.

The benefit, according to Chief Technology Officer and co-founder Jon Frydensbjerg, isn’t simply the ability to modernize—and monetize—archival video content for the seemingly unquenchable demands of streaming video providers, but to do so at a lower cost than existing restoration techniques.

Pixop’s video upscaling comparison

Pixop’s current state of the art delivers a scaling factor of four to any content, sufficient to upscale a high-definition video source to 4K resolution. According to Frydensbjerg, to go from a standard-definition file to a 4K file would require a scaling factor of five. “While doable, this is admittedly not the most practical solution,” Frydensbjerg notes. “We will be adding support for higher upscaling factors in the future, though.”

“There is such an enormous potential to be creative and innovate in this space with the building blocks and technology already available. New research is coming out all the time, which redefines what is possible on even very degraded footage.” — Jon Frydensbjerg, Pixop

The AI Asset Manager

One of the more mature use cases for AI in video is the automatic extraction of important metadata from video files. Imagen does that just, using machine learning for facial tagging, auto-transcriptions, descriptive tagging and automatic video translations.

Charlie Horrell, CEO, explains that AI-derived tools could soon be used for other aspects of content management, including content onboarding, search improvements, workflow optimization and preparing video formats for delivery.

“Our goal is to use AI to bring order to unstructured data, help bring content to market more quickly and substantially reduce cost of ownership. Meanwhile we see enormous scope for AI under the hood, which is sometimes more difficult to get excited about—but can deliver enormous business benefits— such as predicting demand or personalizing the user experience.” — Charlie Horrell, Imagen

Adobe’s Auto Reframe in action
Adobe’s Auto Reframe in action

Adobe has been steadily integrating machine learning-powered features into a range of its Creative Suite products under the collective brand “Sensei.” Sensei is currently enabling Auto Reframe in Premiere Pro, Color Match in Premiere Pro and Auto Ducking in Premiere Pro and Audition, said Patrick Palmer, principal product manager for video editing, Adobe. “Content Aware Fill for Video in After Effects, unveiled last year at the NAB Show, is also powered by intelligent algorithms.”

Read more: Deep Learning and Lumière Brothers Footage: 1896 and Hand-Cranked Cameras vs. AI and 4K, Everybody Wins

For Adobe, AI’s early virtue is in peeling away the grunt work and inefficiencies in a user’s workflow. “We are creating these features to automate tedious, repetitive tasks, but they aren’t black boxes,” Palmer said. “To achieve this, we test and tweak functionality and the user interface, discovering where we need to allow for more transparency and visibility, and where it makes sense to minimize.”

“We’ve seen artificial intelligence and machine learning integrate into more and more editing workflows in the last year, and these technologies will continue to be a game changer in video production and post production.” — Patrick Palmer, Adobe

So don’t worry, the robots aren’t coming for your job—only the boring parts.

Copyright 2020 NAB

This story comes from the NAB Show Daily Special Edition.

Download the NAB Show Daily Special Edition