Truth decayBeyond deep fakes: Automatically transforming video content into another video's style

Published 19 September 2018

Researchers have created a method that automatically transforms the content of one video into the style of another. For instance, Barack Obama’s style can be transformed into Donald Trump. Because the data-driven method does not require human intervention, it can rapidly transform large amounts of video, making it a boon to movie production, as well as to the conversion of black-and-white films to color and to the creation of content for virtual reality experiences.

Researchers at Carnegie Mellon University have devised a way to automatically transform the content of one video into the style of another, making it possible to transfer the facial expressions of comedian John Oliver to those of a cartoon character, or to make a daffodil bloom in much the same way as a hibiscus would.

Because the data-driven method does not require human intervention, it can rapidly transform large amounts of video, making it a boon to movie production, as well as to the conversion of black-and-white films to color and to the creation of content for virtual reality experiences.

“I think there are a lot of stories to be told,” said Aayush Bansal, a Ph.D. student in CMU’s Robotics Institute. Film production was his primary motivation in helping devise the method, he explained, enabling movies to be produced faster and cheaper. “It’s a tool for the artist that gives them an initial model that they can then improve,” he added.

The method has many other potential uses, such as helping autonomous vehicles learn how to drive safely at night.

It also has the potential to be used for so-called “deep fakes,” videos in which a person’s image is inserted without permission, making it appear that the person has done or said things that are out of character, Bansal acknowledged.

“It was an eye opener to all of us in the field that such fakes would be created and have such an impact,” he said. “Finding ways to detect them will be important moving forward.”

Bansal presented the method at ECCV 2018, the European Conference on Computer Vision, 8-14 September, in Munich, Germany. His co-authors include Deva Ramanan, CMU associate professors of robotics.

CMU notes that transferring content from one video to the style of another relies on artificial intelligence. In particular, a class of algorithms called generative adversarial networks, or GANs, have made it easier for computers to understand how to apply the style of one image to another, particularly when they have not been carefully matched.