The spread of so-called fake news on social media has led to a decline in trust of these platforms – now a new form of media manipulation has the potential to shake our belief in videos

German computer scientists use deepfake technology to map their facial expressions to a video of Russian president Vladimir Putin (Credit: Youtube)

German computer scientists use deepfake technology to map their facial expressions to a video of Russian president Vladimir Putin (Credit: Youtube)

Deepfakes are the next big threat to privacy, fraud and defamation of character – as artificial intelligence moves into dangerous new territory by giving con artists the cyber tools to accurately mimic another person.

But a university professor and his team of students are currently on the front-line in the battle to tackle these videos, which have grown in popularity over recent years.


What are deepfakes?

A deepfake uses deep learning AI to merge two videos together – most commonly swapping one person’s face for another to create a completely new and artificial video.

The phenomenon was brought to many people’s attention in a public service address from BuzzFeed in which Get Out director Jordan Peele used the technology to make Barak Obama appear to call Donald Trump a “complete dipshit” in April 2018.

There are fears that, as the technology becomes more sophisticated, deepfakes could be used to commit fraud, undermine political opponents or to damage someone’s reputation–  leading it to be described as “fake news on steroids”.

Siwei Lyu, an associate professor in the Department of Computer Sciences at the University of Albany New York, specialises in deep learning and media forensics, and fears the impact deepfake videos could have.

“The deepfakes create a fake media that can create a kind of synthetic reality – although these videos aren’t real, they look real,” he says.

“They are being produced by algorithms so they don’t correspond to any real event happening in the physical world and that’s the scariest part.

“At the highest level, they could actually shake our belief in online visual media.”


How are deepfake videos made?

The use of deepfakes to swap the identities of two people was first used to place celebrities in porn videos, which were circulated on the social media platform Reddit.

In order to produce a deepfake video, a deep learning neural network – which is a form of AI – has to be trained with visual data of a person’s face.

According to Prof Lyu, a convincing video normally requires 500 to 1,000 high-quality images or ten to 30-second videos.

He adds: “The data requirement for training this model is not excessively high but the algorithm itself requires some time to run.

“On a commercial computer, it could typically take between three and seven days – so within a week, you could train this model.”

Once trained, the AI programme can take that person’s face and superimpose it on to another source video.

Avengers actress Scarlett Johansson was one of the targets of the deepfake porn videos shared on Reddit.

Commenting on the use of her image, she told the Washington Post in December: “The fact is that trying to protect yourself from the internet and its depravity is basically a lost cause, for the most part.”

However, Prof Lyu believes that he and his team of student researchers have found a way to detect and possibly prevent the insidious and deceptive use of peoples’ images for deepfakes.


How to spot deepfakes

Detecting deepfakes in the blink of an eye

Prof Lyu’s team have so far developed three methods for identifying deepfake videos.

The first method for the detection of deepfakes looks at the lack of blinking from the characters in the videos.

Prof Lyu noticed that he was losing staring contests with the subjects in deepfakes and explains this unusual quirk is a result of the training data the AI is given.

“The deep neural network searches for images from the internet,” he says.

“When photographers upload these images online, they tend to filter out images where the subjects’ eyes are closed.

“This gives bias to the training data. The models that are trained on it have trouble understanding the action of a blinking eye and can’t synthesise a closed eye very well.”

Prof Lyu adds: “It’s a simple and intuitive method but unfortunately it is also very fragile because anyone who understands how we detect this can intentionally add in the blinking eye to their training data.”


Detecting deepfakes by analysing head angles and 2D faces

The second method analyses the inconsistencies between the angle of the head and the face.

Prof Lyu says: “We have techniques that can estimate where the head is pointing to in three-dimensional space on a two-dimensional video.

“In deepfake videos, the face is spliced into the video, so when the head is pointing in a different direction to the camera the producers of the deepfake have to perform a 2D transformation to warp the face so that it matches up with the orientation of the head.

“Unfortunately, that transformation is a 2D transformation, but with a real person this will be in 3D.

“That introduces a lot of artefacts or imperfections when the subject of the video looks away from the camera.”

Prof Lyu claims the artefacts are often so obvious that he can use this method to manually determine whether a video he is watching is fake by simply pausing the video when the subject is turning their head.

However, this method also has its flaws.

Prof Lyu says: “Compared to blinking, this is a lot harder problem to fix for people who make deepfakes as it’s one of the fundamental issues with deepfake algorithms.

“But on the other hand, it doesn’t work all the time – if the subject is filmed while they are always looking directly at the camera, then this method can’t be applied very well.”

This motivated the computer sciences team to come up with a third solution for identifying deepfake videos.


An overly smooth face could be key to detecting deepfakes

This final method targets another fundamental imperfection with the production of these fake clips – because of the time-consuming process of creating a deepfake, producers cut corners by reducing the quality of the image.

Prof Lyu says: “The number of pixels on the face of the subject in the original video varies depending on the distance from the camera and the size of the original image, but the fake face is generally a fixed size of 64 by 64 pixels or 128 by 128 pixels.

“To accommodate that variation, the fixed-size faces need to be transformed – either enlarged shrunk or rotated – to match the original video.

“This kind of scaling and rotation, when put together, will leave some artefacts such as an overly smooth face or a loss of detail.”

These artefacts can be detected by training a deep neural network to differentiate between the changes in detail in the facial region – essentially using AI to detect the use of AI.

So far, Prof Lyu claims the algorithm has proved to be “very effective” and “works very well”.


Keeping pace with deepfake technology and finding a method of prevention

Despite three effective methods and a published paper on the subject, Prof Lyu promises he will keep pursuing better methods for deepfake detection.

He notes that even in the space of a few months between the end of last year and the start of 2019, he has seen noticeable advancements in the production of deepfakes.

“Papers from NVIDIA have shown that it’s able to generate highly realistic human faces and those techniques could potentially be combined with deepfakes to synthesise the fine facial details,” he says.

“That will make the image more visually realistic and harder for us to detect.”

Prof Lyu is already working on ways to target the use of this new technology in fake videos but worries that a lack of quality was one of the few reasons stopping their proliferation during the 2018 midterm elections in the US.

He says: “At the time, the technology was not as good as people previously thought and to create good deepfake videos, it can take a lot of manual processing work.

“It was not ready for large-scale deployment by a government-backed group.

“With enough resources and the right technology, the ability to create very believable high-quality deepfakes will be within reach in the next few years.

“Although it wasn’t a problem for the last round of elections, in 2021 the concerns will become more real and people should be taking action to prevent this problem.”

Prof Lyu believes the big tech companies and platforms, such as YouTube, Facebook and Twitter, should be the first line of defence against deepfakes.

However his team has another card up its sleeve when it comes to preventing deepfakes.

His team is currently working on a pro-active prevention to help people that could be a target of deepfakes.

“Deepfake producers rely on the availability of a large amount of images of faces so they can train their AI model,” he says.

“This is the lifeline of this technology and we are providing images of our face as a free training resource every time we upload a selfie online.

“Even though the images we post are innocuous, it could still pose a threat to our identity being stolen and it’s creating a new privacy issue.

“My lab is currently working on methods to protect us from this sort of malicious attack.”

While his team have found ways of identifying deepfakes, Prof Lyu has also begun trialling a method that puts prohibitions on images uploaded on the internet.

This would have no visual impact on the image itself but can hide the person’s face from facial recognition programmes.

He claims that preliminary results are promising and it has the potential to slow down the entire production pipeline for creating deepfakes.

It could prove to be a crucial form of defence in the new battle with deepfakes.