In its most basic sense, a deepfake is a combination of face- and voice-cloning AI technologies that allow for the creation of life-like, computer-generated videos of a real person.
In order to develop a high-quality deepfake of an individual, developers need to accumulate tens of hours of video footage associated with the person whose face/voice is to be cloned, as well as a human imitator who has learned the facial mannerisms and voice of the target.
There are two humans involved in the creation of a deepfake, such that the target face/voice is that of the famous person while the other belongs to an unknown individual who is generally closely associated with the project.
From tech to reality
From a technical standpoint, visual deepfakes are devised through the use of machine learning tools that are able to decode and strip down the images of all the facial expressions related to the two individuals into a matrix consisting of certain key attributes, such as the position of the target’s nose, eyes and mouth. Additionally, finer details, such as skin texture and facial hair, are given less importance and can be thought of as secondary.
The deconstruction, in general, is performed in such a way that it is mostly always possible to fully recreate the original image of each face from its stripped elements. Additionally, one of the primary aspects of creating a quality deepfake is how well the final image is reconstructed — such that any movements in the face of the imitator are realized in the target’s face as well.
To elaborate on the matter, Matthew Dixon, an assistant professor and researcher at the Illinois Institute of Technology’s Stuart School of Business, told Cointelegraph that both face and voice can be easily reconstructed through certain programs and techniques, adding that:
“Once a person has been digitally cloned it is possible to then generate fake video footage of them saying anything, including speaking words of malicious propaganda on social media. The average social-media follower would be unable to discern that the video was fake.”
Similarly, speaking on the finer aspects of deepfake technology, Vlad Miller, CEO of Ethereum Express — a cross-platform solution that is based on an innovative model with its own blockchain and uses a proof-of-authority consensus protocol — told Cointelegraph that deepfakes are simply a way of synthesizing human images by making use of a machine learning technique called GAN, an algorithm that deploys a combination of two neural networks.
The first generates the image samples, while the second distinguishes the real samples from the fake ones. GAN’s operational utility can be compared to the work of two people, such that the first person is engaged in counterfeiting while the other tries to distinguish the copies from the originals. If the first algorithm offers an obvious fake, the second will immediately determine it, after which the first will improve its work by offering a more realistic image.
Regarding the negative social and political implications that deepfake videos can have on the masses, Steve McNew, a MIT trained blockchain/cryptocurrency expert and senior managing director at FTI Consulting, told Cointelegraph:
“Online videos are exploding as a mainstream source of information. Imagine social media and news outlets frantically and perhaps unknowingly sharing altered clips — of police bodycam video, politicians in unsavory situations or world leaders delivering inflammatory speeches — to create an alternate truth. The possibilities for deepfakes to create malicious propaganda and other forms of fraud are significant.”
Examples of deepfakes being used for nefarious purposes
Since deepfake technology is able to manipulate and imitate the facial features and personality characteristics of real-world individuals, it raises many legitimate concerns, especially in relation to its use for various shady activities.
Additionally, for many years now, the internet has been flooded with simple tutorials that teach people how to create digitally altered audio/video data that can fool various facial recognition systems.
Not only that, but some truly disturbing instances of audio/video manipulation have recently surfaced that have called into question the utility of deepfakes. For example, a recent article claims that since 2014, deepfake technology has advanced to such levels that today, it can be used to produce videos in which the target can not only be made to express certain emotions but also bear resemblance to certain ethnic groups as well as look a certain age. On the subject, Martin Zizi, CEO of Aerendir, a physiological biometric technology provider, pointed out to Cointelegraph:
“AI does not learn from mistakes, but from plain statistics. It may seem like a small detail, but AI-based on plain statistics — even with trillion bytes of data — is just that, a statistical analysis of many dimensions. So, if you play with statistics, you can die by statistics.”
Zizi then went on to add that another key facet of facial recognition is that it is based on neural networks that are quite fragile in nature. From a structural standpoint, these networks can be thought of as cathedrals, wherein once you remove one cornerstone, the whole edifice crumbles. To further elaborate on the subject, Zizi stated:
“By removing 3 to 5 pixels from a 12 million pixels image of someone’s face brings recognition to zero! Researchers have found that adversarial attacks on neural net attacks can find those 3 to 5 pixels that represent the ‘cornerstones’ in the image.”
One last big example of deepfake tech being misused for financial reasons was when the CEO of an unnamed United Kingdom-based energy firm was recently scammed into transferring 220,000 euros ($243,000) to an unknown bank account because he believed he was on the phone with his boss, the chief executive of the firm’s parent company. In reality, the voice belonged to a scammer who had made use of deepfake voice technology to spoof the executive.
Blockchain may help against deepfakes
As per a recent 72-page report issued by Witness Media Lab, blockchain has been cited as being a legitimate tool for countering the various digital threats put forth by deepfake technology.
In this regard, using blockchain, people can digitally sign and confirm the authenticity of various video or audio files that are directly or indirectly related to them. Thus, the more digital signatures that are added to a particular video, the more likely it will be considered authentic.
Related: As Deepfake Videos Spread, Blockchain Can Be Used to Stop Them
Commenting on the matter, Greg Forst, director of marketing for Factom Protocol, told Cointelegraph that when it comes to deepfakes, blockchain has the potential to offer the global tech community with a unique solution — or at least a major part of it. He pointed out:
“If video content is on the blockchain once it has been created, along with a verifying tag or graphic, it puts a roadblock in front of deepfake endeavors. However, this hinges on video content being added to the blockchain from the outset. From there, digital identities must underline the origins and creator of the content. Securing data at source and having some standardization for media will go a long way.”
McNew also believes that owing to the blockchain’s overall immutability, once a particular data block has been confirmed by the network, its contents cannot be altered. Thus, if videos (or even photos, for that matter) are made to flow immediately into a blockchain verification application before being made available for sharing, altered videos could be easily identified as fake.
Lastly, a similar idea was shared by Miller, who is of the opinion that blockchain technology in conjunction with artificial intelligence can help solve many of the privacy and security concerns put forth by deepfakes. He added:
“AI perfectly copes with the collection, analysis, sorting and transmission of data, improving the speed and quality of execution of internal processes. The blockchain, in turn, ‘makes sure’ that no one intervenes in the work of AI — it protects data and its sequence from any encroachment.”
Blockchain technology has its own limitations
As things stand, there are a few small drawbacks that are preventing blockchain technology from being actively used to monitor deepfakes on the internet. For starters, the technology is limited in its overall scalability, as the amount of computational resources and memory required to combat digitally manipulated A/V data in real-time is quite intense.
Another potential issue that could arise as a result of blockchain being used for deepfake detection is a substantial curbing of crowdsourced video content (such as the material that is currently available on YouTube). On the issue, Dixon pointed out:
“How does someone in a poor country reach the world with their message if they have to be approved by a Silicon Valley-based company? Should we be entrusting tech companies with such power? Liberty is always at stake when trust weakens.”
A similar opinion is shared by Hibryda, creator and founder of Bitlattice, a distributed ledger system that uses a multidimensional lattice structure to address issues such as scalability, security, timing, etc. In his view:
“The biggest drawback of blockchain tech lies in its inability to determine whether the signed media is really genuine or not. But that isn't an internal issue of blockchain or related technologies — they only provide ledgers that are extremely hard to manipulate. It's external and there's no good way to solve that. While crowd-powered verification could be a partial solution, given crowds can be manipulated it's rather impossible to build a system that provides reliable and objective fact-checking.”
However, Forst told Cointelegraph that while the majority of people tend to believe that leveraging blockchain might be too expensive for deepfake detection, there are several open-source solutions that seek to do this. Forst then added that, “The biggest drawback is that blockchain doesn't solve the problem with deepfakes in its entirety, rather it can be a piece of the solution.”