Deep-Live-Cam Goes Viral, Allowing Anyone To Become a Digital Doppelganger

Posted by BeauHD on Tuesday August 13, 2024 @06:40PM from the world-we-live-in dept.

An anonymous reader quotes a report from Ars Technica: Over the past few days, a software package called Deep-Live-Cam has been going viral on social media because it can take the face of a person extracted from a single photo and apply it to a live webcam video source while following pose, lighting, and expressions performed by the person on the webcam. While the results aren’t perfect, the software shows how quickly the tech is developing — and how the capability to deceive others remotely is getting dramatically easier over time. The Deep-Live-Cam software project has been in the works since late last year, but example videos that show a person imitating Elon Musk and Republican Vice Presidential candidate J.D. Vance (among others) in real time have been making the rounds online. The avalanche of attention briefly made the open source project leap to No. 1 on GitHub’s trending repositories list (it’s currently at No. 4 as of this writing), where it is available for download for free. […]

Like many open source GitHub projects, Deep-Live-Cam wraps together several existing software packages under a new interface (and is itself a fork of an earlier project called “roop“). It first detects faces in both the source and target images (such as a frame of live video). It then uses a pre-trained AI model called “inswapper” to perform the actual face swap and another model called GFPGAN to improve the quality of the swapped faces by enhancing details and correcting artifacts that occur during the face-swapping process. The inswapper model, developed by a project called InsightFace, can guess what a person (in a provided photo) might look like using different expressions and from different angles because it was trained on a vast dataset containing millions of facial images of thousands of individuals captured from various angles, under different lighting conditions, and with diverse expressions.

During training, the neural network underlying the inswapper model developed an “understanding” of facial structures and their dynamics under various conditions, including learning the ability to infer the three-dimensional structure of a face from a two-dimensional image. It also became capable of separating identity-specific features, which remain constant across different images of the same person, from pose-specific features that change with angle and expression. This separation allows the model to generate new face images that combine the identity of one face with the pose, expression, and lighting of another.