WP1 – Deepfake Attribution and Recognition

Leader: Rocco De Nicola (rocco.denicola@imtlucca.it)

Detecting falsified content is often not enough to limit the diffusion of fake media and prosecute the parties who created and spread the fake content. In many cases, it is also necessary to identify the origin of the fake content and possibly trace it back to its history. To this aim, identifying the tools, in particular the network architectures and the specific models, that were used to create the fake content assumes a central role. It is the goal of this WP to identify the traces, a.k.a. fingerprints, left within the deep fakes by the models that generated them, and use such fingerprints to i) distinguish deepfakes from genuine content, ii) attribute the fake content to a specific network architecture, iii) attribute the fake content to a specific model.

Task 1.1 – Deepfake “Fingerprint” Modeling (OC)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person-months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: Task 1.1 aims to identify unique fingerprints in deepfake images and videos generated by neural networks. The noiseprint extracted from them can reveal a signature that identifies the network that created it, but its robustness and universality need investigation. This task seeks to develop techniques to distinguish between real and fake content, using both model-based and data-driven methods. It also considers extending the analysis to temporal signatures in video signals. The goal is to identify unique markers that can aid in detecting and distinguishing between real and fake media content.

Task 1.2 – Deepfake Attribution (OC)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person-months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: Task 1.2 aims to identify the specific network architecture or model used to generate fake content. This can be approached as either a classification or verification problem. In the classification scenario, the classifier must determine which among a predefined set of networks created the fake content. In open-set conditions, the classifier must include a rejection option to be activated when the content was generated by an unknown network. In the verification scenario, the verifier is given a fake content and a suspect network and must decide whether the content was generated by the suspect network. The task will consider both closed and open set verification scenarios. We expect that Task 1.2 will leverage on the results provided by Task 1.1 regarding the identification of deepfake fingerprints.

Deliverables & Milestones of the WP

Each task will provide three deliverables, one at the end of each year. The deliverables will consist of a technical report describing the findings obtained in the previous year, and a software implementation (made available on Git Hub) of the best-performing techniques.

Milestones are placed at the end of every year, when the results obtained in the last 12 months will be revised and the work to be carried out in the subsequent year will be reviewed in light of the results obtained so far.

Note: for the tasks to be covered by open calls, the first deliverable will cover a shorter reporting period due to the time necessary to issue the call and select the partner among applicants.

WP2 – Passive Deepfake Authentication Methods

Leader: Gian Luca Marcialis (marcialis @unica.it)

In this project, “deepfake detection” is referred to as “passive deepfake authentication” to include all three topics: deepfake fingerprinting, media authentication, and marking. The focus of this work package is on verifying the authenticity of content and preventing malicious deep learning-based processing. This is important because fake content poses a risk to security applications and the spread of fake news. However, media compression, size, and resolution present challenges to the authentication method’s robustness. The work package aims to explore the problem from different angles, including biometric applications, multimodal detection, and advanced techniques using context and semantic content.

Task 2.1 – Deepfake and Biometric Recognition (UNICA, ENI)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person-months, including those of the RTDs to be recruited):` UNICA ( 1 PA x 9PMs, 1 PA x 9PMs, 1 PA x 1 PM, 1 RTD x 36PMs), ENI

TASK DESCRIPTION: The task aims to develop effective methods and models to authenticate media content for personal verification by extracting robust features. These features must be able to represent different deepfake reproduction methods and be resilient to compression, resolution variations, and occlusions in the media. Features will be designed in space, frequency, and temporal domains and represented by a formal model. The methods will employ textural descriptors (e.g. LBP and BSIFs), wavelet transformations (e.g. DCT, DWT), quality measurements (e.g. PSNR), and specific neural architectures. The novelty will be introduced by employing Ex-AI methods in the design process to identify appropriate features and data for training networks. Realistic use-case conditions will be considered for available data. The final output is a proof-of-concept of the developed methods and their pros and cons.

Task 2.2 – Audio-Video Deepfake (OC)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person-months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: To improve deepfake detection in videos, analyzing both the audio and visual tracks can expose inconsistencies between them and reveal fakes. In addition to lips and eyes motion, anomalies in emotional state conveyed by the audio and visual tracks, reverberation effects incompatible with the framed scene, lack of synchronization between video and audio and other semantic clues can be used to identify fakes. Task 2.2 aims to investigate the effectiveness of audio-visual deepfake analysis, addressing challenges such as the need for suitable datasets, efficient forensic tools, and proper fusion techniques that merge audio and visual clues at various processing levels. This includes data-level, feature-level, score-level, and decision-level fusion.

Task 2.3 – Advanced Methods for Deepfake Detection (OC)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: Current deepfake detection methods are vulnerable to low-level quality impairment actions. One solution is to use high-level semantic features, such as inconsistencies in individual biometric traits or facial movements, to build reliable models. Semantic-based analysis can also apply to non-facial content, using inconsistencies in motion patterns or shadows. Another solution is to analyze content within its context, which provides valuable priors to authenticate tools and disambiguate uncertain results. Contextual information can be obtained through metadata analysis or by analyzing the wider document or web page where the content is used. These methods can improve the interpretability of forensic analysis results.

Deliverables & Milestones of the WP

Each task will provide three deliverables, one at the end of each year. The deliverables will consist of a technical report describing the findings obtained in the previous year, and a software implementation (made available on Git Hub) of the best-performing techniques.

Milestones are placed at the end of every year, when the results obtained in the last 12 months will be revised and the work to be carried out in the subsequent year will be reviewed in light of the results obtained so far.

Note: for the tasks to be covered by open calls, the first deliverable will cover a shorter reporting period due to the time necessary to issue the call and select the partner among applicants.

WP3 – Deepfakes Detection Methods in Realistic Scenarios

Leader: Irene Amerini (amerini@diag.uniroma1.it)

The detection of deepfakes has garnered significant interest in recent years, resulting in several proposed solutions to address the growing issue of fake media content. However, most of these solutions only perform well in controlled settings, such as laboratory experiments, and fail to provide reliable results in real-world scenarios. This WP aims to go beyond the research conducted in earlier work packages and develop solutions that work effectively in highly realistic environments. The package comprises three tasks, which address specific challenges encountered in real-world applications, including dealing with limited data, open set conditions, interpretability requirements, working in social media settings, and adversarial settings where an informed adversary aims to defeat the detection tools.

Task 3.1 – Deepfake Detection of images-videos in the wild (Sapienza)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person-months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: Deep Learning (DL) models show promise in detecting and attributing deepfake media, but their use in real-life scenarios is impeded by several shortcomings. DL models require large amounts of labeled data for training, which can be difficult to obtain in practical applications. DL models also struggle with unforeseen situations and the risk of overfitting to training data. Additionally, the black-box nature of DL techniques can hinder the interpretation of analysis results, which is often necessary for accountability. FF4ALL addresses these issues with one-class classifiers trained on pristine images, classifiers with rejection options, and methods for generalization.

Task 3.2 – Deepfake and Social Media (OC)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person-months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: Forensic analysis of multimedia data in social media channels is complex, and a binary classification of “real” or “manipulated” may not be representative. The challenge is to develop more sophisticated authenticity indicators that capture different aspects of the object under investigation. Task 3.2 aims to adapt general forensic tools to these needs and test them on data gathered from popular social media channels. This requires the application of various tools and disciplines, working in synergy to advance the field.

Task 3.3 – Detection of Deepfake Images and Videos in Adversarial Setting (OC)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person-months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: Task 3.3 addresses the issue of adversarial attacks on deepfake authentication tools, which can be easily deceived when the adversary is informed about the details. The task aims to develop tools with enhanced security against intentional attacks, starting by defining security models that describe the framework in which the race between the forensic analyst and attacker is staged. Solutions may include adversarial training, hybrid data-driven and model-based detectors, and multi-clue, multimodal analysis to robustify detectors against deliberate attacks during the training phase, considering the goals, constraints, and information available to both players.

Deliverables & Milestones of the WP

Each task will provide three deliverables, one at the end of each year. The deliverables will consist of a technical report describing the findings obtained in the previous year, and a software implementation (made available on Git Hub) of the best-performing techniques.

Milestones are placed at the end of every year, when the results obtained in the last 12 months will be revised and the work to be carried out in the subsequent year will be reviewed in light of the results obtained so far.

Note: for the tasks to be covered by open calls, the first deliverable will cover a shorter reporting period due to the time necessary to issue the call and select the partner among applicants.

WP4 – Active Authentication

Leader: Roberto Caldelli (roberto.caldelli@unifi.it, roberto.caldelli@cnit.it)

Active media authentication techniques work in a preemptive way to ease subsequent analysis, while passive methods work after the forged content has been generated. Deepfake detection methods based on DNN watermarking and unique fingerprints inserted within the content are examples of active techniques. Blockchain can be used to trace the processing chain of images and videos. WP4 aims to study active authentication techniques as a more reliable alternative or complement to passive methods, where operating conditions permit.

Task 4.1 – Active Fingerprinting for Deepfake Detection and Attribution (CNIT)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person-months, including those of the RTDs to be recruited):` CNIT

TASK DESCRIPTION: Recently, watermarking has been proposed as a way to protect DNN IPR. FF4ALL envisions using DNN watermarking to link deepfakes to the model that generated them, achieved by requiring a predefined watermark in all content generated by the network. This idea requires advances in watermark robustness, security against adversarial attacks, and imperceptibility. Task T4.1 aims to develop robust and secure DNN watermarking tools for deepfake detection/attribution and media authentication. This approach differs from current solutions based on multimedia forensics and requires active help from the party that trained the network.

Task 4.2 – Authentication of Devices for the Acquisition and Processing of Content (OC)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: To prevent the spread of fake content, IoT device authentication is crucial for data acquisition and content processing. Traditional methods based on certificates issued by Certification Authorities (CAs) are obsolete and vulnerable to cyber-attacks, especially when IoT devices are installed in remote areas. New protocols are needed to provide additional security in communication systems between IoT devices and the Edge and Cloud systems. The project aims to develop and prototype decoupling systems and new protocols for communication between IoT devices, to create a reliable ecosystem for authenticating acquired and processed data.

Task 4.3 – Trusted Remote Media Processing on Cloud and Edge Computing Systems (OC)

`Partners involved in the execution of the task: with estimate of the resources involved (number of person months, including those of the RTDs to be recruited):` OPEN CALL

TASK DESCRIPTION: Task 4.3 aims to address challenges related to media processing authentication in cloud and edge computing systems. The objective is to design secure strategies for multimedia data transmission, processing, and storage with robust authentication and verification mechanisms to prevent the spread of fake digital media. Active and passive techniques will be integrated into a scalable cloud/edge framework with technologies such as Blockchain and IPFS (InterPlanetary File System). The project will also explore the use of AI and machine learning algorithms to improve accuracy and efficiency. Experiments and simulations will evaluate the practicality and effectiveness of the developed strategies in real-world scenarios.

Deliverables & Milestones of the WP

Each task will provide three deliverables, one at the end of each year. The deliverables will consist of a technical report describing the findings obtained in the previous year, and a software implementation (made available on Git Hub) of the best-performing techniques.

Milestones are placed at the end of every year, when the results obtained in the last 12 months will be revised and the work to be carried out in the subsequent year will be reviewed in light of the results obtained so far.

Note: for the tasks to be covered by open calls, the first deliverable will cover a shorter reporting period due to the time necessary to issue the call and select the partner among applicants.