TrustML Project – PRA Lab

TrustML project

Project coordinator
Prof. Battista Biggio
PRA Lab – University of Cagliari

Duration
25 Months
September 2022 – October 2024

Budget
Total budget: 52777.78 €
FDS funded budget: 52777.78 €

People involved

Prof. Giorgio Giacinto
Prof. Giorgio Fumera
Prof. Luca Didaci
Prof. Matteo Fraschini

Funding

Project context and challenges

Machine learning (ML) and Artificial intelligence (AI) have recorded unprecedented success in different applications, including computer vision, speech recognition, and natural language processing.

Despite providing accurate predictions, such models suffer from several limitations which prevent their applicability in security-sensitive and safety-critical domains.

In this project, we aim to address three main challenges that are hindering the development of trustworthy AI/ML models in these domains.

1. The challenge of adversarial robustness

2. The challenge of model explainability

3. The challenge of uncertainty estimation

The first is the challenge of adversarial robustness. AI/ML models have been shown to be vulnerable to adversarial examples, i.e., carefully optimized perturbations added to the input images, text, or audio. Evaluating robustness in other domains, like malware detection, remains an open challenge due to the lack of a proper formalization of the feasible, structural, and more complex input manipulations specific to such domains.
The second is the challenge of model explainability. AI/ML models output predictions which are hardly interpretable; it is difficult to identify which features of the input samples contribute to predicting them within a given class, or which training samples support such a decision. Explainability is a desirable property not only to build users’ trust in AI/ML, but also to understand what models learn, enabling detection and mitigation of potential dataset/model biases.
The third challenge is about evaluating uncertainty. It arises from the fact that AI/ML models can output highly confident predictions even when the input data is outside the support of the training data distribution, as in the case of adversarial examples.

These problems have been addressed by the development of appropriate scientific methodologies allowing a significant advancement of the state of the art. The methodological advancement concerned:

– the development of techniques for evaluating the robustness/safety of machine learning algorithms with respect to inputs disrupted to compromise their decisions, known as adversarial examples;
– the development of techniques for improving the robustness of algorithms against these inputs.
– the development of techniques to improve the interpretability of the decisions provided by these algorithms.

A large collection of datasets for the experimentation on use cases was performed, followed by the development and testing of prototype systems on applications of interest, which included image recognition and automatic detection of computer viruses.

Project results and impact

The project produced significant tangible results, including (i) scientific publications published in international journals and in the proceedings of conferences among the most important in the field of machine learning and artificial intelligence; (ii) the release of benchmark datasets and associated experimental prototypes, through the production of various projects with open code source; (iii) an intense dissemination and communication activity towards stakeholders and the scientific community.
The project has also fostered numerous scientific collaborations at national and international level, such as demonstrated by the number of publications with co-authors external to the project unit team. This led to the consolidation and formalization of agreements for carrying out some joint activities with companies and international institutions.

Publications on Scientific Journals

Conference Proceedings

Prototypes

Open source software