Topic: Shapley

Published:

# Topic: Shapley

Collaborative learning Cooperative computing Cooperative learning Cooperative networks Distributed approaches Distributed environments Distributed information Federated learning Shapley

2026

  1. Campanile, L., de Biase, M. S., & Marulli, F. (2026). Design and evaluation of a privacy-preserving multi-level federated learning architecture for airport biometric check-in. Future Generation Computer Systems, 176, 108217. https://doi.org/https://doi.org/10.1016/j.future.2025.108217
    Abstract
    The rapid adoption of automated airport check-in systems using facial recognition raises significant privacy concerns due to their reliance on centralized deep learning models that store and transmit biometric data from edge devices. While Federated Learning (FL) is a promising approach for privacy preservation, its effectiveness in biometric identification remains underexplored, particularly in real-world environments like airports. This study assesses the privacy implications of FL in facial recognition by comparing three architectures. A first centralized system, where biometric data is sent to a central server for model training and inference, posing significant privacy risks. The second is a one-level FL architecture, where biometric data remains on local devices, and only model updates are shared with a central aggregator. The third is a two-level FL architecture, introducing an additional aggregation layer among airlines to enhance model generalization while preserving privacy. To ensure a rigorous privacy preservation evaluation, we integrate both quantitative and qualitative metrics. For the quantitative assessment, we leverage the Privacy Meter Tool, which enables simulations of Membership Inference Attacks and the application of Differential Privacy as a mitigation technique. For the qualitative evaluation, we conduct a Data Protection Impact Assessment to analyze potential privacy risks from a regulatory perspective. Additionally, we assess model accuracy, computational efficiency, and communication overhead to determine FL’s feasibility in large-scale airport environments. Our results show that while FL significantly reduces privacy risks, the two-level FL approach introduces new vulnerabilities, such as model poisoning risks and privacy-utility trade-offs, requiring further mitigation strategies like DP.
    DOI Publisher Details
    Details

2025

  1. Campanile, L., de Biase, M. S., & Marulli, F. (2025). Edge-Cloud Distributed Approaches to Text Authorship Analysis: A Feasibility Study [Book chapter]. Lecture Notes on Data Engineering and Communications Technologies, 250, 284–293. https://doi.org/10.1007/978-3-031-87778-0_28
    Abstract
    Automatic authorship analysis, often referred to as stylometry, is a captivating yet contentious field that employs computational techniques to determine the authorship of textual artefacts. In recent years, the importance of author profiling has grown significantly due to the proliferation of automatic text generation systems. These include both early-generation bots and the latest generative AI-based models, which have heightened concerns about misinformation and content authenticity. This study proposes a novel approach to evaluate the feasibility and effectiveness of contemporary distributed learning methods. The approach leverages the computational advantages of distributed systems while preserving the privacy of human contributors, enabling the collection and analysis of extensive datasets of “human-written” texts in contrast to those generated by bots. More specifically, the proposed method adopts a Federated Learning (FL) framework, integrating readability and stylometric metrics to deliver a privacy-preserving solution for Authorship Attribution (AA). The primary objective is to enhance the accuracy of AA processes, thus achieving a more robust “authorial fingerprint”. Experimental results reveal that while FL effectively protects privacy and mitigates data exposure risks, the combined use of readability and stylometric features significantly increases the accuracy of AA. This approach demonstrates promise for secure and scalable AA applications, particularly in privacy-sensitive contexts and real-time edge computing scenarios. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
    DOI Publisher Details
    Details
  2. Di Bonito, L. P., Campanile, L., Iacono, M., & Di Natale, F. (2025). An eXplainable Artificial Intelligence framework to predict marine scrubbers performances [Article]. Engineering Applications of Artificial Intelligence, 160. https://doi.org/10.1016/j.engappai.2025.111860
    Abstract
    This study presents an eXplainable Artificial Intelligence (XAI) framework to predict the performance of marine scrubbers used for sulfur dioxide (SO2) removal from marine diesel engine flue gases. Using an aggregated dataset from a roll-on/roll-off (Ro-Ro) cargo ship equipped with an open-loop scrubber, combined with satellite data, the study constructs and evaluates multiple artificial intelligence models, including ensemble models, which were benchmarked against each other using standard regression metrics such as the coefficient of determination (R2), mean absolute error (MAE), and mean squared error (MSE). Results achieve high accuracy R2>0.92 and offer insights for optimizing scrubber operations. Nevertheless, artificial intelligence models lack transparency. To overcome this problem, this research integrates post-hoc explainability techniques to elucidate the contributions of various features to model predictions, thereby enhancing interpretability and reliability. The integration of SHapley Additive exPlanations (SHAP) and Explain Like I’m 5 (ELI5) not only confirmed the consistency of feature importance rankings (e.g. seawater acidity level, SO2 inlet concentration, outlet temperature) but also aligned with the physical-chemical principles of SO2 absorption. Quantitative comparisons with theoretical expectations demonstrated the reliability of the XAI insights, enhancing both model transparency and interpretability. This can improve the current capability of designing scrubber units by defining more efficient and less expensive options for environmental regulation compliance. © 2025 The Authors
    DOI Publisher Details
    Details

2024

  1. Marulli, F., Campanile, L., Marrone, S., & Verde, L. (2024). Combining Federated and Ensemble Learning in Distributed and Cloud Environments: An Exploratory Study [Book chapter]. Lecture Notes on Data Engineering and Communications Technologies, 203, 297–306. https://doi.org/10.1007/978-3-031-57931-8_29
    Abstract
    Conventional modern Machine Learning (ML) applications involve training models in the cloud and then transferring them back to the edge, especially in an Internet of Things (IoT) enabled environment. However, privacy-related limitations on data transfer from the edge to the cloud raise challenges: among various solutions, Federated Learning (FL) could satisfy privacy related concerns and accommodate power and energy issues of edge devices. This paper proposes a novel approach that combines FL and Ensemble Learning (EL) to improve both security and privacy challenges. The presented methodology introduces an extra layer, the Federation Layer, to enhance security. It uses Bayesian Networks (BNs) to dynamically filter untrusted/unsecure federation clients. This approach presents a solution for increasing the security and robustness of FL systems, considering also privacy and performance aspects. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
    DOI Publisher Details
    Details

2022

  1. Campanile, L., Marrone, S., Marulli, F., & Verde, L. (2022). Challenges and Trends in Federated Learning for Well-being and Healthcare [Conference paper]. Procedia Computer Science, 207, 1144–1153. https://doi.org/10.1016/j.procs.2022.09.170
    Abstract
    Currently, research in Artificial Intelligence, both in Machine Learning and Deep Learning, paves the way for promising innovations in several areas. In healthcare, especially, where large amounts of quantitative and qualitative data are transferred to support studies and early diagnosis and monitoring of any diseases, potential security and privacy issues cannot be underestimated. Federated learning is an approach where privacy issues related to sensitive data management can be significantly reduced, due to the possibility to train algorithms without exchanging data. The main idea behind this approach is that learning models can be trained in a distributed way, where multiple devices or servers with decentralized data samples can provide their contributions without having to exchange their local data. Recent studies provided evidence that prototypes trained by adopting Federated Learning strategies are able to achieve reliable performance, thus by generating robust models without sharing data and, consequently, limiting the impact on security and privacy. This work propose a literature overview of Federated Learning approaches and systems, focusing on its application for healthcare. The main challenges, implications, issues and potentials of this approach in the healthcare are outlined. © 2022 The Authors. Published by Elsevier B.V.
    DOI Publisher Details
    Details
  2. Marulli, F., Verde, L., Marrore, S., & Campanile, L. (2022). A Federated Consensus-Based Model for Enhancing Fake News and Misleading Information Debunking [Conference paper]. Smart Innovation, Systems and Technologies, 309, 587–596. https://doi.org/10.1007/978-981-19-3444-5_50
    Abstract
    Misinformation and Fake News are hard to dislodge. According to experts on this phenomenon, to fight disinformation a less credulous public is needed; so, current AI techniques can support misleading information debunking, given the human tendency to believe “facts” that confirm biases. Much effort has been recently spent by the research community on this plague: several AI-based approaches for automatic detection and classification of Fake News have been proposed; unfortunately, Fake News producers have refined their ability in eluding automatic ML and DL-based detection systems. So, debunking false news represents an effective weapon to contrast the users’ reliance on false information. In this work, we propose a preliminary study aiming to approach the design of effective fake news debunking systems, harnessing two complementary federated approaches. We propose, firstly, a federation of independent classification systems to accomplish a debunking process, by applying a distributed consensus mechanism. Secondly, a federated learning task, involving several cooperating nodes, is accomplished, to obtain a unique merged model, including features of single participants models, trained on different and independent data fragments. This study is a preliminary work aiming to to point out the feasibility and the comparability of these proposed approaches, thus paving the way to an experimental campaign that will be performed on effective real data, thus providing an evidence for an effective and feasible model for detecting potential heterogeneous fake news. Debunking misleading information is mission critical to increase the awareness of facts on the part of news consumers. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
    DOI Publisher Details
    Details

2021

  1. Campanile, L., Iacono, M., Marulli, F., & Mastroianni, M. (2021). Designing a GDPR compliant blockchain-based IoV distributed information tracking system [Article]. Information Processing and Management, 58(3). https://doi.org/10.1016/j.ipm.2021.102511
    Abstract
    Blockchain technologies and distributed ledgers enable the design and implementation of trustable data logging systems that can be used by multiple parties to produce a non-repudiable database. The case of Internet of Vehicles may greatly benefit of such a possibility to track the chain of responsibility in case of accidents or damages due to bad or omitted maintenance, improving the safety of circulation and helping granting a correct handling of related legal issues. However, there are privacy issues that have to be considered, as tracked information potentially include data about private persons (position, personal habits), commercially relevant information (state of the fleet of a company, freight movement and related planning, logistic strategies), or even more critical knowledge (e.g., considering vehicles belonging to police, public authorities, governments or officers in sensible positions). In the European Union, all this information is covered by the General Data Protection Regulation (GDPR). In this paper we propose a reference model for a system that manages relevant information to show how blockchain can support GDPR compliant solutions for Internet of Vehicles, taking as a reference an integrated scenario based on Italy, and analyze a subset of its use cases to show its viability with reference to privacy issues. © 2021 Elsevier Ltd
    DOI Publisher Details
    Details
  2. Marulli, F., Balzanella, A., Campanile, L., Iacono, M., & Mastroianni, M. (2021). Exploring a Federated Learning Approach to Enhance Authorship Attribution of Misleading Information from Heterogeneous Sources [Conference paper]. Proceedings of the International Joint Conference on Neural Networks, 2021-July. https://doi.org/10.1109/IJCNN52387.2021.9534377
    Abstract
    Authorship Attribution (AA) is currently applied in several applications, among which fraud detection and anti-plagiarism checks: this task can leverage stylometry and Natural Language Processing techniques. In this work, we explored some strategies to enhance the performance of an AA task for the automatic detection of false and misleading information (e.g., fake news). We set up a text classification model for AA based on stylometry exploiting recurrent deep neural networks and implemented two learning tasks trained on the same collection of fake and real news, comparing their performances: one is based on Federated Learning architecture, the other on a centralized architecture. The goal was to discriminate potential fake information from true ones when the fake news comes from heterogeneous sources, with different styles. Preliminary experiments show that a distributed approach significantly improves recall with respect to the centralized model. As expected, precision was lower in the distributed model. This aspect, coupled with the statistical heterogeneity of data, represents some open issues that will be further investigated in future work. © 2021 IEEE.
    DOI Publisher Details
    Details

← Back to all publications

2026

  1. Campanile, L., de Biase, M. S., & Marulli, F. (2026). Design and evaluation of a privacy-preserving multi-level federated learning architecture for airport biometric check-in. Future Generation Computer Systems, 176, 108217. https://doi.org/https://doi.org/10.1016/j.future.2025.108217
    Abstract
    The rapid adoption of automated airport check-in systems using facial recognition raises significant privacy concerns due to their reliance on centralized deep learning models that store and transmit biometric data from edge devices. While Federated Learning (FL) is a promising approach for privacy preservation, its effectiveness in biometric identification remains underexplored, particularly in real-world environments like airports. This study assesses the privacy implications of FL in facial recognition by comparing three architectures. A first centralized system, where biometric data is sent to a central server for model training and inference, posing significant privacy risks. The second is a one-level FL architecture, where biometric data remains on local devices, and only model updates are shared with a central aggregator. The third is a two-level FL architecture, introducing an additional aggregation layer among airlines to enhance model generalization while preserving privacy. To ensure a rigorous privacy preservation evaluation, we integrate both quantitative and qualitative metrics. For the quantitative assessment, we leverage the Privacy Meter Tool, which enables simulations of Membership Inference Attacks and the application of Differential Privacy as a mitigation technique. For the qualitative evaluation, we conduct a Data Protection Impact Assessment to analyze potential privacy risks from a regulatory perspective. Additionally, we assess model accuracy, computational efficiency, and communication overhead to determine FL’s feasibility in large-scale airport environments. Our results show that while FL significantly reduces privacy risks, the two-level FL approach introduces new vulnerabilities, such as model poisoning risks and privacy-utility trade-offs, requiring further mitigation strategies like DP.
    DOI Publisher Details
    Details

2025

  1. Campanile, L., de Biase, M. S., & Marulli, F. (2025). Edge-Cloud Distributed Approaches to Text Authorship Analysis: A Feasibility Study [Book chapter]. Lecture Notes on Data Engineering and Communications Technologies, 250, 284–293. https://doi.org/10.1007/978-3-031-87778-0_28
    Abstract
    Automatic authorship analysis, often referred to as stylometry, is a captivating yet contentious field that employs computational techniques to determine the authorship of textual artefacts. In recent years, the importance of author profiling has grown significantly due to the proliferation of automatic text generation systems. These include both early-generation bots and the latest generative AI-based models, which have heightened concerns about misinformation and content authenticity. This study proposes a novel approach to evaluate the feasibility and effectiveness of contemporary distributed learning methods. The approach leverages the computational advantages of distributed systems while preserving the privacy of human contributors, enabling the collection and analysis of extensive datasets of “human-written” texts in contrast to those generated by bots. More specifically, the proposed method adopts a Federated Learning (FL) framework, integrating readability and stylometric metrics to deliver a privacy-preserving solution for Authorship Attribution (AA). The primary objective is to enhance the accuracy of AA processes, thus achieving a more robust “authorial fingerprint”. Experimental results reveal that while FL effectively protects privacy and mitigates data exposure risks, the combined use of readability and stylometric features significantly increases the accuracy of AA. This approach demonstrates promise for secure and scalable AA applications, particularly in privacy-sensitive contexts and real-time edge computing scenarios. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
    DOI Publisher Details
    Details
  2. Di Bonito, L. P., Campanile, L., Iacono, M., & Di Natale, F. (2025). An eXplainable Artificial Intelligence framework to predict marine scrubbers performances [Article]. Engineering Applications of Artificial Intelligence, 160. https://doi.org/10.1016/j.engappai.2025.111860
    Abstract
    This study presents an eXplainable Artificial Intelligence (XAI) framework to predict the performance of marine scrubbers used for sulfur dioxide (SO2) removal from marine diesel engine flue gases. Using an aggregated dataset from a roll-on/roll-off (Ro-Ro) cargo ship equipped with an open-loop scrubber, combined with satellite data, the study constructs and evaluates multiple artificial intelligence models, including ensemble models, which were benchmarked against each other using standard regression metrics such as the coefficient of determination (R2), mean absolute error (MAE), and mean squared error (MSE). Results achieve high accuracy R2>0.92 and offer insights for optimizing scrubber operations. Nevertheless, artificial intelligence models lack transparency. To overcome this problem, this research integrates post-hoc explainability techniques to elucidate the contributions of various features to model predictions, thereby enhancing interpretability and reliability. The integration of SHapley Additive exPlanations (SHAP) and Explain Like I’m 5 (ELI5) not only confirmed the consistency of feature importance rankings (e.g. seawater acidity level, SO2 inlet concentration, outlet temperature) but also aligned with the physical-chemical principles of SO2 absorption. Quantitative comparisons with theoretical expectations demonstrated the reliability of the XAI insights, enhancing both model transparency and interpretability. This can improve the current capability of designing scrubber units by defining more efficient and less expensive options for environmental regulation compliance. © 2025 The Authors
    DOI Publisher Details
    Details

2024

  1. Marulli, F., Campanile, L., Marrone, S., & Verde, L. (2024). Combining Federated and Ensemble Learning in Distributed and Cloud Environments: An Exploratory Study [Book chapter]. Lecture Notes on Data Engineering and Communications Technologies, 203, 297–306. https://doi.org/10.1007/978-3-031-57931-8_29
    Abstract
    Conventional modern Machine Learning (ML) applications involve training models in the cloud and then transferring them back to the edge, especially in an Internet of Things (IoT) enabled environment. However, privacy-related limitations on data transfer from the edge to the cloud raise challenges: among various solutions, Federated Learning (FL) could satisfy privacy related concerns and accommodate power and energy issues of edge devices. This paper proposes a novel approach that combines FL and Ensemble Learning (EL) to improve both security and privacy challenges. The presented methodology introduces an extra layer, the Federation Layer, to enhance security. It uses Bayesian Networks (BNs) to dynamically filter untrusted/unsecure federation clients. This approach presents a solution for increasing the security and robustness of FL systems, considering also privacy and performance aspects. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
    DOI Publisher Details
    Details

2022

  1. Campanile, L., Marrone, S., Marulli, F., & Verde, L. (2022). Challenges and Trends in Federated Learning for Well-being and Healthcare [Conference paper]. Procedia Computer Science, 207, 1144–1153. https://doi.org/10.1016/j.procs.2022.09.170
    Abstract
    Currently, research in Artificial Intelligence, both in Machine Learning and Deep Learning, paves the way for promising innovations in several areas. In healthcare, especially, where large amounts of quantitative and qualitative data are transferred to support studies and early diagnosis and monitoring of any diseases, potential security and privacy issues cannot be underestimated. Federated learning is an approach where privacy issues related to sensitive data management can be significantly reduced, due to the possibility to train algorithms without exchanging data. The main idea behind this approach is that learning models can be trained in a distributed way, where multiple devices or servers with decentralized data samples can provide their contributions without having to exchange their local data. Recent studies provided evidence that prototypes trained by adopting Federated Learning strategies are able to achieve reliable performance, thus by generating robust models without sharing data and, consequently, limiting the impact on security and privacy. This work propose a literature overview of Federated Learning approaches and systems, focusing on its application for healthcare. The main challenges, implications, issues and potentials of this approach in the healthcare are outlined. © 2022 The Authors. Published by Elsevier B.V.
    DOI Publisher Details
    Details
  2. Marulli, F., Verde, L., Marrore, S., & Campanile, L. (2022). A Federated Consensus-Based Model for Enhancing Fake News and Misleading Information Debunking [Conference paper]. Smart Innovation, Systems and Technologies, 309, 587–596. https://doi.org/10.1007/978-981-19-3444-5_50
    Abstract
    Misinformation and Fake News are hard to dislodge. According to experts on this phenomenon, to fight disinformation a less credulous public is needed; so, current AI techniques can support misleading information debunking, given the human tendency to believe “facts” that confirm biases. Much effort has been recently spent by the research community on this plague: several AI-based approaches for automatic detection and classification of Fake News have been proposed; unfortunately, Fake News producers have refined their ability in eluding automatic ML and DL-based detection systems. So, debunking false news represents an effective weapon to contrast the users’ reliance on false information. In this work, we propose a preliminary study aiming to approach the design of effective fake news debunking systems, harnessing two complementary federated approaches. We propose, firstly, a federation of independent classification systems to accomplish a debunking process, by applying a distributed consensus mechanism. Secondly, a federated learning task, involving several cooperating nodes, is accomplished, to obtain a unique merged model, including features of single participants models, trained on different and independent data fragments. This study is a preliminary work aiming to to point out the feasibility and the comparability of these proposed approaches, thus paving the way to an experimental campaign that will be performed on effective real data, thus providing an evidence for an effective and feasible model for detecting potential heterogeneous fake news. Debunking misleading information is mission critical to increase the awareness of facts on the part of news consumers. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
    DOI Publisher Details
    Details

2021

  1. Campanile, L., Iacono, M., Marulli, F., & Mastroianni, M. (2021). Designing a GDPR compliant blockchain-based IoV distributed information tracking system [Article]. Information Processing and Management, 58(3). https://doi.org/10.1016/j.ipm.2021.102511
    Abstract
    Blockchain technologies and distributed ledgers enable the design and implementation of trustable data logging systems that can be used by multiple parties to produce a non-repudiable database. The case of Internet of Vehicles may greatly benefit of such a possibility to track the chain of responsibility in case of accidents or damages due to bad or omitted maintenance, improving the safety of circulation and helping granting a correct handling of related legal issues. However, there are privacy issues that have to be considered, as tracked information potentially include data about private persons (position, personal habits), commercially relevant information (state of the fleet of a company, freight movement and related planning, logistic strategies), or even more critical knowledge (e.g., considering vehicles belonging to police, public authorities, governments or officers in sensible positions). In the European Union, all this information is covered by the General Data Protection Regulation (GDPR). In this paper we propose a reference model for a system that manages relevant information to show how blockchain can support GDPR compliant solutions for Internet of Vehicles, taking as a reference an integrated scenario based on Italy, and analyze a subset of its use cases to show its viability with reference to privacy issues. © 2021 Elsevier Ltd
    DOI Publisher Details
    Details
  2. Marulli, F., Balzanella, A., Campanile, L., Iacono, M., & Mastroianni, M. (2021). Exploring a Federated Learning Approach to Enhance Authorship Attribution of Misleading Information from Heterogeneous Sources [Conference paper]. Proceedings of the International Joint Conference on Neural Networks, 2021-July. https://doi.org/10.1109/IJCNN52387.2021.9534377
    Abstract
    Authorship Attribution (AA) is currently applied in several applications, among which fraud detection and anti-plagiarism checks: this task can leverage stylometry and Natural Language Processing techniques. In this work, we explored some strategies to enhance the performance of an AA task for the automatic detection of false and misleading information (e.g., fake news). We set up a text classification model for AA based on stylometry exploiting recurrent deep neural networks and implemented two learning tasks trained on the same collection of fake and real news, comparing their performances: one is based on Federated Learning architecture, the other on a centralized architecture. The goal was to discriminate potential fake information from true ones when the fake news comes from heterogeneous sources, with different styles. Preliminary experiments show that a distributed approach significantly improves recall with respect to the centralized model. As expected, precision was lower in the distributed model. This aspect, coupled with the statistical heterogeneity of data, represents some open issues that will be further investigated in future work. © 2021 IEEE.
    DOI Publisher Details
    Details

← Back to all publications