Topic: Data mining
Published:
2026
- DetailsCampanile, L., de Biase, M. S., & Marulli, F. (2026). Design and evaluation of a privacy-preserving multi-level federated learning architecture for airport biometric check-in. Future Generation Computer Systems, 176, 108217. https://doi.org/https://doi.org/10.1016/j.future.2025.108217
Abstract
The rapid adoption of automated airport check-in systems using facial recognition raises significant privacy concerns due to their reliance on centralized deep learning models that store and transmit biometric data from edge devices. While Federated Learning (FL) is a promising approach for privacy preservation, its effectiveness in biometric identification remains underexplored, particularly in real-world environments like airports. This study assesses the privacy implications of FL in facial recognition by comparing three architectures. A first centralized system, where biometric data is sent to a central server for model training and inference, posing significant privacy risks. The second is a one-level FL architecture, where biometric data remains on local devices, and only model updates are shared with a central aggregator. The third is a two-level FL architecture, introducing an additional aggregation layer among airlines to enhance model generalization while preserving privacy. To ensure a rigorous privacy preservation evaluation, we integrate both quantitative and qualitative metrics. For the quantitative assessment, we leverage the Privacy Meter Tool, which enables simulations of Membership Inference Attacks and the application of Differential Privacy as a mitigation technique. For the qualitative evaluation, we conduct a Data Protection Impact Assessment to analyze potential privacy risks from a regulatory perspective. Additionally, we assess model accuracy, computational efficiency, and communication overhead to determine FL’s feasibility in large-scale airport environments. Our results show that while FL significantly reduces privacy risks, the two-level FL approach introduces new vulnerabilities, such as model poisoning risks and privacy-utility trade-offs, requiring further mitigation strategies like DP. - DetailsNapoli, F., Castaldo, M., Marrone, S., & Campanile, L. (2026). Comparing Emerging Technologies in Image Classification: From Quantum to Kolmogorov [Conference paper]. Lecture Notes in Computer Science, 15886 LNCS, 260–273. https://doi.org/10.1007/978-3-031-97576-9_17
Abstract
The rapid evolution of Artificial Intelligence has led to significant advancements in image classification, with novel approaches emerging beyond traditional deep learning paradigms. This paper presents a comparative analysis of three distinct methodologies for image classification: classical Convolutional Neural Networks (CNNs), Kolmogorov-Arnold Networks (KANs) and KAN-based CNNs and Quantum Machine Learning using Quantum Convolutional Neural Networks. The study evaluates these models on the Labeled Faces in the Wild dataset, implementing the different classifiers with existing, well-assessed technologies. Given the fundamental differences in computational paradigms, performance assessment extends beyond traditional accuracy metrics to include computational efficiency, interpretability, and, for quantum models, gate depth and noise. As a summary of the results, the proposed Quantum Convolutional Neural Network (QCNN) model achieves an accuracy of 75% on the target images classification task, indicating promising performance within current quantum computational limits. All the experiments strongly suggest that Convolutional Kolmogorov-Arnold Networks (CKANs) exhibit increased accuracy as image resolution decreases, QCNN performance meaningfully changes in relation to noise level, while CNNs still keeping strong discriminative capabilities. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
2025
- DetailsCampanile, L., de Biase, M. S., & Marulli, F. (2025). Edge-Cloud Distributed Approaches to Text Authorship Analysis: A Feasibility Study [Book chapter]. Lecture Notes on Data Engineering and Communications Technologies, 250, 284–293. https://doi.org/10.1007/978-3-031-87778-0_28
Abstract
Automatic authorship analysis, often referred to as stylometry, is a captivating yet contentious field that employs computational techniques to determine the authorship of textual artefacts. In recent years, the importance of author profiling has grown significantly due to the proliferation of automatic text generation systems. These include both early-generation bots and the latest generative AI-based models, which have heightened concerns about misinformation and content authenticity. This study proposes a novel approach to evaluate the feasibility and effectiveness of contemporary distributed learning methods. The approach leverages the computational advantages of distributed systems while preserving the privacy of human contributors, enabling the collection and analysis of extensive datasets of “human-written” texts in contrast to those generated by bots. More specifically, the proposed method adopts a Federated Learning (FL) framework, integrating readability and stylometric metrics to deliver a privacy-preserving solution for Authorship Attribution (AA). The primary objective is to enhance the accuracy of AA processes, thus achieving a more robust “authorial fingerprint”. Experimental results reveal that while FL effectively protects privacy and mitigates data exposure risks, the combined use of readability and stylometric features significantly increases the accuracy of AA. This approach demonstrates promise for secure and scalable AA applications, particularly in privacy-sensitive contexts and real-time edge computing scenarios. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025. - DetailsConference Quantum Convolutional Neural Networks for Image Classification: Perspectives and ChallengesNapoli, F., Campanile, L., De Gregorio, G., & Marrone, S. (2025). Quantum Convolutional Neural Networks for Image Classification: Perspectives and Challenges [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 509–516. https://doi.org/10.5220/0013521500003944
Abstract
Quantum Computing is becoming a central point of discussion in both academic and industrial communities. Quantum Machine Learning is one of the most promising subfields of this technology, in particular for image classification. In this paper, the model of Quantum Convolutional Neural Networks and some related implementations are explored in their potential for a non-trivial task of image classification. The paper presents some experimentations and discusses the limitations and the strengths of these approaches when compared with classical Convolutional Neural Networks. Furthermore, an analysis of the impact of the noise level on the quality of the classification task has been performed. This paper reports a substantial equivalence of the perfomance of the model with respect the level of noise. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda. - DetailsCampanile, L., Zona, R., Perfetti, A., & Rosatelli, F. (2025). An AI-Driven Methodology for Patent Evaluation in the IoT Sector: Assessing Relevance and Future Impact [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 501–508. https://doi.org/10.5220/0013519700003944
Abstract
The rapid expansion of the Internet of Things has led to a surge in patent filings, creating challenges in evaluating their relevance and potential impact. Traditional patent assessment methods, relying on manual review and keyword-based searches, are increasingly inadequate for analyzing the complexity of emerging IoT technologies. In this paper, we propose an AI-driven methodology for patent evaluation that leverages Large Language Models and machine learning techniques to assess patent relevance and estimate future impact. Our framework integrates advanced Natural Language Processing techniques with structured patent metadata to establish a systematic approach to patent analysis. The methodology consists of three key components: (1) feature extraction from patent text using LLM embeddings and conventional NLP methods, (2) relevance classification and clustering to identify emerging technological trends, and (3) an initial formulation of impact estimation based on semantic similarity and citation patterns. While this study focuses primarily on defining the methodology, we include a minimal validation on a sample dataset to illustrate its feasibility and potential. The proposed approach lays the groundwork for a scalable, automated patent evaluation system, with future research directions aimed at refining impact prediction models and expanding empirical validation. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda.
2024
- DetailsBarzegar, A., Campanile, L., Marrone, S., Marulli, F., Verde, L., & Mastroianni, M. (2024). Fuzzy-based Severity Evaluation in Privacy Problems: An Application to Healthcare [Conference paper]. Proceedings - 2024 19th European Dependable Computing Conference, EDCC 2024, 147–154. https://doi.org/10.1109/EDCC61798.2024.00037
Abstract
The growing diffusion of smart pervasive applications is starting to mine personal privacy: from Internet of Things to Machine Learning, the opportunities for privacy loss are many. As for other concerns involving people and goods as financial, safety and security, researchers and practitioners have defined in time different risk assessment procedures to have repeatable and accurate ways of detecting, quantifying and managing the (possible) source of privacy loss. This paper defines a methodology to deal with privacy risk assessment, overcoming the traditional dichotomy between qualitative (easy to apply) and quantitative (accurate) approaches. The present paper introduces an approach based on fuzzy logic, able to conjugate the benefits of both techniques. The feasibility of the proposed methodology is demonstrated using a healthcare case study. © 2024 IEEE. - DetailsBook Chapter Combining Federated and Ensemble Learning in Distributed and Cloud Environments: An Exploratory StudyMarulli, F., Campanile, L., Marrone, S., & Verde, L. (2024). Combining Federated and Ensemble Learning in Distributed and Cloud Environments: An Exploratory Study [Book chapter]. Lecture Notes on Data Engineering and Communications Technologies, 203, 297–306. https://doi.org/10.1007/978-3-031-57931-8_29
Abstract
Conventional modern Machine Learning (ML) applications involve training models in the cloud and then transferring them back to the edge, especially in an Internet of Things (IoT) enabled environment. However, privacy-related limitations on data transfer from the edge to the cloud raise challenges: among various solutions, Federated Learning (FL) could satisfy privacy related concerns and accommodate power and energy issues of edge devices. This paper proposes a novel approach that combines FL and Ensemble Learning (EL) to improve both security and privacy challenges. The presented methodology introduces an extra layer, the Federation Layer, to enhance security. It uses Bayesian Networks (BNs) to dynamically filter untrusted/unsecure federation clients. This approach presents a solution for increasing the security and robustness of FL systems, considering also privacy and performance aspects. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. - DetailsVerde, L., Marulli, F., De Fazio, R., Campanile, L., & Marrone, S. (2024). HEAR set: A ligHtwEight acoustic paRameters set to assess mental health from voice analysis [Article]. Computers in Biology and Medicine, 182. https://doi.org/10.1016/j.compbiomed.2024.109021
Abstract
Background: Voice analysis has significant potential in aiding healthcare professionals with detecting, diagnosing, and personalising treatment. It represents an objective and non-intrusive tool for supporting the detection and monitoring of specific pathologies. By calculating various acoustic features, voice analysis extracts valuable information to assess voice quality. The choice of these parameters is crucial for an accurate assessment. Method: In this paper, we propose a lightweight acoustic parameter set, named HEAR, able to evaluate voice quality to assess mental health. In detail, this consists of jitter, spectral centroid, Mel-frequency cepstral coefficients, and their derivates. The choice of parameters for the proposed set was influenced by the explainable significance of each acoustic parameter in the voice production process. Results: The reliability of the proposed acoustic set to detect the early symptoms of mental disorders was evaluated in an experimental phase. Voices of subjects suffering from different mental pathologies, selected from available databases, were analysed. The performance obtained from the HEAR features was compared with that obtained by analysing features selected from toolkits widely used in the literature, as with those obtained using learned procedures. The best performance in terms of MAE and RMSE was achieved for the detection of depression (5.32 and 6.24 respectively). For the detection of psychogenic dysphonia and anxiety, the highest accuracy rates were about 75 % and 97 %, respectively. Conclusions: The comparative evaluation was carried out to assess the performance of the proposed approach, demonstrating a reliable capability to highlight affective physiological alterations of voice quality due to the considered mental disorders. © 2024 The Author(s) - DetailsCampanile, L., Di Bonito, L. P., Natale, F. D., & Iacono, M. (2024). Ensemble Models for Predicting CO Concentrations: Application and Explainability in Environmental Monitoring in Campania, Italy [Conference paper]. Proceedings - European Council for Modelling and Simulation, ECMS, 38(1), 558–564. https://doi.org/10.7148/2024-0558
Abstract
Monitoring of non-linear phenomena, such as pollution dynamics, which is the result of several combined factors and the evolution of environmental conditions, greatly benefits by AI tools; a larger benefit derives by the application of explainable solutions, which are capable of providing elements to understand those dynamics for better informed decisions. In this paper we discuss a case with real data in which a posteriori explanations have been produced after the application of ensemble models. © ECMS Daniel Grzonka, Natalia Rylko, Grazyna Suchacka, Vladimir Mityushev (Editors) 2024.
2023
- DetailsCampanile, L., de Fazio, R., Di Giovanni, M., Marrone, S., Marulli, F., & Verde, L. (2023). Inferring Emotional Models from Human-Machine Speech Interactions [Conference paper]. Procedia Computer Science, 225, 1241–1250. https://doi.org/10.1016/j.procs.2023.10.112
Abstract
Human-Machine Interfaces (HMIs) are getting more and more important in a hyper-connected society. Traditional HMIs are built considering cognitive features while emotional ones are often neglected, bringing sometimes such interfaces to misuse. As a part of a long run research, oriented to the definition of an HMI engineering approach, this paper concretely proposes a method to build an emotional-aware explicit model of the user starting from the behaviour of the human with a virtual agent. The paper also proposes an instance of this model inference process in voice assistants in an automatic depression context, which can constitute the core phase to realize a Human Digital Twin of a patient. The case study generated a model composed of Fluid Stochastic Petri Net sub-models, achieved after the data analysis by a Support Vector Machine. © 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) - DetailsCampanile, L., Di Bonito, L. P., Gribaudo, M., & Iacono, M. (2023). A Domain Specific Language for the Design of Artificial Intelligence Applications for Process Engineering [Conference paper]. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, 482 LNICST, 133–146. https://doi.org/10.1007/978-3-031-31234-2_8
Abstract
Processes in chemical engineering are frequently enacted by one-of-a-kind devices that implement dynamic processes with feedback regulations designed according to experimental studies and empirical tuning of new devices after the experience obtained on similar setups. While application of artificial intelligence based solutions is largely advocated by researchers in several fields of chemical engineering to face the problems deriving from these practices, few actual cases exist in literature and in industrial plants that leverage currently available tools as much as other application fields suggest. One of the factors that is limiting the spread of AI-based solutions in the field is the lack of tools that support the evaluation of the needs of plants, be those existing or to-be settlements. In this paper we provide a Domain Specific Language based approach for the evaluation of the basic performance requirements for cloud-based setups capable of supporting chemical engineering plants, with a metaphor that attempts to bridge the two worlds. © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. - DetailsDi Bonito, L. P., Campanile, L., Napolitano, E., Iacono, M., Portolano, A., & Di Natale, F. (2023). Analysis of a marine scrubber operation with a combined analytical/AI-based method [Article]. Chemical Engineering Research and Design, 195, 613–623. https://doi.org/10.1016/j.cherd.2023.06.006
Abstract
This paper describes the performances of a marine SO2 absorption scrubber installed onboard a large Ro-Ro cargo ship. The study is based on the reconstruction of an extensive dataset from one-year continuous monitoring of the scrubber’s performances and operating conditions. The dataset has been interpreted with a conventional analytical, physical-mathematical, model for absorbers’ rating and its combination with an Artificial Intelligence (AI) one. First, the analytical model has been used to provide a deterministic mathematical framework for the interpretation and the prediction of the scrubber’s performances in terms of absorbed SO2 molar flow and SO2 concentration at the scrubber exit. Then, data mining and AI techniques have been applied to develop an Artificial Neural Network able to predict the error between the actual SO2 concentration at the scrubber exit and the corresponding analytical model predictions. The final result is a combined model providing superior robustness and accuracy in the prediction of the scrubber performance while preserving a rationale for process design and operation. This interesting outcome suggests that the development of combined, or hybrid, Analytical/AI models can be a reliable and cost-effective way to improve chemical engineers’ ability to design and control marine scrubbers, as well as other chemical equipment. © 2023 Institution of Chemical Engineers - DetailsCampanile, L., Di Bonito, L. P., Iacono, M., & Di Natale, F. (2023). Prediction of chemical plants operating performances: a machine learning approach [Conference paper]. Proceedings - European Council for Modelling and Simulation, ECMS, 2023-June, 575–581. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85163436467&partnerID=40&md5=2e96d04affd9bb4a126b224d7cc8d75a
Abstract
Modern environmental regulations require rigorous optimization of operations in process engineering to reduce waste, pollution, and risks while maximizing efficiency. However, the nature of chemical plants, which include components with non-linear behavior, challenges the use of consolidated tuning and control techniques. Instead, ad-hoc, self-adapting, and time-variant controls, with a balanced tuning of parameters at both the subsystem and system level, may be necessary. Needed computing processes may require significant resources and high performance systems, if managed by means of traditional approaches and with exact solution methods. In this regard, domain experts suggest instead the use of integrated techniques based on Artificial Intelligence (AI), which include Explainable AI (XAI) and Trustworthy AI (TAI), which are unique in this industry and still in the early stages of development. To pave the way for a real-time, cost-effective solution for this problem, this paper proposes an AI-based approach to model the performance of a real chemical plant, i.e. a marine scrubber installed on a Ro-Ro ship. The study aims to investigate Machine Learning (ML) techniques which can be used to model such processes. Notably, this analysis is the first of its kind, at the best of the authors’ knowledge. Overall, the study highlights the potential of using ML-based techniques, to optimize environmental compliance in the shipping industry. © ECMS Enrico Vicario, Romeo Bandinelli, Virginia Fani, Michele Mastroianni (Editors) 2023.
2022
- DetailsCampanile, L., Forgione, F., Mastroianni, M., Palmiero, G., & Sanghez, C. (2022). Evaluating the Impact of Data Anonymization in a Machine Learning Application [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13380 LNCS, 389–400. https://doi.org/10.1007/978-3-031-10542-5_27
Abstract
The data protection impact assessment is used to verify the necessity, proportionality and risks of data processing. Our work is based on the data processed by the technical support of a Wireless Service Provider. The team of WISP tech support uses a machine learning system to predict failures. The goal of our the experiments was to evaluate the DPIA with personal data and without personal data. In fact, in a first scenario, the experiments were conducted using a machine learning application powered by non-anonymous personal data. Instead in the second scenario, the data was anonymized before feeding the machine learning system. In this article we evaluate how much the Data Protection Impact Assessment changes when moving from a scenario with raw data to a scenario with anonymized data. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG. - DetailsVerde, L., Campanile, L., Marulli, F., & Marrone, S. (2022). Speech-based Evaluation of Emotions-Depression Correlation. Proceedings of the 2022 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress, DASC/PiCom/CBDCom/CyberSciTech 2022. https://doi.org/10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927758
Abstract
Early detection of depression symptoms is fundamental to limit the onset of further associated behavioural disorders, such as psychomotor or social withdrawal. The combination of Artificial Intelligence and speech analysis revealed the existence of objectively measurable physical manifestations for early detection of depressive symptoms, constituting a valid support to evaluate these signals. To push forward the research state-of-art, this aim of this paper is to understand quantitative correlations between emotional states and depression by proposing a study across different datasets containing speech of both depressed/non-depressed people and emotional-related samples. The relationship between affective measures and depression can, in fact, a support to evaluate the presence of depression state. This work constitutes a preliminary step of a study whose final aim is to pursue AI-powered personalized medicine by building sophisticated Clinical Decision Support Systems for depression, as well as other psychological disorders. © 2022 IEEE. - DetailsCampanile, L., Marrone, S., Marulli, F., & Verde, L. (2022). Challenges and Trends in Federated Learning for Well-being and Healthcare [Conference paper]. Procedia Computer Science, 207, 1144–1153. https://doi.org/10.1016/j.procs.2022.09.170
Abstract
Currently, research in Artificial Intelligence, both in Machine Learning and Deep Learning, paves the way for promising innovations in several areas. In healthcare, especially, where large amounts of quantitative and qualitative data are transferred to support studies and early diagnosis and monitoring of any diseases, potential security and privacy issues cannot be underestimated. Federated learning is an approach where privacy issues related to sensitive data management can be significantly reduced, due to the possibility to train algorithms without exchanging data. The main idea behind this approach is that learning models can be trained in a distributed way, where multiple devices or servers with decentralized data samples can provide their contributions without having to exchange their local data. Recent studies provided evidence that prototypes trained by adopting Federated Learning strategies are able to achieve reliable performance, thus by generating robust models without sharing data and, consequently, limiting the impact on security and privacy. This work propose a literature overview of Federated Learning approaches and systems, focusing on its application for healthcare. The main challenges, implications, issues and potentials of this approach in the healthcare are outlined. © 2022 The Authors. Published by Elsevier B.V.
2021
- DetailsCampanile, L., Forgione, F., Marulli, F., Palmiero, G., & Sanghez, C. (2021). Dataset Anonimyzation for Machine Learning: An ISP Case Study [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12950 LNCS, 589–597. https://doi.org/10.1007/978-3-030-86960-1_42
Abstract
Internet Service Providers technical support needs personal data to predict potential anomalies. In this paper, we performed a comparative study of forecasting performance using raw data and anonymized data, in order to assess how much performance may vary, when plain personal data are replaced by anonymized personal data. © 2021, Springer Nature Switzerland AG. - DetailsCampanile, L., Cantiello, P., Iacono, M., Lotito, R., Marulli, F., & Mastroianni, M. (2021). Applying Machine Learning to Weather and Pollution Data Analysis for a Better Management of Local Areas: The Case of Napoli, Italy [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 2021-April, 354–363. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85135227609&partnerID=40&md5=5a7c117fa01d0ba8d779b0e092bc0f63
Abstract
Local pollution is a problem that affects urban areas and has effects on the quality of life and on health conditions. In order to not develop strict measures and to better manage territories, the national authorities have applied a vast range of predictive models. Actually, the application of machine learning has been studied in the last decades in various cases with various declination to simplify this problem. In this paper, we apply a regression-based analysis technique to a dataset containing official historical local pollution and weather data to look for criteria that allow forecasting critical conditions. The methods was applied to the case study of Napoli, Italy, where the local environmental protection agency manages a set of fixed monitoring stations where both chemical and meteorological data are recorded. The joining of the two raw dataset was overcome by the use of a maximum inclusion strategy as performing the joining action with”outer” mode. Among the four different regression models applied, namely the Linear Regression Model calculated with Ordinary Least Square (LN-OLS), the Ridge regression Model (Ridge), the Lasso Model (Lasso) and Supervised Nearest Neighbors Regression (KNN), the Ridge regression model was found to better perform with an R2 (Coefficient of Determination) value equal to 0.77 and low value for both MAE (Mean Absolute Error) and MSE (Mean Squared Error), equal to 0.12 and 0.04 respectively. © 2021 by SCITEPRESS - Science and Technology Publications, Lda. - DetailsMarulli, F., Verde, L., & Campanile, L. (2021). Exploring data and model poisoning attacks to deep learning-based NLP systems [Conference paper]. Procedia Computer Science, 192, 3570–3579. https://doi.org/10.1016/j.procs.2021.09.130
Abstract
Natural Language Processing (NLP) is being recently explored also to its application in supporting malicious activities and objects detection. Furthermore, NLP and Deep Learning have become targets of malicious attacks too. Very recent researches evidenced that adversarial attacks are able to affect also NLP tasks, in addition to the more popular adversarial attacks on deep learning systems for image processing tasks. More precisely, while small perturbations applied to the data set adopted for training typical NLP tasks (e.g., Part-of-Speech Tagging, Named Entity Recognition, etc..) could be easily recognized, models poisoning, performed by the means of altered data models, typically provided in the transfer learning phase to a deep neural networks (e.g., poisoning attacks by word embeddings), are harder to be detected. In this work, we preliminary explore the effectiveness of a poisoned word embeddings attack aimed at a deep neural network trained to accomplish a Named Entity Recognition (NER) task. By adopting the NER case study, we aimed to analyze the severity of such a kind of attack to accuracy in recognizing the right classes for the given entities. Finally, this study represents a preliminary step to assess the impact and the vulnerabilities of some NLP systems we adopt in our research activities, and further investigating some potential mitigation strategies, in order to make these systems more resilient to data and models poisoning attacks. © 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) Peer-review under responsibility of the scientific committee of KES International. - DetailsCampanile, L., Marulli, F., Mastroianni, M., Palmiero, G., & Sanghez, C. (2021). Machine Learning-aided Automatic Calibration of Smart Thermal Cameras for Health Monitoring Applications [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 2021-April, 343–353. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85137959400&partnerID=40&md5=eb78330cb4d585e500b77cd906edfbc7
Abstract
In this paper, we introduce a solution aiming to improve the accuracy of the surface temperature detection in an outdoor environment. The temperature sensing subsystem relies on Mobotix thermal camera without the black body, the automatic compensation subsystem relies on Raspberry Pi with Node-RED and TensorFlow 2.x. The final results showed that it is possible to automatically calibrate the camera using machine learning and that it is possible to use thermal imaging cameras even in critical conditions such as outdoors. Future development is to improve performance using computer vision techniques to rule out irrelevant measurements. © 2021 by SCITEPRESS - Science and Technology Publications, Lda. - DetailsMarulli, F., Balzanella, A., Campanile, L., Iacono, M., & Mastroianni, M. (2021). Exploring a Federated Learning Approach to Enhance Authorship Attribution of Misleading Information from Heterogeneous Sources [Conference paper]. Proceedings of the International Joint Conference on Neural Networks, 2021-July. https://doi.org/10.1109/IJCNN52387.2021.9534377
Abstract
Authorship Attribution (AA) is currently applied in several applications, among which fraud detection and anti-plagiarism checks: this task can leverage stylometry and Natural Language Processing techniques. In this work, we explored some strategies to enhance the performance of an AA task for the automatic detection of false and misleading information (e.g., fake news). We set up a text classification model for AA based on stylometry exploiting recurrent deep neural networks and implemented two learning tasks trained on the same collection of fake and real news, comparing their performances: one is based on Federated Learning architecture, the other on a centralized architecture. The goal was to discriminate potential fake information from true ones when the fake news comes from heterogeneous sources, with different styles. Preliminary experiments show that a distributed approach significantly improves recall with respect to the centralized model. As expected, precision was lower in the distributed model. This aspect, coupled with the statistical heterogeneity of data, represents some open issues that will be further investigated in future work. © 2021 IEEE.
2020
- DetailsMainenti, G., Campanile, L., Marulli, F., Ricciardi, C., & Valente, A. S. (2020). Machine learning approaches for diabetes classification: Perspectives to artificial intelligence methods updating [Conference paper]. IoTBDS 2020 - Proceedings of the 5th International Conference on Internet of Things, Big Data and Security, 533–540. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85089519717&partnerID=40&md5=bf7cc36e86c1988dd85e04c2fce06de1
Abstract
In recent years the application of Machine Learning (ML) and Artificial Intelligence (AI) techniques in healthcare helped clinicians to improve the management of chronic patients. Diabetes is among the most common chronic illness in the world for which often is still challenging do an early detection and a correct classification of type of diabetes to an individual. In fact it often depends on the circumstances present at the time of diagnosis, and many diabetic individuals do not easily fit into a single class. The aim is this paper is the application of ML techniques in order to classify the occurrence of different mellitus diabetes on the base of clinical data obtained from diabetic patients during the daily hospitals activities. Copyright © 2020 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved.
2026
- DetailsCampanile, L., de Biase, M. S., & Marulli, F. (2026). Design and evaluation of a privacy-preserving multi-level federated learning architecture for airport biometric check-in. Future Generation Computer Systems, 176, 108217. https://doi.org/https://doi.org/10.1016/j.future.2025.108217
Abstract
The rapid adoption of automated airport check-in systems using facial recognition raises significant privacy concerns due to their reliance on centralized deep learning models that store and transmit biometric data from edge devices. While Federated Learning (FL) is a promising approach for privacy preservation, its effectiveness in biometric identification remains underexplored, particularly in real-world environments like airports. This study assesses the privacy implications of FL in facial recognition by comparing three architectures. A first centralized system, where biometric data is sent to a central server for model training and inference, posing significant privacy risks. The second is a one-level FL architecture, where biometric data remains on local devices, and only model updates are shared with a central aggregator. The third is a two-level FL architecture, introducing an additional aggregation layer among airlines to enhance model generalization while preserving privacy. To ensure a rigorous privacy preservation evaluation, we integrate both quantitative and qualitative metrics. For the quantitative assessment, we leverage the Privacy Meter Tool, which enables simulations of Membership Inference Attacks and the application of Differential Privacy as a mitigation technique. For the qualitative evaluation, we conduct a Data Protection Impact Assessment to analyze potential privacy risks from a regulatory perspective. Additionally, we assess model accuracy, computational efficiency, and communication overhead to determine FL’s feasibility in large-scale airport environments. Our results show that while FL significantly reduces privacy risks, the two-level FL approach introduces new vulnerabilities, such as model poisoning risks and privacy-utility trade-offs, requiring further mitigation strategies like DP. - DetailsNapoli, F., Castaldo, M., Marrone, S., & Campanile, L. (2026). Comparing Emerging Technologies in Image Classification: From Quantum to Kolmogorov [Conference paper]. Lecture Notes in Computer Science, 15886 LNCS, 260–273. https://doi.org/10.1007/978-3-031-97576-9_17
Abstract
The rapid evolution of Artificial Intelligence has led to significant advancements in image classification, with novel approaches emerging beyond traditional deep learning paradigms. This paper presents a comparative analysis of three distinct methodologies for image classification: classical Convolutional Neural Networks (CNNs), Kolmogorov-Arnold Networks (KANs) and KAN-based CNNs and Quantum Machine Learning using Quantum Convolutional Neural Networks. The study evaluates these models on the Labeled Faces in the Wild dataset, implementing the different classifiers with existing, well-assessed technologies. Given the fundamental differences in computational paradigms, performance assessment extends beyond traditional accuracy metrics to include computational efficiency, interpretability, and, for quantum models, gate depth and noise. As a summary of the results, the proposed Quantum Convolutional Neural Network (QCNN) model achieves an accuracy of 75% on the target images classification task, indicating promising performance within current quantum computational limits. All the experiments strongly suggest that Convolutional Kolmogorov-Arnold Networks (CKANs) exhibit increased accuracy as image resolution decreases, QCNN performance meaningfully changes in relation to noise level, while CNNs still keeping strong discriminative capabilities. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
2025
- DetailsCampanile, L., de Biase, M. S., & Marulli, F. (2025). Edge-Cloud Distributed Approaches to Text Authorship Analysis: A Feasibility Study [Book chapter]. Lecture Notes on Data Engineering and Communications Technologies, 250, 284–293. https://doi.org/10.1007/978-3-031-87778-0_28
Abstract
Automatic authorship analysis, often referred to as stylometry, is a captivating yet contentious field that employs computational techniques to determine the authorship of textual artefacts. In recent years, the importance of author profiling has grown significantly due to the proliferation of automatic text generation systems. These include both early-generation bots and the latest generative AI-based models, which have heightened concerns about misinformation and content authenticity. This study proposes a novel approach to evaluate the feasibility and effectiveness of contemporary distributed learning methods. The approach leverages the computational advantages of distributed systems while preserving the privacy of human contributors, enabling the collection and analysis of extensive datasets of “human-written” texts in contrast to those generated by bots. More specifically, the proposed method adopts a Federated Learning (FL) framework, integrating readability and stylometric metrics to deliver a privacy-preserving solution for Authorship Attribution (AA). The primary objective is to enhance the accuracy of AA processes, thus achieving a more robust “authorial fingerprint”. Experimental results reveal that while FL effectively protects privacy and mitigates data exposure risks, the combined use of readability and stylometric features significantly increases the accuracy of AA. This approach demonstrates promise for secure and scalable AA applications, particularly in privacy-sensitive contexts and real-time edge computing scenarios. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025. - DetailsConference Quantum Convolutional Neural Networks for Image Classification: Perspectives and ChallengesNapoli, F., Campanile, L., De Gregorio, G., & Marrone, S. (2025). Quantum Convolutional Neural Networks for Image Classification: Perspectives and Challenges [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 509–516. https://doi.org/10.5220/0013521500003944
Abstract
Quantum Computing is becoming a central point of discussion in both academic and industrial communities. Quantum Machine Learning is one of the most promising subfields of this technology, in particular for image classification. In this paper, the model of Quantum Convolutional Neural Networks and some related implementations are explored in their potential for a non-trivial task of image classification. The paper presents some experimentations and discusses the limitations and the strengths of these approaches when compared with classical Convolutional Neural Networks. Furthermore, an analysis of the impact of the noise level on the quality of the classification task has been performed. This paper reports a substantial equivalence of the perfomance of the model with respect the level of noise. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda. - DetailsCampanile, L., Zona, R., Perfetti, A., & Rosatelli, F. (2025). An AI-Driven Methodology for Patent Evaluation in the IoT Sector: Assessing Relevance and Future Impact [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 501–508. https://doi.org/10.5220/0013519700003944
Abstract
The rapid expansion of the Internet of Things has led to a surge in patent filings, creating challenges in evaluating their relevance and potential impact. Traditional patent assessment methods, relying on manual review and keyword-based searches, are increasingly inadequate for analyzing the complexity of emerging IoT technologies. In this paper, we propose an AI-driven methodology for patent evaluation that leverages Large Language Models and machine learning techniques to assess patent relevance and estimate future impact. Our framework integrates advanced Natural Language Processing techniques with structured patent metadata to establish a systematic approach to patent analysis. The methodology consists of three key components: (1) feature extraction from patent text using LLM embeddings and conventional NLP methods, (2) relevance classification and clustering to identify emerging technological trends, and (3) an initial formulation of impact estimation based on semantic similarity and citation patterns. While this study focuses primarily on defining the methodology, we include a minimal validation on a sample dataset to illustrate its feasibility and potential. The proposed approach lays the groundwork for a scalable, automated patent evaluation system, with future research directions aimed at refining impact prediction models and expanding empirical validation. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda.
2024
- DetailsBarzegar, A., Campanile, L., Marrone, S., Marulli, F., Verde, L., & Mastroianni, M. (2024). Fuzzy-based Severity Evaluation in Privacy Problems: An Application to Healthcare [Conference paper]. Proceedings - 2024 19th European Dependable Computing Conference, EDCC 2024, 147–154. https://doi.org/10.1109/EDCC61798.2024.00037
Abstract
The growing diffusion of smart pervasive applications is starting to mine personal privacy: from Internet of Things to Machine Learning, the opportunities for privacy loss are many. As for other concerns involving people and goods as financial, safety and security, researchers and practitioners have defined in time different risk assessment procedures to have repeatable and accurate ways of detecting, quantifying and managing the (possible) source of privacy loss. This paper defines a methodology to deal with privacy risk assessment, overcoming the traditional dichotomy between qualitative (easy to apply) and quantitative (accurate) approaches. The present paper introduces an approach based on fuzzy logic, able to conjugate the benefits of both techniques. The feasibility of the proposed methodology is demonstrated using a healthcare case study. © 2024 IEEE. - DetailsBook Chapter Combining Federated and Ensemble Learning in Distributed and Cloud Environments: An Exploratory StudyMarulli, F., Campanile, L., Marrone, S., & Verde, L. (2024). Combining Federated and Ensemble Learning in Distributed and Cloud Environments: An Exploratory Study [Book chapter]. Lecture Notes on Data Engineering and Communications Technologies, 203, 297–306. https://doi.org/10.1007/978-3-031-57931-8_29
Abstract
Conventional modern Machine Learning (ML) applications involve training models in the cloud and then transferring them back to the edge, especially in an Internet of Things (IoT) enabled environment. However, privacy-related limitations on data transfer from the edge to the cloud raise challenges: among various solutions, Federated Learning (FL) could satisfy privacy related concerns and accommodate power and energy issues of edge devices. This paper proposes a novel approach that combines FL and Ensemble Learning (EL) to improve both security and privacy challenges. The presented methodology introduces an extra layer, the Federation Layer, to enhance security. It uses Bayesian Networks (BNs) to dynamically filter untrusted/unsecure federation clients. This approach presents a solution for increasing the security and robustness of FL systems, considering also privacy and performance aspects. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. - DetailsVerde, L., Marulli, F., De Fazio, R., Campanile, L., & Marrone, S. (2024). HEAR set: A ligHtwEight acoustic paRameters set to assess mental health from voice analysis [Article]. Computers in Biology and Medicine, 182. https://doi.org/10.1016/j.compbiomed.2024.109021
Abstract
Background: Voice analysis has significant potential in aiding healthcare professionals with detecting, diagnosing, and personalising treatment. It represents an objective and non-intrusive tool for supporting the detection and monitoring of specific pathologies. By calculating various acoustic features, voice analysis extracts valuable information to assess voice quality. The choice of these parameters is crucial for an accurate assessment. Method: In this paper, we propose a lightweight acoustic parameter set, named HEAR, able to evaluate voice quality to assess mental health. In detail, this consists of jitter, spectral centroid, Mel-frequency cepstral coefficients, and their derivates. The choice of parameters for the proposed set was influenced by the explainable significance of each acoustic parameter in the voice production process. Results: The reliability of the proposed acoustic set to detect the early symptoms of mental disorders was evaluated in an experimental phase. Voices of subjects suffering from different mental pathologies, selected from available databases, were analysed. The performance obtained from the HEAR features was compared with that obtained by analysing features selected from toolkits widely used in the literature, as with those obtained using learned procedures. The best performance in terms of MAE and RMSE was achieved for the detection of depression (5.32 and 6.24 respectively). For the detection of psychogenic dysphonia and anxiety, the highest accuracy rates were about 75 % and 97 %, respectively. Conclusions: The comparative evaluation was carried out to assess the performance of the proposed approach, demonstrating a reliable capability to highlight affective physiological alterations of voice quality due to the considered mental disorders. © 2024 The Author(s) - DetailsCampanile, L., Di Bonito, L. P., Natale, F. D., & Iacono, M. (2024). Ensemble Models for Predicting CO Concentrations: Application and Explainability in Environmental Monitoring in Campania, Italy [Conference paper]. Proceedings - European Council for Modelling and Simulation, ECMS, 38(1), 558–564. https://doi.org/10.7148/2024-0558
Abstract
Monitoring of non-linear phenomena, such as pollution dynamics, which is the result of several combined factors and the evolution of environmental conditions, greatly benefits by AI tools; a larger benefit derives by the application of explainable solutions, which are capable of providing elements to understand those dynamics for better informed decisions. In this paper we discuss a case with real data in which a posteriori explanations have been produced after the application of ensemble models. © ECMS Daniel Grzonka, Natalia Rylko, Grazyna Suchacka, Vladimir Mityushev (Editors) 2024.
2023
- DetailsCampanile, L., de Fazio, R., Di Giovanni, M., Marrone, S., Marulli, F., & Verde, L. (2023). Inferring Emotional Models from Human-Machine Speech Interactions [Conference paper]. Procedia Computer Science, 225, 1241–1250. https://doi.org/10.1016/j.procs.2023.10.112
Abstract
Human-Machine Interfaces (HMIs) are getting more and more important in a hyper-connected society. Traditional HMIs are built considering cognitive features while emotional ones are often neglected, bringing sometimes such interfaces to misuse. As a part of a long run research, oriented to the definition of an HMI engineering approach, this paper concretely proposes a method to build an emotional-aware explicit model of the user starting from the behaviour of the human with a virtual agent. The paper also proposes an instance of this model inference process in voice assistants in an automatic depression context, which can constitute the core phase to realize a Human Digital Twin of a patient. The case study generated a model composed of Fluid Stochastic Petri Net sub-models, achieved after the data analysis by a Support Vector Machine. © 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) - DetailsCampanile, L., Di Bonito, L. P., Gribaudo, M., & Iacono, M. (2023). A Domain Specific Language for the Design of Artificial Intelligence Applications for Process Engineering [Conference paper]. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, 482 LNICST, 133–146. https://doi.org/10.1007/978-3-031-31234-2_8
Abstract
Processes in chemical engineering are frequently enacted by one-of-a-kind devices that implement dynamic processes with feedback regulations designed according to experimental studies and empirical tuning of new devices after the experience obtained on similar setups. While application of artificial intelligence based solutions is largely advocated by researchers in several fields of chemical engineering to face the problems deriving from these practices, few actual cases exist in literature and in industrial plants that leverage currently available tools as much as other application fields suggest. One of the factors that is limiting the spread of AI-based solutions in the field is the lack of tools that support the evaluation of the needs of plants, be those existing or to-be settlements. In this paper we provide a Domain Specific Language based approach for the evaluation of the basic performance requirements for cloud-based setups capable of supporting chemical engineering plants, with a metaphor that attempts to bridge the two worlds. © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. - DetailsDi Bonito, L. P., Campanile, L., Napolitano, E., Iacono, M., Portolano, A., & Di Natale, F. (2023). Analysis of a marine scrubber operation with a combined analytical/AI-based method [Article]. Chemical Engineering Research and Design, 195, 613–623. https://doi.org/10.1016/j.cherd.2023.06.006
Abstract
This paper describes the performances of a marine SO2 absorption scrubber installed onboard a large Ro-Ro cargo ship. The study is based on the reconstruction of an extensive dataset from one-year continuous monitoring of the scrubber’s performances and operating conditions. The dataset has been interpreted with a conventional analytical, physical-mathematical, model for absorbers’ rating and its combination with an Artificial Intelligence (AI) one. First, the analytical model has been used to provide a deterministic mathematical framework for the interpretation and the prediction of the scrubber’s performances in terms of absorbed SO2 molar flow and SO2 concentration at the scrubber exit. Then, data mining and AI techniques have been applied to develop an Artificial Neural Network able to predict the error between the actual SO2 concentration at the scrubber exit and the corresponding analytical model predictions. The final result is a combined model providing superior robustness and accuracy in the prediction of the scrubber performance while preserving a rationale for process design and operation. This interesting outcome suggests that the development of combined, or hybrid, Analytical/AI models can be a reliable and cost-effective way to improve chemical engineers’ ability to design and control marine scrubbers, as well as other chemical equipment. © 2023 Institution of Chemical Engineers - DetailsCampanile, L., Di Bonito, L. P., Iacono, M., & Di Natale, F. (2023). Prediction of chemical plants operating performances: a machine learning approach [Conference paper]. Proceedings - European Council for Modelling and Simulation, ECMS, 2023-June, 575–581. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85163436467&partnerID=40&md5=2e96d04affd9bb4a126b224d7cc8d75a
Abstract
Modern environmental regulations require rigorous optimization of operations in process engineering to reduce waste, pollution, and risks while maximizing efficiency. However, the nature of chemical plants, which include components with non-linear behavior, challenges the use of consolidated tuning and control techniques. Instead, ad-hoc, self-adapting, and time-variant controls, with a balanced tuning of parameters at both the subsystem and system level, may be necessary. Needed computing processes may require significant resources and high performance systems, if managed by means of traditional approaches and with exact solution methods. In this regard, domain experts suggest instead the use of integrated techniques based on Artificial Intelligence (AI), which include Explainable AI (XAI) and Trustworthy AI (TAI), which are unique in this industry and still in the early stages of development. To pave the way for a real-time, cost-effective solution for this problem, this paper proposes an AI-based approach to model the performance of a real chemical plant, i.e. a marine scrubber installed on a Ro-Ro ship. The study aims to investigate Machine Learning (ML) techniques which can be used to model such processes. Notably, this analysis is the first of its kind, at the best of the authors’ knowledge. Overall, the study highlights the potential of using ML-based techniques, to optimize environmental compliance in the shipping industry. © ECMS Enrico Vicario, Romeo Bandinelli, Virginia Fani, Michele Mastroianni (Editors) 2023.
2022
- DetailsCampanile, L., Forgione, F., Mastroianni, M., Palmiero, G., & Sanghez, C. (2022). Evaluating the Impact of Data Anonymization in a Machine Learning Application [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13380 LNCS, 389–400. https://doi.org/10.1007/978-3-031-10542-5_27
Abstract
The data protection impact assessment is used to verify the necessity, proportionality and risks of data processing. Our work is based on the data processed by the technical support of a Wireless Service Provider. The team of WISP tech support uses a machine learning system to predict failures. The goal of our the experiments was to evaluate the DPIA with personal data and without personal data. In fact, in a first scenario, the experiments were conducted using a machine learning application powered by non-anonymous personal data. Instead in the second scenario, the data was anonymized before feeding the machine learning system. In this article we evaluate how much the Data Protection Impact Assessment changes when moving from a scenario with raw data to a scenario with anonymized data. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG. - DetailsVerde, L., Campanile, L., Marulli, F., & Marrone, S. (2022). Speech-based Evaluation of Emotions-Depression Correlation. Proceedings of the 2022 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress, DASC/PiCom/CBDCom/CyberSciTech 2022. https://doi.org/10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927758
Abstract
Early detection of depression symptoms is fundamental to limit the onset of further associated behavioural disorders, such as psychomotor or social withdrawal. The combination of Artificial Intelligence and speech analysis revealed the existence of objectively measurable physical manifestations for early detection of depressive symptoms, constituting a valid support to evaluate these signals. To push forward the research state-of-art, this aim of this paper is to understand quantitative correlations between emotional states and depression by proposing a study across different datasets containing speech of both depressed/non-depressed people and emotional-related samples. The relationship between affective measures and depression can, in fact, a support to evaluate the presence of depression state. This work constitutes a preliminary step of a study whose final aim is to pursue AI-powered personalized medicine by building sophisticated Clinical Decision Support Systems for depression, as well as other psychological disorders. © 2022 IEEE. - DetailsCampanile, L., Marrone, S., Marulli, F., & Verde, L. (2022). Challenges and Trends in Federated Learning for Well-being and Healthcare [Conference paper]. Procedia Computer Science, 207, 1144–1153. https://doi.org/10.1016/j.procs.2022.09.170
Abstract
Currently, research in Artificial Intelligence, both in Machine Learning and Deep Learning, paves the way for promising innovations in several areas. In healthcare, especially, where large amounts of quantitative and qualitative data are transferred to support studies and early diagnosis and monitoring of any diseases, potential security and privacy issues cannot be underestimated. Federated learning is an approach where privacy issues related to sensitive data management can be significantly reduced, due to the possibility to train algorithms without exchanging data. The main idea behind this approach is that learning models can be trained in a distributed way, where multiple devices or servers with decentralized data samples can provide their contributions without having to exchange their local data. Recent studies provided evidence that prototypes trained by adopting Federated Learning strategies are able to achieve reliable performance, thus by generating robust models without sharing data and, consequently, limiting the impact on security and privacy. This work propose a literature overview of Federated Learning approaches and systems, focusing on its application for healthcare. The main challenges, implications, issues and potentials of this approach in the healthcare are outlined. © 2022 The Authors. Published by Elsevier B.V.
2021
- DetailsCampanile, L., Forgione, F., Marulli, F., Palmiero, G., & Sanghez, C. (2021). Dataset Anonimyzation for Machine Learning: An ISP Case Study [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12950 LNCS, 589–597. https://doi.org/10.1007/978-3-030-86960-1_42
Abstract
Internet Service Providers technical support needs personal data to predict potential anomalies. In this paper, we performed a comparative study of forecasting performance using raw data and anonymized data, in order to assess how much performance may vary, when plain personal data are replaced by anonymized personal data. © 2021, Springer Nature Switzerland AG. - DetailsCampanile, L., Cantiello, P., Iacono, M., Lotito, R., Marulli, F., & Mastroianni, M. (2021). Applying Machine Learning to Weather and Pollution Data Analysis for a Better Management of Local Areas: The Case of Napoli, Italy [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 2021-April, 354–363. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85135227609&partnerID=40&md5=5a7c117fa01d0ba8d779b0e092bc0f63
Abstract
Local pollution is a problem that affects urban areas and has effects on the quality of life and on health conditions. In order to not develop strict measures and to better manage territories, the national authorities have applied a vast range of predictive models. Actually, the application of machine learning has been studied in the last decades in various cases with various declination to simplify this problem. In this paper, we apply a regression-based analysis technique to a dataset containing official historical local pollution and weather data to look for criteria that allow forecasting critical conditions. The methods was applied to the case study of Napoli, Italy, where the local environmental protection agency manages a set of fixed monitoring stations where both chemical and meteorological data are recorded. The joining of the two raw dataset was overcome by the use of a maximum inclusion strategy as performing the joining action with”outer” mode. Among the four different regression models applied, namely the Linear Regression Model calculated with Ordinary Least Square (LN-OLS), the Ridge regression Model (Ridge), the Lasso Model (Lasso) and Supervised Nearest Neighbors Regression (KNN), the Ridge regression model was found to better perform with an R2 (Coefficient of Determination) value equal to 0.77 and low value for both MAE (Mean Absolute Error) and MSE (Mean Squared Error), equal to 0.12 and 0.04 respectively. © 2021 by SCITEPRESS - Science and Technology Publications, Lda. - DetailsMarulli, F., Verde, L., & Campanile, L. (2021). Exploring data and model poisoning attacks to deep learning-based NLP systems [Conference paper]. Procedia Computer Science, 192, 3570–3579. https://doi.org/10.1016/j.procs.2021.09.130
Abstract
Natural Language Processing (NLP) is being recently explored also to its application in supporting malicious activities and objects detection. Furthermore, NLP and Deep Learning have become targets of malicious attacks too. Very recent researches evidenced that adversarial attacks are able to affect also NLP tasks, in addition to the more popular adversarial attacks on deep learning systems for image processing tasks. More precisely, while small perturbations applied to the data set adopted for training typical NLP tasks (e.g., Part-of-Speech Tagging, Named Entity Recognition, etc..) could be easily recognized, models poisoning, performed by the means of altered data models, typically provided in the transfer learning phase to a deep neural networks (e.g., poisoning attacks by word embeddings), are harder to be detected. In this work, we preliminary explore the effectiveness of a poisoned word embeddings attack aimed at a deep neural network trained to accomplish a Named Entity Recognition (NER) task. By adopting the NER case study, we aimed to analyze the severity of such a kind of attack to accuracy in recognizing the right classes for the given entities. Finally, this study represents a preliminary step to assess the impact and the vulnerabilities of some NLP systems we adopt in our research activities, and further investigating some potential mitigation strategies, in order to make these systems more resilient to data and models poisoning attacks. © 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) Peer-review under responsibility of the scientific committee of KES International. - DetailsCampanile, L., Marulli, F., Mastroianni, M., Palmiero, G., & Sanghez, C. (2021). Machine Learning-aided Automatic Calibration of Smart Thermal Cameras for Health Monitoring Applications [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 2021-April, 343–353. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85137959400&partnerID=40&md5=eb78330cb4d585e500b77cd906edfbc7
Abstract
In this paper, we introduce a solution aiming to improve the accuracy of the surface temperature detection in an outdoor environment. The temperature sensing subsystem relies on Mobotix thermal camera without the black body, the automatic compensation subsystem relies on Raspberry Pi with Node-RED and TensorFlow 2.x. The final results showed that it is possible to automatically calibrate the camera using machine learning and that it is possible to use thermal imaging cameras even in critical conditions such as outdoors. Future development is to improve performance using computer vision techniques to rule out irrelevant measurements. © 2021 by SCITEPRESS - Science and Technology Publications, Lda. - DetailsMarulli, F., Balzanella, A., Campanile, L., Iacono, M., & Mastroianni, M. (2021). Exploring a Federated Learning Approach to Enhance Authorship Attribution of Misleading Information from Heterogeneous Sources [Conference paper]. Proceedings of the International Joint Conference on Neural Networks, 2021-July. https://doi.org/10.1109/IJCNN52387.2021.9534377
Abstract
Authorship Attribution (AA) is currently applied in several applications, among which fraud detection and anti-plagiarism checks: this task can leverage stylometry and Natural Language Processing techniques. In this work, we explored some strategies to enhance the performance of an AA task for the automatic detection of false and misleading information (e.g., fake news). We set up a text classification model for AA based on stylometry exploiting recurrent deep neural networks and implemented two learning tasks trained on the same collection of fake and real news, comparing their performances: one is based on Federated Learning architecture, the other on a centralized architecture. The goal was to discriminate potential fake information from true ones when the fake news comes from heterogeneous sources, with different styles. Preliminary experiments show that a distributed approach significantly improves recall with respect to the centralized model. As expected, precision was lower in the distributed model. This aspect, coupled with the statistical heterogeneity of data, represents some open issues that will be further investigated in future work. © 2021 IEEE.
2020
- DetailsMainenti, G., Campanile, L., Marulli, F., Ricciardi, C., & Valente, A. S. (2020). Machine learning approaches for diabetes classification: Perspectives to artificial intelligence methods updating [Conference paper]. IoTBDS 2020 - Proceedings of the 5th International Conference on Internet of Things, Big Data and Security, 533–540. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85089519717&partnerID=40&md5=bf7cc36e86c1988dd85e04c2fce06de1
Abstract
In recent years the application of Machine Learning (ML) and Artificial Intelligence (AI) techniques in healthcare helped clinicians to improve the management of chronic patients. Diabetes is among the most common chronic illness in the world for which often is still challenging do an early detection and a correct classification of type of diabetes to an individual. In fact it often depends on the circumstances present at the time of diagnosis, and many diabetic individuals do not easily fit into a single class. The aim is this paper is the application of ML techniques in order to classify the occurrence of different mellitus diabetes on the base of clinical data obtained from diabetic patients during the daily hospitals activities. Copyright © 2020 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved.
