Topic: Fake detection

Published:

# Topic: Fake detection

Activity detection Automatic calibration Classification (of information) Cluster analysis Fake detection feature extraction Fraud detection Information detection Information extraction Intrusion detection Keyword-based search Named entity recognition Object detection signal processing Signal-processing spectral centroid Traffic information

2025

  1. Campanile, L., Zona, R., Perfetti, A., & Rosatelli, F. (2025). An AI-Driven Methodology for Patent Evaluation in the IoT Sector: Assessing Relevance and Future Impact [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 501–508. https://doi.org/10.5220/0013519700003944
    Abstract
    The rapid expansion of the Internet of Things has led to a surge in patent filings, creating challenges in evaluating their relevance and potential impact. Traditional patent assessment methods, relying on manual review and keyword-based searches, are increasingly inadequate for analyzing the complexity of emerging IoT technologies. In this paper, we propose an AI-driven methodology for patent evaluation that leverages Large Language Models and machine learning techniques to assess patent relevance and estimate future impact. Our framework integrates advanced Natural Language Processing techniques with structured patent metadata to establish a systematic approach to patent analysis. The methodology consists of three key components: (1) feature extraction from patent text using LLM embeddings and conventional NLP methods, (2) relevance classification and clustering to identify emerging technological trends, and (3) an initial formulation of impact estimation based on semantic similarity and citation patterns. While this study focuses primarily on defining the methodology, we include a minimal validation on a sample dataset to illustrate its feasibility and potential. The proposed approach lays the groundwork for a scalable, automated patent evaluation system, with future research directions aimed at refining impact prediction models and expanding empirical validation. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda.
    DOI Publisher Details
    Details

2024

  1. Verde, L., Marulli, F., De Fazio, R., Campanile, L., & Marrone, S. (2024). HEAR set: A ligHtwEight acoustic paRameters set to assess mental health from voice analysis [Article]. Computers in Biology and Medicine, 182. https://doi.org/10.1016/j.compbiomed.2024.109021
    Abstract
    Background: Voice analysis has significant potential in aiding healthcare professionals with detecting, diagnosing, and personalising treatment. It represents an objective and non-intrusive tool for supporting the detection and monitoring of specific pathologies. By calculating various acoustic features, voice analysis extracts valuable information to assess voice quality. The choice of these parameters is crucial for an accurate assessment. Method: In this paper, we propose a lightweight acoustic parameter set, named HEAR, able to evaluate voice quality to assess mental health. In detail, this consists of jitter, spectral centroid, Mel-frequency cepstral coefficients, and their derivates. The choice of parameters for the proposed set was influenced by the explainable significance of each acoustic parameter in the voice production process. Results: The reliability of the proposed acoustic set to detect the early symptoms of mental disorders was evaluated in an experimental phase. Voices of subjects suffering from different mental pathologies, selected from available databases, were analysed. The performance obtained from the HEAR features was compared with that obtained by analysing features selected from toolkits widely used in the literature, as with those obtained using learned procedures. The best performance in terms of MAE and RMSE was achieved for the detection of depression (5.32 and 6.24 respectively). For the detection of psychogenic dysphonia and anxiety, the highest accuracy rates were about 75 % and 97 %, respectively. Conclusions: The comparative evaluation was carried out to assess the performance of the proposed approach, demonstrating a reliable capability to highlight affective physiological alterations of voice quality due to the considered mental disorders. © 2024 The Author(s)
    DOI Publisher Details
    Details

2022

  1. Campanile, L., de Biase, M. S., Marrone, S., Marulli, F., Raimondo, M., & Verde, L. (2022). Sensitive Information Detection Adopting Named Entity Recognition: A Proposed Methodology [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13380 LNCS, 377–388. https://doi.org/10.1007/978-3-031-10542-5_26
    Abstract
    Protecting and safeguarding privacy has become increasingly important, especially in recent years. The increasing possibilities of acquiring and sharing personal information and data through digital devices and platforms, such as apps or social networks, have increased the risks of privacy breaches. In order to effectively respect and guarantee the privacy and protection of sensitive information, it is necessary to develop mechanisms capable of providing such guarantees automatically and reliably. In this paper we propose a methodology able to automatically recognize sensitive data. A Named Entity Recognition was used to identify appropriate entities. An improvement in the recognition of these entities is achieved by evaluating the words contained in an appropriate context window by assessing their similarity to words in a domain taxonomy. This, in fact, makes it possible to refine the labels of the recognized categories using a generic Named Entity Recognition. A preliminary evaluation of the reliability of the proposed approach was performed. In detail, texts of juridical documents written in Italian were analyzed. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
    DOI Publisher Details
    Details
  2. Marulli, F., Verde, L., Marrore, S., & Campanile, L. (2022). A Federated Consensus-Based Model for Enhancing Fake News and Misleading Information Debunking [Conference paper]. Smart Innovation, Systems and Technologies, 309, 587–596. https://doi.org/10.1007/978-981-19-3444-5_50
    Abstract
    Misinformation and Fake News are hard to dislodge. According to experts on this phenomenon, to fight disinformation a less credulous public is needed; so, current AI techniques can support misleading information debunking, given the human tendency to believe “facts” that confirm biases. Much effort has been recently spent by the research community on this plague: several AI-based approaches for automatic detection and classification of Fake News have been proposed; unfortunately, Fake News producers have refined their ability in eluding automatic ML and DL-based detection systems. So, debunking false news represents an effective weapon to contrast the users’ reliance on false information. In this work, we propose a preliminary study aiming to approach the design of effective fake news debunking systems, harnessing two complementary federated approaches. We propose, firstly, a federation of independent classification systems to accomplish a debunking process, by applying a distributed consensus mechanism. Secondly, a federated learning task, involving several cooperating nodes, is accomplished, to obtain a unique merged model, including features of single participants models, trained on different and independent data fragments. This study is a preliminary work aiming to to point out the feasibility and the comparability of these proposed approaches, thus paving the way to an experimental campaign that will be performed on effective real data, thus providing an evidence for an effective and feasible model for detecting potential heterogeneous fake news. Debunking misleading information is mission critical to increase the awareness of facts on the part of news consumers. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
    DOI Publisher Details
    Details

2021

  1. Campanile, L., Iacono, M., Levis, A. H., Marulli, F., & Mastroianni, M. (2021). Privacy regulations, smart roads, blockchain, and liability insurance: Putting technologies to work [Article]. IEEE Security and Privacy, 19(1), 34–43. https://doi.org/10.1109/MSEC.2020.3012059
    Abstract
    Smart streets promise widely available traffic information to help improve people’s safety. Unfortunately, gathering that data may threaten privacy. We describe an architecture that exploits a blockchain and the Internet of Vehicles and show its compliance with the General Data Protection Regulation. © 2003-2012 IEEE.
    DOI Publisher Details
    Details
  2. Marulli, F., Verde, L., & Campanile, L. (2021). Exploring data and model poisoning attacks to deep learning-based NLP systems [Conference paper]. Procedia Computer Science, 192, 3570–3579. https://doi.org/10.1016/j.procs.2021.09.130
    Abstract
    Natural Language Processing (NLP) is being recently explored also to its application in supporting malicious activities and objects detection. Furthermore, NLP and Deep Learning have become targets of malicious attacks too. Very recent researches evidenced that adversarial attacks are able to affect also NLP tasks, in addition to the more popular adversarial attacks on deep learning systems for image processing tasks. More precisely, while small perturbations applied to the data set adopted for training typical NLP tasks (e.g., Part-of-Speech Tagging, Named Entity Recognition, etc..) could be easily recognized, models poisoning, performed by the means of altered data models, typically provided in the transfer learning phase to a deep neural networks (e.g., poisoning attacks by word embeddings), are harder to be detected. In this work, we preliminary explore the effectiveness of a poisoned word embeddings attack aimed at a deep neural network trained to accomplish a Named Entity Recognition (NER) task. By adopting the NER case study, we aimed to analyze the severity of such a kind of attack to accuracy in recognizing the right classes for the given entities. Finally, this study represents a preliminary step to assess the impact and the vulnerabilities of some NLP systems we adopt in our research activities, and further investigating some potential mitigation strategies, in order to make these systems more resilient to data and models poisoning attacks. © 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) Peer-review under responsibility of the scientific committee of KES International.
    DOI Publisher Details
    Details
  3. Campanile, L., Marulli, F., Mastroianni, M., Palmiero, G., & Sanghez, C. (2021). Machine Learning-aided Automatic Calibration of Smart Thermal Cameras for Health Monitoring Applications [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 2021-April, 343–353. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85137959400&partnerID=40&md5=eb78330cb4d585e500b77cd906edfbc7
    Abstract
    In this paper, we introduce a solution aiming to improve the accuracy of the surface temperature detection in an outdoor environment. The temperature sensing subsystem relies on Mobotix thermal camera without the black body, the automatic compensation subsystem relies on Raspberry Pi with Node-RED and TensorFlow 2.x. The final results showed that it is possible to automatically calibrate the camera using machine learning and that it is possible to use thermal imaging cameras even in critical conditions such as outdoors. Future development is to improve performance using computer vision techniques to rule out irrelevant measurements. © 2021 by SCITEPRESS - Science and Technology Publications, Lda.
    Publisher Details
    Details
  4. Marulli, F., Balzanella, A., Campanile, L., Iacono, M., & Mastroianni, M. (2021). Exploring a Federated Learning Approach to Enhance Authorship Attribution of Misleading Information from Heterogeneous Sources [Conference paper]. Proceedings of the International Joint Conference on Neural Networks, 2021-July. https://doi.org/10.1109/IJCNN52387.2021.9534377
    Abstract
    Authorship Attribution (AA) is currently applied in several applications, among which fraud detection and anti-plagiarism checks: this task can leverage stylometry and Natural Language Processing techniques. In this work, we explored some strategies to enhance the performance of an AA task for the automatic detection of false and misleading information (e.g., fake news). We set up a text classification model for AA based on stylometry exploiting recurrent deep neural networks and implemented two learning tasks trained on the same collection of fake and real news, comparing their performances: one is based on Federated Learning architecture, the other on a centralized architecture. The goal was to discriminate potential fake information from true ones when the fake news comes from heterogeneous sources, with different styles. Preliminary experiments show that a distributed approach significantly improves recall with respect to the centralized model. As expected, precision was lower in the distributed model. This aspect, coupled with the statistical heterogeneity of data, represents some open issues that will be further investigated in future work. © 2021 IEEE.
    DOI Publisher Details
    Details

2020

  1. Campanile, L., Iacono, M., Martinelli, F., Marulli, F., Mastroianni, M., Mercaldo, F., & Santone, A. (2020). Towards the Use of Generative Adversarial Neural Networks to Attack Online Resources [Conference paper]. Advances in Intelligent Systems and Computing, 1150 AISC, 890–901. https://doi.org/10.1007/978-3-030-44038-1_81
    Abstract
    The role of remote resources, such as the ones provided by Cloud infrastructures, is of paramount importance for the implementation of cost effective, yet reliable software systems to provide services to third parties. Cost effectiveness is a direct consequence of a correct estimation of resource usage, to be able to define a budget and estimate the right price to put own services on the market. Attacks that overload resources with non legitimate requests, being them explicit attacks or just malicious, non harmful resource engagements, may push the use of Cloud resources beyond estimation, causing additional costs, or unexpected energy usage, or a lower overall quality of services, so intrusion detection devices or firewalls are set to avoid undesired accesses. We propose the use of Generative Adversarial Neural Networks (GANs) to setup a method for shaping request based attacks capable of reaching resources beyond defenses. The approach is studied by using a publicly available traffic data set, to test the concept and demonstrate its potential applications. © 2020, Springer Nature Switzerland AG.
    DOI Publisher Details
    Details

← Back to all publications

2025

  1. Campanile, L., Zona, R., Perfetti, A., & Rosatelli, F. (2025). An AI-Driven Methodology for Patent Evaluation in the IoT Sector: Assessing Relevance and Future Impact [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 501–508. https://doi.org/10.5220/0013519700003944
    Abstract
    The rapid expansion of the Internet of Things has led to a surge in patent filings, creating challenges in evaluating their relevance and potential impact. Traditional patent assessment methods, relying on manual review and keyword-based searches, are increasingly inadequate for analyzing the complexity of emerging IoT technologies. In this paper, we propose an AI-driven methodology for patent evaluation that leverages Large Language Models and machine learning techniques to assess patent relevance and estimate future impact. Our framework integrates advanced Natural Language Processing techniques with structured patent metadata to establish a systematic approach to patent analysis. The methodology consists of three key components: (1) feature extraction from patent text using LLM embeddings and conventional NLP methods, (2) relevance classification and clustering to identify emerging technological trends, and (3) an initial formulation of impact estimation based on semantic similarity and citation patterns. While this study focuses primarily on defining the methodology, we include a minimal validation on a sample dataset to illustrate its feasibility and potential. The proposed approach lays the groundwork for a scalable, automated patent evaluation system, with future research directions aimed at refining impact prediction models and expanding empirical validation. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda.
    DOI Publisher Details
    Details

2024

  1. Verde, L., Marulli, F., De Fazio, R., Campanile, L., & Marrone, S. (2024). HEAR set: A ligHtwEight acoustic paRameters set to assess mental health from voice analysis [Article]. Computers in Biology and Medicine, 182. https://doi.org/10.1016/j.compbiomed.2024.109021
    Abstract
    Background: Voice analysis has significant potential in aiding healthcare professionals with detecting, diagnosing, and personalising treatment. It represents an objective and non-intrusive tool for supporting the detection and monitoring of specific pathologies. By calculating various acoustic features, voice analysis extracts valuable information to assess voice quality. The choice of these parameters is crucial for an accurate assessment. Method: In this paper, we propose a lightweight acoustic parameter set, named HEAR, able to evaluate voice quality to assess mental health. In detail, this consists of jitter, spectral centroid, Mel-frequency cepstral coefficients, and their derivates. The choice of parameters for the proposed set was influenced by the explainable significance of each acoustic parameter in the voice production process. Results: The reliability of the proposed acoustic set to detect the early symptoms of mental disorders was evaluated in an experimental phase. Voices of subjects suffering from different mental pathologies, selected from available databases, were analysed. The performance obtained from the HEAR features was compared with that obtained by analysing features selected from toolkits widely used in the literature, as with those obtained using learned procedures. The best performance in terms of MAE and RMSE was achieved for the detection of depression (5.32 and 6.24 respectively). For the detection of psychogenic dysphonia and anxiety, the highest accuracy rates were about 75 % and 97 %, respectively. Conclusions: The comparative evaluation was carried out to assess the performance of the proposed approach, demonstrating a reliable capability to highlight affective physiological alterations of voice quality due to the considered mental disorders. © 2024 The Author(s)
    DOI Publisher Details
    Details

2022

  1. Campanile, L., de Biase, M. S., Marrone, S., Marulli, F., Raimondo, M., & Verde, L. (2022). Sensitive Information Detection Adopting Named Entity Recognition: A Proposed Methodology [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13380 LNCS, 377–388. https://doi.org/10.1007/978-3-031-10542-5_26
    Abstract
    Protecting and safeguarding privacy has become increasingly important, especially in recent years. The increasing possibilities of acquiring and sharing personal information and data through digital devices and platforms, such as apps or social networks, have increased the risks of privacy breaches. In order to effectively respect and guarantee the privacy and protection of sensitive information, it is necessary to develop mechanisms capable of providing such guarantees automatically and reliably. In this paper we propose a methodology able to automatically recognize sensitive data. A Named Entity Recognition was used to identify appropriate entities. An improvement in the recognition of these entities is achieved by evaluating the words contained in an appropriate context window by assessing their similarity to words in a domain taxonomy. This, in fact, makes it possible to refine the labels of the recognized categories using a generic Named Entity Recognition. A preliminary evaluation of the reliability of the proposed approach was performed. In detail, texts of juridical documents written in Italian were analyzed. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
    DOI Publisher Details
    Details
  2. Marulli, F., Verde, L., Marrore, S., & Campanile, L. (2022). A Federated Consensus-Based Model for Enhancing Fake News and Misleading Information Debunking [Conference paper]. Smart Innovation, Systems and Technologies, 309, 587–596. https://doi.org/10.1007/978-981-19-3444-5_50
    Abstract
    Misinformation and Fake News are hard to dislodge. According to experts on this phenomenon, to fight disinformation a less credulous public is needed; so, current AI techniques can support misleading information debunking, given the human tendency to believe “facts” that confirm biases. Much effort has been recently spent by the research community on this plague: several AI-based approaches for automatic detection and classification of Fake News have been proposed; unfortunately, Fake News producers have refined their ability in eluding automatic ML and DL-based detection systems. So, debunking false news represents an effective weapon to contrast the users’ reliance on false information. In this work, we propose a preliminary study aiming to approach the design of effective fake news debunking systems, harnessing two complementary federated approaches. We propose, firstly, a federation of independent classification systems to accomplish a debunking process, by applying a distributed consensus mechanism. Secondly, a federated learning task, involving several cooperating nodes, is accomplished, to obtain a unique merged model, including features of single participants models, trained on different and independent data fragments. This study is a preliminary work aiming to to point out the feasibility and the comparability of these proposed approaches, thus paving the way to an experimental campaign that will be performed on effective real data, thus providing an evidence for an effective and feasible model for detecting potential heterogeneous fake news. Debunking misleading information is mission critical to increase the awareness of facts on the part of news consumers. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
    DOI Publisher Details
    Details

2021

  1. Campanile, L., Iacono, M., Levis, A. H., Marulli, F., & Mastroianni, M. (2021). Privacy regulations, smart roads, blockchain, and liability insurance: Putting technologies to work [Article]. IEEE Security and Privacy, 19(1), 34–43. https://doi.org/10.1109/MSEC.2020.3012059
    Abstract
    Smart streets promise widely available traffic information to help improve people’s safety. Unfortunately, gathering that data may threaten privacy. We describe an architecture that exploits a blockchain and the Internet of Vehicles and show its compliance with the General Data Protection Regulation. © 2003-2012 IEEE.
    DOI Publisher Details
    Details
  2. Marulli, F., Verde, L., & Campanile, L. (2021). Exploring data and model poisoning attacks to deep learning-based NLP systems [Conference paper]. Procedia Computer Science, 192, 3570–3579. https://doi.org/10.1016/j.procs.2021.09.130
    Abstract
    Natural Language Processing (NLP) is being recently explored also to its application in supporting malicious activities and objects detection. Furthermore, NLP and Deep Learning have become targets of malicious attacks too. Very recent researches evidenced that adversarial attacks are able to affect also NLP tasks, in addition to the more popular adversarial attacks on deep learning systems for image processing tasks. More precisely, while small perturbations applied to the data set adopted for training typical NLP tasks (e.g., Part-of-Speech Tagging, Named Entity Recognition, etc..) could be easily recognized, models poisoning, performed by the means of altered data models, typically provided in the transfer learning phase to a deep neural networks (e.g., poisoning attacks by word embeddings), are harder to be detected. In this work, we preliminary explore the effectiveness of a poisoned word embeddings attack aimed at a deep neural network trained to accomplish a Named Entity Recognition (NER) task. By adopting the NER case study, we aimed to analyze the severity of such a kind of attack to accuracy in recognizing the right classes for the given entities. Finally, this study represents a preliminary step to assess the impact and the vulnerabilities of some NLP systems we adopt in our research activities, and further investigating some potential mitigation strategies, in order to make these systems more resilient to data and models poisoning attacks. © 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) Peer-review under responsibility of the scientific committee of KES International.
    DOI Publisher Details
    Details
  3. Campanile, L., Marulli, F., Mastroianni, M., Palmiero, G., & Sanghez, C. (2021). Machine Learning-aided Automatic Calibration of Smart Thermal Cameras for Health Monitoring Applications [Conference paper]. International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 2021-April, 343–353. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85137959400&partnerID=40&md5=eb78330cb4d585e500b77cd906edfbc7
    Abstract
    In this paper, we introduce a solution aiming to improve the accuracy of the surface temperature detection in an outdoor environment. The temperature sensing subsystem relies on Mobotix thermal camera without the black body, the automatic compensation subsystem relies on Raspberry Pi with Node-RED and TensorFlow 2.x. The final results showed that it is possible to automatically calibrate the camera using machine learning and that it is possible to use thermal imaging cameras even in critical conditions such as outdoors. Future development is to improve performance using computer vision techniques to rule out irrelevant measurements. © 2021 by SCITEPRESS - Science and Technology Publications, Lda.
    Publisher Details
    Details
  4. Marulli, F., Balzanella, A., Campanile, L., Iacono, M., & Mastroianni, M. (2021). Exploring a Federated Learning Approach to Enhance Authorship Attribution of Misleading Information from Heterogeneous Sources [Conference paper]. Proceedings of the International Joint Conference on Neural Networks, 2021-July. https://doi.org/10.1109/IJCNN52387.2021.9534377
    Abstract
    Authorship Attribution (AA) is currently applied in several applications, among which fraud detection and anti-plagiarism checks: this task can leverage stylometry and Natural Language Processing techniques. In this work, we explored some strategies to enhance the performance of an AA task for the automatic detection of false and misleading information (e.g., fake news). We set up a text classification model for AA based on stylometry exploiting recurrent deep neural networks and implemented two learning tasks trained on the same collection of fake and real news, comparing their performances: one is based on Federated Learning architecture, the other on a centralized architecture. The goal was to discriminate potential fake information from true ones when the fake news comes from heterogeneous sources, with different styles. Preliminary experiments show that a distributed approach significantly improves recall with respect to the centralized model. As expected, precision was lower in the distributed model. This aspect, coupled with the statistical heterogeneity of data, represents some open issues that will be further investigated in future work. © 2021 IEEE.
    DOI Publisher Details
    Details

2020

  1. Campanile, L., Iacono, M., Martinelli, F., Marulli, F., Mastroianni, M., Mercaldo, F., & Santone, A. (2020). Towards the Use of Generative Adversarial Neural Networks to Attack Online Resources [Conference paper]. Advances in Intelligent Systems and Computing, 1150 AISC, 890–901. https://doi.org/10.1007/978-3-030-44038-1_81
    Abstract
    The role of remote resources, such as the ones provided by Cloud infrastructures, is of paramount importance for the implementation of cost effective, yet reliable software systems to provide services to third parties. Cost effectiveness is a direct consequence of a correct estimation of resource usage, to be able to define a budget and estimate the right price to put own services on the market. Attacks that overload resources with non legitimate requests, being them explicit attacks or just malicious, non harmful resource engagements, may push the use of Cloud resources beyond estimation, causing additional costs, or unexpected energy usage, or a lower overall quality of services, so intrusion detection devices or firewalls are set to avoid undesired accesses. We propose the use of Generative Adversarial Neural Networks (GANs) to setup a method for shaping request based attacks capable of reaching resources beyond defenses. The approach is studied by using a publicly available traffic data set, to test the concept and demonstrate its potential applications. © 2020, Springer Nature Switzerland AG.
    DOI Publisher Details
    Details

← Back to all publications