Exploring data and model poisoning attacks to deep learning-based NLP systems
Exploring data and model poisoning attacks to deep learning-based NLP systems
Venue & metadata
- Journal/Proceedings: Procedia Computer Science
- Volume: 192
- Pages: 3570 – 3579
- Note: Cited by: 27; All Open Access, Gold Open Access
- Author keywords: Data poisoning attacks; Deep learning vulnerabilities; Natural language processing; Poisoned word embeddings; Reliable machine learning
Abstract
Natural Language Processing (NLP) is being recently explored also to its application in supporting malicious activities and objects detection. Furthermore, NLP and Deep Learning have become targets of malicious attacks too. Very recent researches evidenced that adversarial attacks are able to affect also NLP tasks, in addition to the more popular adversarial attacks on deep learning systems for image processing tasks. More precisely, while small perturbations applied to the data set adopted for training typical NLP tasks (e.g., Part-of-Speech Tagging, Named Entity Recognition, etc..) could be easily recognized, models poisoning, performed by the means of altered data models, typically provided in the transfer learning phase to a deep neural networks (e.g., poisoning attacks by word embeddings), are harder to be detected. In this work, we preliminary explore the effectiveness of a poisoned word embeddings attack aimed at a deep neural network trained to accomplish a Named Entity Recognition (NER) task. By adopting the NER case study, we aimed to analyze the severity of such a kind of attack to accuracy in recognizing the right classes for the given entities. Finally, this study represents a preliminary step to assess the impact and the vulnerabilities of some NLP systems we adopt in our research activities, and further investigating some potential mitigation strategies, in order to make these systems more resilient to data and models poisoning attacks. © 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) Peer-review under responsibility of the scientific committee of KES International.
Keywords
Computational linguistics GS Deep neural networks GS Embeddings GS Object detection GS Speech recognition GS Activity detection GS Data poisoning attack GS Deep learning vulnerability GS Embeddings GS ITS applications GS Malicious activities GS Named entity recognition GS Poisoned word embedding GS Poisoning attacks GS Reliable machine learning GS Natural language processing systems GS
Links & artifacts
Suggested citation
Marulli, F., Verde, L., & Campanile, L. (2021). Exploring data and model poisoning attacks to deep learning-based NLP systems [Conference paper]. Procedia Computer Science, 192, 3570–3579. https://doi.org/10.1016/j.procs.2021.09.130