Dataset Anonimyzation for Machine Learning: An ISP Case Study
Dataset Anonimyzation for Machine Learning: An ISP Case Study
Venue & metadata
- Journal/Proceedings: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
- Volume: 12950 LNCS
- Pages: 589 – 597
- Note: Cited by: 1
- Author keywords: Attribute suppression; Character masking; Cryptography; Customer Premise Equipment; Data generalization; Hash; ISP; Logistic regression; Pseudo-anonymization; Random forest; WISP
Abstract
Internet Service Providers technical support needs personal data to predict potential anomalies. In this paper, we performed a comparative study of forecasting performance using raw data and anonymized data, in order to assess how much performance may vary, when plain personal data are replaced by anonymized personal data. © 2021, Springer Nature Switzerland AG.
Keywords
Logistic regression GS Machine learning GS Anonymization GS Attribute suppression GS Character masking GS Customer premises equipment GS Data generalization GS Hash GS ISP GS Pseudo-anonymization GS Random forests GS WISP GS Decision trees GS
Links & artifacts
Suggested citation
Campanile, L., Forgione, F., Marulli, F., Palmiero, G., & Sanghez, C. (2021). Dataset Anonimyzation for Machine Learning: An ISP Case Study [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12950 LNCS, 589–597. https://doi.org/10.1007/978-3-030-86960-1_42