Dataset Anonimyzation for Machine Learning: An ISP Case Study
Venue & metadata
- Journal/Proceedings: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
- Volume: 12950 LNCS
- Pages: 589 – 597
- Note: Cited by: 1
- Author keywords: Attribute suppression; Character masking; Cryptography; Customer Premise Equipment; Data generalization; Hash; ISP; Logistic regression; Pseudo-anonymization; Random forest; WISP
Abstract
Internet Service Providers technical support needs personal data to predict potential anomalies. In this paper, we performed a comparative study of forecasting performance using raw data and anonymized data, in order to assess how much performance may vary, when plain personal data are replaced by anonymized personal data. © 2021, Springer Nature Switzerland AG.
Keywords
Logistic regressionMachine learningAnonymizationAttribute suppressionCharacter maskingCustomer premises equipmentData generalizationHashISPPseudo-anonymizationRandom forestsWISPDecision trees
Links & artifacts
Suggested citation
Campanile, L., Forgione, F., Marulli, F., Palmiero, G., & Sanghez, C. (2021). Dataset Anonimyzation for Machine Learning: An ISP Case Study [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12950 LNCS, 589–597. https://doi.org/10.1007/978-3-030-86960-1_42