Dataset Anonimyzation for Machine Learning: An ISP Case Study

1 minute read

Conference Lelio Campanile, Fabio Forgione, Fiammetta Marulli, Gianfranco Palmiero, Carlo Sanghez — 2021 · Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Venue & metadata

  • Journal/Proceedings: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
  • Volume: 12950 LNCS
  • Pages: 589 – 597
  • Note: Cited by: 1
  • Author keywords: Attribute suppression; Character masking; Cryptography; Customer Premise Equipment; Data generalization; Hash; ISP; Logistic regression; Pseudo-anonymization; Random forest; WISP

Abstract

Internet Service Providers technical support needs personal data to predict potential anomalies. In this paper, we performed a comparative study of forecasting performance using raw data and anonymized data, in order to assess how much performance may vary, when plain personal data are replaced by anonymized personal data. © 2021, Springer Nature Switzerland AG.

Keywords

Logistic regressionMachine learningAnonymizationAttribute suppressionCharacter maskingCustomer premises equipmentData generalizationHashISPPseudo-anonymizationRandom forestsWISPDecision trees

Links & artifacts

DOI Publisher

Suggested citation

Campanile, L., Forgione, F., Marulli, F., Palmiero, G., & Sanghez, C. (2021). Dataset Anonimyzation for Machine Learning: An ISP Case Study [Conference paper]. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12950 LNCS, 589–597. https://doi.org/10.1007/978-3-030-86960-1_42

← Back to Publications