Abstract Automatic authorship analysis, often referred to as stylometry, is a captivating yet contentious field that employs computational techniques to determine the authorship of textual artefacts. In recent years, the importance of author profiling has grown significantly due to the proliferation of automatic text generation systems. These include both early-generation bots and the latest generative AI-based models, which have heightened concerns about misinformation and content authenticity. This study proposes a novel approach to evaluate the feasibility and effectiveness of contemporary distributed learning methods. The approach leverages the computational advantages of distributed systems while preserving the privacy of human contributors, enabling the collection and analysis of extensive datasets of “human-written” texts in contrast to those generated by bots. More specifically, the proposed method adopts a Federated Learning (FL) framework, integrating readability and stylometric metrics to deliver a privacy-preserving solution for Authorship Attribution (AA). The primary objective is to enhance the accuracy of AA processes, thus achieving a more robust “authorial fingerprint”. Experimental results reveal that while FL effectively protects privacy and mitigates data exposure risks, the combined use of readability and stylometric features significantly increases the accuracy of AA. This approach demonstrates promise for secure and scalable AA applications, particularly in privacy-sensitive contexts and real-time edge computing scenarios. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
DOI Publisher Details Copy BibTeX Download .bib
{"key"=>"Campanile2025284", "type"=>"Book chapter", "bibtex"=>"@article{Campanile2025284,\n author = {Campanile, Lelio and de Biase, Maria Stella and Marulli, Fiammetta},\n title = {Edge-Cloud Distributed Approaches to Text Authorship Analysis: A Feasibility Study},\n year = {2025},\n journal = {Lecture Notes on Data Engineering and Communications Technologies},\n volume = {250},\n pages = {284 – 293},\n doi = {10.1007/978-3-031-87778-0_28}\n}\n", "author"=>"Campanile, Lelio and de Biase, Maria Stella and Marulli, Fiammetta", "author_array"=>[{"first"=>"Lelio", "last"=>"Campanile", "prefix"=>"", "suffix"=>""}, {"first"=>"Maria Stella", "last"=>"Biase", "prefix"=>"de", "suffix"=>""}, {"first"=>"Fiammetta", "last"=>"Marulli", "prefix"=>"", "suffix"=>""}], "author_0_first"=>"Lelio", "author_0_last"=>"Campanile", "author_0_prefix"=>"", "author_0_suffix"=>"", "author_1_first"=>"Maria Stella", "author_1_last"=>"Biase", "author_1_prefix"=>"de", "author_1_suffix"=>"", "author_2_first"=>"Fiammetta", "author_2_last"=>"Marulli", "author_2_prefix"=>"", "author_2_suffix"=>"", "title"=>"Edge-Cloud Distributed Approaches to Text Authorship Analysis: A Feasibility Study", "year"=>"2025", "journal"=>"Lecture Notes on Data Engineering and Communications Technologies", "volume"=>"250", "pages"=>"284 – 293", "doi"=>"10.1007/978-3-031-87778-0_28", "url"=>"https://www.scopus.com/inward/record.uri?eid=2-s2.0-105003007301&doi=10.1007%2f978-3-031-87778-0_28&partnerID=40&md5=616665181072a2879591530e2820efc1", "abstract"=>"Automatic authorship analysis, often referred to as stylometry, is a captivating yet contentious field that employs computational techniques to determine the authorship of textual artefacts. In recent years, the importance of author profiling has grown significantly due to the proliferation of automatic text generation systems. These include both early-generation bots and the latest generative AI-based models, which have heightened concerns about misinformation and content authenticity. This study proposes a novel approach to evaluate the feasibility and effectiveness of contemporary distributed learning methods. The approach leverages the computational advantages of distributed systems while preserving the privacy of human contributors, enabling the collection and analysis of extensive datasets of “human-written” texts in contrast to those generated by bots. More specifically, the proposed method adopts a Federated Learning (FL) framework, integrating readability and stylometric metrics to deliver a privacy-preserving solution for Authorship Attribution (AA). The primary objective is to enhance the accuracy of AA processes, thus achieving a more robust “authorial fingerprint”. Experimental results reveal that while FL effectively protects privacy and mitigates data exposure risks, the combined use of readability and stylometric features significantly increases the accuracy of AA. This approach demonstrates promise for secure and scalable AA applications, particularly in privacy-sensitive contexts and real-time edge computing scenarios. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.", "author_keywords"=>"Authorship Attribution; Cloud-Edge Computing; Distributed Models; Federated Learning; Text Analysis", "keywords"=>"Adversarial machine learning; Contrastive Learning; Differential privacy; Distributed cloud; Authorship analysis; Authorship attribution; Cloud-edge computing; Distributed approaches; Distributed models; Edge clouds; Edge computing; Feasibility studies; Text analysis; Text authorship; Federated learning", "publication_stage"=>"Final", "source"=>"Scopus", "note"=>"Cited by: 0"}