Abstract Background: Voice analysis has significant potential in aiding healthcare professionals with detecting, diagnosing, and personalising treatment. It represents an objective and non-intrusive tool for supporting the detection and monitoring of specific pathologies. By calculating various acoustic features, voice analysis extracts valuable information to assess voice quality. The choice of these parameters is crucial for an accurate assessment. Method: In this paper, we propose a lightweight acoustic parameter set, named HEAR, able to evaluate voice quality to assess mental health. In detail, this consists of jitter, spectral centroid, Mel-frequency cepstral coefficients, and their derivates. The choice of parameters for the proposed set was influenced by the explainable significance of each acoustic parameter in the voice production process. Results: The reliability of the proposed acoustic set to detect the early symptoms of mental disorders was evaluated in an experimental phase. Voices of subjects suffering from different mental pathologies, selected from available databases, were analysed. The performance obtained from the HEAR features was compared with that obtained by analysing features selected from toolkits widely used in the literature, as with those obtained using learned procedures. The best performance in terms of MAE and RMSE was achieved for the detection of depression (5.32 and 6.24 respectively). For the detection of psychogenic dysphonia and anxiety, the highest accuracy rates were about 75 % and 97 %, respectively. Conclusions: The comparative evaluation was carried out to assess the performance of the proposed approach, demonstrating a reliable capability to highlight affective physiological alterations of voice quality due to the considered mental disorders. © 2024 The Author(s)
DOI Publisher Details Copy BibTeX Download .bib
{"key"=>"Verde2024", "type"=>"Article", "bibtex"=>"@article{Verde2024,\n author = {Verde, Laura and Marulli, Fiammetta and De Fazio, Roberta and Campanile, Lelio and Marrone, Stefano},\n title = {HEAR set: A ligHtwEight acoustic paRameters set to assess mental health from voice analysis},\n year = {2024},\n journal = {Computers in Biology and Medicine},\n volume = {182},\n doi = {10.1016/j.compbiomed.2024.109021}\n}\n", "author"=>"Verde, Laura and Marulli, Fiammetta and De Fazio, Roberta and Campanile, Lelio and Marrone, Stefano", "author_array"=>[{"first"=>"Laura", "last"=>"Verde", "prefix"=>"", "suffix"=>""}, {"first"=>"Fiammetta", "last"=>"Marulli", "prefix"=>"", "suffix"=>""}, {"first"=>"Roberta", "last"=>"De Fazio", "prefix"=>"", "suffix"=>""}, {"first"=>"Lelio", "last"=>"Campanile", "prefix"=>"", "suffix"=>""}, {"first"=>"Stefano", "last"=>"Marrone", "prefix"=>"", "suffix"=>""}], "author_0_first"=>"Laura", "author_0_last"=>"Verde", "author_0_prefix"=>"", "author_0_suffix"=>"", "author_1_first"=>"Fiammetta", "author_1_last"=>"Marulli", "author_1_prefix"=>"", "author_1_suffix"=>"", "author_2_first"=>"Roberta", "author_2_last"=>"De Fazio", "author_2_prefix"=>"", "author_2_suffix"=>"", "author_3_first"=>"Lelio", "author_3_last"=>"Campanile", "author_3_prefix"=>"", "author_3_suffix"=>"", "author_4_first"=>"Stefano", "author_4_last"=>"Marrone", "author_4_prefix"=>"", "author_4_suffix"=>"", "title"=>"HEAR set: A ligHtwEight acoustic paRameters set to assess mental health from voice analysis", "year"=>"2024", "journal"=>"Computers in Biology and Medicine", "volume"=>"182", "doi"=>"10.1016/j.compbiomed.2024.109021", "url"=>"https://www.scopus.com/inward/record.uri?eid=2-s2.0-85202842139&doi=10.1016%2fj.compbiomed.2024.109021&partnerID=40&md5=55e3b2beb75c8c24a863ba15a34f4f47", "abstract"=>"Background: Voice analysis has significant potential in aiding healthcare professionals with detecting, diagnosing, and personalising treatment. It represents an objective and non-intrusive tool for supporting the detection and monitoring of specific pathologies. By calculating various acoustic features, voice analysis extracts valuable information to assess voice quality. The choice of these parameters is crucial for an accurate assessment. Method: In this paper, we propose a lightweight acoustic parameter set, named HEAR, able to evaluate voice quality to assess mental health. In detail, this consists of jitter, spectral centroid, Mel-frequency cepstral coefficients, and their derivates. The choice of parameters for the proposed set was influenced by the explainable significance of each acoustic parameter in the voice production process. Results: The reliability of the proposed acoustic set to detect the early symptoms of mental disorders was evaluated in an experimental phase. Voices of subjects suffering from different mental pathologies, selected from available databases, were analysed. The performance obtained from the HEAR features was compared with that obtained by analysing features selected from toolkits widely used in the literature, as with those obtained using learned procedures. The best performance in terms of MAE and RMSE was achieved for the detection of depression (5.32 and 6.24 respectively). For the detection of psychogenic dysphonia and anxiety, the highest accuracy rates were about 75 % and 97 %, respectively. Conclusions: The comparative evaluation was carried out to assess the performance of the proposed approach, demonstrating a reliable capability to highlight affective physiological alterations of voice quality due to the considered mental disorders. © 2024 The Author(s)", "author_keywords"=>"Acoustic features set; HEAR set; Mental disorders; Signal processing; Voice analysis", "keywords"=>"Acoustics; Adult; Female; Humans; Male; Mental Disorders; Mental Health; Middle Aged; Speech Acoustics; Voice; Voice Quality; Acoustic variables measurement; mHealth; Acoustic feature set; Acoustic features; Acoustic parameters; Features sets; HEAR set; Mental disorders; Performance; Signal-processing; Voice analysis; Voice quality; acoustics; anxiety; Article; climate change; comparative study; controlled study; convolutional neural network; electric potential; emotional stress; feature extraction; health care personnel; human; major clinical study; mental disease; mental health; mood change; prosody; reliability; root mean squared error; signal processing; spectral centroid; vocal cord; voice; voice analysis; voice change; acoustics; adult; diagnosis; female; male; mental disease; mental health; middle aged; pathophysiology; physiology; speech; Personalized medicine", "publication_stage"=>"Final", "source"=>"Scopus", "note"=>"Cited by: 1"}