HMN 2025: How AI model analyzes speech to detect early neurological issues with excessive accuracy

A analysis group led by Prof. Li Hai from the Hefei Institutes of Physical Science of the Chinese Academy of Sciences has developed a novel deep {learning} framework that considerably improves the accuracy and interpretability of detecting neurological issues by means of speech. The findings have been just lately published in Neurocomputing.

“A slight change in the way in which we communicate may be greater than only a slip of the tongue—it may very well be a warning signal from the mind,” stated Prof. Hai, who led the group. “Our new model can detect early signs of neurological illnesses corresponding to Parkinson’s, Huntington’s, and Wilson illness, by analyzing voice recordings.”

Dysarthria is a typical early symptom of varied neurological issues. Since speech abnormalities usually mirror underlying neurodegenerative processes, voice indicators have emerged as promising noninvasive biomarkers for the early screening and steady monitoring of such situations.

Automated speech evaluation presents excessive effectivity, low value, and non-invasiveness. However, present mainstream strategies usually undergo from over-reliance on handcrafted options, restricted capability to model temporal-variable interactions, and poor interpretability.

To deal with these challenges, the researchers proposed the Cross-Time and Cross-Axis Interactive Transformer (CTCAIT) for multivariate time sequence evaluation. This framework first employs a large-scale audio model to extract high-dimensional temporal options from speech, representing them as multidimensional embeddings alongside time and have axes. It then makes use of the Inception Time community to seize multi-scale and multi-level patterns throughout the time sequence.

By integrating cross-time and cross-channel multi-head consideration mechanisms, CTCAIT successfully captures pathological speech signatures embedded throughout completely different dimensions.

The technique achieved a detection accuracy of 92.06% on a Mandarin Chinese dataset and 87.73% on an exterior English dataset, demonstrating sturdy cross-linguistic generalizability.

Furthermore, the researchers performed interpretability analyses of the model’s inside decision-making processes and systematically in contrast the effectiveness of various speech duties, providing precious insights for its potential medical deployment.

These efforts present essential steerage for potential medical functions of the strategy within the early prognosis and monitoring of neurological issues.

More info:
Zhenglin Zhang et al, Multivariate time sequence method integrating cross-temporal and cross-channel consideration for dysarthria detection from speech, Neurocomputing (2025). DOI: 10.1016/j.neucom.2025.130708

Provided by
Chinese Academy of Sciences

Citation:
AI model analyzes speech to detect early neurological issues with excessive accuracy ( 7)
10 July 2025
ai-speech-early-neurological-disorders.html

The content material is offered for info functions solely.

Related posts: