Facebook YouTube Twitter Linkedin
Facebook YouTube Twitter Linkedin


18 July 2019

Vocalic Markers of Cognitive Load Derived from Automated Verbal Neuropsychological Assessment and Machine Learning in a Large Scale Remote Sample

The effort participants exert in order to complete a task provides the researcher with a rich understanding of their level of functioning. At AAIC 2019, we shared how to derive this cognitive load from voice recordings. 


High-functioning individuals may perform in the normal range on cognitive tests despite significant existing pathology. Previous research has identified changes in voice features in Alzheimer’s disease (AD) and mild cognitive impairment (MCI). Here, we aim to detect cognitive effort in the context of a verbal cognitive task, using automated voice analysis and machine learning. The goal is to develop a novel voice biomarker as a means to detect high functioning patients with existing AD pathology.



2,868 participants aged 17-86 years (M=34.5, SD=12.32) completed a web-based implementation of verbal digit-span backwards, scored using automated speech recognition (ASR) on the Cambridge Cognition Neurovocalix platform.

Cognitive load was calculated with respect to each participant’s maximum span. Responses were categorised as “high load” if they were > 0.6 of their maximum span.

Data were divided into training (60%), test (20%) and validate (20%) datasets. Training and test were used in model building and hyper-parameter tuning, respectively. Validation was held out for final model evaluation.

Five different models (logistic regression, naive Bayes, support vector machine, random forest and gradient boost) were trained to predict the cognitive load score based on acoustic features.



The best performing models were Random Forest and Gradient Boosting. Results of the Gradient Boosting Classifier on the held-out data are shown in Figure 1, with respect to cognitive load and span length. Overall accuracy was 0.93.

Figure 1. (A) The relationship between model probability prediction of cognitive load and the observed load in the held-out data. The boundary between high and low cognitive load is 0.6. The decision boundary for the model is 0.5. (B) The relationship between load, span length and model probability prediction.



These data suggest that automatically administered and scored verbal cognitive tests can be used to generate both reliable measures of performance and useful vocal features.

Future work will aim to replicate these findings in patients with neurodegenerative disease, and examine the potential of these digital biomarkers in increasing sensitivity to the presence of neurodegenerative pathology.

Download poster

Tags : poster | cognitive load | machine learning

Author portrait

Francesca Cormack and Nick Taptiklis