Facebook YouTube Twitter Linkedin
Facebook YouTube Twitter Linkedin


24 October 2018

Using Bayesian methods to model normative CANTAB cognitive performance across adulthood

At CTAD 2018 we shared our new data on why a Bayesian approach is best to describe normative cognitive performance, especially when making scientifically robust comparisons to neurologically impaired groups. 


Knowledge of how the general population typically perform on cognitive tests is essential for the accurate interpretation of atypical performance across disease cohorts and clinical trials.

This 'normative data' is typically described by group means across age, gender and education ability.

However, these grouped means may lead to unreliable estimates when sample sizes are small within certain demographic group combinations.

To address this issue, we have applied Bayesian techniques to (i) incorporate information on age distribution and (ii) capture cognitive task structure.

The key benefit of this approach is that it produces robust and reproducible estimations of cognitive performance percentiles, which is particularly important when recruiting patients into clinical trials, and tracking their performance. 



  1. To describe cognitive performance of a healthy population sample using the CANTAB cognitive assessment battery across age, gender and education groups. 
  2. To utilise a Bayesian approach that can capture appropriate cognitive task structures for a robust estimation of performance percentiles.



728 participants aged 18 – 85 years (M = 39 years, SD = 13 years) were recruited into the study using Prolific (https://www.prolific.ac/): an online platform for advertising web-based studies. All participants provided their age, sex and highest level of education (later collapsed into 'high' vs 'low' education).

The online cognitive assessments lasted approximately thirty minutes and consisted of three CANTAB tasks. This study is particularly focused on participant performance on the CANTAB Paired Associated Learning (PAL) task, as indicated by the Total Errors Adjusted (PALTEA) score.

Figure 1. CANTAB PAL 

PALTEA describes number of errors with an adjustment for the number of attempts and the level of difficulty completed.

The participants’ CANTAB test performance was then modelled as a function of age using a Bayesian General Linear Model (GLM).



Exploratory data analyses were used to investigate trends of performance by age, in gender and education categories.

The four key demographic groups (male-high education, female high-education, male low-education, and female low-education) showed different profiles of performance with age.

These differences were captured across the performance percentiles as demonstrated by the 5th, 10th, 50th, 75th and 95th Percentiles, in Figure 2.

Figure 2. On PALTEA, both males and females with high education performed with a low number of errors compared to the low education levels. With the high educated females having the lowest number of errors.

Furthermore, the rate of change with age for females is greater than for males, with highly educated females showing the fastest cognitive decline (Figure 2).



This study demonstrates the benefits of using reproducible and robust Bayesian methods to describe normative cognitive performance on CANTAB PAL.

The Bayesian GLM was so crucial because it allowed the incorporation of prior information about the age distribution. At the same time, the parametrisation of response distribution allowed the appropriate test structure to be taken into account. For example, certain cognitive tests rely heavily on error-count type of response variables; thus, it is important to fit a zero-inflated distributions (e.g., error count). 

Therefore, as neurological impaired groups are likely to show a high error count on cognitive assessments, it is particularly important to take this Bayesian approach when comparing their performance with a normative sample.


View poster

Contact us 

Tags : pal | modelling | bayesian | normative data

Author portrait

Pasquale Dente MSc, Data Scientist & Elizabeth Baker PhD, Statistical Scientist