jpad journal

AND option

OR option



V. Bloniecki1,2, G. Hagman1,3, M. Ryden3, M. Kivipelto1,3,4,5,6


1. Department of Neurobiology, Caring Sciences and Society (NVS), Division of Clinical Geriatrics, Center for Alzheimer Research, Karolinska Institute, Stockholm, Sweden; 2. Dermato-Venereology Clinic, Karolinska University Hospital, Stockholm, Sweden; 3. Theme Aging, Karolinska University Hospital, Stockholm, Sweden;
4. Ageing and Epidemiology (AGE) Research Unit, School of Public Health, Imperial College London, London, UK; 5. Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio, Finland, Kuopio, Finland; 6. Research and Development Unit, Stockholms Sjukhem, Stockholm, Sweden.

Corresponding Author: Victor Bloniecki, Karolinska Institute, Karolinska Uinversity Hospital, Eugeniavägen 3, SE-17176, Stockholm, Sweden. Tel.: +46 70-726 82 20; Email:

J Prev Alz Dis 2021;
Published online January 18, 2021,



Background: Due to an ageing demographic and rapid increase of cognitive impairment and dementia, combined with potential disease-modifying drugs and other interventions in the pipeline, there is a need for the development of accurate, accessible and efficient cognitive screening instruments, focused on early-stage detection of neurodegenerative disorders.
Objective: In this proof of concept report, we examine the validity of a newly developed digital cognitive test, the Geras Solutions Cognitive Test (GCST) and compare its accuracy against the Montreal Cognitive Assessment (MoCA).
Methods: 106 patients, referred to the memory clinic, Karolinska University Hospital, due to memory complaints were included. All patients were assessed for presence of neurodegenerative disorder in accordance with standard investigative procedures. 66% were diagnosed with subjective cognitive impairment (SCI), 25% with mild cognitive impairment (MCI) and 9% fulfilled criteria for dementia. All patients were administered both MoCA and GSCT. Descriptive statistics and specificity, sensitivity and ROC curves were established for both test.
Results: Mean score differed significantly between all diagnostic subgroups for both GSCT and MoCA (p<0.05). GSCT total test time differed significantly between all diagnostic subgroups (p<0.05). Overall, MoCA showed a sensitivity of 0.88 and specificity of 0.54 at a cut-off of <=26 while GSCT displayed 0.91 and 0.55 in sensitivity and specificity respectively at a cut-off of <=45.
Conclusion: This report suggests that GSCT is a viable cognitive screening instrument for both MCI and dementia.

Key words: Dementia, MCI, cognitive test, MoCA, e-medicine.


Dementia is currently a global driver of health care costs, and with an ageing demographic, the disease burden of neurodegenerative disorders will increase exponentially in the future. The prevalence is estimated to double every two decades, reaching approximately 80 million affected patients worldwide in 2030 (1). In 2016, the global costs associated with dementia were 948 billion US dollars and are currently projected to increase to 2 trillion US dollars by 2030, corresponding to roughly 2% of the world’s total current gross domestic product (GDP) (2, 3)..
Dementia, or major neurocognitive disorder (MCD), is an umbrella term for neurodegenerative disorders typically characterized by memory dysfunction with Alzheimer’s disease (AD) constituting approximately 60% of all cases. Other common forms of dementia include vascular dementia, Lewy-Body dementia and Frontotemporal dementia. Modern diagnostic tools, such as various imaging modalities and cerebrospinal fluid biomarkers (4, 5), have improved our diagnostic accuracy substantially. These methods have also provided key insights into the pathological mechanisms associated with neurodegenerative and contributed to the development of concepts such as mild cognitive impairment (MCI) and “preclinical AD” (6, 7). Preclinical AD is defined by the presence of cerebral amyloid or tau pathology, identified by positron emission tomography (PET) imaging or cerebrospinal fluid (CSF) biomarkers, before the onset of clinical symptoms (8).
Nevertheless, assessment of cognitive functions, the primary clinical outcome of interest, still largely relies on analogue “pen and paper” based tests administered to patients by health care providers (9). Although some regional differences exist, two of the most known and used cognitive tests include the Montreal Cognitive Assessment (MoCA) and the Mini-Mental State Examination (MMSE) (10, 11). Both tests assess various cognitive domains, with some inter-test differences, including for example; orientation, memory, concentration, executive functions, language, and visuospatial abilities (9) with scores ranging from 0 to 30 points. MoCA, as compared to MMSE which is mostly focused on memory deficits, includes assessment of more cognitive domains thus increasing its diagnostic accuracy. Although optimal cut-off points vary somewhat between different studies, a score lower than 26 on MoCA and 24 on MMSE are considered indicative of dementia (12–15). MoCA has in a previous meta-analysis shown to have a sensitivity and specificity of 0.94 and 0.60 respectively, at a cut-off of 26 points (16). This indicates a good ability to detect dementia, but at the cost of a high amount of false positives. MMSE has, in a meta-analysis, demonstrated a sensitivity of 0.85 and specificity of 0.9 (14). However, MMSE has limited value in detecting MCI and prodromal AD patients from healthy controls (17). Albeit, in the setting of cognitive screening tests a trade-off between sensitivity and specificity is necessary and screening instruments should favor sensitivity over specificity.
Given the current scientific consensus that potential future disease-modifying drugs for AD need to be administered early on in the disease continuum, there is a clear need to develop accurate and widely available cognitive screening tests in order to facilitate early diagnosis of MCI patients in the future. In the European Union, there are currently approximately 20 million individuals over the age of 55 with MCI, most of whom have not undergone screening for cognitive impairment (18). A previous study investigating the treatment and diagnostic capacity of six European countries (France, Germany, Italy, Spain, Sweden, United Kingdom) estimated that over 1 million patients would progress from MCI to AD due to capacity constraints within current health care systems if a disease modifying treatment were to be available in 2020 (18). As such, digital cognitive screening instruments are likely to be a part of the diagnostic process in the future, especially when considering the advancement of digitalized health care in multiple facets of modern medicine (19).
Cognitive assessment instruments are available in different settings including clinic based and at home testing (20, 21). Current cognitive evaluation methods include both pen-and-paper screening tools, which is the conventional method administrated by a clinical neuropsychologist, and computerized cognitive tests (20, 21). Increasing advances in technology has led clinical trials to move away from the conventional methods and adopt validated digital cognitive tools that are sensitive to capturing cognitive changes in early prevention stages (20, 22). Computerized cognitive assessment tools offer several benefits over the traditional instruments, enabling recording of accuracy and speed of response precisely, minimizing floor and ceiling effects and eliminating the examiner bias by offering a standardized format (20–22). Computerized cognitive assessments may also generate potential time and cost savings as the test can be administrated by the patient or other healthcare professionals than neuropsychologist, as long as appropriate professional will be responsible for the test interpretation and diagnosis (20, 22). Thus, unmonitored digital tools provide practical advantages of reduced need for trained professionals, self-administration, automated test scoring and reporting and ease of repeat adjustments, which enable administration for large-scale screening (22, 23). On the other hand, cognitive assessment tools are typically administrated to elderly population who might lack familiarity with digital tools, which can negatively affect their performance (22, 24). However, the attitude and perception of patients using a computerized cognitive assessment have been investigated in the elderly population, and individuals expressed a growing acceptance of using computerized cognitive assessments and rated them as understandable, easy to use and more acceptable than pen and paper tests (20, 22). They also perceived them as having the potential to improve patient care quality and the relationship between the patient and clinician when human intervention is involved (20).
Currently, there are a number of computerized screening instruments available, and they are either a digital version of the existing standardized tests or new computerized tests and batteries for cognitive function assessment (25). The pen-and-paper version of the MoCA test was recently transformed to an electronic version (eMoCA) (24). eMoCA was tested on a group of adults to compare its performance to MoCA, and most of the subjects performed comparably (24). For the detection of MCI, eMoCA (24, 25) and CogState (26) showed promising psychometric properties (25). Computer test of Inoue (27), CogState (26) and CANS-MCI (28) showed a good sensitivity in detecting AD (25). Unlike the other computerized cognitive screening tools, Geras Solutions is a comprehensive tool that provides, besides the cognitive test, a medical history questionnaire that is administrated by the patient, and a symptom survey that is administrated by the patient’s relatives. Thus, it has the potential to save more time and cost compared to the other digital assessment instruments by providing a more complete clinical evaluation.
The primary objective of this study is to investigate the accuracy and validity of a newly developed digital cognitive test (Geras Solutions Cognitive Test [GSCT]). The GSCT is a self-administered cognitive screening test provided by Geras Solutions predominantly based on MoCA. In this study, we intend to investigate the validity of GSCT, including psychometric properties, agreement with MoCA and diagnostic accuracy by establishing sensitivity, specificity, receiver operating characteristics (ROC), area under the curve values (AUC) and optimal cut-off levels, as well as compare performance with MoCA.


Materials and methods

Geras Solutions cognitive test

The GSCT, is a newly developed digital screening tool for cognitive impairment and is included in the Geras Solutions APP (GSA). Development of the screening tool was done in collaboration with the research and clinical team at Theme Aging, Karolinska University Hospital, Solna memory clinic and Karolinska Institutet. GSCT is developed on existing cognitive assessment methods (MoCA and MMSE) and includes additional proprietary tests developed at the memory clinic, Karolinska University Hospital Stockholm, Sweden. The test is suitable for digital administration through devices supporting iOS and Android.
The test is composed of 16 different items assessing various aspects of cognition, developed in order to screen for cognitive deterioration in the setting of dementia and to ensure suitability for administration via mobile devices. The GSCT is scored between 0-59 points in total and has six main subdomains including; memory (0-10 points), visuospatial abilities (0-11 points), executive functions (0-13 points), working memory (0-19 points), language (0-1 point) and orientation (0-5 points). Additionally, the time needed to complete the individual tasks is registered and presented as total test time and subdomain test time. The GSCT is automatically scored using a computer algorithm and results are presented as the total score as well as subdomain scores. A detailed description of the GSCT test items and scoring is provided in supplement 1.


The included study population consisted of 106 patients referred to the memory clinic at Karolinska University Hospital, Solna, predominantly by primary care practitioners due to memory complaints and suspicion of cognitive decline. All patients referred to the clinic between January 2019 and January 2020 were asked to participate in the study. No exclusion criteria were established a priori. If a patient fulfilled the criteria for inclusion (i.e. referred for investigation of suspect dementia at the memory clinic and provided informed consent) they were included in the study. A total of 106 patients accepted participation in the study. Five patients did not complete GSCT (two with MCI, two with subjective cognitive impatient [SCI] and one with dementia) and three patients displayed test scores with evident irregularities (one with MCI, one with SCI and one with dementia) and were excluded from the final analysis, thus leaving 98 complete cases. Irregularities included two patients whom started the test multiple times and one patient with a congenital cognitive deficiency resulting in test scores below 2 SD from the mean on both MoCA and GSCT.
All patients included in the study underwent the standard investigative procedure for dementia assessment as conducted at Karolinska University Hospital Memory Clinic. The investigative process is completed in its entirety in one week and includes; brain imaging, lumbar punctures for analysis of CSF biomarkers and neuropsychological assessment including administration of different cognitive rating scales, including MoCA. Patients received a dementia or MCI diagnosis according to the ICD-10 classification and the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-V) criteria were used as clinical support (29). If no evidence of neurodegeneration was observed patients were provided with an SCI diagnosis based on the ICD-10 classification (30). Final diagnosis was determined by specialist in geriatric medicine. In parallel, patients who accepted inclusion in the study completed the self-administered GSCT during the investigative process. GSCT was, in all cases, administered after MoCA, but not during the same day. Patients were provided a tablet and conducted GSCT alone with a health care provider adjacent if any technical difficulties would arise. The GSCT is a self-administered and all test instructions are provided by the platform. The test is intended to be performed in a home environment without any assistance. Information regarding patients GSCT scores, MoCA scores, age, gender and final diagnosis were collected for statistical analysis. All included patients provided informed consent, and the study was approved by the Regional Ethics Committee of Karolinska Institute, Stockholm, Sweden. Registration number: 2018/998-31/1.
The mean age for the whole included population (n=98) was 58 years. 5 patients were below 50 years of age (5%), 58 patients were between 50 and 60 years of age (59%), 34 patients were between 61 and 70 years of age (35%) and one patient over 70 years (1%) Altogether, 67% (n=65) of the patients were assessed without any signs of neurodegenerative disorder and diagnosed with SCI. 24% were diagnosed with MCI, and 9% received a dementia diagnosis. The dementia group consisted of 8 patients with AD and 1 patient with vascular dementia.


All statistical analyses were done using Statistica software (version 13). Baseline descriptive characteristics were calculated and are provided in Table 1. The rating scales (GSCT and MoCA) were treated as both continuous and dichotomous variables when identifying optimal cut-off levels based on sensitivity and specificity analysis. Both parametric and non-parametric tests were used for the analysis to validate findings and are reported if discrepancies were seen. Agreement between test measures were analyzed using standardized concordance correlation coefficient and analysis of Bland-Altman plot. Association between GSCT and MoCA was assessed using Pearson correlation. The internal consistency of GSCT was analyzed using Cronbach’s alpha index.
ANOVA was used to assess the differences in cognitive test scores categorized by diagnostic subgroups. Post-hoc analysis was conducted using Fisher’s Least Significant Difference (LSD) method. Logistic regression of total test scores was done in order to compare odds ratio between the tests.
Validation of GSCT total score against MoCA required the following to be established or calculated: ROC curves, the area under the curve (AUC) values with 95 % confidence intervals and sensitivity/specificity levels. Analyses were performed to estimate optimal cutoff values based on the best-compiled outcome from a range of sensitivity and specificity levels when testing the continuous scale against a dichotomous test of reference (SCI vs dementia and SCI vs MCI). Adjustment for multiple comparisons was done using the FDR-method. The presented p-values are adjusted values with the FDR-method. An adjusted p-level of <0.05 was defined as statistically significant.



Baseline data, psychometric properties and normative data

Baseline patient characteristics, including cognitive test scores, are provided in Table 1. The mean score for GSCT was 45 points in the SCI group, 36 points in the MCI group and 28 points in patients with dementia.

Table 1. Descriptive statistics

Descriptive data and test scores. Values are shown as means, standard deviations and minimum/maximum; A. p<0.05 compared to SCI. B. p<0.05 compared to MCI. C. p<0.05 compared to dementia; SCI = Subjective cognitive impairment. MCI = Mild cognitive impairment. GSCT = Geras Solutions cognitive test. MoCA = Montreal Cognitive Assessment.

Figure 1. Bland-Altman plot of standardized test scores

X-axis= mean of MoCA and GSCT. Y-axis = Difference in MoCA and GSCT

The correlation between GSCT and MoCA was (r(96) = 0.82, p <0.01). Standardized concordance correlation coefficient between GSCT and MoCA was 0.82, indicating a high level of agreement. Agreement between the GSCT and MoCA was also analyzed using a Bland-Altman plot with standardized values showing that 97 % of data points lie within ±2SD of the mean difference, see figure 1. Estimation of the internal consistency of GSCT showed a standardized Cronbach’s alpha index of α = 0.87.
Age was not significantly correlated with GSCT scores (r =-0.16, p=0.1). Diagnostic subgroup was significantly associated with age (F(2, 95) = 4,8 = 0.02), with post hoc test showing a significant difference between dementia and SCI patients (mean 63 vs 57 years, p=0.01) but not between SCI and MCI (mean 57 vs 60 years, p=0.08) or MCI and dementia patients (mean 60 vs 63 years, p=0.2). No differences in GSCT scores were observed depending on gender (t (96) =-0.3, p= 0.74) with males having a mean score of 41 points and females 40.4. Finally, both age, gender and education were included in an ANCOVA showing that education (F(1, 93) = 5.4, p= 0.03) was significantly associated with GSCT scores in contrast to age (F(1, 93) = 2.9, p = 0.1) and gender (F(1, 93) = 0.74, p = 0.4). Patients with more than 12 years of education showed higher mean test scores as compared to patients with 12 years or less (mean 42.2 vs 37.6 points, p = 0.05). GSCT total test time differed significantly depending on diagnostic subgroup (F(2, 95) = 36.4, p < 0.01) (Figure 2). Post-hoc tests showed that the differences in mean scores were significant between all three subgroups with SCI patients showing a mean test time of 1057 seconds compared to 1296 and 2065 seconds for MCI and dementia patients respectively (SCI vs MCI, p < 0.01) (SCI vs dementia, p < 0.01) (MCI vs dementia, p < 0.01).

Figure 2. Differences in GSCT test time depending on diagnosis

Mean GSCT total test time and a 95% confidence interval for patients with SCI, MCI and Dementia. p<0.05 between all subgroups.


Between-group differences in GSCT and MoCA

Average GSCT scores differed significantly depending on diagnostic subgroup (F(2, 95) = 20.3, p < 0.01). Post-hoc tests showed that the differences in mean scores were significant between all three subgroups (SCI vs MCI, p < 0.01) (SCI vs dementia, p < 0.01) (MCI vs dementia, p = 0.02).
Mean MoCA scores were also significantly different depending on diagnosis (F(2, 95) = 29.5, p < 0.01) and the mean scores were significantly different for all three subgroups (SCI vs MCI, p < 0.01) (SCI vs dementia, p < 0.01) (MCI vs dementia, p < 0.01) (Table 1).

Box plots for test scores for both GSCT and MoCA categorized by diagnosis can be seen in Figure 3. Odds ratios were calculated showing a one unit increase on the GSCT increased the odds of being healthy by 1.15 (CI 95% 1.07 – 1.22) while MoCA was associated with a 1.47 increase in odds (CI 95% 1.22-1.76).

Figure 3. Boxplots showing differences in test scores depending on diagnosis

Median GSCT and MoCA scores are represented by small squares. Larger squares represent interquartile range while whiskers show non-outlier range.


Accuracy and comparison with MoCA

GSCT showed very good to excellent discriminative properties at a wide range of cut-off values. When including all patients, thus coding both MCI and dementia patients into a binary classification of healthy/cognitively impaired, GSCT total score displayed an AUC value of 0.80 with 95% CI [0.70-0.90], whereas MoCA showed an AUC value of 0.80 with CI [0.70-0.90]. MoCA showed a sensitivity of 0.88 and specificity of 0.54 at a cut-off of <=26 while GSCT total score displayed 0.91 and 0.55 in sensitivity and specificity respectively at a cut-off of <=45. Figure 4 shows respective AUC curves and Table 2 presents the respective summary statistics.

Figure 4. Comparison of ROC curves between cognitive tests

Receiver operating characteristics curves for GSCT and MoCA in; top left SCI vs (MCI + dementia); Top right SCI vs MCI. Bottom left SCI vs Dementia.

When assessing the accuracy in discriminating between SCI and MCI patients GSCT showed an AUC value of 0.74 with 95% CI [0.62-0.85] whereas MoCA showed an AUC value of 0.74 with 95% CI [0.61-0.85]. Sensitivity and specificity at a cut-off level of <=45 was 0.88 and 0.55, respectively for GSCT total score. Whereas MoCA, at the traditional cut-off of <=26, displayed a sensitivity of 0.83 and specificity of 0.54 (Table 2). Both tests were excellent at discriminating dementia patients from SCI. GSCT showed an AUC score of 0.96 with 95% CI [0.92-0.1] while MoCA had an AUC score of 0.98 with 95% CI [0.95-0.1]. At the traditional MoCA cut-off of <= 26, sensitivity and specificity scores were 1 and 0.54, respectively whereas GSCT using a cut-off of <=35.5 showed a sensitivity of 1 and specificity of 0.9. As seen in Figure 5, both tests show good capabilities in discriminating between different diagnostic subgroups in this material, although some overlap between MCI and SCI patients existed for both tests. GSCT was marginally better at discriminating MCI from SCI patients as compared to MoCA. No patients with dementia scored within the normal range for either test.

Figure 5. Scatterplot of cognitive test scores depending on diagnosis

Scatter plot of GSCT and MoCA categorized by diagnosis. Marked lines represent cut-off points.

Table 2. Summary of accuracy for both tests

Summary statistics ROC



In this study, we present the first results on a newly developed digital cognitive test provided by Geras Solutions. GSCT displayed good agreement with MoCA based on concordance correlation analysis and Bland-Altman plot indicating that both tests measure similar cognitive domains. Additionally, normative data regarding the influence of age, gender and education was analyzed showing that education, but not age and gender, affected test scores. Individuals with more than 12 years of education had higher mean GSCT scores as compared to individuals with 12 years or less of education providing valuable information regarding scoring analysis in different demographic groups. GSCT showed equally good discriminative properties compared to the MoCA test. Both tests were excellent at discriminating dementia patients from SCI patients with a sensitivity of 1 for both GSCT and MoCA while showing a specificity of 0.9 and 0.56, respectively. This result is similar to the differential capabilities of other digital cognitive test showing sensitivity and specificity scores ranging from 0.85-1 and 0.81-1 respectively (31–33). Both tests also showed similar capabilities when discriminating between SCI and MCI patients with AUC scores of 0.74. GSCT was in this study slightly better in correctly identifying cognitive deterioration in MCI patients with a sensitivity of 0.88 compared to 0.83 for MoCA while both tests showed similar specificity of 0.55 and 0.54 receptively. The GSCT showed somewhat better sensitivity in detecting MCI patients compared to other digital screening tools, such as CogState, which previously reported sensitivity scores ranging between 0.63 and 0.84, albeit those test demonstrated higher specificity (31, 33, 34). Since GSCT is intended as a screening tool used early in the diagnostic process we believe that focus on high sensitivity is of more importance and must come at the cost of lower specificity.
Both tests demonstrated significant differences in mean test scores between all diagnostic subgroups. Additionally, the total GSCT time was also significantly different between all subgroups providing further valuable clinical information as compared to current paper and pen based cognitive screening instruments. GSCT showed very good internal consistency (α = 0.87). Based on this study, we suggest a cut-off level of <=45 for detection of MCI while values <=35.5 indicate manifest dementia.

Overall, GSCT performed at least as well as compared to currently available screening tools for dementia disorders (MoCA) while simultaneously providing several advantages. First, the test is administered via a digital device, thus eliminating the time-consuming need for testing provided by health care practitioners while also increasing the availability of cognitive screening. Given the earlier reported estimated increase in dementia prevalence combined with possible disease-modifying drugs, there is an urgent need for increased accessibility. Additionally, the digital set up of the test eliminates administration bias from health care providers and creates a more homogenous diagnostic tool. Albeit, future studies are needed to test the device in a setting without health care providers nearby. Furthermore, the possibility to register total and domain-specific test time may provide valuable clinical information potentially increasing the diagnostic capabilities, a hypothesis needing further testing in future research. Due to current trends, the development of an effective and accurate digital screening tool for cognitive impairment is of utter importance. Given a sufficiently accurate test, patients scoring in the normal range would not need to undergo further examination in the hospital setting. Instead, this digital screening instrument could identify the proper individuals in need of expanded testing e.g. MRI, CSF analysis and detailed neuropsychological testing, thus saving resources for the health care system and allocating interventions for those in need.


In this initial study we were not able to include healthy subjects. Instead, SCI patients were used as “healthy controls”. Although these patients have a self-reported presence of cognitive dysfunction, no objective findings for the presence of an ongoing neurodegenerative process could be identified. Future studies should include healthy patients without any subjective symptoms. Additionally, future larger normative studies are required to investigate how factors such as age, gender and education affect GSCT performance in order to increase validity and diagnostic accuracy. Another limitation of the test is the lack of information regarding test-retest reliability. In this preliminary trial, we were unable to obtain longitudinal data thus hindering such analysis. Future studies must include longitudinal measurements in order to determine the test-retest reliability of GSCT.
Another limitation of this study is the small sample size, especially in the MCI and dementia subgroups. These findings should be interpreted with caution and future studies, including more patients with MCI and dementia disorders, are necessary to improve the accuracy of the test. Albeit the low sample size increases the risk of type 2 errors, we found significant differences for all groups in mean GSCT scores, further supporting the robustness of the findings. Continuous collection of data from new individuals will improve test performance and provide normative information. Another limitation is the fact that patients were administered GSCT during the same week as MoCA, which could potentially generate practice effects. Furthermore, all testing in the study was conducted in Swedish and all included patients were living in close proximity to Stockholm, Sweden. Thus, there may be a potential bias in the selection of the study population and future studies should investigate whether GSCT scores are affected by regional differences as well as examine the suitability of different language versions in order to improve accessibility.



Overall, the Geras Solutions Cognitive Test performed very well with diagnostic capabilities equal to MoCA when tested on this study population.
This report suggests that GSCT could be a viable cognitive screening instrument for both MCI and dementia. Continued testing and the collection of normative data and test-retest reliability analysis is needed to improve the validity and diagnostic accuracy of the test. Additionally, future studies should explore the diagnostic value of total test time as well as item specific test time.

Funding: Theme Aging Research Unit had research collaboration with Geras Solutions during the study and a grant from Geras Solutions was provided to support conducting the study. The study was conducted independently at the memory clinic, Karolinska University Hospital, and the funding organizations had not been involved in analyses and writing. Other research support: Joint Program of Neurodegenerative Disorders, IMI, Knut and Alice Wallenberg Foundation, Center for Innovative Medicine (CIMED) Stiftelsen Stockholms sjukhem, Konung Gustaf V:s och Drottning Victorias Frimurarstiftelse, Alzheimer’s Research and Prevention Foundation, Alzheimerfonden, Region Stockholm (ALF and NSV grants). Advisory board (MK): Geras Solutions, Combinostics, Roche. GH: Advisory board: Gears Solutions. VB: Consultant for Geras Solutions.

Conflict of Interest: MK: Advisory board: Combinostics, Roche; GH: Advisory board: Gears Solutions; VB: Consultant for Geras Solutions.

Ethical Standards: The study was approved by the Regional Ethics Committee of Karolinska Institute, Stockholm, Sweden. Registration number: 2018/998-31/1.

Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.



1. Prince M, Wimo A, Guerchet M et al (2015) World alzheimer report. The global impact of dementia. An analysis of prevalance, incidence, cost and trends. Alzheimer’s Disease International, London.
2. Wimo A, Guerchet M, Ali GC, Wu YT, Prina AM, Winblad B, et al. The worldwide costs of dementia 2015 and comparisons with 2010. Alzheimer’s Dement. 2017;13(1):1–7.
3. Xu J, Zhang Y, Qiu C, Cheng F. Global and regional economic costs of dementia: a systematic review. Lancet. 2017;390:S47.
4. Blennow K, Zetterberg H. Biomarkers for Alzheimer’s disease: current status and prospects for the future. J Intern Med. 2018;284(6):643–63.
5. Jack CR, Bennett DA, Blennow K, Carrillo MC, Dunn B, Haeberlein SB, et al. 2018 National Institute on Aging-Alzheimer’s Association (NIA-AA) Research Framework NIA-AA Research Framework: Toward a biological definition of Alzheimer’s disease. Alzheimer’s Dement. 2018;14(1):535–62.
6. Lopez OL. Mild cognitive impairment. Continuum (Minneap Minn). 2013;19(2 Dementia):411–24.
7. Sperling RA, Aisen PS, Beckett LA, Bennett DA, Craft S, Fagan AM, et al. Toward defining the preclinical stages of Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimer’s Dement. 2011;7(3):280–92.
8. Dubois B, Hampel H, Feldman HH, Scheltens P, Aisen P, Andrieu S, et al. Preclinical Alzheimer’s disease: Definition, natural history, and diagnostic criteria. Vol. 12, Alzheimer’s and Dementia. Elsevier Inc.; 2016. p. 292–323.
9. Sheehan B. Assessment scales in dementia. Ther Adv Neurol Disord. 2012;5(6):349–58.
10. Nasreddine ZS, Phillips NA, Bédirian V, Charbonneau S, Whitehead V, Collin I, et al. The Montreal Cognitive Assessment, MoCA: A Brief Screening Tool For Mild Cognitive Impairment. J Am Geriatr Soc. 2005;53(4):695–9.
11. Folstein MF, Folstein SE, McHugh PR. “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975;12(3):189–98.
12. Carson N, Leach L, Murphy KJ. A re-examination of Montreal Cognitive Assessment (MoCA) cutoff scores. Int J Geriatr Psychiatry. 2018;33(2):379–88.
13. Milani SA, Marsiske M, Cottler LB, Chen X, Striley CW. Optimal cutoffs for the Montreal Cognitive Assessment vary by race and ethnicity. Alzheimer’s Dement Diagnosis, Assess Dis Monit. 2018;10:773–81.
14. Creavin ST, Wisniewski S, Noel-Storr AH, Trevelyan CM, Hampton T, Rayment D, et al. Mini-Mental State Examination (MMSE) for the detection of dementia in clinically unevaluated people aged 65 and over in community and primary care populations. Cochrane Database Syst Rev. 2016;2016(1):CD011145.
15. O’Bryant SE, Humphreys JD, Smith GE, Ivnik RJ, Graff-Radford NR, Petersen RC, et al. Detecting dementia with the mini-mental state examination in highly educated individuals. Arch Neurol. 2008;65(7):963–7.
16. Davis DH, Creavin ST, Yip JL, Noel-Storr AH, Brayne C, Cullum S. Montreal Cognitive Assessment for the diagnosis of Alzheimer’s disease and other dementias. Cochrane Database Syst Rev. 2015;
17. Mitchell AJ. A meta-analysis of the accuracy of the mini-mental state examination in the detection of dementia and mild cognitive impairment. J Psychiatr Res. 2009;43(4):411–31.
18. Hlavka JP, Mattke S, Liu JL. Assessing the Preparedness of the Health Care System Infrastructure in Six European Countries for an Alzheimer’s Treatment. Rand Heal Q. 2019;8(3):2.
19. Meskó B, Drobni Z, Bényei É, Gergely B, Győrffy Z. Digital health is a cultural transformation of traditional healthcare. mHealth. 2017;3:38–38.
20. Robillard JM, Lai JA, Wu JM, Feng TL, Hayden S. Patient perspectives of the experience of a computerized cognitive assessment in a clinical setting. Alzheimer’s Dement Transl Res Clin Interv. 2018;
21. Kim H, Hsiao CP, Do EYL. Home-based computerized cognitive assessment tool for dementia screening. J Ambient Intell Smart Environ. 2012;
22. Wild K, Howieson D, Webbe F, Seelye A, Kaye J. Status of computerized cognitive testing in aging: A systematic review. Alzheimer’s and Dementia. 2008.
23. Morrison RL, Pei H, Novak G, Kaufer DI, Welsh-Bohmer KA, Ruhmel S, et al. A computerized, self-administered test of verbal episodic memory in elderly patients with mild cognitive impairment and healthy participants: A randomized, crossover, validation study. Alzheimer’s Dement Diagnosis, Assess Dis Monit. 2018;
24. Berg JL, Durant J, Léger GC, Cummings JL, Nasreddine Z, Miller JB. Comparing the Electronic and Standard Versions of the Montreal Cognitive Assessment in an Outpatient Memory Disorders Clinic: A Validation Study. J Alzheimer’s Dis. 2018;
25. De Roeck EE, De Deyn PP, Dierckx E, Engelborghs S. Brief cognitive screening instruments for early detection of Alzheimer’s disease: A systematic review. Alzheimer’s Research and Therapy. 2019.
26. Maruff P, Lim YY, Darby D, Ellis KA, Pietrzak RH, Snyder PJ, et al. Clinical utility of the cogstate brief battery in identifying cognitive impairment in mild cognitive impairment and Alzheimer’s disease. BMC Psychol. 2013;
27. Inoue M, Jinbo D, Nakamura Y, Taniguchi M, Urakami K. Development and evaluation of a computerized test battery for Alzheimer’s disease screening in community-based settings. Am J Alzheimers Dis Other Demen. 2009;
28. Memória CM, Yassuda MS, Nakano EY, Forlenza O V. Contributions of the Computer-Administered Neuropsychological Screen for Mild Cognitive Impairment (CANS-MCI) for the diagnosis of MCI in Brazil. Int Psychogeriatrics. 2014;
29. Diagnostic and statistical manual of mental disorders : DSM-5 [Internet]. Fifth edition. Arlington, VA : American Psychiatric Publishing, [2013] ©2013;
30. The ICD-10 Classification of Mental and Behavioural Disorders Clinical descriptions and diagnostic guidelines World Health Organization.
31. Scharre DW, Chang SI, Nagaraja HN, Vrettos NE, Bornstein RA. Digitally translated Self-Administered Gerocognitive Examination (eSAGE): Relationship with its validated paper version, neuropsychological evaluations, and clinical assessments. Alzheimer’s Res Ther. 2017;9(1).
32. Onoda K, Yamaguchi S. Revision of the cognitive assessment for dementia, iPad version (CADi2). PLoS One. 2014;9(10).
33. Possin KL, Moskowitz T, Erlhoff SJ, Rogers KM, Johnson ET, Steele NZR, et al. The Brain Health Assessment for Detecting and Diagnosing Neurocognitive Disorders. J Am Geriatr Soc. 2018;66(1):150–6.
34. de Jager CA, Schrijnemaekers ACMC, Honey TEM, Budge MM. Detection of MCI in the clinic: Evaluation of the sensitivity and specificity of a computerised test battery, the Hopkins Verbal Learning Test and the MMSE. Age Ageing. 2009;38(4):455–60.



P.A. Amofa1, D.E.C. Locke2, M. Chandler3, J.E. Crook4, C.T. Ball4, V. Phatak5, G.E. Smith1


1. Department of Clinical and Health Psychology, University of Florida, Gainesville, FL, USA; 2. Department of Psychiatry and Psychology, Mayo Clinic Arizona, Scottsdale, AZ, USA; 3. Department of Psychiatry and Psychology, Mayo Clinic Florida, Jacksonville, FL, USA; 4. Division of Biomedical Statistics and Informatics, Mayo Clinic Florida, Jacksonville, FL, USA; 5. Department of Neurological Sciences, University of Nebraska Medical Center, Omaha, NE, USA; 6. Department of Psychiatry and Psychology, Mayo Clinic Minnesota, Rochester, MN, USA.

Corresponding Author: Dona E.C. Locke, Division of Psychology, Mayo Clinic, 13400 E. Shea Blvd., Scottsdale, AZ 85259; Ph: 480-301-8297; Fax: 480-301-6258; Email:

J Prev Alz Dis 2021;1(8):33-40
Published online October 26, 2020,



Background/Objective: Various behavioral interventions are recommended to combat the distress experienced by caregivers of those with cognitive decline, but their comparative effectiveness is poorly understood.
Design/Setting: Caregivers in a comparative intervention study randomly had 1 of 5 possible interventions suppressed while receiving the other four. Caregivers in a full clinical program received all 5 intervention components. Care partner outcomes in the study group were compared to participants enrolled in a full clinical program.
Participants: Two hundred and seventy-two dyads of persons with amnestic mild cognitive impairment (pwMCI) and care partners enrolled in the comparative intervention study. 265 dyads participated in the full clinical program.
Intervention: Behavioral intervention components included: memory compensation training, computerized cognitive training, yoga, support group, and wellness education. Each was administered for 10 sessions over 2 weeks.
Measurements: A longitudinal mixed-effect regression model was used to analyze the effects of the interventions on partner burden, quality of life (QoL), mood, anxiety, and self-efficacy at 12 months follow-up.
Results: At 12 months, withholding wellness education or yoga had a significantly negative impact on partner anxiety compared to partners in the clinical program (ES=0.55 and 0.44, respectively). Although not statistically significant, withholding yoga had a negative impact on partner burden and mood compared to partners in the full clinical program (ES=0.32 and 0.36, respectively).
Conclusion: Our results support the benefits of wellness education and yoga for improving partner’s burden, mood, and anxiety at one year. Our findings are the first to provide an exploration of the impact of multicomponent interventions in care partners of pwMCI.

Key words: Non-pharmacological interventions, MCI, dementia, caregiver, patient preferences.




As a medical community, we are increasingly able to identify dementia at an early stage, including the Mild Cognitive Impairment (MCI) stage. Amnestic MCI is defined as memory abnormality beyond normal age-related decline with relatively retained functional capacity (1). However, it is acknowledged in this definition that persons with MCI often have some mild problems with complex tasks they previously performed (such as paying bills) such that they may take longer or make more errors than in the past (2). Therefore, some care partner support is often needed, even if it is only minor reminders and supervision. Rates of depression and other psychological comorbidities (e.g., burden, anxiety, decreased quality of life) are elevated in family members of people with MCI as compared to the general population (but not as pronounced as in caregivers of persons with dementia) (3). The neurobehavioral symptoms, psychological wellbeing, cognitive and functional decline, executive functioning difficulties, and dependency present in persons with MCI (pwMCI) are common predictors of the psychological symptoms experienced by care partners (4–6). Thus, interventions to improve outcomes in patients with MCI could also potentially impact outcomes in care partners of those with MCI.
Mayo Clinic developed the HABIT (Healthy Action to Benefit Independence & Thinking ®) program, which is a 50-hour behavioral intervention treatment program with 5 components. The 5 components include physical exercise via yoga, computerized cognitive training (CCT), wellness education, patient and partner support groups, and cognitive rehabilitation with a compensatory memory support system (MSS). Each of these behavioral interventions has support in the literature for effectiveness for pwMCI across a variety of outcomes (e.g., cognitive functioning, quality of life, mood, partner burden) in comparison to no treatment (7–12). However, there is a dearth of literature comparing the effectiveness of behavioral interventions for those with MCI or their loved ones who participate as support partners. In a pilot study, we compared the outcome of wellness education plus compensation based cognitive rehabilitation to wellness education plus cognitive exercise, and we compared each combination to no treatment. We found that patient memory-related activities of daily living (ADLs) were improved over no treatment in the cognitive rehabilitation condition while they were not in the cognitive exercise condition (13). We also found that partners in both treatment groups showed stable to improved mood and anxiety symptoms while partners in the untreated group showed worsening depression and anxiety symptoms over 6 months (14). There were no statistically significant differences between the impact of cognitive rehabilitation or cognitive exercise on patient self-efficacy or other partner outcomes (quality of life, burden); however, effect size estimates suggested the possibility of greater impact of the cognitive rehabilitation intervention on several of these outcomes that this small pilot study was underpowered to detect (Cohen’s d range .37 to .73) (13, 14).
The current study sought to compare the effectiveness of the five behavioral interventions that comprise HABIT®. The full details of our rationale, design, and initial enrollment of the comparative effectiveness study are outlined elsewhere (15). Briefly, we utilized a subtraction model, randomizing groups of dyads to have one of the five interventions withheld while receiving the other four. These groups were compared to a clinical dataset of dyads who received the full clinical HABIT ® program. Patient-related outcomes of the comparative effectiveness study are outlined in a separate report (16). As delivery of the HABIT interventions requires a care partner to participant in the sessions, the interventions are as much geared towards the care partner as the pwMCI themselves. We hypothesize that the intervention components that promote self-care and resilience (as opposed to improve cognitive function) will encourage care partners to develop skills which will impact their burden, mood, and overall quality of life directly. This report outlines outcomes for partners one-year post intervention. Patient/partner-advocated outcomes were determined by surveying alumni who had previously completed HABIT®. This survey asked patients and partners to identify their preference of outcomes they were seeking in a behavioral intervention for MCI. Focusing just on partner-related outcomes, burden was ranked as most important of the partner outcomes, followed by partner quality of life, partner self-efficacy, partner anxiety, and partner mood, respectively (17). These are the outcomes that are the focus of this analysis.




272 dyads were recruited through clinical services at Mayo Clinic in Minnesota, Arizona, and Florida as well as University of Washington to take part in the comparative effectiveness intervention study. Consecutive candidates with diagnoses of amnestic MCI (single or multi-domain) were approached for the study, underwent further evaluation for study inclusion/exclusion criteria, and enrolled in the trial. Inclusion criteria included a Clinical Dementia Rating Scale (18) score of <0.5, a cognitively normal (Mini Mental Status Exam, MMSE (19) (>24)) care partner who has at least twice-weekly contact with the pwMCI, either not taking or stable on nootropic medication for at least 3 months, and fluent in English. Exclusion criteria included current participation in another treatment-related clinical trial or significant auditory, visual, or motor impairment impacting ability to participate in the program.

HABIT Intervention and Randomization

The clinical HABIT program involves 10 days of intervention over the course of two weeks. All components are designed to help pwMCI and their care partners initiate new health behavior habits with the aim of sustaining these behaviors post-HABIT®. In the comparative effectiveness study, block randomization was utilized to suppress one of the five components from groups of 10-20 couples in each session. Interventions for both the clinical HABIT program and the comparative effectiveness study were run by PhD clinical neuropsychologists, master’s trained counselors, or cognitive rehabilitation and dementia education specialists. Certified yoginis conducted the yoga sessions. Components of HABIT® include:
1. Yoga: Partners and patients engaged in daily 45 to 60-minute sessions of physical exercise and relaxation/mindfulness training via yoga. They were provided a customized DVD to encourage continued practice post-HABIT
2. Computerized Cognitive Training (CCT): Partners and patients completed 45- to 60-minute sessions of cognitive training via the commercially available Brain-HQ™ program (Posit Science; San Francisco CA). They were provided a one-year subscription to the program to encourage continued use post-HABIT.
3. Wellness: Partners and patients attended daily 45- to 60-minute lectures covering a range of health topics such as Living with MCI, Changes in Roles and Relationships, Sleep Hygiene, MCI and Depression, Nutrition, and Assistive Technologies. Dyads were given resources and written information to help engage behavioral changes post-HABIT.
4. Support Groups: pwMCI and care partners met separately in support groups 45-60 minutes each day. The pwMCI support group focused on reminiscence-focused group sessions with the opportunity for psychotherapeutic discussion of MCI-related concerns as desired by patients. The partner support group focused on building resources for coping with the change in their loved one. Dyads were encouraged to seek out community-based support groups (e.g. Alzheimer’s Association groups) for continued support post-HABIT.
5. Memory Support System (MSS): The patient received cognitive rehabilitation daily focused on compensatory-focused MSS development. This involved training using a structured curriculum in use of a two page-per-day written memory book to develop compensatory written reminders for important appointments, tasks, or reminders. Patients and partners were provided the paper MSS materials in an ongoing manner to enable continued use of the system post-HABIT.

The final sample for the comparative effectiveness study included 56 dyads who had the yoga component suppressed, 54 dyads who had the CCT suppressed, 52 dyads who had the wellness component suppressed, 53 dyads who had the support group suppressed, and 57 dyads who had the MSS suppressed.
To assess the impact of individual intervention components on partner outcomes, we compared data from the comparative effectiveness study to a clinical HABIT sample. The clinical HABIT sample was similar in makeup to the participants in the comparative effectiveness study except that the clinical HABIT sample patients: received all five interventions, completed their sessions prior to the PCORI trial, and did not include patients from the University of Washington. Data used in these analyses came from only those clinical HABIT program participants who had provided informed consent for the use of their data for research purposes, which included follow-up for 5 years post-HABIT. The final clinical HABIT sample included 265 dyads recruited through the clinical HABIT programs at Mayo Arizona, Mayo Florida, and Mayo Minnesota.

Measures: Partner

The care partner completed measures at baseline, end of treatment, 6-month follow-up, and 12-month follow-up.

Partner burden

Care partner burden at 12-month follow-up was our primary outcome measure. This was assessed with the short form of the Caregiver Burden Inventory (20). Scores range from 0-48 with higher scores suggesting more burden.

Partner mood and anxiety

Partners completed the Center for Epidemiological Studies Depression Scale (CES-D) (21) for measurement of depression-related symptoms and the Resources for Enhancing Alzheimer’s Caregiver Health (REACH) (22) scale for anxiety. The CES-D scores range from 0-60 with higher scores suggestive of more symptoms of depression. Reach total scores range from 10-40 with higher scores suggestive of more symptoms of anxiety.

Partner quality of life

Quality of life (QoL) was measured using the Quality of Life AD (QOL-AD) scale (23). Scores range from 13-52 with higher scores representing better QOL.

Partner self-efficacy

Partners completed the Caregiving Competence and Mastery of the Pearlin (24) scales. Scores range from 7-18 with higher scores indicating higher self-efficacy.


We compared baseline outcome measures of partners in the experimental groups to those in the clinical HABIT group using a mixed-effects regression model with a fixed effect for clinical HABIT group and random effects for site-dependent patient/partner group. For the primary analysis, a longitudinal mixed-effects regression model was used to compare primary partner outcome measures using data from four time points (baseline, end of treatment, 6-month, and 12-month) among the six groups – the five comparative intervention study groups and one clinical HABIT group. The clinical HABIT group also had outcome measures collected at 3 months, which were accounted for in the model. The primary analysis focused on changes in the measures from baseline to 12 months with burden as the primary outcome and QOL, mood, anxiety, and self-efficacy as secondary outcomes. Specifically, each outcome measure at baseline was modeled with fixed effects for age, sex, site, and group (clinical vs. experimental). The mean change in each outcome measure from baseline to follow-up time point was also modeled with fixed effects for group (clinical HABIT and each of the 5 experimental groups), age, and sex. We included random effects for partner to account for the multiple measurements over time. Comparisons of each experimental group to the clinical HABIT group on partner burden, QOL, mood, anxiety, and self-efficacy at 12 months were of primary interest. Furthermore, we obtained fitted trajectories over time for all experimental groups and the clinical HABIT group for each of the 5 outcome measures. We created 95% confidence intervals (CI) using the profile likelihood method and performed testing using corresponding likelihood ratio tests. Effect sizes (ES) were calculated as the fitted mean change from baseline for a hypothetical average partner divided by the standard deviation (SD) of the baseline measures of partners in the experimental groups. The Holm method was used to adjust for multiple comparisons. Data missing at random were accounted for by using longitudinal mixed models for our primary analysis. Analyses were performed using R statistical software, version 3.6.2 (R Foundation for Statistical Computing, Vienna, Austria).




Study partner baseline characteristics and outcome measures at baseline and 12 month follow-up per study arm are shown in Table 1. Two hundred seventy-two dyads entered the comparative effectiveness intervention study and 228 dyads (83.8%) completed the study through the 12-month follow-up. The most common reasons for withdrawal from the study included the presence of other significant medical concerns and changes in living situation. The clinical group included 265 dyads at baseline and data from all 265 dyads were included in 12-month outcome analysis . Table 1 shows the number of missing observations for each group. Mood, anxiety, and self-efficacy were worse at baseline for the clinical HABIT group compared to the patients in the experimental groups (p<0.001). This difference was adjusted for in the comparison analysis at 12 months outlined below. There was no evidence of baseline differences in burden or QoL between the clinical HABIT group and the experimental groups (p=0.21 and p=0.18, respectively).

Table 1. Partner Baseline and 12 Month Follow-up Characteristics

Abbreviations: CCT, computerized cognitive training; MSS, memory support system; HABIT, Healthy Action to Benefit Independence and Thinking; QOL-AD, Quality of Life in Alzheimer Disease; Resources for Enhancing Alzheimer Caregiver Health; Note. For Burden, higher scores indicate greater burden; for QOL, higher scores indicate greater QOL; for CES-D, higher scores indicate more symptoms of depression; for REACH, higher scores indicate more symptoms of anxiety; for Self-efficacy, higher scores indicate higher self-efficacy. Superscripts represent the number of missing observations. Asterisk (*) represents significant (p<0.001) baseline difference in outcome measure between clinical HABIT group and experimental group based on a likelihood ratio test from a mixed effects regression model with fixed effect for group (clinical vs. experimental) and a random effect for patient/partner session.


Outcome Measures

At the end of treatment (see Figure 1), the clinical HABIT group saw an improvement from baseline in QOL (ES= 0.20; 95% CI 0.09 to 0.30), mood (ES=0.16, 95% CI 0.04 to 0.29), and anxiety (ES=0.12, 95% CI 0.00 to 0.24), but not in burden or self-efficacy. All the experimental groups showed evidence of improvement in QOL from baseline to end of treatment except the no yoga group (ES, -.07, 95% CI –0.06 to 0.32) and the no wellness group (ES, -.04, 95% CI -0.24 to 0.17). However, the no wellness condition was the only experimental condition that was significantly lower than clinical HABIT (ES, -0.23, 95% CU -0.46 to -0.01). For the other 4 intervention groups, the difference in the change in QOL from baseline to end of treatment compared to clinical HABIT was not significantly different. There were no other notable differences between clinical HABIT and intervention groups with respect to change from baseline to end of treatment on mood, anxiety, burden, or self-efficacy.
In the comparison analysis at 12 months (Table 2), there was a trend toward the no yoga group having significantly worsened outcome in our primary measure (partner burden) compared to the clinical HABIT sample (ES=-0.32, 95% CI, -0.58 to -0.06, adjusted p=.080), although these were non-significant after adjusting p-values for multiple comparisons.

Table 2. Differences in Caregiver Outcomes at 12 Months Compared to Clinical HABIT Program Participants

Abbreviations: HABIT, Healthy Action to Benefit Independence and Thinking; CI, confidence interval; WE, wellness education; CCT, computerized cognitive training; SG, support group; MSS, memory support system. Differences in effect sizes are interpreted such that experimental groups with a negative difference had worse partner outcomes at 12 months compared to the clinical HABIT group. Effect sizes were estimated from longitudinal mixed effects regression models, in which a 1-unit increase in the effect size corresponds to a 1 standard deviation (SD) improvement in partner outcome from baseline. Baseline SDs from the 5 experimental groups were used for the effect sizes: 6.77 for burden, 5.45 for QOL (quality of life), 6.10 for mood, 4.88 for anxiety, and 3.21 for self-efficacy. Adjusted p values were computed using the Holm method for multiple comparisons based on 5 tests.


In analysis at 12 months of the other secondary outcome measures (Table 2), withholding wellness (ES=-0.55, 95% CI, -0.84 to -0.25, adjusted p=0.013) and withholding yoga (ES=-0.44, 95% CI, -0.73 to -0.16, adjusted p=0.013) each had a significant negative impact on partner anxiety compared to the clinical HABIT sample. There was a trend toward withholding yoga having a negative impact on mood compared to clinical HABIT (ES=-0.36, 95% CI, -0.66 to -0.06, adjusted p=0.10), however this was not statistically significant after adjusting for multiple comparisons. There were no significant differences in QOL or self-efficacy at 12 months when intervention components were withheld in comparison to the full HABIT program. The course of change over time for each of the 5 outcome measures and each study group are illustrated in Figure 1.

Figure 1. Effect Sizes

Effect sizes were estimated from longitudinal mixed effects regression models, in which a 1-unit increase in the effect size corresponds to a 1 standard deviation (SD) improvement in caregiver outcome. Baseline SDs from the 5 study arms with one HABIT component removed were 6.77 for burden (A), 5.45 for QOL (quality of life) (B), 6.10 for mood (C), 4.88 for anxiety (D), and 3.21 for self-efficacy (E). Abbreviations: EOT, end of treatment; CCT, computerized cognitive training; MSS, memory support system. Error bars represent 95% confidence intervals for the effect sizes.



Psychological distress and reduced quality of life among care partners of those with MCI are related to neuropsychiatric symptoms, executive functioning deficits, and memory dysfunction in their loved one with MCI (3, 5, 25). The HABIT program for pwMCI aims to impact these care partner symptoms as well as create healthy lifestyle and coping habits in pwMCI. We have reported on primary patient outcomes elsewhere (13). Our aim with this report is to compare the impact on care partners by providing five behavioral treatments (MSS, CCT, yoga, wellness, and support group) in a multicomponent program for pwMCI and their care partners.
Priority care partner outcomes measured in the study were determined by previous participants as burden (primary), quality of life, mood, anxiety, and self-efficacy (17, 26). Among all study arms, partner burden, mood, anxiety, quality of life, and self-efficacy remained stable or improved by end of treatment but varied by group at 12 months after the intervention (Figure 1). Across measures, partner outcomes were stable (or improved for anxiety) at 12 months in the full clinical program (after adjustment to baseline difference between the intervention groups and the clinical sample) but showed variable worsening in the arms with an intervention withheld. For anxiety specifically, worsening was significant if either yoga or wellness was withheld. Partner burden and mood also trended toward worsening in the group with yoga withheld.
Our findings were partially supportive of our hypothesis. When compared to a clinical HABIT program inclusive of all five components, partners who did not received wellness or yoga showed worsened anxiety (and a trend toward greater burden and worsened mood) at 12 months. However, there was no difference in partner QoL across interventions. The two intervention components that impacted anxiety, and to a lesser extent burden, provide knowledge and skills for ongoing self-care and resiliency directly to the care partner in addition to the pwMCI. Cognitive exercise and cognitive rehabilitation components of the program were mainly aimed to help the pwMCI initiate new behavioral habits that promote independence and maintain cognitive functioning; while support group intervention component was to encourage pursuing of emotional support and community building.
The concepts of self-care and resilience, which are broadly applied in HABIT, are perceived as the core aspects of yoga and wellness (27). Our yoga component, while teaching physical exercise with yoga poses, also taught relaxation and mindfulness practices. Wellness encouraged self-care practices such as eating well, exercise, getting enough sleep, monitoring moods, and staying connected socially. This helps explain why these interventions may have more impact on care partner sense of burden and anxiety as time went on. Recent studies have shown yoga intervention to be associated with greater improvements in mood and wellbeing compared to other exercise regimens (27, 28). Likewise, engagement in meaningful activities combined with psychoeducational materials (covering information about MCI and what to expect, and healthy lifestyle materials) improve mood and decrease burden among caregivers of pwMCI (29, 30). The impact of both yoga and wellness activities on outcomes in both the pwMCI (16) and their care partner is promising, considering their easy accessibility. From a patient-centered behavioral intervention prospective, this is reassuring.
Baseline levels of anxiety and mood symptoms were higher, and self-efficacy was lower in our clinical sample than in our experimental sample. It is possible that the clinical care partner sample had a higher distress level as a result of a worse cognitive function level among their persons with MCI. Regardless, this difference could not account entirely for the observed results as our statistical method factored these baseline differences into the study model.


Due to our innovative study design, interpretation of our results in comparison to other studies requires caution. The subtraction approach employed in the intervention study approximates the cost of not receiving an intervention. By comparison with the clinical sample data, we infer that intervention components make a significant contribution to the outcomes of interest given the impact on outcomes when that intervention is absent. Although studies have shown that support group and compensatory cognitive rehabilitation reduce burden, improve quality of life, and reduce psychological distress among care partners in the short term (12, 29, 30), it is possible that these interventions do not yield the same results on longer term outcomes. This could be in part due to the increased demands of caring for a family member with progressive cognitive impairment. Additionally, our study is from a predominantly non-Hispanic white population with high educational level. Also, all interested outcomes were of subjective reports; no objective performance-based reports were added to support our findings.
While our effect sizes are modest and our findings may not be extended to other forms of behavioral interventions (e.g. different versions of wellness, physical activity, and support group therapy), we offer our results to encourage multimodal trials among care partners of pwMCI. Enhancing knowledge and skills early in the course of a progressive process may not only impact trajectory but also increase hope. Further research can examine the effect of different modalities of physical activity (e.g. resistance training or a different form of yoga) combined with the different types of group therapy and wellness interventions.


Support/Funding: Research reported in this manuscript was primarily funded through a Patient-Centered Outcomes Research Institute (PCORI) Award (CER-1306-01897). The statements in this publication are solely the responsibility of the authors and do not necessarily represent the views of the Patient-Centered Outcomes Research Institute (PCORI), its Board of Governors or Methodology Committee. Additional support for DECL: NIA P30AG19610, NIA R01 AG031581, the Arizona Alzheimer’s Research Consortium, the Ralph J. Wilson Foundation Development Gift to Mayo Clinic. Additional support for GES: NIA P50AG47266, State of Florida Ed and Ethel Moore program.

Sponsor’s Role: The sponsor had no role in the design and conduct of the study; in the collection, analysis, and interpretation of data; in the preparation of the manuscript; or in the review or approval of the manuscript.

Disclosure statement: No conflict of interest was reported by all authors.

Author Contribution: Study concept and design: D.E.C.L., M.C., J.E.C., C.T.B., V.P., G.E.S. Data acquisition and interpretation: All authors; Statistical analysis: C.T.B., J.E.C. Manuscript draft: P.A.A., D.E.C.L. Critical revision of manuscript: P.A.A., D.E.C.L., M.C., J.E.C., C.T.B., G.E.S.

Approval of final manuscript: All authors.

Data availability statement: The data that support the findings of this study are available from the corresponding author, D.E.C.L., upon reasonable request.

Trial registration: Identifier: NCT02265757.

IRB: Institutional Review Boards at the Mayo Clinic (14-000885) and University of Washington (49235)



1. M. S. Albert et al., “The diagnosis of mild cognitive impairment due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease,” Alzheimer’s and Dementia. 2011.
2. S. T. Farias, D. Mungas, B. R. Reed, D. Harvey, D. Cahn-Weiner, and C. DeCarli, “MCI is associated with deficits in everyday functioning,” Alzheimer Dis. Assoc. Disord., 2006.
3. K. Seeher, L. F. Low, S. Reppermund, and H. Brodaty, “Predictors and outcomes for caregivers of people with mild cognitive impairment: A systematic literature review,” Alzheimer’s and Dementia. 2013.
4. M. Paradise, D. Mccade, I. B. Hickie, K. Diamond, S. J. G. Lewis, and S. L. Naismith, “Caregiver burden in mild cognitive impairment,” Aging Ment. Heal., 2015.
5. C. Ikeda et al., “Difference in determinants of caregiver burden between amnestic mild cognitive impairment and mild Alzheimer’s disease,” Psychiatry Res., 2015.
6. D. Gallagher et al., “Dependence and caregiver burden in Alzheimer’s disease and mild cognitive impairment,” Am. J. Alzheimers. Dis. Other Demen., 2011.
7. J. W. Williams, B. L. Plassman, J. Burke, T. Holsinger, and S. Benjamin, “Preventing Alzheimer’s disease and cognitive decline Evidence Report/Technology Assessment, Number 193,” Ann. Intern. Med., 2010.
8. K. Hepburn, M. Lewis, J. Tornatore, C.W. Sherman, and K.L. Bremer, “The savvy caregiver program: The demonstrated effectiveness of a transportable dementia caregiver psychoeducation program,” J. Gerontol. Nurs., 2007.
9. L. Rozzini, D. Costardi, V. Chilovi, S. Franzoni, M. Trabucchi, and A. Padovani, “Efficacy of cognitive rehabilitation in patients with mild cognitive impairment treated with cholinesterase inhibitors,” Int. J. Geriatr. Psychiatry, 2007.
10. S. Belleville, B. Gilbert, F. Fontaine, L. Gagnon, É. Ménard, and S. Gauthier, “Improvement of episodic memory in persons with mild cognitive impairment and healthy older adults: Evidence from a cognitive intervention program,” Dement. Geriatr. Cogn. Disord., 2006.
11. N. T. Lautenschlager et al., “Effect of physical activity on cognitive function in older adults at risk for Alzheimer disease: A randomized trial,” JAMA – J. Am. Med. Assoc., 2008.
12. M. C. Greenaway, N. L. Duncan, and G. E. Smith, “The memory support system for mild cognitive impairment: Randomized trial of a cognitive rehabilitation intervention,” Int. J. Geriatr. Psychiatry, 2013.
13. M. J. Chandler et al., “Computer versus compensatory calendar training in individuals with mild cognitive impairment: Functional impact in a pilot study,” Brain Sci., 2017.
14. A. V. Cuc et al., “A pilot randomized trial of two cognitive rehabilitation interventions for mild cognitive impairment: caregiver outcomes,” Int. J. Geriatr. Psychiatry, 2017.
15. G. Smith et al., “Behavioral Interventions to Prevent or Delay Dementia: Protocol for a Randomized Comparative Effectiveness Study,” JMIR Res. Protoc., 2017.
16. M. J. Chandler et al., “Comparative Effectiveness of Behavioral Interventions on Quality of Life for Older Adults With Mild Cognitive Impairment,” JAMA Netw. Open, 2019.
17. G. E. Smith, M. Chandler, J. A. Fields, J. Aakre, and D. E. C. Locke, “A Survey of Patient and Partner Outcome and Treatment Preferences in Mild Cognitive Impairment,” J. Alzheimer’s Dis., 2018.
18. J. C. Morris, “The Clinical Dementia Rating (CDR): Current version and scoring rules,” Neurology, 1993.
19. M. F. Folstein, S. E. Folstein, and P. R. McHugh, “‘Mini-mental state’. A practical method for grading the cognitive state of patients for the clinician,” J. Psychiatr. Res., 1975.
20. M. Bédard, D. W. Molloy, L. Squire, S. Dubois, J. A. Lever, and M. O’donnell, “The Zarit Burden Interview: A new short version and screening version,” Gerontologist, 2001.
21. L. S. Radloff, “The CES-D Scale: A Self-Report Depression Scale for Research in the General Population,” Appl. Psychol. Meas., 1977.
22. S. R. Wisniewski et al., “The Resources for Enhancing Alzheimer’s Caregiver Health (REACH): Project design and baseline characteristics,” Psychol. Aging, 2003.
23. R. G. Logsdon, L. E. Gibbons, S. M. McCurry, and L. Teri, “Assessing quality of life in older adults with cognitive impairment,” Psychosom. Med., 2002.
24. L. I. Pearlin, J. T. Mullan, S. J. Semple, and M. M. Skaff, “Caregiving and the stress process: An overview of concepts and their measures,” Gerontologist, 1990.
25. K. A. Ryan, A. Weldon, C. Persad, J. L. Heidebrink, N. Barbas, and B. Giordani, “Neuropsychiatric symptoms and executive functioning in patients with mild cognitive impairment: Relationship to caregiver burden,” Dement. Geriatr. Cogn. Disord., 2012.
26. P. G. Barrios et al., “Priority of Treatment Outcomes for Caregivers and Patients with Mild Cognitive Impairment: Preliminary Analyses,” Neurol. Ther., 2016.
28. C. C. Streeter et al., “Effects of Yoga Versus Walking on Mood, Anxiety, and Brain GABA Levels: A Randomized Controlled MRS Study,” J. Altern. Complement. Med., 2010.
29. N. S. Domingues, P. Verreault, and C. Hudon, “Reducing Burden for Caregivers of Older Adults With Mild Cognitive Impairment: A Systematic Review,” American Journal of Alzheimer’s Disease and other Dementias. 2018.
30 Ö. Küçükgüçlü, B. Akpınar Söylemez, G. Yener, and A. T. Işık, “The effects of support groups on dementia caregivers: A mixed method study,” Geriatr. Nurs. (Minneap)., 2018.

A Novel Study Paradigm for Long-term Prevention Trials in Alzheimer Disease: The Placebo Group Simulation Approach (PGSA). Application to MCI data from the NACC database

M. Berres1, W.A. Kukull2, A.R. Miserez3, A.U. Monsch4, S.E. Monsell1, R. Spiegel4 for the Alzheimer’s Disease Neuroimaging Initiative*

1. University of Applied Sciences Koblenz, RheinAhrCampus Remagen, Remagen, Germany; 2. National Alzheimer’s Coordinating Center (NACC), Department of Epidemiology, University of Washington, Seattle, USA; 3. diagene Laboratories Inc., Reinach, Switzerland; 4. University Hospital Department of Geriatrics, Memory Clinic, Basel, Switzerland.

Corresponding Author: René Spiegel, University Hospital Department of Geriatrics, Memory Clinic, Schanzenstrasse 55, CH 4031 Basel, Switzerland. Email:


J Prev Alz Dis 2014;1(2):99-109

Published online November 4, 2014,


INTRODUCTION: The PGSA (Placebo Group Simulation Approach) aims at avoiding problems of sample representativeness and ethical issues typical of placebo- controlled secondary prevention trials with MCI patients. The PGSA uses mathematical modeling to forecast the distribution of quantified outcomes of MCI patient groups based on their own baseline data established at the outset of clinical trials. These forecasted distributions are then compared with the distribution of actual outcomes observed on candidate treatments, thus substituting for a concomitant placebo group. Here we investigate whether a PGSA algorithm that was developed from the MCI population of ADNI 1*, can reliably simulate the distribution of composite neuropsychological outcomes from a larger, independently selected MCI subject sample.

METHODS: Data available from the National Alzheimer’s Coordinating Center (NACC) were used. We included 1523 patients with single or multiple domain amnestic mild cognitive impairment (aMCI) and at least two follow-ups after baseline. In order to strengthen the analysis and to verify whether there was a drift over time in the neuropsychological outcomes, the NACC subject sample was split into 3 subsamples of similar size. The previously described PGSA algorithm for the trajectory of a composite neuropsychological test battery (NTB) score was adapted to the test battery used in NACC. Nine demographic, clinical, biological and neuropsychological candidate predictors were included in a mixed model; this model and its error terms were used to simulate trajectories of the adapted NTB.

RESULTS: The distributions of empirically observed and simulated data after 1, 2 and 3 years were very similar, with some over-estimation of decline in all 3 subgroups. The by far most important predictor of the NTB trajectories is the baseline NTB score. Other significant predictors are the MMSE baseline score and the interactions of time with ApoE4 and FAQ (functional abilities). These are essentially the same predictors as determined for the original NTB score.

CONCLUSION: An algorithm comprising a small number of baseline variables, notably cognitive performance at baseline, forecasts the group trajectory of cognitive decline in subsequent years with high accuracy. The current analysis of 3 independent subgroups of aMCI patients from the NACC database supports the validity of the PGSA longitudinal algorithm for a NTB. Use of the PGSA in long-term secondary AD prevention trials deserves consideration.

Key words: Placebo Group Simulation Approach (PGSA), clinical AD trials, phase 3 clinical trials, MCI, modelling AD trajectories, prodromal AD



We have recently proposed the PGSA (Placebo Group Simulation Approach) as a novel study design to partly substitute for long-

term placebo-controlled secondary prevention trials in patients with mild cognitive impairment (MCI) (1). The PGSA uses study participants’ own demographic, baseline clinical, biological and neuropsychological data to model and forecast the distribution of the natural course of relevant cognitive outcomes of untreated patients included in clinical trials. These model-based outcomes are subsequently compared with the actual outcomes observed on the experimental treatment being tested, and may thus substitute for real observed outcomes of a concomitant placebo group. Differences – positive or negative – between the forecasted outcomes and the outcomes on experimental treatment may then be ascribed to the effect of the drug being tested. According to our proposal the PGSA should be considered in advanced stages of drug development, typically in Phase 3 long-term studies with candidate disease-course altering compounds, assuming (1) that the expected biological and clinical effects of the experimental drug have been supported in preceding clinical trials on the target population (subjects with prodromal AD or MCI due to AD (2, 3)), and (2) that the experimental drug’s projected risk-benefit profile is suggestive of a successful completion of its clinical development.

Under such circumstances the PGSA can help to resolve two problems that are typically encountered in late stages of clinical development of disease-course altering drugs against AD and other progressive degenerative brain disorders: (1) the problem of subject sample representativeness (some patients and/or their carers will not consent to participate in a placebo- controlled long-term trial once they fully understand what placebo control means), and (2) the ethical dilemma on the side of investigators who are obliged to expose patients with a high risk of contracting dementia to months and years of foreseeably useless placebo treatment – although they have a potentially effective therapy at their hands. For ethically problematic situations like this the ICH E10 guideline (4) considers using an external (historical) control if the course of the disease is predictable in a group of patients. Similar considerations apply if the new therapy is to be compared with a standard therapy that is known to have insufficient efficacy. Finally, if a standard therapy is available and a three-armed trial is desirable, but not ethically feasible, a simulated placebo group based on the PGSA could replace a real one.

The PGSA has raised some interest in the AD community (5), although questions regarding the generalizability of our published PGSA algorithms are still open. It was also noted that trials using the PGSA and lacking a concomitant placebo group will not reliably answer questions of drug safety and tolerability, and that there is no randomized allocation of drug and placebo to study participants. Our first study (1) demonstrated the internal validity of the PGSA in the MCI population of the Alzheimer Disease Neuroimaging Initiative (ADNI 1 (6, 7)). The present study deals with the external validity of the PGSA, i.e., with the question of whether a PGSA model, that was developed using ADNI 1 MCI data, can be reliably applied to simulate post-baseline neuropsychological performance data of another large, independently selected MCI subject sample, such that they coincide in distribution with the actually observed data.


Material and Methods

A database from the Uniform Data Set (UDS) collected at 33 AD centers in the USA was available from the National Alzheimer’s Coordinating Center (NACC; data collected under the cooperative agreement No. U01 AG016976; for details see Appendix). We used results from the data freeze of March 19, 2014, and, in a first step, excluded those participants who were also included in the ADNI 1 database. The remaining dataset contained 4602 subjects with a diagnosis at baseline of single or multiple domain amnestic mild cognitive impairment (aMCI). Patients younger than 55 or older than 90 years were excluded to conform to the ADNI 1 database. This reduced the dataset to 4466 patients. As we intended to apply a longitudinal model that includes the number of E4 alleles as a potential predictor, eventually 1523

patients, who had been genotyped for ApoE, and had at least one follow-up visit between 2 and 3.5 years after baseline were eligible for the analysis. Their demographic and other baseline characteristics are essentially the same as those of the whole NACC aMCI subject sample not restricted for ApoE4 availability (Table 1). Subjects underwent repeated evaluation of their cognitive function, about one year apart. Since our published PGSA models are based on ADNI 1 data not extending beyond 3 years (8), we did not consider NACC data from visits occurring more than 3.5 years after baseline.

Data available from the 1523 aMCI patients were collected between 2005 and 2014. In order to strengthen the analysis and to verify whether there was a drift over time in the neuropsychological outcomes, the subject sample was split into 3 subsamples of equal size (see Table 3 below): a group of “early” subjects (baseline data collected between July 29, 2005 and August 10, 2006), “mid” (baseline data collected between August 11, 2006 and December 2, 2007) and “late” participants (baseline data collected between December 3, 2007 and September 30, 2011).

The ADAScog as a potential cognitive outcome is not available in the NACC database. For this reason the relevant outcome used in the current analyses is the trajectory of the composite score of a Neuropsychological Test Battery (NTB), computed from the z-scores of a set of standard tests of mental performance. Details on the UDS NTB and its factor structure are provided in Weintraub et al (9) and Hayden et al (10). In the absence of the ADAScog we preferred to use the NTB scores to the CDR- SB (Clinical Dementia Rating sum of boxes (11)) also contained in the NACC database, since the latter is based on a metric similar to an ordinal scale, which makes it less suitable for linear regression models as used in the current analysis. In addition, within MCI the numerical range of the CDR-SB becomes quite restricted as seen, e.g., in Cedarbaum et al. (11, Table 1), and that lack of variability reduces the ability to detect small changes which can be represented in scores like the NTB.

It had been our intention to include the same set of 9 neuropsychological tests for analysis of the NACC NTB that were integrated in the composite NTB score originating from the ADNI 1 database. However, one of the tests (delayed recall from the Auditory Verbal Learning test, AVLT) was not available in the NACC database, and for one other test (Logical Memory delayed recall from the Wechsler Memory Scale, WMS) several procedural details used in NACC differed significantly from those used in ADNI 1 (see below). Therefore, these two tests of the ADNI NTB could not be included in the current analysis (Textbox 1).

Textbox 1. Tests included in NTB-7

Z-scores of the remaining 7 neuropsychological variables were computed by subtracting the mean and dividing the obtained difference by the standard deviation of the ADNI 1 participants who were cognitively normal at study entry (8). The z-scores for the Trail Making Test B (measured in seconds to complete) were inverted so that high values reflect high ability. The summary score NTB-7 was computed if at least 6 of the seven variables had valid data. Some individuals in NACC reported more than 20 years of education and, to conform to the ADNI 1 data, these values have been truncated to 20 years. Nine demographic, clinical, biological and neuropsychological variables were included as candidate predictors in the mixed model used for analysis (Textbox 2).

Textbox 2. Candidate Predictor Variables Available at Baseline

As a first step in the current validation of the longitudinal PGSA model, we had to adapt the original PGSA model for the NTB score with 9 tests to the NTB-7 score. Our published PGSA models were based on data from the ADNI database available in October 2009 (8). For the current analysis we downloaded the now completed ADNI database and extracted the data records of patients who were classified as MCI at baseline in the original ADNI study, now called ADNI 1. This data set comprises 397 patients at baseline, 380 at month 6, 370 at

month 12, 324 at month 18, 302 at month 24 and 251 at month 36. The procedure to develop the adapted mixed model was the same as described in (1):

  • The sample consists of the patients in the ADNI 1 database with a diagnosis of MCI at baseline;

  • The response is the composite score of the NTB, now computed as the mean of seven z-scores, abbreviated as NTB-7;

  • The potential predictors are those listed in Textbox 2.


The starting model used all predictor variables, time and the square of time and interactions with time as fixed effects and allowed for random intercepts and slopes of patients. Random effects take into account the inter- correlations of NTB-7 scores within the same patient. Random intercepts model each patient’s deviation from the overall mean predicted from the model’s fixed effects (linear combination of all predictors and of time). Random slopes model each patient’s individual slope to describe change over time. Standard deviations of random intercepts and slopes are available from mixed model analysis.

Variable selection was based on the AIC criterion whereas, in the previous process of model development (1), variable selection was based on Wald tests. Backward stepping was first done for interactions with time, then for all main effects that were not involved in an interaction. Finally, each of the two-way interactions not involving time was tested for inclusion in the model. The final model was checked for normality and variance homogeneity of residuals. The model was then used to simulate NTB-7 scores on follow-up visits for the NACC data. The simulation comprises all aspects of randomness in the model parameters: For each patient, fixed effects were sampled from the multivariate distribution of the regression coefficients, and random effects were sampled from centered bivariate normal distribution; independent error terms were simulated for each visit. Five hundred simulation runs were performed to demonstrate the stability of the procedure. Boxplots comparing observed and simulated scores are broken down by rounded years of follow-up. This simulation model reproduces the variance in the observed NTB-7 scores, whereas predicted values for the model have considerably smaller variance.

Analysis was performed using R (12) and, specifically, its package nlme for mixed effects model.



Comparison of ADNI 1 and NACC data at baseline

Although the NACC sample showed wider variation in some baseline characteristics at the lower end, the ADNI 1 and the NACC MCI samples were remarkably similar with regard to average age, years of education, cerebrovascular risk, body mass index, cognitive status (MMSE) and functional abilities in daily life (FAQ) (Table 1). In contrast, the NACC population contains fewer male subjects and fewer individuals with one or two ApoE4 alleles than the ADNI 1 sample (45.0% vs. 53.4%).

Table 1. Subject demographics

Despite some minor, statistically significant differences (p values not shown), the ADNI 1 and the NACC MCI patient samples were quite similar at baseline with regard to their performance on most tests comprised in the NTB (Table 2). However, this did not hold true for the Logical Memory test of the WMS where NACC subjects performed almost 60% better than the ADNI 1 subjects on average (data not shown). The likely reason for this discrepancy is twofold: NACC subjects, but not ADNI 1 participants, were given a cue before being asked to recount the story of the Wechsler test, and the maximum performance of the ADNI 1 subjects was truncated by the investigators to 8 recalled items (S. Weintraub, Personal Communication). Given these differences this test was not included in the NTB-7 (see Textbox 1).

Table 2. Summary statistics of neuropsychological scores at baseline in “All NACC”, “NACC included” and ADNI 1 samples

“All NACC” (N=4466) was obtained after excluding patients also found in the ADNI 1 database and patients with <55 years or > 90 years. 2589 of these subjects were ApoE genotyped. “NACC included” (N=1523) were ApoE genotyped and had at least one follow-up visit between 2 and 3.5 years after baseline. The column to the right shows the respective characteristics of ADNI 1 MCI patients for comparison.

All subsequent analyses were performed separately for 3 subgroups of NACC aMCI patients recruited at different times between 2005 and 2011 (early – mid – recent). Tables 3 and 4 show demographic and neuropsychological data of these 3 subgroups; they are well comparable in all respects, with one exception: the percentage of carriers of one or two ApoE4 alleles is higher in the recent subjects than in the early and mid ones.

Table 3. Demographics of 3 NACC subgroups (“early”, “mid” and “recent” subjects)

Table 4. Summary statistics of neuropsychological scores at baseline in 3 NACC subgroups

The “NACC included“ subject sample (N=1523) was split into 3 subsamples of similar size: a group of “early”, “mid” and “recent” subjects, depending on the time their baseline data were collected.

Adaptation of the ADNI 1 model

The mixed model for NTB-7 at follow-up visits contains time, the candidate predictor variables (Textbox 2) and interactions of time with these predictor as fixed effects. Random intercepts and random slopes were included because they account for inter-correlations and provided significant improvements of the model. The first step of fixed effects selection eliminated the interaction of time with categorized BMI, age, Hachinski modified score, MMSE, education and sex in this order. The interactions of time with the number of ApoE4 alleles, FAQ and NTB-7 scores at baseline remained in the model. In the second step the effects of age, categorized BMI, education, and Hachinski were eliminated in this order. None of the pairwise interactions not involving time entered the model in the final step. The resulting model was checked for normality and homogeneity of variances. Since the variances decrease with increasing predicted values, a power-transformation for error variance was included in the model. This cured the problem and yielded normal distribution of the residuals as well. The final model is shown in Table 5.

Table 5. Fixed effects of the final mixed model for the NTB-7 estimated with the ADNI 1 subpopulation with an initial diagnosis of MCI

Comparison between the previously published longitudinal model for the NTB-9 (1) with the adapted model to simulate performance on the NTB-7 within the ADNI 1 MCI dataset shows high consistency: In both models the strongest single predictor of NTB outcomes, by far, are the scores of the respective NTB at baseline (coefficients are 0.908 in NTB-7 compared to 0.975 in NTB-9). Another significant, but quantitatively much less important predictor in both models is the interaction between the number of ApoE4 alleles and time (-0.080 in NTB-7 versus -0.071 in NTB-9). The square of time only entered the new model, presumably because more information is now available on later visits in the completed ADNI 1 data. The MMSE score at baseline also emerged as a significant, although numerically weak predictor. The effects of time are not comparable, because time is involved in some interactions in both models. Other predictors in both models cannot be compared because they do not enter with the same variables in interactions. ADAScog was an important predictor in the previous model (1) but not in the present one, as the ADAScog is not available in the NACC database.

Simulation of the NACC outcome data

The coefficients of the fixed effects of the ADNI 1 model for the NTB-7 and their distribution were then used to simulate outcomes in the NACC aMCI dataset. All available follow-up visits between 2 and 3.5 years post baseline and with a complete set of predictor variables were extracted from the NACC sample (5078 of 5345 visits had a complete set of predictor variables for the final model). Predictions could not be computed for patients with missing values. Random effects and error terms are generated as centered normal variates with variance obtained from the mixed model estimates.

Comparing observed and simulated values 1, 2 and 3 years after baseline

The time between visits was quite variable in the NACC database: Some follow-up visits took place slightly earlier than one half year after baseline. These are grouped in the “1-year follow-up” by applying a cut- point of 0.45 years. The other cut-points to separate strata were 1.5, 2.5 and 3.5 years.

Boxplots (Figures 1-3) show observed values at baseline and the decreasing numbers of observed values per follow-up for the “early”, “mid” and “late” subgroups separately. The observed values are compared with boxplots of all simulated values (500 times the number of available visits) and with boxplots of the 5%, 25%, 50%, 75% and 95% percentiles from 500 simulation runs. Inspection of Figures 1 – 3 reveals that there is a continuous decline of cognitive performance – as indicated by a downward shift of the observed median z-values – in all 3 subgroups. This continuous decline is also seen in the simulated values – most clearly illustrated by the median z-values and their distributions. It will also be noted on Figures 1 – 3 that, overall, the correspondence between observed and simulated values is high, and that there is a small and slightly increasing overestimation of the cognitive decline in the simulated as compared to the observed values. This is also seen in Table 6, which presents the mean z-values for the 3 strata at baseline and after 1, 2 and 3 years of follow-up. The mean overestimation increases from 0.015 (one year) to 0.117 (three years).

Table 6. Comparison between observed and simulated mean NTB-7 scores for 3 NACC subgroups



Our original paper on the development of the PGSA models (1) presented two predictive algorithms: a univariate model to simulate ADAScog scores cross- sectionally, and a multivariable model to forecast the averaged z-scores of a NTB longitudinally, i.e., MCI subjects’ cognitive trajectory over several years. Both models used demographic, biological, neuropsychological and clinical data from the ADNI 1 MCI database as potential predictors. A first, internal validation was performed by comparison of observed and simulated ADAScog and NTB-9 scores within the ADNI MCI database (1). Since ADAScog data are not available in the NACC database, the present analysis focuses on the longitudinal PGSA model for the NTB and attempts a validation by applying the ADNI model for the NTB to a larger set of NACC aMCI data. The original PGSA longitudinal model for the NTB-9 of ADNI 1 needed to be adapted because two neuropsychological tests were either not available in the NACC database, or had to be omitted owing to relevant procedural differences between ADNI 1 and NACC. As a consequence, we adapted the longitudinal model, starting again from the meanwhile completed ADNI 1 database.

Although the omission of two proven tests of learning and memory in the new NTB-7 represents a significant loss of information, the published PGSA model for the NTB-9 from ADNI 1 underwent little change when re-calculated for the NTB-7 and applied to the updated ADNI 1 data. In both versions of the model the strongest predictor of the NTB trajectories, by far, are the baseline scores of the respective NTB: the baseline NTB-9 scores for the trajectory of the NTB-9, and the baseline NTB-7 scores for the trajectory of the NTB-7. A second significant predictor in both models is the interaction between the number of ApoE4 alleles and time: The more E4 alleles are present, the faster is the decline in NTB-7 scores. The FAQ score, another significant predictor in the model for the NTB-9, also emerged as a significant predictor, interacting with time, of the NTB-7 scores.

The adapted model for the NTB-7 was then applied to a large aMCI dataset from NACC. In order to strengthen the analysis and to check whether there was a trend in time of study entry in the neuropsychological measures, the NACC aMCI sample was split into 3 subsamples of similar size, called “early”.  “mid” and “late” depending on their time of recruitment. Visual comparison of Figures 1, 2 and 3 and inspection of Tables 4 and 6 provided two crucial observations: (1) There is no time trend in either the baseline values or the changes from baseline with regard to the mean NTB-7 z-scores; and (2) in all 3 subsets of data there was a high level of correspondence between the empirically observed and the model-based simulated values, supporting the validity of the longitudinal PGSA algorithm for the NTB- 7.

However, close scrutiny of Figures 1–3 and Table 6 reveals a small, but potentially relevant difference between the observed and the modelled values: There is some overestimation of cognitive decline in the simulated as compared to the observed data which tends to increase over time. Before we discuss the significance of this observation, another phenomenon, also seen in Figures 1- 3, needs to be highlighted: the conspicuously slow decline of the NTB-7 z-scores in both the observed and the simulated data over time. 

Little cognitive decline in MCI subjects on placebo after 2 and 3 years was also seen in clinical drug trials with galantamine (13) and rivastigmine (14), and an unexpectedly low conversion rate to dementia even led to an extension of the rivastigmine trial from 3 to 4 years (14). As for the slow average decline of the NTB-7 scores in the current study, one needs to consider some specifics of the NACC aMCI sample included in the analysis. These subjects’ median MMSE score at baseline was 27.0 (third quartile 29.0; see Table 1), suggesting that many of them were in an early stage of MCI or were possibly misclassified at baseline as having aMCI. A diagnosis of aMCI is even more doubtful for those subjects who had NTB-7 z-scores consistently around or clearly above zero at baseline or later, i.e., who were in the range typical of cognitively healthy subjects (Figures 1-3). As is known from other studies (e.g., (15-17)), subjects in very early stages of MCI are unlikely to deteriorate, or may revert to normal on follow-up, an observation that could partly explain the slow average decline noted in our analysis. Another factor that needs consideration is selective dropout of subjects with relatively poor performance at baseline: In longitudinal studies of cognitive performance, subjects with lower scores at baseline tend to deteriorate further and are then likely to drop out prematurely, and this selective dropout of poorer performers will result in seemingly better average performance on follow-up of those individuals who remain in the study. This mechanism was likely operating in the current study, too, as will be noted from the decreasing numbers of subjects available 1, 2 and particularly 3 years after baseline. In summary, the slow decline of cognitive performance in all 3 subsamples can probably be ascribed to two factors: a high percentage of subjects with little or no cognitive impairment at baseline included in the NACC aMCI database, and selective dropout primarily of those subjects who had relatively poor cognitive performance at baseline.

Figure 1. Observed (obs) and simulated (sim) NTB-7 scores for the subsample of early recruited individuals with complete predictor information in NACC. Boxplots of observed values at baseline and after 1, 2, and 3 years (+/- 6 months), to be compared with boxplots of simulated scores (500 simulations). Each percentile boxplot summarizes 500 percentiles

Figure 2. Observed (obs) and simulated (sim) NTB-7 scores for the subsample of mid-time recruited individuals with complete predictor information in NACC. Boxplots of observed values at baseline and after 1, 2, and 3 years (+/- 6 months), to be compared with boxplots of simulated scores (500 simulations). Each percentile boxplot summarizes 500 percentiles

Figure 3. Observed (obs) and simulated (sim) NTB-7 scores for the subsample of recently recruited individuals with complete predictor information in NACC. Boxplots of observed values at baseline and after 1, 2, and 3 years (+/- 6 months), to be compared with boxplots of simulated scores (500 simulations). Each percentile boxplot summarizes 500 percentiles

With regard to the small, but monotonically increasing overestimation of cognitive decline in the simulated data, one has to assume that selective dropout also played its part, and that NTB scores are missing not at random, i.e. missingness is dependent on the unobserved values and cannot be predicted from (previous) observed values. To be more concrete, we suppose that some subjects did not attend a subsequent visit when their cognition had deteriorated after the previous visit. The numbers on top of Figures 1 – 3 suggest that this may have happened more frequently in the “mid” and “late” subgroups between the 2nd and the 3rd year, which would then explain why lower values on late visits are underrepresented.

If these considerations apply, two consequences for the application of the longitudinal PGSA model for a NTB in future trials need to be drawn: One relates to the selection of aMCI patient samples with regard to their cognitive performance at baseline, the other to the proportion of subjects who are retained in a study. Thus, if many “very mild” aMCI patients (or subjects with no aMCI) are included in a patient sample, then a clinical trial with a potential disease-course altering drug and using the current longitudinal PGSA model risks to provide false positive results (because there will be less cognitive decline on the drug than forecasted by the model). Consequently, our observations support the notion that a diagnosis of aMCI (or prodromal AD) needs to be backed by thorough testing of cognitive function (18) and, as far as acceptable to trial participants, by relevant biological markers (2, 19). The second consequence relates to patient retention and dropout in a clinical trial: The current PGSA algorithm for a composite NTB presupposes high retention rates (around 80 percent, see Figures). If, however, a significantly higher proportion of MCI patients with relatively poor cognitive performance at baseline are included in and then lost during a study, then NTB forecasts based on the current longitudinal PGSA algorithm will tend to over-estimate cognitive decline.

Returning to the main objective of the current study, i.e. an examination of the generalizability of a PGSA algorithm for MCI patients originating from the ADNI 1 database, it is fair to say that the longitudinal PGSA model for the NTB was supported by 3 independent subsets of aMCI data from NACC.

What inferences can be made from this work with regard to the proposed use of the PGSA models in clinical studies with candidate anti-AD drugs? Can one consider the PGSA as a valid alternative to traditional long-term RPCTs in advanced stages of clinical AD drug development? As noted earlier, development of the PGSA was undertaken with two major goals in mind: a scientific and an ethical one. The scientific goal is to improve the representativeness of patient samples recruited for clinical trials with anti-AD drugs, i.e., not to exclude those study candidates who will decline participation in a trial with prolonged placebo administration. In this regard a trial using the PGSA will be more similar than a RPCT to everyday clinical reality where drug treatment is usually prescribed without randomization to active medication and placebo, blinded assessments etc.. The ethical goal is to avoid or to minimize the extended use of placebo on aMCI or prodromal AD patients with their high risk of contracting dementia within a few years. As an additional benefit of the PGSA, one should also mention that recruitment of candidates into and retention of participants in PGSA-based trials will be easier than for RPCTs, an economical aspect of clinical studies that must not be neglected [20]. As noted by Grill & Karlawish [21] spouses of potential AD study participants are closely involved in their decision as to whether or not entering and continuing on a trial; according to these authors the “possibility of receiving placebo is a barrier to participation”.

We have previously stipulated that the PGSA makes better use of large and carefully collected MCI and AD databases such as those available from ADNI and NACC (22) than just viewing these subjects as “historical controls” in clinical trials. Thus, rather than using historical data for plain descriptive comparisons (see, e.g. 23), deriving predictive models for disease trajectories and endpoints from such data – models that can be used to simulate relevant outcomes of new MCI and AD patients samples adjusted for their baseline data – appears to be a more promising approach. Since the PGSA simulation model incorporates all sources of variation, it has been able to reproduce the distribution of observed outcomes.

What concrete applications can be foreseen for this proposed novel clinical study design? In our first publication on the PGSA (1) we suggested that the PGSA be preferably used in late stages of clinical drug development, and that PGSA-based studies may partly substitute for traditional long-term RPCTs. We maintain this proposition, but must add one more caveat: The results of the current analysis suggest that the longitudinal PGSA model (Table 5) tends to overestimate the cognitive decline of untreated aMCI subjects, and thus carries some risk of providing wrongly positive results of treatment studies. This risk obviously increases when interventions with only marginal effects are tested, as seems to be the case with some current developments (24, 25). As a consequence, one either has to wait until more effective interventions become available for long- term trials, or one will have to resort to so-called enriched (26) or even very high-risk patient samples (27) where ethical concerns may render long-term placebo exposure of patients very difficult.

RPCTs of adequate size and duration are and will remain as the pivotal elements in some critical phases of clinical drug development programs, because only an RPCT can provide unbiased information on the efficacy, safety and tolerability of a new treatment. On the other hand, RPCTs have their serious limitations, too, and not all clinical studies in late stages of drug development need to be RPCTs. Trials using the PGSA can provide important supportive evidence of a new drug’s effectiveness from large numbers of persons with a diagnosis of aMCI or prodromal AD who – for whatever reasons – decline or have to be excluded from participation in RPCTs.



The NACC aMCI data set used in this validation study is about 3 times larger and less narrowly defined than the ADNI 1 MCI data set, and not all neuropsychological tests included in the NP-Batt9 for ADNI 1 are contained in the NTB-7 defined for NACC. Despite these differences, an algorithm comprising a small number of demographic, biological, neuropsychological and clinical variables routinely available at the outset of studies, notably cognitive performance at baseline, is able to forecast the average trajectory of cognitive decline in the subsequent three years in 3 independent aMCI subsamples from the NACC database. One notes some small, monotonously increasing over-estimation of cognitive decline, which might result in wrongly positive outcomes when applying the PGSA algorithm in studies with only marginally effective drugs. Practical measures how to deal with this risk in future intervention trials are described. It is suggested to use the PGSA as part of late stage anti-AD drug development programs where ethical problems may arise with prolonged placebo treatment.


Competing Interests: MB, ARM, AUM and RS are founders and owners of a small business that offers services to drug companies and CROs based on the PGSA. WAK and SEM declare no conflict of interest.

Authors’ contributions: MB developed the mathematical models underlying the PGSA. WAK and SEM were instrumental in making the NACC database available and answered many technical queries. ARM and AUM made important contributions to the development of the PGSA and to the current manuscript. RS is the originator of the principle of the PGSA and wrote major parts of the manuscript.

Authors’ information: MB is Professor of Mathematics and Statistics, RheinAhrCampus Remagen, University of Applied Sciences, Koblenz Germany. He acts as Consultant for study design and statistics at the Memory Clinic, Dept. of Geriatrics, Basel University Hospital. WAK is Professor of Epidemiology at the University of Washington, Seattle Washington. He is also Director and Principal Investigator of the National Alzheimer’s Coordinating Center, funded as a Cooperative Agreement from the National Institute on Aging [U01 AG016976]. SEM is a research scientist at the National Alzheimer’s Coordinating Center, funded as Cooperative Agreement from the National Institute on Aging [U01 AG016976]. ARM is Professor of Internal Medicine at the University of Basel Hospital. He is owner and CEO of diagene Laboratories Inc., in Reinach Switzerland. AUM is Professor of Psychology at the University of Basel and Director of the Memory Clinic, Dept. of Geriatrics, Basel University Hospital. RS is Professor emeritus of Clinical Psychology at the University of Basel. Till end 2003 he worked in Clinical R&D at Novartis Pharma in Basel. He acts as Senior Scientific Consultant at the Memory Clinic, Dept. of Geriatrics, Basel University Hospital.

Acknowledgment: We are most grateful to the staff members of the US National Alzheimer Coordinating Center (22) for their continuous help with the NACC database (Grant U01 AG016976). Data collection and sharing for parts of this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: Abbott, AstraZeneca AB, Bayer Schering Pharma AG, Bristol-Myers Squibb, Eisai Global Clinical Development, Elan Corporation, Genentech, GE Healthcare, GlaxoSmithKline, Innogenetics, Johnson and Johnson, Eli Lilly and Co., Medpace, Inc., Merck and Co., Inc., Novartis AG, Pfizer Inc, F. Hoffman-La Roche, Schering- Plough, Synarc, Inc., as well as non-profit partners the Alzheimer’s Association and Alzheimer’s Drug Discovery Foundation, with participation from the U.S. Food and Drug Administration. Private sector contributions to ADNI are facilitated by the Foundation for the National Institutes of Health ( The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of California, Los Angeles. This research was also supported by NIH grants P30 AG010129, K01 AG030514, and the Dana Foundation.

Abbreviations: AD, Alzheimer Disease; ADC, Alzheimer Disease Center; ADAScog, Alzheimer Disease Assessment Scale cognitive; ADNI, Alzheimer Disease Neuroimaging Initiative; AIC, Akaike Information Criterion; aMCI, amnestic mild cognitive impairment; ApoE4, Apolipoprotein Epsilon 4; AVLT, Auditory Verbal Learning test; BL, Baseline; BMI, Body Mass Index; CDR-SB, Clinical Dementia Rating sum of boxes; FAQ, Functional Assessment Questionnaire; max, Maximum; MCI, Mild Cognitive Impairment; min, Minimum; MMSE, Mini-Mental Status Examination; NACC National Alzheimer’s Coordinating Center; NIA, National Institute on Aging; NTB, Neuro-Psychological Battery; obs, observed; PGSA, Placebo Group Simulation Approach; R&D, Research & Development; RPCT, Randomized Placebo-Controlled Trial; SD, Standard Deviation; sim, simulated;UDS, Uniform Data Set, WAIS, Wechsler Adult Intelligence Scalse, WMS, Wechsler Memory Scale.

Ethical standards: Research using NACC data was approved by the University of Washington Institutional Review Board.

Appendix. Information on the NACC Database

* Data used in preparation of parts of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI 1) database [6]. As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. ADNI investigators include (complete listing available at



  1. Spiegel R, Berres M, Miserez AR, Monsch AU: For debate: substituting placebo controls in long-term Alzheimer’s prevention trials. Alzheimer’s Research & Therapy 2011, 3:9-20

  2. Albert MS, DeKosky ST, Dickson D, Dubois B, Feldman HH, Fox NC, Gamst A, Holtzman DM, Jagust WJ, Petersen RC, Snyder PJ, Carrillo MC, Thies B, Phelps CH: The diagnosis of mild cognitive impairment due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimer’s & Dementia 2011, 7:270-279

  3. Aisen PS, Andrieu S, Sampaio C, Carrillo M, Khachaturian ZS, Dubois B, Feldman HH, Petersen RC, Siemers E, Doody RS, Hendrix SB, Grundman M, Schneider LS, Schindler RJ, Salmon E, Potter WZ, Thomas RG, Salmon D, Donohue D, Bednar MM, Touchon J, Vellas B: Report of the task force on designing clinical trials in early (predementia) AD. Neurology 2011, 76:280–286

  4. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use. Guideline E10 on Choice of Control Group and Related Issues in Clinical Trials, 20 July 2000.

  5. Cummings J, Gould H, Zhong K: Advances in designs for Alzheimer’s disease clinical trials. American Journal of Neurodegenerative Diseases 2012, 1:205-216

  6. Weiner MW, Veitch DP, Aisen PS, Beckett LA, Cairns NJ, Green RC, Harvey D, Jack CR, Jagust W, Liu Enchi, Morris JC, Petersen RC, Saykin AJ, Schmidt ME, Shaw L, Siuciak JA, Soares H, Toga AW, Trojanowski JQ: The Alzheimer’s Disease Neuroimaging Initiative: A review of papers published since its inception. Alzheimer’s & Dementia 2011, 7:1-67

  7. Aisen PS,Petersen RC, Donohue MC, Gamst A, Raman R, Thomas RG, Walter S, Trojanowski JQ, Shaw LM, Beckett LA, Jack CR, Jagust W,Toga AW, Saykin AJ, Morris JC,Green RC, Weiner MW and the Alzheimer’s Disease Neuroimaging Initiative (2010): Clinical core of the Alzheimer’s disease neuroimaging initiative: Progress and plans. Alzheimer’s & Dementia 2010, 6:239-246

  8. Alzheimer’s Disease Neuroimaging Initiative. [ ADNI]

  9. Weintraub S, Salmon DS, Mercaldo N, Ferris S, Graff-Radford NR, Chui H, Cummings J, DeCarli Ch, Foster NL, Galasko D, Peskind E, Dietrich W, Beekly DL, Kukull WA, Morris JC: The Alzheimer’s Disease Centers’ Uniform Data Set (UDS) The Neuropsychologic Test Battery. Alzheimer Disease Associated Disorders 2009, 23:91-101

  10. Hayden KM, Jones RM, Zimmer C, Plassman BL, Browndyke JN, Pieper C, Warren LH, Welsh-Bohmer KA: Factor Structure of the National Alzheimer’s Coordinating Centers Uniform Dataset Neuropsychological Battery. An Evaluation of Invariance Between and Within Groups Over Time. Alzheimer Disease Associated Disorders 2011, 25:128-137

  11. Cedarbaum JM, Jaros M, Hernandez C, Coley N, Andrieu S, Grundman M, Vellas B, and the Alzheimer’s Disease Neuroimaging Initiative: Rationale for use of the Clinical Dementia Rating Sum of Boxes as a primary outcome measure for Alzheimer’s disease clinical trials. Alzheimer’s & Dementia 2013, 9 Suppl: S45-S99

  12. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria 2011. url =

  13. Winblad B, Gauthier S, Scinto L, Feldman H, Wilcock GK, Truyen L, Mayorga AJ, Wang D, Brashear HR, Nye JS, The GAL-INT-11/18 Study Group: Safety and efficacy of galantamine in subjects with mild cognitive impairment. Neurology 2008, 70:2024-2035

  14. Feldman HH, Ferris S, Winblad B, Sfikas N, Mancione L, He Y, Tekin S, Burns A, Cummings J, del Ser T, Inzitari D, Orgogozo JM, Sauer H, Scheltens P, Scarpini E, Herrmann N, Farlow M, Potkin S, Charles HC, Fox NC, Lane R: Effect of rivastigmine on delay to diagnosis of Alzheimer’s disease from mild cognitive impairment: the InDDEx study. Lancet Neurol 2007, 6:501-512

  15. Albert MS, DeKosky ST, Dickson D, Dubois B, Feldman HH, Fox N, Gamst A, Holtzman DM, Jagust WJ, Petersen RC, Snyder PJ, Carrillo MC, Thies B, Phelps CH: The diagnosis of mild cognitive impairment due to Alzheimer’s disease: recommendations from the National Institute on Aging–Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers & Dementia 2011, 7:270–279

  16. Brodaty H: Sydney Memory and Ageing Study. Paper presented at the AAICAD 2011 meeting, Paris, July 16-21, 2011. Abstract F3-02-01, S486

  17. Roberts R: MCI Incidence, Progression to Dementia, and Reversion to Normal in a Population-Based Cohort: The Mayo Study of Aging. Paper presented at the AAICAD 2011 meeting, Paris, July 16-21, 2011. Abstract F3- 02-02, S486

  18. Carter AF, Caine D, Burns A, Herholz K, Lambon Ralph MA: Staging of the cognitive decline in Alzheimer’s disease: insights from a detailed neuropsychological investigation of mild cognitive impairment and mild Alzheimer’s disease. International Journal of Geriatric Psychiatry 2012, 27: 423-432

  19. Okello A, Koivunen J, Edison P, Archer HA, Turkheimer FE, Någren K, Bullock R, Walker Z, Kennedy A, Fox NC, Rossor MN, Rinne JO, Brooks DJ: Conversion of amyloid positive and negative MCI to AD over 3 years. An 11C-PIB PET study. Neurology 2009, 73:754-760

  20. Vellas B, Pesce A, Robert PH, Aisen PS, Ancoli-Israel S, Andrieu S, Cedarbaum J, Dubois B, Siemers E, Spire JP, Weiner MW, May TS: AMPA workshop on challenges faced by investigators conducting Alzheimer’s disease clinical trials. Alzheimer’s & Dementia 2011, 7:109–117

  21. Grill JD & Karlawish J: Addressing the challenges to successful recruitment and retention in Alzheimer’s disease clinical trials. Alzheimer’s Research & Therapy 2010, 2:34-44

  22. Beekly DL, Ramos EM, Lee WW, Deitrich WD, Jacka ME, Wu J, Hubbard JL, Koepsel TD, Morris JC, Kukull WA: The National Alzheimer’s Coordinating Center (NACC) Database: The Uniform Data Set. Alzheimer Disease Associated Disorders 2007, 21:249-258

  23. Miller RG, Moore DH, Forshew DA, Katz JS, Barohn RJ, Valan M, Bromberg MB, Goslin KL, Graves MC, McCluskey LF, McVey AL, Mozaffar T, Florence JM, Pestronk A, Ross M, Simpson EP, Appel SH: Phase II screening of lithium carbonate in amyotrophic lateral sclerosis. Examining a more efficient trial design. Neurology 2011,77:973-979

  24. Doody RS, Thomas RG, Farlow M, Iwatsubo T, Vellas B, Joffe S, Kieburtz K, Raman R, Sun X, Aisen PS, Siemers E, Liu-Seifert H, Mohs R: Phase 3 trial of solanezumab for mild-to-moderate Alzheimer’s disease. The New England Journal of Medicine 2014, 370:311-321

  25. The A4 Study. Homepage of ADCS (accessed May 2014)

  26. McEvoy LK, Edland SD, Holland D, Hagler DJ, Roddey JC, Fennema- Notestine Ch, Salmon DP, Koyama AK, Aisen PS, Brewer JB, Dale AM: Neuroimaging enrichment strategy for secondary prevention trials in Alzheimer disease. Alzheimer Disease & Associated Disorers 2010, 24:269-277

  27. Moulder KL, Snider BJ, Mills SL, Buckles VD, Santacruz AM, Bateman RJ, Morris JC: Dominantly inherited Alzheimer Network: facilitating research and clinical trials. Alzheimer’s Research & Therapy 2013, 5:48-55