jpad journal

AND option

OR option

CHINESE VERSION OF THE BAYLOR PROFOUND MENTAL STATUS EXAMINATION: A BRIEF STAGING MEASURE FOR PATIENTS WITH SEVERE ALZHEIMER’S DISEASE

 
X. Fu1,*, W. Yu2,*, M. Ke2, X. Wang1, J. Zhang1, T. Luo1, P.J. Massman3,4, R.S. Doody3, Y. Lü1,*
 

1. Department of Geriatrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, China; 2. Institute of Neuroscience, Chongqing Medical University, Chongqing 400016, China; 3. Department of Neurology, Baylor College of Medicine, Houston, TX USA at the time this work was done. Now Genentech/Roche, Basel, Switzerland; 4. Department of Psychology, University of Houston, Houston, TX USA; *Authors contributed equally and are co-first authors of the study.

Corresponding Authors: Prof. Yang Lü, 1 Youyi Road, Yuzhong District, Chongqing 400016, China, Tel: +86-23-89011622, Fax: +86-23-68811487, E-mail: yanglyu@hosptial.cqmu.edu.cn

J Prev Alz Dis 2020;
Published online December 21, 2020, http://dx.doi.org/10.14283/jpad.2020.72

 


Abstract

BACKGROUND: A specialized instrument for assessing the cognition of patients with severe Alzheimer’s disease (AD) is needed in China.
Objectives: To validate the Chinese version of the Baylor Profound Mental Status Examination (BPMSE-Ch).
Design: The BPMSE is a simplified scale which has proved to be a reliable and valid tool for evaluating patients with moderate to severe AD, it is worthwhile to extend the use of it to Chinese patients with AD.
Setting: Patients were assessed from the Memory Clinic Outpatient.
Participants: All participants were diagnosed as having probable AD by assessment.
Measurements: The BPMSE was translated into Chinese and back translated. The BPMSE-Ch was administered to 102 AD patients with a Mini-Mental State Examination (MMSE) score below 17. We assessed the internal consistency, reliability, and construct validity between the BPMSE-Ch and MMSE, Severe Impairment Battery (SIB), Global Deterioration Scale (GDS-1), Geriatric Depression Scale(GDS-2), Instrumental Activities of Daily Living (IADL), Physical Self-Maintenance Scale (PSMS), Neuropsychiatric Inventory (NPI) and Clinical Dementia Rating (CDR).
Results: The BPMSE-Ch showed good internal consistency (α = 0.87); inter-rater and test-retest reliability were both excellent, ranging from 0.91 to 0.99. The construct validity of the measure was also supported by significant correlations with MMSE, SIB. Moreover, as expected, the BMPSE-Ch had a lower floor effect than the MMSE, but a ceiling effect existed for patients with MMSE scores above 11.
Conclusions: The BPMSE-Ch is a reliable and valid tool for evaluating cognitive function in Chinese patients with severe AD.

Key words: Alzheimer’s disease, Baylor Profound Mental Status Examination, Chinese version, severe dementia, validation.

Abbreviations: AD: Alzheimer’s disease; ADAS-cog: Alzheimer’s Disease Assessment Scale-Cognitive section; ANOVA: A one-way analysis of variance; BPMSE: Baylor Profound Mental Status Examination; BPMSE-Ch: Chinese version of the Baylor Profound Mental Status Examination; BPMSE-Ch-cog: Cognition subscale of Chinese version of the Baylor Profound Mental Status Examination; BPMSE-Ch-behav: Behavior subscale of Chinese version of the Baylor Profound Mental Status Examination; CDR: Clinical Dementia Rating; FAST: Functional Assessment Staging; GDS-1: Global Deterioration Scale; GDS-2: Geriatric Depression Scale: IADL, Instrumental Activities of Daily Living; MMSE: Mini-Mental State Examination; NPI: Neuropsychiatric Inventory; PSMS: physical self-maintenance scale; SIB: Severe Impairment Battery.


 

Introduction

Alzheimer’s disease (AD) is a common neurodegenerative disorder among mainly elderly persons worldwide. The manifestations of AD include deterioration in cognition, memory and activities of daily living. It is usually accompanied by behavioral and psychological symptoms (1).
Currently, China is facing serious issues related to having an aging population. Persons aged 60 or older account for 17.3% of the total population (2). The prevalence of all-cause dementia over age 65 is about 6% in China, and AD makes up about 65% of all cases (3, 4). The rough prevalence of AD in China has reported to ranges from 7 per 1000 people to 66 per 1000 individuals (5). In a population-based cross-sectional survey, 10276 residents aged 65 year or older were drawn from Beijing (northern-eastern), Zhengzhou (northern-central), Guiyang (southern-western) and Guangzhou (southern-eastern). This survey showed that the prevalence of AD was 3.21% in a total of 10276 residents (6). Despite the fact that China has the relatively high AD prevalence, few studies of AD were conducted to research excellent methods for AD diagnosing and evaluating.
It seems unquestionable that AD is gradually evolving into a crucial social problem and presents a major challenge for health-care in China. However, awareness of AD and dementia in general is inadequate in China, leading to delayed diagnosis and initiation of treatment (7, 8).Therefore, many patients do not get evaluated until moderate to severe stages of the disease (9, 10). Moreover, once these patients present for an evaluation, tools to assess them are limited (11). Hence, better instruments are needed for the accurate assessment of patients with advanced AD.
A variety of neuropsychological and functional measures have been utilized to assess mental status and dementia severity both cross-sectionally and longitudinally. Frequently-used instruments include the Mini-Mental Status Examination (MMSE) (12), Severe Impairment Battery (SIB) (13), Alzheimer’s Disease Assessment Scale-Cognitive section (ADAS-Cog) (14), Geriatric Deterioration Scale (GDS-1) (15), Functional Assessment Staging Tool(FAST) (16) and Clinical Dementia Rating (CDR) (17). However, these scales show some limitations in patients with moderate to severe AD. The MMSE and ADAS-cog are not optimal for evaluating patients with severe AD because both contain a lot of verbal information and; therefore, the results may be confounded by language disorders and/or low level of education. SIB is a suitable tool to evaluate patients with severe dementia. However, this test takes more than 30 minutes to administer, which often exceeds the attention capacity of most patients with severe AD (18). The NPI is usually used to evaluate neuropsychiatric symptoms, but it is largely dependent on the description from caregivers (19). Overall, it is clear that a convenient and effective assessment instrument for measuring cognitive function in patients with severe AD is highly needed.
The Baylor Profound Mental State Examination (BPMSE) developed by Doody RS et al, is a simplified scale which has proved to be a reliable and valid tool for evaluating patients with moderate to severe AD (20). And in Doody’s study, European American accounted for about 82% of the original population. Thus, it is worthwhile to extend the use of the BPMSE to Severely demented patients from different cultural backgrounds. To date, there have been three translated versionsof the BPMSE, including Korean, Danish and Spanish (21-23). A study of the Korean version has shown that the BPMSE is a rapid, easy and valid scale for measuring cognitive function in patients with moderate to severe AD, particularly in patients with MMSE below 12. Similarly, a study utilizing the Danish version indicated that the BPMSE is a stable and strong instrument, and was recommended as an appropriate measure of dementia severity in patients with more sever impairment. Adaptation of the Spanish version revealed that BPMSE that the BPMSE is a useful tool for assessing cognitive function, even in daily medical practice focusing on patients with severe AD.
In China, there is no applicable scale for assessing patients with severe AD. Therefore, the aim of our study was to develop a Chinese version of the BPMSE (BPMSE-Ch) and to evaluate the psychometric properties of this version in Chinese patients with AD.

 

Methods

Translation

The original version of BPMSE consists of three parts, including the cognition subscale which includes 25 questions, the behavior subscale which includes 10 items to rate the presence or absence of behavioral problems, and 2 qualitative observations of language and social interaction. The cognition subscale assesses four areas: language, orientation, attention, and motor skills. The BPMSE total cognition subscale has a score between 0and 25: maximum 5 scores for orientation, 11 scores for language, 4scores for attention and 5 scores for motor skills. BPMSE behavior subscale score has a score between 0 (no behavioral disturbances) and 10 (all behavioral disturbances). In present study, we did not study the 2 qualitative observations about communication and social interactions.
Firstly, the original version of BPMSE was translated into Chinese with Mandarin by two bilingual translators whose mother tongue was Chinese. Then, the two Chinese versions were discussed by our team with gerontologists, a neurologist, a psychologist and an English expert, and the final Chinese version was formulated based on this input. Finally, two other translators of English philology back translated the final Chinese version into English to confirm consistency with the original version.

Subjects

Patients were recruited from the Memory Clinic, Department of Geriatrics, The First Affiliated Hospital of Chongqing Medical University.

Enrollment criteria

(a) All participants were diagnosed as having probable AD according to National Institute of Neurological and Communicative Diseases and Stroke/Alzheimer’s Disease and Related Disorders Association criteria (NINCDS-ADRDA) (24); (b) Patients with MMSE <17 were included; (c) This study was approved by the Ethical Committee of The First Affiliated Hospital of Chongqing Medical University on human research; (d) Informed consent was obtained from all participants or their family members.

Exclusion criteria

(a) Patients were excluded if they had other neurological or psychiatric disorders or clinically significant medical conditions (e.g., acute infections, cancer, organ failure etc.,); (b) Patients had severely impaired communication abilities (e.g., global aphasia, deafness, blindness, muteness etc.,); (c) Patients had a history of head trauma, sedative drugs use or substance abuse.

Measurements

The following measures were administered to all enrolled patients: BPMSE-Ch, MMSE, SIB, GDS-1, GDS-2, IADL, PSMS, NPI, and CDR. All tests were given on the same day. Two trained physicians in our clinic administered the BPMSE-Ch to evaluate a subset of enrolled patients consecutively and independently in order to examine inter-rater reliability. Finally, to investigate test-retest reliability, some patients were randomly chosen to be given the BPMSE-Cha second time within 30days of the first administration. It took 5 minutes on average to administer the BPMSE-Ch.

Statistical analyses

Internal consistency was assessed by computing coefficient α. Inter-rater reliability was assessed by correlation and paired t-test of the two scores obtained by different professionals on the same day. And the test-retest reliability was also calculated with correlational and paired t-test analyses using scores obtained on the same patient within 30 days. The correlations between the BPMSE-Ch and other measures including the SIB, MMSE, GDS-1, GDS-2, IADL, PSMS, NPI and CDR were calculated with Pearson correlations in order to evaluate construct validity. In addition, patients were divided into dementia severity groups using the MMSE and CDR, and differences between those groups were analyzed by conducting a one-way analysis of variance (ANOVA) and Scheffé’s test. Statistical analyses were performed with SPSS 20.0 for Windows.

 

Results

Demographic characteristics and test performances

102 patients (male: 35, female: 67) were included in our study, the mean age of the patients was 77.76, ranging between 64 and 93. The mean years of education was 7.95, ranging from 0 to 16 years. The specific variations were showed in Table 1.

Table 1. Demographic characteristics and Scores on Instruments

Abbreviations: MMSE, Mini-Mental State Examination; BPMSE-Ch-cog, Cognition subscale of Chinse version of the Baylor Profound Mental Status Examination; BPMSE-Ch-behav, Behavior subscale of Chinse version of the Baylor Profound Mental Status Examination; SIB, Severe Impairment Battery; NPI, Neuropsychiatric Inventory; SD, standard deviation.

 

Reliability

In our study, the coefficient α which could reflect the inter-correlations for items on the BPMSE-Ch cognition (BPMSE-Ch-cog) subscale, was 0.87. Furthermore, significant correlations were found among all the BPMSE-Ch-cog components, as seen in Table 2. Inter-rater and test-retest reliability were showed in Table 3.

Table 2. Correlations among BPMSE-Ch-cogsubscales

Correlation coefficients by Pearson correlation. * p< 0.001.

 

Table 3. Inter-rater and test-retest reliability

Abbreviations: BPMSE-Ch-cog, Cognition subscale of Chinse version of the Baylor Profound Mental Status Examination; BPMSE-Ch-behav, Behavior subscale of Chinse version of the Baylor Profound Mental Status Examination.
Correlation coefficients by Pearson correlation. n = Number of patients. All p values <0.001.

 

52 patients were tested twice by two trained doctors simultaneously and independently to determine the inter-rater reliability. The correlation between two total cognition subscale scores was 0.99 (p < 0.001) and there was no significant difference (paired t (51) = +1.84, p > 0.05) between the two scores (Mean = 0.17, SD = 0.68). The correlation between two behavior subscale scores was 0.92 (p < 0.001).
42 patients were tested twice by a same doctor within 30 day-interval for the test-retest reliability. The test-retest correlation between two total cognition scores was 0.99 (p < 0.001). Similarly, there was no significant difference (paired t (41) = +1.18, p > 0.05) between the two scores obtained at two time points (Mean = 0.14, SD = 0.78). The test-retest correlation between two behavior scores was 0.94 (p < 0.001).

Validity

Construct validity of the BPMSE-Ch was showed in Table 4. The correlations between the BPMSE-Ch-cog and MMSE (0.76), SIB (0.78), GDS-1 (-0.26), GDS-2 (0.16), PSMS (-0.26), IADL (-0.36), NPI (-0.41), CDR (-0.54) were calculated by Pearson correlation. The results showed that the construct validity of BPMSE-cog was very good (r=0.78) for SIB and good for MMSE (0.76). In addition, the relationship between BPMSE-Ch behavior subscale (BPMSE-Ch-behav) and NPI was analyzed (0.54, p < 0.001, Table 4).

Table 4. Concurrent validity of BPMSE-Ch

Abbreviations: BPMSE-Ch, Chinse version of the Baylor Profound Mental Status Examination; BPMSE-Ch-cog, Cognition subscale of Chinse version of the Baylor Profound Mental Status Examination; BPMSE-Ch-behav, Behavior subscale of Chinse version of the Baylor Profound Mental Status Examination; MMSE, Mini-Mental State Examination; SIB, Severe Impairment Battery; GDS1, Global Deterioration Scale; GDS2, Geriatric Depression Scale; IADL, Instrumental Activities of Daily Living; PSMS, physical self-maintenance scale; CDR, Clinical Dementia Rating; NPI, Neuropsychiatric Inventory.

 

Ceiling and floor effects

The relationship between BPMSE-Ch-cog and MMSE was revealed on a scatterplot (Supplementary Figure 1A). The range of 0 to 5 scores on the MMSE corresponded to a substantial range of 2 to 24 scores on BPMSE-Ch-cog, indicating that the BPMSE-Ch had no floor effect. In addition, it was found that patients scoring 12 to 16 on MMSE had the BPMSE-Ch-cog scores ranging from 20 to 25 (Mean: 23.08, SD: 1.08, Table 5). This demonstrated that BPMSE-Ch showed a ceiling effect among patients who were at a relative moderate level of dementia.

Sensitivity

The relationship between BPMSE-Ch-cog and SIB scores is displayed (Supplementary Figure 1B). The relatively highR2=0.61 indicated that BPMSE-Ch-cog showed a strong association with the SIB, which demonstrated that the BPMSE-Ch was a sensitive tool for assessing patients with severe AD.

BPMSE-Ch-cog score stratified by MMSE levels

Table 5 presented that BPMSE-Ch-cog differentiated all the enrolled patients belonging to different severity groups according to the MMSE scores (F = 56.7, p <0.001). Patients in the MMSE Group 1 (range 16-12) had a BPMSE-Ch score of 23.08 ± 1.08, patients in the MMSE Group 2 (range 7-11) had a BPMSE-Ch score of 21.25 ± 3.53, and patients in the MMSE Group 3 (range 0-6) had a further reduced BPMSE-Ch score of 12.50 ± 6.69. From the results of Table 5, it was found that the differences in total BPMSE-Ch-cog score as well as in its four subcomponents scores between the Group 2 and Group 3 was significant (p < 0.001).

Table 5. Three severity groups according to the MMSE

Abbreviations: MMSE, Mini-Mental State Examination; BPMSE-Ch-cog, Cognition subscale of Chinse version of the Baylor Profound Mental Status Examination; SD, standard deviation; n = Number of patients. One-way ANOVA test. NS = Nonsignificant; 1. By Scheffé’s analysis

 

BPMSE-Ch-cog score stratified by CDR levels

BPMSE-Ch-cog differentiates the patients into different groups according to the CDR stage (F = 16.0, p < 0.001) (Supplementary Table 1). It was observed that the mean BPMSE-Ch-cog and subcomponents scores declined as the CDR stage increased (Supplementary Table 1). Furthermore, at Group 1 (CDR = 0.5), the total score of BPMSE-Ch-cog ranged from 23 to 25(Mean = 24.33, SD = 1.15); at Group 2 (CDR = 1), the total score of BPMSE-Ch-cog ranged from 19 to 25 (Mean = 22.86, SD = 1.42); at Group 3 (CDR = 2), the total BPMSE-Ch-cog score ranged from 2 to 25 (Mean = 19.82, SD = 5.59); at Group 4 (CDR = 3), the total BPMSE-Ch-cog score ranged from 2 to 24 (Mean = 12.50, SD = 7.29). It was observed that as the CDR stage increased, the corresponding range of BPMSE-Ch-cog became wide. Moreover, it was also shown that significant differences of total BPMSE-Ch-cog score and subcomponents scores existed between Group 3 and Group 4(Supplementary Table 1). All above suggested that BPMSE-Ch measured in a way different from CDR, and could differentiate levels of cognition at high CDR stages. Discussion The present study shows that BPMSE-Ch is a reliable, stable and valid instrument for assessing cognition in patients with severe AD. Internal consistency is robust, inter-rater reliability is near-perfect for both the BPMSE-Ch-cog and BPMSE-Ch-behav subscales, and test-retest reliability is also excellent. Furthermore, excellent construct validity was found referring to significant correlations with SIB (r=0.78), MMSE (r=0.76). These findings are consistent with the results of previous adoptions of Korean, Spanish, and Danish versions of the BPMSE. BPMSE-Ch-cog scores were strongly associated with MMSE, SIB ratings, indicating that the BPMSE-Ch-cog can differentiate well among patients with AD with differing degrees of cognitive impairment, particularly in the more severe end of the dementia spectrum, which of course is its primary intended use. In this regard, BPMSE-Ch-cog do not display floor effects in severely demented patients, as measured by the MMSE. Also, BPMSE-Ch-cog scores are strongly associated with SIB scores (while displaying a lower floor than the SIB), and its administration time is much shorter (only 5 minutes on average versus 30 minutes for the SIB). It further suggests that BPMSE-Ch is an efficient tool. Relative low correlations are also shown between BPMSE-Ch-cog scores and PSMS and IADL functional scores, demonstrating that the BPMSE-Ch can only partly measure cognitive abilities relevant to the abilities needed to function in daily life. We thought the possible reason is that the most enrolled patients would have reached maximum impairment of activities of daily living. It supposed that a certain degree of ceiling effects existed in IADL and PSMS tests. AlthoughGDS-1 is an available tool used to evaluate not only cognition but also the abilities to maintain daily life, participation in adverse activities and it is useful for the severe AD cases (25-27), it is a synthetic grade evaluation tool. The forced-choice format would place most enrolled patients into high stages. This might be the reason that the correlation between BPMSE-Ch and GDS-1 is low. Behavioral and psychological symptoms of dementia (BPSD) in patients with Alzheimer’s disease have a strong correlation with cognitive impairment and impairment in activities of daily living. NPI is a common tool for BPMSD evaluating. The BPMSE-Ch-behav selectively focused on disruptive behaviors. In this study, it has been found that there is a moderate correlation between BPMSE-Ch-behavand NPI. While the NPI is obtained by questions to the primary caregiver and is a complex and time-consuming process. Therefore, it indicated that BPMSE-Ch is also a relative practicable instrument to evaluate the behavioral and psychological symptoms in patients with severe dementia. The correlation between BPMSE-Ch-cog and GDS-2 is not significant (r = 0.16, p > 0.001). There are two possible reasons. Firstly, BPMSE-Ch-cog does not involve questions directed against depressive symptoms and is not intended to evaluate for depression. Secondly, it has been reported that patients with moderate-severe AD have relatively low GDS-2 scores (28), which is similar to our study. It suggests that patients with moderate-severe AD have no obvious depression symptoms. In our study, the highest GDS-2score seen was 24; therefore, GDS-2 sometimes shows a good complementary assessment for depression. Because the BPMSE measures clinical features distinct from the GDS-2 the absence of correlation is not surprising.
Regarding its suitability for use with severely impaired patients, it has been observed that the BPMSE-Ch-cog differentiates well between patients with MMSE scores 0-6 and those with MMSE scores 7-11, but not as well between patients with scores of 12-16 and those with scores 7-11. This indicates that the BPMSE-Ch, like its versions in other languages, is most appropriate to use with patients who are more severely impaired (with MMSE score of 11 or below). Similarly, analyses of patients in different CDR stages reveals that the total BPMSE-Ch score and subcomponent scores differ significantly between patients in CDR stage 2 versus those in CDR stage 3, and patients in both of these more severely impaired CDR stages exhibited a wide range of scores, with substantial variability. These results lend further support to the use of the BPMSE-Ch with severely impaired patients.
In conclusion, the BPMSE-Ch is a convenient, stable, reliable and valid scale to assess cognition in patients with moderate-severe AD, and is most appropriately used with patients who have MMSE scores 11 or below. And in future work, we should popularize the BPMSE-Ch in other areas of China including rural areas to research the properties about BPMSE. We believe that it would be beneficial for this instrument to be widely used for evaluating cognitive functioning of patients with severe AD in China.

 

Acknowledgments: Funding Information: This study was supported by grants from National Key R&D Program of China (2018YFC2001700), General Project of Technological Innovation and Application Development of Chongqing Science & Technology Bureau (cstc2019jscx-msxmX0239), Key project of Social undertakings and people’s livelihood security of Chongqing Science & Technology Commission (cstc2017shms-zdyfX0009) and Postgraduate Research Innovation Project of Chongqing(CYS16122), Particularly, we greatly thank Dr. Sergio Salmerón (Department of Geriatrics, Hospital General de Villarrobledo, Albacete, Spain) for the assistance in making a translation of BPMSE.

Conflict of Interest: The authors declare that they have no potential competing interests

Ethics approval and consent to participate: The study was approved by the Ethics Committee of The First Affiliated Hospital of Chongqing Medical University and has been performed in accordance with the ethical standards laid down in the Declaration of Helsinki and its later amendments.

Authors’ Contributions: Rachelle S. Doody and Yang Lü designed the study. Xue Fu, Weihua Yu and Yang Lü collected the data and wrote the paper. Yang Lü, Paul J. Massman and Rachelle S. Doody revised the manuscript: Xia Wang, Jia Zhang and Tao Luo analyzed data and assisted with writing the article.

SUPPLEMENTARY1_MATERIAL

SUPPLEMENTARY2_MATERIAL

References

1. J.C. Morris, K. Blennow, L. Froelich, et al. Harmonized diagnostic criteria for Alzheimer’s disease: recommendations, Journal of internal medicine2014; 275(3): 204-13.
2. F. Li, S. Chen, C. Wei, J. Jia. Monetary costs of Alzheimer’s disease in China: protocol for a cluster-randomised observational study, BMC neurology 2017; 17(1):15.
3. J. Jia, A. Zhou, C. Wei, et al. The prevalence of mild cognitive impairment and its etiological subtypes in elderly Chinese, Alzheimer’s & dementia : the journal of the Alzheimer’s Association 2014; 10(4): 439-47.
4. Y. Zhang, Y. Xu, H. Nie, et al. Prevalence of dementia and major dementia subtypes in the Chinese populations: a meta-analysis of dementia prevalence surveys, 1980-2010, Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia 2012;19(10): 1333-7.
5. K.Y. Chan, W. Wang, J.J. Wu, et al. Epidemiology of Alzheimer’s disease and other forms of dementia in China, 1990-2010: a systematic review and analysis, Lancet (London, England) 2013; 381(9882): 2016-23.
6. J. Jia, F. Wang, C. Wei, et al. The prevalence of dementia in urban and rural areas of China, Alzheimer’s & dementia : the journal of the Alzheimer’s Association 2014, 10(1): 1-9.
7. D. Liu, G. Cheng, L. An, et al. Public Knowledge about Dementia in China: A National WeChat-Based Survey, International journal of environmental research and public health2019; 16(21).
8. X. Li, W. Fang, N. Su, Y. Liu, S. Xiao, Z. Xiao, Survey in Shanghai communities: the public awareness of and attitude towards dementia, Psychogeriatrics : the official journal of the Japanese Psychogeriatric Society2011; 11(2): 83-9.
9. M. Zhao, X. Lv, M. Tuerxun, et al. Delayed help seeking behavior in dementia care: preliminary findings from the Clinical Pathway for Alzheimer’s Disease in China (CPAD) study, International psychogeriatrics 2016; 28(2): 211-9.
10. D. Peng, Z. Shi, J. Xu, et al. Demographic and clinical characteristics related to cognitive decline in Alzheimer disease in China: A multicenter survey from 2011 to 2014, Medicine2016; 95(26): e3727.
11. F.A. Schmitt, W. Ashford, C. Ernesto, et al. The severe impairment battery: concurrent validity and the assessment of longitudinal change in Alzheimer’s disease. The Alzheimer’s Disease Cooperative Study, Alzheimer disease and associated disorders1997; 11 Suppl 2: S51-6.
12. M.F. Folstein, S.E. Folstein, P.R. McHugh, «Mini-mental state». A practical method for grading the cognitive state of patients for the clinician, Journal of psychiatric research 1975; 12(3): 189-98.
13. J. Saxton, A.A. Swihart, Neuropsychological assessment of the severely impaired elderly patient, Clinics in geriatric medicine 1989; 5(3): 531-43.
14. S.J. Cano, H.B. Posner, M.L. Moline, et al. The ADAS-cog in Alzheimer’s disease clinical trials: psychometric evaluation of the sum and its parts, Journal of neurology, neurosurgery, and psychiatry2010; 81(12): 1363-8.
15. B. Reisberg, S.H. Ferris, M.J. de Leon, T. Crook. The Global Deterioration Scale for assessment of primary degenerative dementia, The American journal of psychiatry1982; 139(9): 1136-9.
16. B. Reisberg. Functional assessment staging (FAST), Psychopharmacology bulletin 1988; 24(4): 653-9.
17. C.P. Hughes, L. Berg, W.L. Danziger, L.A. Coben, R.L. Martin, A new clinical scale for the staging of dementia, The British journal of psychiatry : the journal of mental science 1982; 140: 566-72.
18. G.M. Peavy, D.P. Salmon, V.A. Rice, et al. Neuropsychological assessment of severely demeted elderly: the severe cognitive impairment profile, Archives of neurology 1996; 53(4): 367-72.
19. K.L. Lanctot, J. Amatniek, S. Ancoli-Israel, et al. Neuropsychiatric signs and symptoms of Alzheimer’s disease: New treatment paradigms, Alzheimer’s & dementia (New York, N. Y.)2017; 3(3): 440-449.
20. R.S. Doody, S.L. Strehlow, P.J. Massman, E.P. Feher, C. Clark, J.R. Roy, Baylor profound mental status examination: a brief staging measure for profoundly demented Alzheimer disease patients, Alzheimer disease and associated disorders 1999; 13(1): 53-9.
21. A. Korner, A. Brogaard, I. Wissum, U. Petersen, The Danish version of the Baylor Profound Mental State Examination, Nordic journal of psychiatry2012; 66(3): 198-202.
22. H.R. Na, S.H. Lee, J.S. Lee, R.S. Doody, S.Y. Kim, Korean version of the Baylor Profound Mental Status Examination: a brief staging measure for patients with severe Alzheimer’s disease, Dementia and geriatric cognitive disorders 2009; 27(1): 69-75.
23. S. Salmeron, I. Huedo, M. Lopez-Utiel, et al. Validation of the Spanish version of the Baylor Profound Mental Status Examination, Journal of Alzheimer’s disease 2016; 49(1): 73-8.
24. G. McKhann, D. Drachman, M. Folstein, R. Katzman, D. Price, E.M. Stadlan, Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease, Neurology1984; 34(7): 939-44.
25. R.H. Paul, R.A. Cohen, D.J. Moser, et al. The global deterioration scale: relationships to neuropsychological performance and activities of daily living in patients with vascular dementia, Journal of geriatric psychiatry and neurology 2002; 15(1): 50-4.
26. J.S. Kim, C.W. Won, B.S. Kim, H.R. Choi, Predictability of various serial subtractions on global deterioration scale according to education level, Korean journal of family medicine2013; 34(5): 327-33.
27. S.H. Choi, B.H. Lee, S. Kim, et al. Interchanging scores between clinical dementia rating scale and global deterioration scale, Alzheimer disease and associated disorders2003; 17(2): 98-105.
28. A.J. Midden, B.T. Mast, Differential item functioning analysis of items on the Geriatric Depression Scale-15 based on the presence or absence of cognitive impairment, Aging & mental health 2017; 1-7.

THE HARVARD AUTOMATED PHONE TASK: NEW PERFORMANCE-BASED ACTIVITIES OF DAILY LIVING TESTS FOR EARLY ALZHEIMER’S DISEASE

G.A. Marshall1,2,3,4, M. Dekhtyar1,2, J.M. Bruno1,2, K. Jethwani6, R.E. Amariglio1,2,3,4, K.A. Johnson1,2,3,5, R.A. Sperling1,2,3,4, D.M. Rentz1,2,3,4

1. Center for Alzheimer Research and Treatment, Boston, USA; 2. Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, USA;
3. Massachusetts Alzheimer’s Disease Research Center, Boston, USA; 4. Departments of Neurology, Boston, USA; 5. Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, USA; 6. Connected Health Innovation, Partners HealthCare, Harvard Medical School, Boston, USA

Corresponding Author: Gad A. Marshall, Center for Alzheimer Research and Treatment, Brigham and Women’s Hospital, 221 Longwood Avenue, BL-104H, Boston, MA 02115, USA, P: 617-732-8085, F: 617-264-5212, E: gamarshall@partners.org

J Prev Alz Dis 2015;2(4):242-253
Published online June 10, 2015, http://dx.doi.org/10.14283/jpad.2015.72


Abstract

Background: Impairment in activities of daily living is a major burden for Alzheimer’s disease dementia patients and caregivers. Multiple subjective scales and a few performance-based instruments have been validated and proven to be reliable in measuring instrumental activities of daily living in Alzheimer’s disease dementia but less so in amnestic mild cognitive impairment and preclinical Alzheimer’s disease.

Objective: To validate the Harvard Automated Phone Task, a new performance-based activities of daily living test for early Alzheimer’s disease, which assesses high level tasks that challenge seniors in daily life.

Design: In a cross-sectional study, the Harvard Automated Phone Task was associated with demographics and cognitive measures through univariate and multivariate analyses; ability to discriminate across diagnostic groups was assessed; test-retest reliability with the same and alternate versions was assessed in a subset of participants; and the relationship with regional cortical thickness was assessed in a subset of participants.

Setting: Academic clinical research center.

Participants: One hundred and eighty two participants were recruited from the community (127 clinically normal elderly and 45 young normal participants) and memory disorders clinics at Brigham and Women’s Hospital and Massachusetts General Hospital (10 participants with mild cognitive impairment).

Measurements: As part of the Harvard Automated Phone Task, participants navigated an interactive voice response system to refill a prescription (APT-Script), select a new primary care physician (APT-PCP), and make a bank account transfer and payment (APT-Bank). The 3 tasks were scored based on time, errors, and repetitions from which composite z-scores were derived, as well as a separate report of correct completion of the task.

Results: We found that the Harvard Automated Phone Task discriminated well between diagnostic groups (APT-Script: p=0.002; APT-PCP: p<0.001; APT-Bank: p=0.02), had an incremental level of difficulty, and had excellent test-retest reliability (Cronbach’s α values of 0.81 to 0.87). Within the clinically normal elderly, there were significant associations in multivariate models between performance on the Harvard Automated Phone Task and executive function (APT-PCP: p<0.001), processing speed (APT-Script: p=0.005), and regional cortical atrophy (APT-PCP: p=0.001; no significant association with APT-Script) independent of hearing acuity, motor speed, age, race, education, and premorbid intelligence.

Conclusions: Our initial experience with the Harvard Automated Phone Task, which consists of ecologically valid, easily-administered measures of daily activities, suggests that these tasks could be useful for screening and tracking the earliest functional alterations in preclinical and early prodromal AD.

Key words: Activities of daily living, Alzheimer’s disease, mild cognitive impairment, performance-based, validation.  


Introduction 

Impairment in activities of daily living (ADL) is a time-intensive, psychological, physical, and financial burden for patients with Alzheimer’s disease (AD) dementia and their caregivers. Traditionally, impairment in basic ADL, which consist of self-care activities, has been associated with moderate to severe dementia, while impairment in instrumental ADL, which consist of activities such as managing the finances, driving, cooking, shopping, and performing household chores, has been associated with mild to moderate dementia. Multiple subjective scales and a few performance-based instruments have been validated and proven to be reliable in measuring instrumental ADL in AD dementia but less so in amnestic mild cognitive impairment (MCI) and preclinical AD (1). There is a debate in the field about whether or not mild impairment in instrumental ADL should be allowed in the diagnosis of MCI because it can blur the distinction with mild dementia (2-4). However, that distinction is often arbitrary and since MCI, which may represent prodromal AD, could progress to AD dementia, they are on a continuum. One step back from MCI may be preclinical AD, in which asymptomatic or minimally symptomatic elderly individuals have biomarker evidence of AD pathology (5). Like other deficits in AD, subtle difficulties in complex ADL may begin at the transition from preclinical AD to MCI. However, to date few ADL tests have been developed to capture these earliest changes.   

Performance-based ADL instruments, in which individuals complete simulated or actual tasks from daily life, are thought to be more objective and ecologically valid than the more widely used subjective ADL questionnaires, in which usually informants and sometimes subjects report about their ability to perform various activities (1, 6, 7). One of the first performance-based ADL instruments was the Direct Assessment of Functional Status, which targeted a wide range of basic and instrumental ADL in mild-moderate dementia (8, 9). More recent instruments have focused on MCI (10, 11). The University of California, San Diego Performance-Based Skills Assessment and the Financial Capacity Instrument (FCI) have both been shown to discriminate well between clinically normal (CN) elderly and MCI (11, 12). However, both instruments take 30 minutes or longer to complete, require specialized tools and a trained administrator, thus limiting the extent of their use in research and clinical settings.

Recently, guidelines for the assessment of functional impairment at the preclinical AD stage were suggested (13). In the current study, we describe a newly developed performance-based ADL instrument, the Harvard Automated Phone Task (APT), targeting individuals with preclinical AD and early prodromal AD. The telephone is still by far the most prevalent technology mode of communication in the elderly, who often need to use an interactive voice response system (IVRS) to complete everyday activities (14). As part of the Harvard APT, participants navigate an IVRS to refill a prescription (APT-Script), select a new primary care physician (APT-PCP), and make a bank account transfer and payment (APT-Bank). Our objective was to validate the Harvard APT, a set of new performance-based ADL tests for early AD, which are quick and easy to administer and assess high level tasks that challenge seniors in daily life.

Methods

Participants

One hundred and eighty two participants were recruited from the community (consisting of 127 CN elderly) and 45 young normal (YN) participants) and memory disorders clinics at Brigham and Women’s Hospital (BWH) and Massachusetts General Hospital (MGH) (consisting of 10 MCI participants). All participants were in general good health or had stable medical problems and did not have significant psychiatric disorders.

CN participants were ages 60 to 90 years old (inclusive), had a Mini-Mental State Examination (MMSE) (15) score of 26 to 30 (inclusive), and normal memory performance (defined as a Free and Cued Selective Reminding Test (FCSRT) (16) free recall score of >24 and cued recall score of >44). YN participants were ages 18 to 27 years old (inclusive), had an MMSE score of 27 to 30 (inclusive), and normal memory performance (FCSRT free recall score of >24 and cued recall score of >44). MCI participants were ages 61 to 81 (inclusive), had an MMSE score of 25 to 29 (inclusive), and impaired memory performance (FCSRT free recall score of ≤24 and/or cued recall score of ≤44).

The study was approved by the Institutional Review Board (IRB) of Partners Healthcare Inc. Written informed consent was obtained from all participants prior to initiation of any study procedures in accordance with IRB guidelines.

Clinical Assessments

The Harvard Automated Phone Task (APT) was developed at the Center for Alzheimer Research and Treatment at BWH and MGH and the Connected Health Innovation at Partners HealthCare. As part of the Harvard APT, participants perform 3 tasks consisting of navigating an IVRS. The 3 tasks combined can be completed in about 10 minutes using any phone with buttons containing digits and letters. See Appendix for detailed participant instructions, scoring, and schemas of the 3 tasks.

Task 1 (APT-Script) requires participants to call a pharmacy and refill a prescription. Participants are given a mock pill bottle for simvastatin 20 mg orally every day, quantity 30, refills 2, and the pharmacy phone number.

Task 2 (APT-PCP) requires participants to call a health insurance company and select a new primary care physician (Dr. John Smith in Boston, MA). Participants are provided with a card with a member ID number (CDW758421693) and a member services phone number.

Task 3 (APT-Bank) requires participants to make a bank account transfer in order to have enough money to pay their Federal taxes for the year. Participants are provided with the amount of Federal taxes they owe ($4,150), the bank phone number, their checking account number, their savings account number, and a bank statement. Participants are instructed to take notes on a piece of paper in order to complete the task. Once they call in, they are told how much money they have in each account (checking: $2,050; savings: $3,950). They then make a transfer in the appropriate amount (between $2,100 and $3,700; they are told that they need to maintain a minimum of $250 in their savings account in order not to be penalized) from their savings into their checking account in order to have enough money to pay their taxes. They then make the payment.

The 3 tasks are administered in the same sequence (1 through 3) to all participants. Prior to each task, motor speed and hearing acuity are assessed on the phone in order to control for those possible confounds. The 3 tasks are scored based on total time (until disconnected), number of errors, number of repetition of steps, and correct completion of task. Composite z-scores of time, errors, and repetitions are generated for each task using the mean and standard deviation of the performance of the CN participants as a reference. Individuals who do not complete a task correctly are assigned a total time equivalent to the longest total time among individuals who have correctly completed the task. This adjustment is made in order to avoid the possibility of the appearance of a better performance by individuals who prematurely completed a task incorrectly.

Upon completion of all 3 tasks, participants are asked whether these tasks relate to their daily activities on a 5-point Likert-type scale ranging from very little to very much. Then, participants are asked to rank the difficulty of each of the 3 tasks from very easy to very hard.

We also developed alternate versions of the 3 tasks with equivalent psychometric properties (e.g. same character length of physician’s name for Task 2, APT-PCP) for test-retest reliability and in order to avoid practice effects from year to year.

Other assessments used in the current study included the American National Adult Reading Test intelligence quotient (AMNART IQ) (17), an estimate of premorbid intelligence, serving as a proxy of cognitive reserve; the MMSE (15), a measure of global cognition; the FCSRT (16), a measure of episodic memory; Trailmaking Test A (TMT-A) (18), a measure of processing speed; and Trailmaking Test B (TMT-B) (18), a measure of executive function.

Magnetic Resonance Imaging (MRI) Data

MRI scans were conducted with a Siemens Trio 3T scanner (Siemens Medical Systems, Erlangen Germany) at the Charlestown Navy Yard campus of MGH. High-resolution T1-weighted structural images were acquired using a 3D Magnetization Prepared Rapid Acquisition Gradient Echo (MP-RAGE) sequence with the following acquisition parameters: repetition time=2300ms; echo time=2.98ms; inversion time=900ms; flip angle=9°; voxel size=1.0×1.0x1.2mm. Cortical thickness of regions of interest (ROI) was measured using FreeSurfer Version 5.1 (http://surfer.nmr.mgh.harvard.edu/) (19). Four ROI chosen based on results of prior analyses of neuroimaging correlates of ADL were assessed (20, 21): 1) inferior temporal cortex 2) posterior cingulate cortex; 3) medial orbitofrontal cortex; and 4) supramarginal cortex.

Statistical Analyses

Analyses were performed using SPSS version 22.0. Most univariate and multivariate analyses focused on the CN diagnostic group, which was the largest group and the group of main interest since the Harvard APT is primarily meant to target individuals with preclinical AD and those transitioning to prodromal AD (MCI).

The Harvard APT distribution was not normal with a negative (left) skew. Therefore, non-parametric tests were employed for univariate analyses. In CN participants, the Harvard APT was related to participant demographics and characteristics using Spearman’s correlations for continuous variables and Mann-Whitney U test for categorical variables (two-tailed p values were reported).

In CN participants, multiple linear regression models with backward elimination (cut-off p<0.05) were performed to assess the association between Harvard APT and cognitive tests, adjusting for demographics that were significant in the univariate analyses. Tasks 1 and 2 (APT-Script and APT-PCP) were the dependent variables in separate models. Predictors included TMT-A, TMT-B, age, education, and AMNART-IQ. These models were repeated in a subset of participants after adding predictors accounting for hearing acuity and motor speed, as well as self-report of difficulty of tasks. Partial regression coefficient estimates (β) with 95% confidence intervals (CI), significance test results (p values), and percent variance accounted for in the dependent variable by the model as a whole (R2) were reported.

Analysis of variance (ANOVA) was used to compare Harvard APT performance (APT-Script and APT-PCP) across diagnostic groups (YN, CN, and MCI). Comparisons of pairs of diagnoses (ex: CN vs. MCI) were also performed for which p values were corrected for multiple comparisons using Tukey post-hoc tests. For APT-Bank, which was completed only in CN and YN participants, a t-test was used to compare performance across diagnostic groups.

In CN participants, reliability of the Harvard APT (APT-Script and APT-PCP) was determined with Cronbach’s α, intraclass correlations. Test-retest was performed with a short interval using the same version, as well as with an alternate version.

In CN and MCI participants combined and then in CN participants alone, multiple linear regression models with backward elimination (cut-off p<0.05) were performed to assess the association between Harvard APT and regional cortical thickness (MRI ROI). Tasks 1 and 2 (APT-Script and APT-PCP) were the dependent variables in separate models. Predictors included the 4 MRI ROI, age, education, and AMNART-IQ. β with 95% CI, p values, R2 were reported.

Results

One hundred and eighty two participants (127 CN, 45 YN, and 10 MCI) underwent APT-Script and APT-PCP and 40 of those participants (30 CN and 10 YN) underwent APT-Bank, which was developed later. See Table 1 for participant demographics and characteristics.

Table 1. Demographics and characteristics of all participants

AMNART IQ (American National Adult Reading Test intelligence quotient), APT (Automated Phone Task), CN (clinically normal elderly), MCI (mild cognitive impairment), MMSE (Mini-Mental State Exam), FCSRT (Free and Cued Selective Reminding Test), TMT (Trailmaking Test), YN (young normal). Values represent mean ± standard deviation (except for n, Sex, and Race). * Mann-Whitney U test used instead of Spearman correlation for Sex and Race. † YN minorities: 4.4% African American, 17.8% Asian, and 4.4% Hispanic. ‡ CN minorities: 20.0% African American, 2.5% Asian, and 1.7% Hispanic.

Table 2 provides information about performance on Tasks 1 and 2 (APT-Script and APT-PCP) across diagnostic groups. Correct task completion across groups was high (98% for APT-Script and 91% for APT-PCP). The average time to complete APT-Script was about 1 minute and APT-PCP about 3 ½ minutes. The average time to complete APT-Bank was about 5 minutes (CN: 314.5±73.6 seconds; YN: 254.2±97.7 seconds). APT-Script and APT-PCP were significantly associated across groups (rs=0.30, p<0.001). APT-Bank was significantly associated with APT-PCP (rs=0.41, p=0.008) but not with APT-Script (rs=0.09, p=0.57).

Table 2. Performance on Harvard APT Tasks 1 and 2

APT (Automated Phone Task), CN (clinically normal elderly), MCI (mild cognitive impairment), YN (young normal); Values represent mean ± standard deviation (except for n and Completed); Analysis of variance (ANOVA) revealed performance differences between groups on composite z-scores (Task 1: p=0.002, Task 2: p<0.001) with MCI performing worse than CN and CN worse than YN participants.

Association with demographics and cognitive measures in CN participants

Composite z-scores for each task accounting for time, errors, and repetitions were created. As illustrated in Table 2, in unadjusted analyses, greater age was significantly associated with worse APT-Script performance (p=0.02) and worse APT-PCP performance (p=0.02), lower education and lower AMNART IQ were significantly associated with worse APT-PCP performance (education: p=0.04; IQ: p<0.001), worse processing speed (TMT-A) was significantly associated with worse APT-Script performance (p<0.001) and worse APT-PCP performance (p<0.001), and worse executive function (TMT-B) was significantly associated with worse APT-Script performance (p=0.007) and worse APT-PCP performance (p<0.001). APT-Script was not significantly associated with education or AMNART IQ. APT-Script and APT-PCP were not significantly associated with sex, race, MMSE, or FCSRT. Worse APT-Bank performance was significantly associated with lower education (p=0.003) and worse executive function (p=0.05) but not with any other demographic or cognitive variables.

The multiple linear regression model revealed a significant association between worse APT-Script performance and worse processing speed (β=-0.02, 95% CI for β=-0.03, -0.01, p<0.001). No other predictors were retained in the model (R2=0.11, p<0.001 for overall model). A separate multiple linear regression model revealed a significant association between worse APT-PCP performance and worse executive function (β=-0.01, 95% CI for β=-0.01, -0.005, p<0.001). No other predictors were retained in the model (R2=0.19, p<0.001 for overall model).

Association with cognitive measures independent of hearing acuity and motor speed

In order to correct for hearing acuity and motor speed, prior to each task participants were asked to enter 8 numbers separately and time and errors were recorded, see Appendix for details. Data from 55 participants (34 CN, 20 YN, and 1 MCI) completing this pre-task revealed an average time of 15.9±2.3 and nearly no errors (0.0±0.1). Across all participants, slower performance on pre-task was significantly associated with worse APT-Script performance (rs=-0.37, p=0.005) and worse APT-PCP performance (rs=-0.27, p=0.05); however, there was no significant association in CN participants alone. When adding average pre-task time to the regression models in CN participants above with APT-Script and APT-PCP as dependent variables, the associations between APT-Script and processing speed (p=0.009) and APT-PCP and executive function (p<0.001) remained significant.

Association with cognitive measures independent of self-report of difficulty of tasks

Upon completion of the tasks, 46 of the participants (36 CN and 10 YN) were asked whether these tasks relate to their daily activities on a 5-point Likert-type scale ranging from very little to very much. Then, participants were asked to rank the difficulty of each task from very easy to very hard, see Appendix for details. On average, participants agreed (2.1±1.1) that the tasks relate to their daily activities. Across all participants, those who performed better on the tasks indicated that they were easier (APT-Script: rs=-0.28, p=0.06; APT-PCP: rs=-0.37, p=0.01). When adding self-report of difficulty of tasks to the regression models in CN participants above with APT-Script and APT-PCP as dependent variables, the associations between APT-Script and processing speed (p=0.005) and APT-PCP and executive function (p<0.001) remained significant.

Discrimination between diagnostic groups

ANOVA revealed significant performance differences between groups (APT-Script: p=0.002, APT-PCP: p<0.001) with MCI performing worse than CN participants (Tukey post-hoc tests for APT-Script: p=0.05, APT-PCP: p=0.12) and CN performing worse than YN participants (Tukey post-hoc tests for APT-Script: p=0.05, APT-PCP: p=0.004), see Table 2 and Figure 1. Only CN and YN participants underwent APT-Bank, and a t-test revealed significant performance difference between the groups with CN performing worse than YN participants (p=0.02).

Figure 1. Bar graphs with error bars of composite Z-scores for Task 1 (APT-Script) (LEFT) and Task 2 (APT-PCP) (RIGHT) in YN, CN, and MCI participants. MCI performed worse than CN and CN performed worse than YN participants on both tasks. P values are corrected for multiple comparisons using Tukey post-hoc tests. APT (Automated Phone Task), CN (clinically normal elderly), MCI (mild cognitive impairment), YN (young normal)

Test-retest reliability

Data was obtained for APT-Script and APT-PCP in CN participants using the same and alternate versions over short intervals. See Appendix for details of task versions A and B. Thirty one participants underwent the same version after 9.0±3.1 days, yielding a Cronbach’s α value of 0.81. Eleven participants underwent an alternate version after 8.3±4.2 days, yielding a Cronbach’s α value of 0.87.

Association with regional cortical atrophy

Structural MRI data was available for 19 participants (15 CN and 4 MCI) who underwent APT-Script and APT-PCP. Multiple linear regression models examining 4 ROI revealed a significant association between reduced inferior temporal cortical thickness and worse APT-PCP performance in all participants (β=3.01, 95% CI for β=0.45, 5.57, p=0.02; R2=0.52, p=0.04 for overall model) and in CN participants alone (β=8.07, 95% CI for β=4.27, 11.87, p=0.001; R2=0.68, p=0.01 for overall model), see Figure 2. There were no significant associations with APT-Script performance.

Figure 2. Regression models showing the association between Task 2 (APT-PCP) and inferior temporal cortical thickness in all participants (LEFT) and in CN participants alone (RIGHT), adjusted for age, education, and AMNART IQ. AMNART IQ (American National Adult Reading Test intelligence quotient), APT (Automated Phone Task), CN (clinically normal elderly)

Discussion

This initial experience with the Harvard APT, a new performance-based ADL instrument, which consists of real-life, practical, quick (about 10 minutes altogether), and easy to administer tasks, is very encouraging. We demonstrated differential performance across diagnostic groups (CN, MCI, and YN), associations with processing speed, executive function, and regional cortical thinning within CN elderly independent of hearing acuity and motor speed, as well as incremental complexity for the three tasks and excellent test-retest reliability. The more complex tasks were influenced by level of education as expected but not by race. Our results suggest that the Harvard APT could be of useful for screening and tracking the earliest functional alterations in preclinical and early prodromal AD.

Recently, a leading expert in the field published recommendations for an effective ADL test for preclinical AD (13). These guidelines stipulated that the test should: “1) assess cognitively complex functional abilities relevant to independent living; 2) use an interval scaled, direct performance measure that evaluates performance variables in a highly detailed and granular manner; 3) include time limitations for performance items in order to enhance item difficulty; and 4) include task completion time variables in order to capture subtle processing speed changes”. One of the few examples of ADL tests that fulfil these criteria is a new short-form version of the FCI, with which Marson and colleagues have shown subtle financial skill decline in individuals with preclinical AD participating in the Mayo Clinic Study of Aging, thus demonstrating the utility and feasibility of a performance-based ADL instrument in preclinical AD (22). Our data suggest that the newly developed Harvard APT similarly fulfils these criteria and therefore has the potential to be an effective ADL test in preclinical AD.

Among CN elderly, the more complex Harvard APT tasks (APT-PCP and APT-Bank) were associated primarily with executive function, while the simpler task (APT-Script) was associated primarily with processing speed. These findings are in agreement with prior studies in early AD, which have shown that among various cognitive domains, instrumental ADL are associated most closely with executive function (23, 24). There was no association between Harvard APT and global cognition or episodic memory. However, this could be partly due to the narrow range of MMSE and FCSRT scores in the CN sample of the current study. As expected, there was an association between performance on the more complex Harvard APT tasks and level of education and premorbid intelligence. Therefore, those tasks may require an education or IQ adjustment when used in the future. Similarly, there was an association with age and Harvard APT task performance. Such associations are commonly found with cognitive tests as well. That said, after adjusting for those variables, there was still a significant association between Harvard APT and executive function and processing speed in CN elderly. On the other hand, there was no association between Harvard APT performance and race in a sample consisting of about one quarter minorities. Furthermore, adjusting for hearing acuity and motor speed did not affect the association with executive function and processing speed. Finally, on average, participants agreed that the Harvard APT relates to their daily activities, which has been confirmed in a recent study that indicated wide use of phones, and in particular IVRS, among elderly individuals (14). These findings bode well for the generalizability of the Harvard APT and its ability to overcome common confounds in the assessment of elderly individuals.

We were also able to demonstrate that the Harvard APT can discriminate well between diagnostic groups (CN, MCI, and YN); it has a wide range of performance in CN elderly, especially for the more complex tasks, which can help avoid floor or ceiling effects; it is comprised of tasks with incremental complexity; and it has excellent test-retest reliability using the same version or an alternate version. These are important properties for a new test to possess.

A small subset of CN and MCI participants in the current study underwent structural MRI. Among those, there was an association between worse performance on the Harvard APT and greater inferior temporal atrophy, even within CN elderly alone. Inferior temporal atrophy appears early in the course of AD (25) and has been associated with impairment in instrumental ADL, measured by a conventional subjective scale, in the early AD spectrum (20). Instrumental ADL impairment in early AD has also been associated with frontal and parietal atrophy and hypometabolism (21, 26, 27, 28).

The current study had several limitations. First, the study sample was highly educated and intelligent. However, this is typical of clinical trial and imaging study samples. Therefore, participants of such studies are likely to perform on the Harvard APT similarly to the participants of the current study. On the other hand, unlike many of those type of studies, our sample consisted of about one quarter minorities, potentially increasing the generalizability of the Harvard APT. Further validation in a population-based sample will help clarify this important issue as we hope to be able to utilize this test in the clinical setting. Second, the current study had a small sample of impaired participants with MCI used to compare to the diagnostic group of interest, CN elderly. Future studies of the Harvard APT with larger samples of impaired individuals, as well as those with subjective cognitive decline, will help further characterize this test in the early AD spectrum. Third, we did not have the opportunity to compare the Harvard APT to established ADL scales, such as conventional subjective scales, which could contribute to convergent validity. However, we did have access to sensitive memory and executive function tests, the latter of which were associated with performance on the Harvard APT. Fourth, APT-Bank was not associated with APT-Script; however, APT-Bank was associated with APT-PCP and APT-PCP was associated with APT-Script. There is an incremental complexity in these tasks (APT-Script is the least complex, APT-PCP is intermediate, and APT-Bank is the most complex), which may be the reason for this differential in associations. As such, the less complex tasks may prove more useful in symptomatic individuals with MCI or mild dementia, while the more complex tasks may prove more useful in asymptomatic CN elderly or in minimally symptomatic individuals with subjective cognitive decline. Finally, only a small subset of participants in the current study had imaging data though even with this small group we found significant associations with the Harvard APT in CN elderly.

In conclusion, the Harvard APT, a novel performance-based ADL instrument developed to target CN elderly and individuals with MCI at risk for early AD, is a promising, ecologically valid, quick, and easy to administer assessment. ADL are the practical extension of cognitive function that is of the utmost importance to patients and caregivers. Accordingly, the Food and Drug Administration (FDA) has required co-primary outcome measures of cognition and ADL for clinical trials in mild to moderate AD dementia for many years now. However, recently, the FDA issued new guidance for clinical trials in early AD allowing a single cognitive outcome measure for prevention trials in preclinical AD due to the thought that there are no meaningful changes in ADL at that stage of AD (29). That said, the guidance also stipulated the requirement to eventually demonstrate a clinically relevant benefit from the intervention, which is where a sensitive ADL assessment fits in. As such, the Harvard APT provides a unique opportunity to fill this important gap in early functional assessment both in the research and clinical setting.

Acknowledgments: We would like to thank Kamolika Roy, Meaghan Doherty, Clare Flanagan, and RipRoad for their assistance in the development of the Harvard APT.

Funding: This study was supported by K23 AG033634, R01 AG027435, K24 AG035007, the Harvard Aging Brain Study (P01 AGO36694 and R01AG037497), the Alzheimer’s Association (SGCOG-13-282201), Fidelity Biosciences Corporation, the Massachusetts Alzheimer’s Disease Research Center (P50 AG005134), and the Harvard NeuroDiscovery Center.

Disclosures: Dr. Marshall has served as a consultant for Halloran, GliaCure, and Janssen Research & Development. Ms. Dekhtyar, Mr. Bruno, Dr. Jethwani, Dr. Amariglio, and Dr. Johnson have no disclosures. Dr. Sperling has served as a consultant for Merck, Eisai, Janssen, Boehringer-Ingelheim, Isis, Lundbeck, Roche, and Genetech. Dr. Rentz has served as a consultant for Eli Lilly, Neurotrack, and Lundbeck.

References

1. Marshall GA, Amariglio RE, Sperling RA, Rentz DM. Activities of daily living: Where do they fit in the diagnosis of Alzheimer’s disease. Neurodegener Dis Manag 2012;2:483-491.

2. Albert MS, DeKosky ST, Dickson D, et al. The diagnosis of mild cognitive impairment due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement 2011;7:270-279.

3. Brown PJ, Devanand DP, Liu X, Caccappolo E. Functional impairment in elderly patients with mild cognitive impairment and mild Alzheimer disease. Arch Gen Psychiatry 2011;68:617-626.

4. Morris JC. Revised Criteria for Mild Cognitive Impairment May Compromise the Diagnosis of Alzheimer Disease Dementia. Arch Neurol 2012.

5. Sperling RA, Aisen PS, Beckett LA, et al. Toward defining the preclinical stages of Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement 2011;7:280-292.

6. Gold DA. An examination of instrumental activities of daily living assessment in older adults and mild cognitive impairment. J Clin Exp Neuropsychol 2012;34:11-34.

7. Schmitter-Edgecombe M, Parsey C, Cook DJ. Cognitive correlates of functional performance in older adults: comparison of self-report, direct observation, and performance-based measures. J Int Neuropsychol Soc 2011;17:853-864.

8. Loewenstein DA, Amigo E, Duara R, et al. A new scale for the assessment of functional status in Alzheimer’s disease and related disorders. J Gerontol 1989;44:P114-121.

9. Zanetti O, Frisoni GB, Rozzini L, Bianchetti A, Trabucchi M. Validity of direct assessment of functional status as a tool for measuring Alzheimer’s disease severity. Age Ageing 1998;27:615-622.

10. Cullum CM, Saine K, Chan LD, Martin-Cook K, Gray KF, Weiner MF. Performance-Based instrument to assess functional capacity in dementia: The Texas Functional Living Scale. Neuropsychiatry Neuropsychol Behav Neurol 2001;14:103-108.

11. Triebel KL, Martin R, Griffith HR, et al. Declining financial capacity in mild cognitive impairment: A 1-year longitudinal study. Neurology 2009;73:928-934.

12. Goldberg TE, Koppel J, Keehlisen L, et al. Performance-based measures of everyday function in mild cognitive impairment. Am J Psychiatry 2010;167:845-853.

13. Marson D. Investigating functional impairment in preclinical Alzheimer’s disease. J Prev Alzheimers Dis 2015;2:4-6.

14. Miller D, Gagnon M, Talbot V, Messier C. Predictors of successful communication with interactive voice response systems in older people. J Gerontol B Psychol Sci Soc Sci 2013;68:495-503.

15. Folstein MF, Folstein SE, McHugh PR. «Mini-mental state». A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 1975;12:189-198.

16. Grober E, Sanders AE, Hall C, Lipton RB. Free and cued selective reminding identifies very mild dementia in primary care. Alzheimer Dis Assoc Disord 2010;24:284-290.

17. Nelson HE, O’Connell A. Dementia: the estimation of premorbid intelligence levels using the New Adult Reading Test. Cortex 1978;14:234-244.

18. Reitan RM. Validity of the trail making test as an indicator of organic brain damage. Percept Mot Skills 1958;8:271-276.

19. Fischl B, Salat DH, Busa E, et al. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron 2002;33:341-355.

20. Marshall GA, Lorius N, Locascio JJ, et al. Regional cortical thinning and cerebrospinal biomarkers predict worsening daily functioning across the Alzheimer disease spectrum. J Alzheimers Dis 2014;41:719-728.

21. Roy K, Pepin LC, Philiossaint M, et al. Regional fluorodeoxyglucose metabolism and instrumental activities of daily living across the Alzheimer’s disease spectrum. J Alzheimers Dis 2014;42:291-300.

22. Marson DC, Triebel KL, Gerstenecker A, et al. Detecting declining functional skills in preclinical Alzheimer’s disease: the Financial Capacity Instrument–short form.  10th annual conference of the International Society for CNS Clinical Trials and Methodology (ISCTM), 2014, Boston, MA.

23. Marshall GA, Rentz DM, Frey MT, et al. Executive function and instrumental activities of daily living in mild cognitive impairment and Alzheimer’s disease. Alzheimers Dement 2011;7:300-308.

24. Royall DR, Lauterbach EC, Kaufer D, et al. The cognitive correlates of functional status: a review from the Committee on Research of the American Neuropsychiatric Association. J Neuropsychiatry Clin Neurosci 2007;19:249-265.

25. McDonald CR, McEvoy LK, Gharapetian L, et al. Regional rates of neocortical atrophy from normal aging to early Alzheimer disease. Neurology 2009;73:457-465.

26. Landau SM, Harvey D, Madison CM, et al. Associations between cognitive, functional, and FDG-PET measures of decline in AD and MCI. Neurobiol Aging 2011;32:1207-1218.

27. Nadkarni NK, Levy-Cooperman N, Black SE. Functional correlates of instrumental activities of daily living in mild Alzheimer’s disease. Neurobiol Aging 2012;33:53-60.

28. Vidoni ED, Honea RA, Burns JM. Neural correlates of impaired functional independence in early Alzheimer’s disease. J Alzheimers Dis 2010;19:517-527.

29. Kozauer N, Katz R. Regulatory innovation and drug development for early-stage Alzheimer’s disease. N Engl J Med 2013;368:1169-1171.

Appendix: Harvard APT Subject Instructions, Scoring, and Schemas (Version A)

Initial Dial in:

Note: This portion is not counted toward the subject’s performance.

In order to test for deficits in hearing acuity and motor speed, the subject will be first asked to enter 8 numbers separately, and the system will record whether the correct number was entered and how much time it took to enter the number.

The subject ID will be 5 digits long; the first 2 digits will determine the task (18 = Task 1; 26 = Task 2; 49 = Task 3); the last 3 digits will identify the subject.

Subjects will dial in to the same number 3 separate times in order to complete the 3 tasks.

Task 1: Refilling a Prescription (APT-Script)

Instructions for subject:

The purpose of this task is to have you use a pharmacy automated phone menu to refill a prescription.

You will be given a pill bottle with a label that includes a pharmacy phone number (9-1-917-525-4971), a prescription number (487216), and a medication name (simvastatin), dose (20 mg orally), frequency (every day), quantity of pills (30), and number of refills (2). You will use this information to complete the task.

To start the task, please use our phone to call the pharmacy phone number that appears on the pill bottle.

Your subject ID is: 18XXX

* Items in parentheses above appear on the pill bottle and not on the written instructions.

Scoring:

The task will be scored based on total time (until disconnected), number of errors, number of repetition of steps, and correct completion.

  

Task 2: Selecting a New Primary Care Physician (APT-PCP)

Instructions for subject:

The purpose of this task is to have you use a health insurance company automated phone menu to select a new primary care physician.

You will be given a health insurance member card which includes a member ID number (CDW758421693) and a member services phone number (9-1-917-525-4971). You will use this information to complete the task.

To start the task, please use our phone to call the member services phone number that appears on the card.

Your subject ID is: 26XXX

The name and address of the primary care physician you will select is: Dr. John Smith in Boston, MA.

* Items in parentheses above appear on the health insurance member card and not on the written instructions.

Scoring:

The task will be scored based on total time (until disconnected), number of errors, number of repetition of steps, and correct completion.

    

Task 3: Bank Account Transfer and Payment (APT-Bank)

Instructions for subject:

The purpose of this task is to have you use a bank automated phone menu to make a bank account transfer in order to have enough money to pay your Federal taxes for the year.

You will be given a bank account summary that contains your checking account number (0095834881), your savings account number (0095834899), and the bank phone number (9-1-917-525-4971). You will use this information to complete the task. You will be provided with your account balances once you call in.

Amounts of money are in dollars only (not in cents).

You should take notes on a piece of paper in order to complete the task.

To start the task, please use our phone to call the bank phone number that appears on the summary.

Your subject ID is: 49XXX

The amount you owe in Federal taxes is $4,150. You will need to make the payment from your checking account.

* Items in parentheses above appear on the bank account summary and not on the written instructions.

Scoring:

The task will be scored based on total time (until disconnected), number of errors, number of repetition of steps, and correct completion.

  

Harvard APT Subject Instructions, Scoring, and Schemas (Version B)

Initial Dial in:

Note: This portion is not counted toward the subject’s performance.

In order to test for deficits in hearing acuity and motor speed, the subject will be first asked to enter 8 numbers separately, and the system will record whether the correct number was entered and how much time it took to enter the number.

The subject ID will be 5 digits long; the first 2 digits will determine the task (19 = Task 1; 27 = Task 2; 50 = Task 3); the last 3 digits will identify the subject.

Subjects will dial in to the same number 3 separate times in order to complete the 3 tasks.

 

Task 1: Refilling a Prescription (APT-Script)

Instructions for subject:

The purpose of this task is to have you use a pharmacy automated phone menu to refill a prescription.

You will be given a pill bottle with a label that includes a pharmacy phone number (9-1-917-525-4971), a prescription number (394172), and a medication name (lisinopril), dose (5 mg orally), frequency (every day), quantity of pills (30), and number of refills (3). You will use this information to complete the task.

To start the task, please use our phone to call the pharmacy phone number that appears on the pill bottle.

Your subject ID is: 19XXX

* Items in parentheses above appear on the pill bottle and not on the written instructions.

Scoring:

The task will be scored based on total time (until disconnected), number of errors, number of repetition of steps, and correct completion.

  

Task 2: Selecting a New Primary Care Physician (APT-PCP)

Instructions for subject:

The purpose of this task is to have you use a health insurance company automated phone menu to select a new primary care physician.

You will be given a health insurance member card which includes a member ID number (CDW946382715) and a member services phone number (9-1-917-525-4971). You will use this information to complete the task.

To start the task, please use our phone to call the member services phone number that appears on the card.

Your subject ID is: 27XXX

The name and address of the primary care physician you will select is: Dr. Paul Jones in Denver, CO.

* Items in parentheses above appear on the health insurance member card and not on the written instructions.

Scoring:

The task will be scored based on total time (until disconnected), number of errors, number of repetition of steps, and correct completion.

  

Task 3: Bank Account Transfer and Payment (APT-Bank)

Instructions for subject:

The purpose of this task is to have you use a bank automated phone menu to make a bank account transfer in order to have enough money to pay your Federal taxes for the year.

You will be given a bank account summary that contains your checking account number (0084925772), your savings account number (0084925788), and the bank phone number (9-1-917-525-4971). You will use this information to complete the task. You will be provided with your account balances once you call in.

Amounts of money are in dollars only (not in cents).

You should take notes on a piece of paper in order to complete the task.

To start the task, please use our phone to call the bank phone number that appears on the summary.

Your subject ID is: 50XXX

The amount you owe in Federal taxes is $5,350. You will need to make the payment from your checking account.

* Items in parentheses above appear on the bank account summary and not on the written instructions.

Scoring:

The task will be scored based on total time (until disconnected), number of errors, number of repetition of steps, and correct completion.

 

 

 Self-rating of Harvard APT

Subject ID: _______________ Date: ______

After completing the 3 tasks, please answer the following question:

These tasks relate to your daily activities?

1. Strongly agree

2. Agree

3. Neither agree nor disagree

4. Disagree

5. Strongly disagree

Also, please circle one of the following options for each task.

Task 1: Pharmacy task (APT-Script)

The task you just completed was:

1. Very easy

2. Easy

3. Neither easy nor hard

4. Hard

5. Very hard

Task 2: Health Insurance Company task (APT-PCP)

The task you just completed was:

1. Very easy

2. Easy

3. Neither easy nor hard

4. Hard

5. Very hard

Task 3: Bank task (APT-Bank)

The task you just completed was:

1. Very easy

2. Easy

3. Neither easy nor hard

4. Hard

5. Very hard