Introduction
In 2010, Aboriginal communities in remote north Western Australia initiated Australia’s first study of the prevalence of Fetal Alcohol Spectrum Disorders (FASD) to better understand the support services required to assist children and their families into the future [
1]. This study, called the Lililwan Project, arose following concerns from Aboriginal leaders about the effect that high-risk drinking was having on the development of children within their communities [
1] and the potential for FASD. FASD refers to a spectrum of lifelong physical, behavioural and neurodevelopmental disorders resulting from brain injury caused by prenatal alcohol exposure (PAE) [
2,
3]. Clinicians have suspected 30% or higher of the population in some remote Australian Aboriginal communities may have FASD where drinking rates are high [
4]. The Lililwan Project will provide the first data for these communities.
Diagnostic process
Diagnosis of FASD is complex, involving assessment for facial dysmorphology, growth deficiency and central nervous system (CNS) impairment or structural abnormalities. CNS impairment may manifest as deficits in memory, cognition, executive function, adaptive behaviour, sensory processing and language, as well as deficits in fine motor (FM) and gross motor (GM) function [
5,
6]. Current diagnostic systems for FASD include the University of Washington: The 4-digit Diagnostic Code [
5], Canadian Guidelines [
6], the Institute of Medicine [
2] and the Centres for Disease Control and Prevention [
7]. These systems agree on many aspects including the assessment of FM skills but only some include assessment of GM skills [
5‐
7]. Physical activities are central to Australian Aboriginal culture hence inclusion of GM assessment within FASD diagnostic procedures captures a culturally relevant aspect of CNS function for children growing up in the Fitzroy Valley.
The Canadian Guidelines were applied to determine the prevalence of FASD amongst the children in the Lililwan Project cohort (n = 108). They require the assessment of both GM and FM functioning with standardised assessment tools using predefined cut-offs for impairment at 2 standard deviations (SD) below the population mean (< 3rd percentile) [
6]. Within the diagnostic framework, these skills are assessed during the evaluation of nine domains of CNS impairment. GM and FM functioning fall into the first of these domains under the category of these domains under the category of
hard and soft neurologic signs (including sensory motor signs).
Recommendations exist within some international FASD diagnostic criteria [
5,
6,
8] regarding appropriate standardised assessment tools to test motor proficiency in children with PAE but further guidelines are needed regarding age and cultural suitability. Other elements which need consideration in assessment tool selection are validity, established reliability in children with PAE, ability to assess mild to moderate motor impairment, and, as FASD is now recognised by the World Health Organisation as the leading preventable non-genetic cause of mental retardation [
9], the tool must be able to be accurately administered in the presence of intellectual impairment. Furthermore, to satisfy FASD diagnostic cut-offs, assessment outcomes need to be reported in percentile ranks or standard deviations.
To determine the most appropriate standardised assessment tool for measuring motor skills in the Lililwan Project cohort (i) a literature review was conducted; (ii) national paediatric physiotherapy networks were canvassed through a phone survey by contacting all of the Children’s Hospitals within Australia (n = 6); and (iii) representatives of national and international FASD networks were surveyed during informal discussions at the 4th International Conference on FASD, Vancouver, March 2011.
A comprehensive literature review for children aged 7 – 9 years of age revealed five studies in which GM performance was included in the motor assessment of children with a FASD diagnosis or with prenatal exposure to alcohol [
10‐
14]. These studies used six different standardised GM assessment tools ie: Griffith Mental Developmental Scale (GMDS) [
10], Pediatric Early Elementary Examination Second Edition (PEEX2) [
11], Pediatric Examination of Educational Readiness Second Edition (PEERAMID 2) [
11], Clinical Observations of Motor and Postural Skills (COMPS) [
12], Movement Assessment Battery for Children (Movement ABC) [
12], Modified Bruininks-Oseretsky Test of Motor Proficiency (BOTMP) [
13] and McCarthy Scales of Children’s Abilities (MSCA) [
14]. On further investigation only the Movement ABC and BOTMP were found to be comprehensive motor assessments. Recommendations from FASD diagnostic guidelines [
5,
6,
8] were also reviewed with the following standardised assessment tools recommended: Movement ABC [
6], BOTMP [
6], Bruininks Oseretsky Test of Motor Proficiency Second Edition (BOT-2) [
8], Alberta Infant Motor Scale (AIMS) [
6], Peabody Developmental Motor Scales Second Edition (PDMS - 2) [
6,
8], Miller Function and Participation Scales (M –FUN) [
8] and the Bayley Scales of Infant Development Second Edition (BSID II) [
5]. Further review of these assessment tools found only the BOT-2 and Movement ABC were applicable based on age appropriateness, cultural suitability and comprehensive assessment design.
The phone survey of Australian Children’s Hospital Physiotherapy Outpatient Departments (n = 6) recommended the same two motor assessments in their revised versions – Movement ABC Second Edition (Movement ABC −2) [
15] and the BOT-2 [
16]. Papers describing the clinimetric properties of each of these tools were reviewed [
17‐
20] and their appropriateness for use in a remote Aboriginal community was considered.
Discussions with clinicians from international FASD services at the 4th International Conference on FASD, Vancouver, March 2011 unanimously concluded that the BOT-2 was the motor assessment tool of choice because of its comprehensive assessment design and sensitivity to detect motor impairment [
16].
BOT-2 testing involves game-like motor tasks which capture the child’s interest and are not verbally complex [
21] and therefore suitable for children of non-English speaking background. The authors report that it can identify motor deficits in individuals with “mild to moderate” motor impairment and is validated and reliable for assessing subjects with “mild to moderate” mental retardation [
16]. Importantly, both aspects fit the profile of children with a FASD diagnosis. The earlier version, the BOTMP [
22], is a widely used standardised assessment tool with a long history of use in clinical practice and research. It is often used as the standard for the criterion validation of other motor tests [
23]. Both CF and SF versions report score outcomes in percentile ranks thus satisfying requirements for use in internationally recognised FASD diagnostic processes. Furthermore, the motor activities incorporated within the BOT-2 include GM tasks that assess hopping, jumping, running, ball skills, balance, strength, and co-ordination and FM tasks that assess precision, integration and manual dexterity through drawing, writing, and functional tasks such as threading blocks. Through interviews with community members we established that these motor tasks are consistent with motor activities of Fitzroy Valley children at school and in recreational time. As yet, the reliability of the BOT-2 CF or BOT-2 SF has not been established either in children exposed to alcohol
in utero or for the motor assessment of Australian Aboriginal children.
The BOT-2 authors report that BOT-2 SF was designed as a screening tool to identify children with motor deficits who may benefit from further comprehensive testing for diagnostic purposes or intervention activities [
16]. Whilst the Lililwan Project FASD prevalence study used the more comprehensive BOT-2 CF, the reliability study used the shorter BOT-2 SF in order to minimise assessment fatigue as the reliability study was conducted in addition to the concurrent FASD prevalence study. Pilot testing had indicated that a reliability study involving the BOT-2 CF may be too exhausting given each child participating in the Lililwan Project underwent approximately 6 hours of interdisciplinary assessments over two days (including the BOT-2 CF assessment) as part of the FASD diagnostic process [
24]. Even though the Lililwan Project occurred over a 6 month period, the assessment team had little flexibility in timetabling assessments, and this was compounded by the remoteness of most communities. The Lililwan Project team visited each community for a limited time, during which assessments, data entry, FASD diagnosis (and other diagnoses) and individual management plans needed to be completed. For these reasons a limited sample (n = 30) of the Lililwan Project (n = 108) was recruited for the reliability testing using the shorter BOT-2 SF as this measure takes approximately 20 minutes to complete compared with 60 minutes for the BOT-2 CF. The 14 test items in the BOT-2 SF are included within the BOT-2 CF, enabling comparison of these 14 key items between the BOT-2 SF and the BOT-2 CF to determine the test-retest reliability. Correlation between the BOT-2 CF and SF is not provided by the BOT-2 authors [
16]. However, a study using the earlier BOTMP version reported a high correlation between the CF and SF total composite scores using Pearson’s product–moment coefficients [
r = 0.85 (95% CI, 0.80 – 0.89)] [
25].
Measurement of change
Of further benefit is the provision of cut-offs which indicate true change in a subject’s performance at a second assessment point attributable to intervening factors, such as a therapy program, rather than measurement error. The standard error measure (SEM) reflects the degree to which a measurement can vary as a result of error in the measurement process [
26]. The minimal detectable change (MDC) shows which changes fall outside the measurement error range ie changes greater than the MDC can be attributed to real change and not to measurement error [
27]. The SEM and MDC are based on test-retest reliability in stable persons. They are both estimates of the extent of measurement error based on the standard deviation (SD) and reliability value, and are readily interpretable as they are given in the same units of measurement as the instrument under examination [
26,
27]. As the BOT-2 SF is a concise motor assessment designed as a screening tool, these estimates are calculated for the BOT-2 SF outcome scores rather than from the individual 14 subtest items.
The aims of this study were to:
1.
determine the inter-rater and test-retest reliability of the BOT-2 SF amongst a convenience sample of children (n = 30) selected from the group of children born in 2002 or 2003 participating in the Lililwan Project cohort (n = 108) where over 50% of mothers drank alcohol during pregnancy.
2.
estimate the SEM and MDC for the BOT-2 SF score sheet outcomes (standard scores and percentile ranks).