Introduction
Fatigue is one of the most commonly reported symptoms in motor neurone disease (MND) [
1,
2]. The etiology of this symptom is not yet fully understood and its progression and symptom salience varies between individuals. It has been shown to be associated with poor quality of life (QoL) [
1], though there is some debate as to its precise relationship with concomitant disease factors, including depression [
2].
Fatigue is an essentially subjective phenomenon; clinically, it remains undefined due to the overlap between the lay notion of tiredness and the clinically relevant symptom of fatigue [
3]. In addition, fatigue may confound with loss of motivation or other symptoms. The symptom of fatigue extends beyond just muscular fatigability or weakness, it is distinct from depression and does not necessarily correlate with severity of disease [
4]. Recent evidence supports the notion that fatigue in MND is an independent factor not directly associated with depression, dyspnoea or sleepiness [
2].
The lack of research relating to fatigue in this population may be due in part to lack of tools available to accurately measure the experience of fatigue in MND. There are currently no MND-specific scales for measuring fatigue and it is long established that generic questionnaires may be insensitive to the unique experience of a patient with MND [
5]. Similarly it has been demonstrated that the experience of fatigue may differ among neurological conditions [
3]. In light of these considerations, there is a clear need to develop and validate a disease specific fatigue inventory for patients with MND. Without access to a valid tool for measuring and comparing levels of fatigue in this population, there is little hope for developing better treatment modalities that will allow this disabling symptom to become better managed.
The objective of this research is to develop a disease-specific measure for fatigue in patients with motor neurone disease (MND) by generating data that would fit the Rasch measurement model
Methods
The Neurological Fatigue Scale for MND (NFI-MND) was developed in two stages: a confirmatory qualitative phase followed by a stage of formal psychometric assessment. Ethical permission was granted for both phases from relevant hospital committees in the U.K. (Sefton 05/Q0401/7 and Tayside 07/S1402/64), and local research governance committees at all participating sites.
Qualitative methodology was used to assess patient perception of fatigue in MND. A sample of 10 patients who had reported experiences of fatigue were interviewed at the time of their clinical visit. Participants all had a diagnosis of MND from a neurologist with expertise in MND. The interviews commenced with an open-ended question asking patients to describe their experience of fatigue. The interviews were then extended into a semi-structured format in which issues relating to fatigue derived from interviews with other samples of patients with neurological illness (including multiple sclerosis (MS), and stroke) were explored with the patients. In accordance with interpretative phenomenological analysis (IPA) guidelines [
6] an
a priori sample of ten patients was hypothesised to be sufficient to investigate the phenomenon of fatigue in patients with MND.
All patients who completed the qualitative interviews were then presented with the original pool of 52 items related to fatigue, developed initially for use in MS [
7]. They were asked to comment on the relevance of the item set for MND and whether or not the items were understandable. The qualitative methodology is described in further detail elsewhere [
8]. In addition, the MND qualitative data were compared to previously derived themes in MS for the emergence of new themes.
The psychometric and scaling properties of the proposed 52-item NFI-MND were then assessed among patients recruited from five regional MND care centres: The Walton Centre for Neurology and Neurosurgery in Liverpool, Preston Royal Hospital, Oxford John Radcliffe Hospital, Salford Hope Hospital and Sheffield Royal Hallamshire Hospital. Patients were eligible to enter the study irrespective of age, sex, and disease sub-type or disability status. Questionnaires were either handed out during a routine clinic appointment or sent to the patient's home, as part of a larger questionnaire pack sent alongside a newsletter describing the research activities of their local care centre. A subsample of patients completed The Modified Fatigue Impact Scale [
9]. Two to four weeks after completing the first questionnaire patients were invited to complete a second questionnaire to assess test-retest reliability.
The Rasch measurement model was used to evaluate the scaling properties and construct validity of the 52-item draft questionnaire [
10]. The Rasch model supplements the traditional psychometric assessments of reliability and construct validity by also evaluating the fundamental scaling properties of an instrument. The model operationalises the formal axioms of measurement (order, unidimensionality and additivety) allowing interval level data to be gained from questionnaires [
11]. In the context of fatigue, the Rasch model simply states that the probability of a person affirming an item is a logistic function of the symptom severity the person experiences and the severity of the symptom measured by the question. For example if a person with a very low level of fatigue attempts a question that expresses a high level of fatigue, there is a high probability that they will not affirm the item. A detailed explanation and a more comprehensive review of Rasch methods may be found elsewhere [
12].
To assess external validity, a visual analogue scale (VAS) of fatigue was included with the questionnaire pack. The question was marked on a 0-100 scale and prompted respondents to "Mark on the line, how severe you fatigue has been over the past 4 weeks". The VAS extremes were marked as 'Lively and alert' at the lower extreme and 'Absolutely no energy to do anything at all' at the upper.
Analysis Procedure
An initial exploratory factor analysis (EFA) based on a polychoric correlation matrix was undertaken followed by an oblique Promax rotation. The objective at this stage is to avoid bringing to the Rasch analysis any serious multidimensionality. Thus an EFA is undertaken to give an indication of the dimensionality of the draft scale prior to more rigorous tests of unidimensionality within Rasch analysis [
13]. Consequently a parsimonious solution is sought from the EFA, where a root mean square error of approximation (RMSEA) value below .10 is considered suitable [
14].
Fit to the Rasch model
Data are required to meet Rasch model expectations, and a number of fit statistics are used for this purpose. Fit is indicated by a non-significant summary chi-square statistic. Person and Item fit is also represented by residual mean values, where the summary fit standard deviation falls below 1.4, and individual person and item residuals fall within the range of ± 2.5.
Local dependency
An assumption of the Rasch model is that items are locally independent, conditional upon the trait being measured (i.e. fatigue). This is identified by residual item correlations of +.3 and above. Where local dependency occurs items are too similar, and this artificially inflates reliability. This can be accommodated by summing the items together into one 'super' item, known as a testlet.
Differential Item Functioning (DIF) [15]
Differential Item Functioning (DIF) occurs when different groups within the sample (e.g. males and females) respond in a different way to a certain question, given the same level of the underlying trait (i.e. fatigue). DIF occurs where there is difference in responses across groups. DIF would occur, for example, if men consistently give a higher score to an item than women, regardless of their level of fatigue. Analysis of variance (ANOVA, 5% alpha) is used to measure DIF. In the current study DIF was assessed for five factors: Test/Retest; Location (Liverpool, Oxford/Preston/Salford/Sheffield); Mode of Administration (clinic/delivered to home); Age (quartile split between participants) and Gender. Differential item functioning is used to examine contextual factors for invariance, preventing such factors being a source of confounding effect in the phenomenon being measured.
Item Category Thresholds
The Rasch model also allows for a detailed analysis of the way in which response categories are understood by respondents. For example, in the case of a Likert style response, some respondents may have difficulty differentiating between categories, such as "Never" or "Very Rarely". In instances where there is too little discrimination between two response categories on an item, collapsing the categories into one response option can often improve scale fit to the Rasch model.
Person Separation Index
This indicates the extent to which items distinguish between distinct levels of functioning (where .7 is considered a minimal value for group use; .85 for individual patient use).
Unidimensionality
Finally, a series of independent t-tests are employed to assess the final scale for unidimensionality. Two estimates are derived from items forming high positive and high negative loadings on the first principal component of the residuals. These are compared and individual t-tests calculated. The number of significant t-tests outside the ± 1.96 range indicates whether the scale is unidimensional or not. Generally, less than 5% of significant t-tests are considered to be unidimensional (or the lower bound of the binomial confidence interval overlaps 5%) [
12].
Scale item reduction
Items are removed where necessary one at a time. Once an item is removed from a scale the resultant scale is reassessed for fit, dimensionality, local dependency and DIF. This iterative process is repeated until an acceptable solution is found for the scale.
The unrestricted 'partial credit' Rasch polytomous model was used with conditional pair-wise parameter estimation [
16]. Rasch Unidimensional Measurement Model 2020 (RUMM2020) software (Version 4.1, Build 194) was used for the Rasch analyses presented in this study [
17].
Discussion
The purpose of this study was to develop and validate a disease-specific instrument for measuring fatigue in patients with MND. Qualitative analysis confirmed the suitability of a previously identified 52-item neurological fatigue set. Rasch model expectations were met after correctly ordering the item set into salient factors and removing misfitting items.
As expected for this functionally limited population, the themes of the final scale were not heavily focussed around fatigue following strenuous exercise. Generic instruments, such as the Fatigue Severity Scale [
19], include items assessing fatigue following levels of exertion that are simply not possible for patients in the later, disabling stages of MND. For example, the Multidimensional Fatigue Inventory [
20] measures fatigue over 20 items split into 5 dimensions; General Fatigue, Physical Fatigue, Mental Fatigue, Reduced Motivation and Reduced Activity. Our qualitative findings suggest that patients with MND not only make minimal reference to activity determined fatigue in the classical sense (
i.e. following exercises such as running) but report fewer experiences of mental fatigue than patients with other neurological disorders, such as multiple sclerosis [
21].
There are some of limitations to the study. Whilst we endeavoured to obtain a representative sample, most patients were recruited initially either at a routine clinic appointment or where the patient was known to the clinical team to be interested in research. Selecting patients in this manner may have caused the sample to be skewed toward patients who were at early stages of the disease rather than those nearing the end stage of the disease, although ALSFRS-R scores suggested a wide spread of disability within our sample. Additionally, the number of ALSFRS-R responses restricts the power of correlations to detect changes below magnitudes of r = 0.2. However, other researchers [
2] have found there to be no significant relationship between functional status and fatigue in patients with MND.
Scores for test-retest reliability for the Energy subscale were slightly below expected values. Test-retest reliability analyses were carried out between two to four weeks after the completion of the original questionnaire. The rapidly progressive nature of MND could mean that, for some patients, a large increase in this aspect of fatigue may occur within a four week period. The current study may have been improved by collecting test-retest data over a shorter time period, in order to minimise the effects of the rapid natural progression of the disease upon the results of test-retest reliability analyses.
Differential item functioning analyses in this study were limited by the small sample sizes in the clinic completion group, which contained only twenty patients. Small numbers are apparent in this group due to the difficulties of administering a suite of questionnaires, including a 52-item fatigue measure, in a short clinic appointment. Many patients expressed a preference to take the pack home to complete. The thirteen items of the MND-NFI are now more suitable for clinic administration and further work may usefully examine the validity of the MND-NFI for clinician administration, as well as patient self-complete.
An important caveat of disease specific outcome measures is their inability to provide comparisons between disorders [
22] that may serve to foster a more complete understanding of fatigue and its mechanisms. However, the Rasch model is capable of addressing this problem and allowing for comparisons to be made across different disease groups, especially if the scales have been derived in such a manner as to share common items [
23]. Further progress could be made using the initial 52 questionnaire items to form the basis of other Rasch validated disease-specific scales for neurological conditions such as stroke and post-polio syndrome; allowing for both disease-specific measurement and inter-disease comparison. To this end, the Neurological Fatigue Index for Multiple Sclerosis (NFI-MS) was derived from the same initial 52 item bank and separately validated specifically for use in multiple sclerosis [
7]. The NFI-MS measures fatigue over four domains revealed to be salient to patients with MS; 'Physical', 'Cognitive', 'Relief by diurnal sleep or rest' and 'Abnormal nocturnal sleep and sleepiness', although the latter two scales were acknowledged to be only provisional, and may indicate adaptive processes, rather than aspects of fatigue itself. Further work is warranted to compare fatigue as experienced by patients with MND, MS and other neurological illnesses.
In the NFI-MND the simple duality of the 'Weakness' and 'Energy' subscales will also assist clinicians in assessing what patients mean when they describe feelings of fatigue. As such the NFI-MND fatigue scale may serve as a valuable tool for assessing the patient experience of fatigue and how this disabling symptom changes over time in clinical settings, clinical trials and in bio-psychosocial research studies. This is facilitated further by the transformation of the ordinal raw scores into interval level measurement.
Importantly, the MND-NFI is a brief measure, containing only 13 items, with only 8 items in the summary scale. Questionnaire length is an important concern for patients with MND, particularly when they are suffering from fatigue [
22]. The brevity of the MND-NFI makes it appropriate for routine clinical application in this population, but the scale may also be used in clinical trials, whilst the full NFI-MND may lend itself to bio-psychosocial and biological studies.
Given that all three scales fit the Rasch model, the raw score from each scale is sufficient for identifying the ordinal level of fatigue, energy or weakness a patient exhibits. This ordinal score is convenient for 'everyday' use and will give a good indicator of the levels of fatigue displayed by the respondents. Whenever parametric statistics are required for the data, the ordinal-interval conversion can be employed, in the event there are no missing data.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
CJG collected data, conducted analyses and is the primary author of this manuscript. EWT assisted in study design and authoring of the paper and is a co-grant holder. RJM provided expert review and assisted in study design and editing.
JE, JDM, PJS and KT facilitated data collection in the MND care centres they run. AT provided expert statistical advice regarding Rasch analysis. CAY assisted in study design, authoring, collection of data and editing and in the primary grant holder. All authors read and approved of the finalised version of this manuscript.