Patients
To recruit a large number of patients with LBP, a chiropractic patient sample was chosen, as this is the condition most commonly treated by chiropractors in Sweden [
9]. The 244 patients included in the study had non-specific LBP, with or without sciatica and no other obvious diseases that could account for the LBP [
6]. No categorization according to the duration of their present LBP was used. The patients were of working ages, usually between 18 and 65 years. They had not been under chiropractic care for the past three months. The external validity of this sample has been found to be acceptable, i.e. the general heath and development of pain over time of the sample was compared to that of relevant populations [
6]. Patients were not included from the study if pregnant, if unable to understand Swedish, if they did not have a mobile phone, or if they did not know how to use the text message function. Only patients who answered their weekly text messages over 80% (n = 176) of the time were included in the cluster approach. Data collection took place between May 2008 and June 2009.
Data
A full description of the data collecting process is reported elsewhere [
6]. In short, the participating patients were informed about the study verbally and in writing. If they accepted participation, they filled in a base-line questionnaire with items found to be valid on the study inclusion visit (i.e. the second clinic visit) and signed a consent form. Questions included details of their pain (pain drawing [
10] and pain intensity (using a numeric 11-point scale (NRS), anchors at no pain and worst imaginable) [
11]), self-reported sick leave (number of days the previous year) [
12]) and health ("How would you rate your health? Excellent (1), very good (2), good (3), fair (4), poor (5)"[
13] and EuroQol 5, EQ5D [
14], weighted score ranging from 0 (death) to 1 (perfect health)). The chiropractor collected information on patients' gender, age and occupation, as well as area of pain, intensity, duration and frequency of pain the previous year, self-reported sick-leave previous year (Y/N), access to a mobile phone and know-how in terms of receiving and sending text messages. During the 4
th visit, the patients answered a question regarding improvement [
15] (
"How would you rate your improvement (compared to the time of the first treatment)?". Answers were provided as a descriptive 5-point scale, ranging from definitely improved to definitely worse).
The questionnaires were sent to the research centre and the respondents were entered into the text message system. The first text message question was sent to the patient the following Sunday afternoon (i.e. shortly after the second visit to the chiropractor) and every Sunday after that for six months. The question was:
"How many days during the previous week was your low back pain bothersome, (i.e. affected your daily activities or routines)? Please answer with a number between 0 and 7" [
16]. In a Danish study on chiropractic patients in which text messages were used to collect data, the weekly number of painful days was found to be closely related to the weekly intensity of the LBP [
17]. Therefore, in our study, bothersomeness was used as a measure of the effects of pain on daily life, but was also regarded as a proxy for pain intensity, as it is important to limit the number of questions in text message surveys. Respondents who failed to answer for three weeks in a row were called and kindly reminded to answer. If the respondent could not be reached by phone, a letter was sent to the patient with a reminder to continue answering the text messages.
Analysis
The selected analytic approach is person- oriented [
20]. This means that the analysis concerns the development of the individual, regardless baseline variables. We hypothesized that the pain course over time would be similar in groups of individuals, and different from the course of other groups. This is different from the more traditional variable-based analysis, in which the hypothesis concerns the association of a baseline variable with the outcome.
The text message replies provided individual curves for each respondent based on the weekly measurement of number of days with bothersome LBP reported for 26 weeks. Visual inspection of the individual curves and of the aggregated curve for all respondents was the starting point of the exploratory cluster analysis.
In cluster analysis, it is practically impossible to use 26 parameters (the weekly data) on which to cluster. Therefore, the courses were condensed as follows: For each individual, two linear regression lines were calculated, which defined first the early and then the later trends over time, the dependent variable being the number of days, and the independent variable the week. An intersection between the two regression lines was calculated. The regression analysis was done using a spline (nonlinear regression) technique which simultaneously estimated the lines and their intersection. Thus, for each patient, the weekly measures were condensed into 4 parameters describing the clinical course. These were: 1) the slope of the regression line describing the early course, 2) the intercept of the regression line describing the early course, 3) the difference in slope between the two regression lines (to describe the change from early to late course) and 4) the intersection estimate (to describe when the change in improvement occurs), the so-called "knot". The curve estimates for each patient were checked for their goodness-of-fit through analysis of their residual variance and R square. It should be noted that the regression lines were simply used to describe the development over time, and that no regression analysis was performed. As the individual curves were described with regression lines, it was necessary to obtain good approximations of the actual courses. Therefore, in order to secure solid curve estimates, only patients responding more than 80% of the time (i.e. 21 weeks or more) were used in the cluster analysis. As the analysis was based on curve estimates, a few missing values did not affect the overall description of the curve in the individuals left for analysis (i.e. those with more than 80% weekly answers).
The four mathematical parameters described above were used in a hierarchical cluster analysis, Ward's method, to detect clusters [
21,
22]. The parameters were first standardized to counteract differences in scale. In order to determine the optimal number of clusters, a graphical representation specific to cluster analysis, the dendrogram, together with a criterion based on comparisons of the variation within the clusters in relation to the variation between the clusters, the Calinski-Harabasz criterion, was used [
22]. Further, using the results of the Ward algorithm as a starting point, a K-means cluster analysis was used to optimize cluster allocations and, if necessary, to reallocate the subjects to other clusters. Reallocations were also evaluated with the Calinski-Harabasz criterion.
The final clusters were then described in relation to initial level of bothersomeness, rate of early improvement and the point of change, making it possible to visualize the course pattern of each cluster.
An attempt was thereafter made to match the clusters with clinical information to ascertain if the clusters were clinically different from each other in other ways as well. A number of clinical variables, namely age, gender, pain intensity, the presence of leg pain, duration of LBP the previous year and self- rated health, as well as two variables of outcome, improvement at the 4
th visit and the total number of days with bothersome pain over 26 weeks, were used to describe the clusters. These variables were tested for differences between the clusters with ANOVAs and X
2 tests. The clinical variables, the outcome variables excluded, were used in a discriminant analysis (kth-nearest-neighbour) [
23] for a multivariate evaluation of cluster differences. Thus, the mathematically obtained clusters were validated by investigating differences in various clinical variables to answer the question: Are the clusters clinically relevant? Analyses were performed using SPSS 17 [
24], STATA 10 [
25] and Sleipner [
26].