Skip to main content
Free AccessOriginal Article

Normative Data of the Self-Report Version of the German Strengths and Difficulties Questionnaire in an Epidemiological Setting

Published Online:https://doi.org/10.1024/1422-4917/a000589

Abstract

Abstract.Objective: This study served to establish German norms for the Strengths and Difficulties Questionnaire self-report (SDQ-S) by using data from a representative epidemiological sample from the German National Health Interview and Examination Survey for Children and Adolescents (KiGGS study). Although the German version of the SDQ has been widely used and normative data for the parent version (SDQ-P) exist, no German norms for the self-report version have been reported, so that practitioners had to rely on the available British norms. In addition, we investigated whether sex- and age-specific norms are necessary. Methods: At the baseline of the KiGGS study, SDQ-S ratings were collected from n = 6,726 children and adolescents between 11 and 17 years (n = 3,440 boys und n = 3,286 girls). We assessed the internal consistency and age/sex effects of the SDQ-S. Confirmatory factor analysis was conducted to assess the factor structure of the SDQ-S. Banding scores were developed to differentiate children and adolescents with levels of difficulties and categorized them as “normal,” “borderline,” and “abnormal.” General as well as age- and sex-specific bandings were created for both total score and subscales of SDQ-S. In addition, the German norms of the SDQ-S were compared with those of the UK, Norway, and Thailand. Results: The five-factor solution of the SDQ-S (including Emotional symptoms, Conduct problems, Hyperactivity/Inattention, Peer problems, and Prosocial behavior) provided a satisfactory fit to the data. Moderate internal consistencies (Cronbach’s α) were observed for the scales Emotional symptoms, Hyperactivity/Inattention, and Total difficulties score, whereas insufficient internal consistency was found for the scales Peer problems and Conduct problems. However, using McDonald’s ω as a more appropriate measure of homogeneity, internal consistencies were found to be satisfactory for all subscales and for Total difficulties. Normative banding scores were established conservatively to avoid producing too many false positives in the category “abnormal.” In line with previous research, girls showed more emotional problems but fewer Peer problems than boys. German normative bandings of SDQ-S were similar to the original British bandings and those of other countries. Conclusions: This study of the German SDQ-S in a large representative epidemiological sample presents evidence of partly moderate to good psychometric properties. It also supports the usefulness of SDQ-S as an effective and efficient instrument for child and adolescent mental health problems in Germany. German normative banding scores of SDQ-S established in this study were comparable with the original British norms as well as with those of other countries, so that SDQ-S can be recommended as a psychopathological broadband-screening tool.

Introduction

Many children and adolescents suffer from distinct emotional and behavioral problems (Bourdon et al. 2005; Costello et al., 2003; Ravens-Sieberer, Wille, Bettge, & Erhart, 2007). Various epidemiological studies have identified psychopathological abnormalities in about 10–20 % of children and adolescents (Hölling, Schlack, Petermann, Ravens-Sieberer, & Mauz, 2014; Petermann, Döpfner, Lehmkuhl, & Scheithauer, 2000). Those internalizing and externalizing disturbances also bear a high risk for physical and mental problems in later childhood and adolescence (Björnsdotter, Enebrink, & Ghaderi, 2013).

Thus, early clinical assessment of psychiatric problems of children and adolescents is highly important. In clinical practice, diagnostic interviews and symptom questionnaires are generally used as diagnostic instruments, including parent and teacher ratings from childhood (and especially from adolescence) and self-reports (Döpfner & Petermann, 2012). The predictive power of a child’s report for a later impairment has been shown in several studies (Hagenberg et al., 2004; Morgan & Cauce, 1999). The Strengths and Difficulties Questionnaire (SDQ) is a well-established, multi-informant, economic, and freely accessible instrument (www.sdqinfo.com) that can be used for clinical as well as research purposes (Achenbach et al., 2008; Becker, Hagenberg, Roessner, Woerner, & Rothenberger, 2004b; Klaasen, Woerner, Rothenberger, & Goodman, 2002). The Strengths and Difficulties Questionnaire self-rating version (SDQ-S) has been translated into more than 80 languages (www.sdqinfo.com) and is frequently used worldwide. An overview of existing articles about the SDQ-S was published by Achenbach et al. (2008).

The Strengths and Difficulties Questionnaire (SDQ), developed by Goodman et al. (1997), encompasses a self-rating version (SDQ-S) and a parent (SDQ-P) as well as a teacher version (SDQ-T); it targets the child’s experien- ces in different areas of life. The 25 items offered with a three-point rating scale (0 = nottrue; 1 = somewhattrue; 2 = certainlytrue) assess positive and negative aspects of the child’s experience and behavior as well as the clinical severity of possible psychiatric problems. The five subscales Emotional symptoms, Conduct problems, Hyperactivity/Inattention, Peer problems, and Prosocial behavior are assessed with five items each according to the disorder concepts and criteria of ICD-10 and DSM-IV (Goodman, Lamping, & Ploubidis, 2010). While the SDQ-P refers to children aged 4 to 16 years, the SDQ-S is considered reasonable only from the age of 11 years on (Becker et al., 2004b; Goodman et al., 1998). The SDQ was developed primarily as a screening instrument for population-based samples (Goodman, 2000) but is increasingly being used for assessment purposes in the clinical setting (Becker et al., 2004a; Masi et al., 2013).

Studies on the SDQ-S from the UK, Finland, and Norway showed acceptable to good internal consistency for the SDQ-S Total difficulties score (Cronbach’s α between 0.71 and 0.82 (Goodman et al., 1998; Koskelainen, Sourander & Kaljonen, 2000; Muris, Meesters, & van den Berg, 2003). Some studies found low reliability for the subscales Conduct problems (α = 0.47–0.60) and Peer problems (α = 0.39–0.46) (Capron, Thérond, & Duyme, 2007; Muris, Meesters, Eijkelenboom, & Vincken, 2004; Rogge et al., 2017).

Based on a German community sample, Klasen et al. (2000) found that the correlations between the SDQ-P and SDQ-S were satisfying (rtotal = 0.60, rsubscale = 0.36–0.64). As in other studies, higher correlations were observed for externalizing ratings compared to internalizing ratings (Becker et al., 2004b; Hodges, 1993). The validation and standardization of the proposed five-factor structure of the SDQ-S has been generally replicated in studies in different cultures (Bøe et al.. 2016; Du et al. 2008; Essau et al., 2012; Lohbeck et al., 2015; Sharratt et al., 2018; van de Looij-Jansen et al., 2011; Yao et al., 2009); some studies also found the three- or four-factor structure more favorable (Altendorfer-Kling et al., 2007; Dickey et al., 2004; Lohbeck, 2015; Muris et al., 2004). Moreover, the bi-factor models of externalizing and internalizing disorders have recently received more attention (Caci et al., 2015; Kóbor et al., 2013; Patalay et al., 2015).

The first standardization of the German SDQ-S was published by Koglin (2007). However, these norms were based on an Austrian population sample, which included only two federal states and two types of school and therefore had low representativeness of the German population. The German standardization study by Lohbeck et al. (2015) found partly insufficient psychometric properties for the SDQ-S (e. g., Conduct problems (α = 0.55) and Peer problems (α = 0.56)). Although they made recommendations concerning cutoff values, their results could be biased because of the limited representativeness and age heterogeneity of the investigated sample. Thus, a study providing German normative banding scores for the SDQ-S based on data of an unselected, representative sample from the general population is still needed. Moreover, the effects of sex on SDQ-S scales were discovered in several studies (Becker et al., 2004b): Boys showed significantly more Conduct problems than girls, whereas girls suffered more often from Emotional problems and scored higher on the Prosocial behavior scale (Koskelainen et al., 2000; Muris et al., 2003). Issues concerning sex should therefore be considered when general norms are established.

The present study served to establish German norms for the SDQ-S using a large and representative population-based sample. To date, the German SDQ-S are used in clinical studies and clinical routine by applying the British norms provided by Goodman et al. (1997). However, measurement invariance of SDQ across different countries was found in previous studies (Ortuño-Sierra et al., 2015), making the use of British norms for German SDQ-S, technically speaking, inadmissible (Rogge et al., 2017). Setting German norms is of high clinical and practical relevance for achieving more precise assessments. To this end, we used a national sample where no selection effects (cf. Lohbeck et al., 2015) were observed and in which the distribution of the participants can be considered representative (Kurth et al., 2008). Since younger children have only a limited ability for introspection and are often unable to judge and report on their emotions or behavior, their direct assessment is considered to be of limited diagnostic value (Achenbach et al., 1987). Thus, it seems reasonable to use only the information from self-reports of children aged about 11 years or older. Furthermore, the hypothesized five-factor structure of the SDQ-S from the original British version (Goodman, 2001) should be verified in light of the above-mentioned contradictory results of former studies.

When the SDQ was developed, Goodman established rules of classifying abnormality into three categories based on statistical thresholds (“normal” 80 %, “borderline” 10 %, and “abnormal” 10 %). In the current study, these thresholds were applied to set normative banding scores based on the examined German population-based sample. Furthermore, the bandings for the SDQ-S determined using our sample were compared to corresponding findings of previous studies from other European countries.

In sum, generalized (i. e., non-age- and non-sex-specific) norms should be developed for clinical routine practice. Furthermore, the comparability with normative data from other European countries should be demonstrated. Beyond that, age- and sex-specific differences for the SDQ-S scales should be investigated and corresponding norm values should be presented to deliver valuable information for future research studies and clinical evaluations.

In this study, we investigated and determined:

  1. 1
    (i) whether the hypothesized five-factor structure can be found for the SDQ-S based on data from a representative German sample;
  2. 2
    (ii) the normative banding scores/cutoff values for the German SDQ-S to allow the allocation of children and adolescents to the proposed groups “normal,” “borderline,” and “abnormal”;
  3. 3
    (iii) the age- and sex-specific effects on the German SDQ-S scales.

Methods

Study Description and Sample

Data for the current analysis stem from the baseline survey of the German KIGGS study (Kurth et al., 2008). The KIGSS study is a cohort-sequential study that collected comprehensive data on the health status of children and adolescents in Germany. It was part of the German health-monitoring system established at the Robert Koch Institute, Berlin, on behalf of the German Federal Ministry of Health (Hölling et al., 2012; Kurth et al., 2009). Currently, data from the baseline study (2003–2006) and the first repeat sample (KiGGS Wave 1) are available. The KIGSS study is representative in terms of age, sex, regional and citizenship structure of the German population. A detailed description of the design and procedure of the KIGSS study can be found in Kurth et al. (2008). In brief, a total of 17,641 children and adolescents between the ages of 0 and 17 and their parents participated in the baseline assessment, which took place between May 2003 and May 2006. The net response rate was 66.6 % (for more details about response rates and reasons for nonresponse, please see Kurth et al., 2008). The participating children and adolescents were given a physical examination; the parents and the children and adolescents themselves from age 11 on completed extensive self-administered questionnaires on their physical, social, and mental health. The sampling frame followed the principles of a stratified multistage probability sample (Kish, 1965).

The participants were recruited in two steps: In the first step, 167 study locations (primary sample units, PSUs) were systematically chosen from an inventory of German communities stratified according to the BIK classification (Aschpurwis und Behrens GmbH, 2001), which measures the degree of urbanization and geographic distribution. Using the Cox procedures for community sampling (Cox, 1987), the number of PSUs per stratum was determined with a sampling probability proportional to population size. In the second step, an equal number of study subjects per birth cohort over the entire age range were randomly selected (simple random sample) from the local population registries.

The present analyses focused on children and adolescents aged 11 to 17 years (n = 6,793). Age and sex distribution of the normative sample are presented in the Electronic Supplementary Material (ESM) 1, Table e1 (a–f). Corresponding to the procedure recommended by Goodman (1997), cases with missing values for more than 10 out of 25 items in the SDQ-S total scale or 3 out of 5 items in a subscale were excluded from the following analyses (n = 67). This resulted in a final sample of n = 6,726 for the present study.

SDQ-S

The SDQ-S is a brief behavioral screening questionnaire covering the most important current domains of child and adolescent psychopathology. It contains 25 items assigned to five subscales: Emotional symptoms, Conduct problems, Hyperactivity/Inattention, Peer problems, and Prosocial behavior (Goodman, 1997). The severity level of each item is assessed by the child on a 3-point scale (0 = nottrue, 1 = somewhattrue, and 2 = certainlytrue). Higher scores indicate more serious problems, except for Prosocial behavior, with higher scores indicating more positive behavior. The Total difficulties score (range 0–40) is obtained by summing the scores of the four problem-specific subscales Emotional symptoms, Conduct problems, Hyperactivity/Inattention, and Peer problems. Previous research showed that the psychometric properties of the SDQ are satisfactory to good, and that its subscales correspond to the major categories and criteria of the current psychiatric classification systems (Achenbach et al., 2008).

SES

The participants’ socioeconomic status (SES) was quantified using a multidimensional index for social strata based on net household income, parental level of education, and occupational status, which allows for differentiation between lower, middle, and high social status (Lange et al., 2014).

Statistical Analyses

First, descriptive statistics of the analyzed sample in terms of age- and sex-specific SDQ-S total and subscale scores (mean values and standard deviations) were calculated. Age and sex effects were calculated by means of 2 x 2 ANOVA using SDQ-S scores as outcomes and Sex and Age as factors. Internal consistency of the SDQ-S scales was calculated by both Cronbach’s α and McDonald’s omega (ω). Several studies showed that Cronbach’s α increases with a greater number of items (Nunnally & Bernstein, 1994; Streiner, 2003). In addition, Cronbach’s α captures only the interdependence between items that belong to the same scale (Cortina, 1993; Sijtsma, 2009). Consequently, we supplemented an alternative method for the calculation of reliability, McDonald’s ω, which is recommended as the most accurate measurement for reliability (Revelle & Zinbarg, 2009). McDonald’s ω was calculated for all five subscales as well as for the Total difficulties score. The following guidelines for interpretation were used (Kline, 2000): α ≤ 0.5 unacceptable; 0.6 > α ≥ 0.5 poor; 0.7 > α ≥ 0.6 questionable; 0.8 > α ≥ 0.7 acceptable; 0.9 > α ≥ 0.8 good; α ≥ 0.9 excellent. Additionally, internal consistency should be “good” for measures to be used in the group comparisons and “excellent” for comparing individual scores. To verify the hypothesized five-factor structure of the SDQ-S from the original British version (Goodman, 2001), we carried out a confirmatory factor analysis with extracted factors fixed at five.

Subsequently, we acquired the distribution of raw values for the SDQ-S scales for each scale value and assigned a percentile rank (PR) to every severity level. According to the procedure used in the British version of the SDQ (Goodman et al., 1997), which resorted to empirical values for the determination of abnormal behavior (Frombonne 1991), threshold values were established for the allocation of raw values to one of three categories: “normal,” “borderline,” and “abnormal” (Klasen et al., 2000). Following the statistical case definition used for the SDQ by Goodman (1997), the threshold values categorized 80 %%, 10 %, and 10 % of the population as “normal,” “borderline” and “abnormal” cases, respectively. Banding scores were calculated giving priority to sensitivity, i. e., true positive rate, which prevented exceeding the 10 % rate of “abnormal” and “borderline”, respectively, and in sum no more than 20 %. A final statement on the appropriateness of this criterion is difficult, seeing that, in this sample, nothing was known concerning the actual mental health state of the children. Finally, the banding scores and percentages of participants in the “abnormal,” “borderline,” and “normal” groups based on SDQ-S scores obtained in our study were compared descriptively to corresponding results of similar standardization studies from different countries, including the often-used British norms for the SDQ-S.

Data handling and statistical analyses were carried out using SPSS (Statistical Package for the Social Sciences; release 22).

Results

Descriptive Statistics, Internal Consistency, and Effects of Sex and Age

The internal consistency of the self-reported scales of the SDQ in the normative German sample of children and adolescents is shown in Table 1. For the total sample, an acceptable internal consistency with α = 0.72 was found for the Total difficulties score; the corresponding values for the subscales ranged from α = 0.43 to α = 0.64, indicating a variety of internal consistency levels. The Conduct problems and Peer problems scales showed unacceptable internal consistency. However, using an alternative method for the calculation of reliability, McDonald’s ω values ranging between 0.61 (Conduct problems) and 0.85 (Total difficulties score), all scales reached a sufficient level of reliability.

Table 1 Internal consistency and effects of sex and age on self-reported scales of the SDQ in the normative German sample of children and adolescents (aged 11 to 17 years)

Table 1 also presents the SDQ-S scale scores for the total as well as the sex-specific sample on a descriptive level. Girls scored higher on Total difficulties than boys (p < .001). In detail, girls reported higher scores on Emotional symptoms (p < .001) and Prosocial behavior (p < .001), while boys reported higher scores on Conduct problems (p < .001). Both girls and boys reported comparable scores on the scales Hyperactivity/Inattention (p = .782) and Peer problems (p = .202).

An examination of age effects showed that older and younger children had similar levels of Total difficulties (p = .527) and Peer problems (p = .107). Older children reported significantly more Emotional symptoms (p < .001) and Conduct problems (p = .035), while younger children reported more Hyperactivity/Inattention (p < .001). Moreover, significant interaction terms between age and sex in the Total difficulties score (p < .001) and in the Emotional symptoms score (p < .001) indicated that older girls were the most troubled group.

Children and adolescents from families with a low SES reported significantly more problems on all four problem scales than children from families with a higher SES. No such effect was found for the Prosocial behavior scale.

Evaluation of the Factorial Structure

The factorial structure of the German SDQ-S was evaluated in the total sample with a confirmatory factor analysis (see ESM 1, Table e2). The number of extracted factors was fixed at five to obtain a direct comparison with the proposed five-factor structure of the SDQ-S. The five extracted factors explained 39.9 % of the total variance. Most items had their main loadings on the extracted factors corresponding to the hypothesized scales. No significant adjacent loading higher than 0.35 was observed for msEmotional symptoms, Peer problems, and orProsocial behavior. For Hyperactivity/Inattention, the five intended items loaded the corresponding hypothesized SDQ-S subscale. However, items 21 (reflective) and 25 (persistent) showed high negative cross-loadings on the scale Prosocial behavior. The subscale Conduct problems could not be replicated as intended since item 7 (obedient) and item 18 (lies, cheats) showed their highest loadings on the scales Prosocial behavior and Peer problems, respectively.

Recommended Bandings of Raw Scores

Tables 2 and 3 present the percentile ranks of raw scores and recommended banding scores for the SDQ-S scales calculated based on the analyzed sample, respectively. Supplementary analyses also provide sex- and age-specific bandings of raw scores of the SDQ scales (see ESM 1, Table e1). We aimed to determine the banding scores in a conservative manner in order to avoid too many false positive cases. This was done by selecting a cutoff value below 10 % for children classified as abnormal. This aim was reached for all SDQ-S scales except for the scale Prosocial behavior (% abnormal + % borderline = 24.8 %). To allow for a comparison of the banding scores resulting from our analyses to corresponding findings of previous studies, we added bandings for the SDQ-S scales reported in these other studies to Table 4.

Table 4 Cut-off points for classification as “abnormal” (and proportion of cases at or above the 90th percentile of the corresponding sample) based on self-reported scales of the SDQ across different studies
Table 3 Bandings of raw scores for group allocation according to self-reported scales of the SDQ based on the normative German child and adolescent sample
Table 2 Assignment of raw values of self-reported scales of the SDQ to percentile ranks based on the normative German child and adolescent sample

Age and Sex Dependency of Raw Values

Consistent with the study of Woerner et al. (2002), we found significant age and sex differences in total and subscale scores among the assessed children (Table 1 and ESM 2, Figure e1–e6). In order to adequately consider developmental age effects and sex differences when in the interpretation of the individual raw values, we calculated separate threshold values for combined age groups from adjacent ages and for boys and girls for the SDQ-S Total difficulties score (see ESM 1, Table e1).

Discussion

This study served to establish norms for the German translation of the SDQ-S questionnaire. Moreover, we wanted to test the factorial validity and reliability of the SDQ-S in a representative German general population sample. We analyzed a large general population sample of children and adolescents aged 11 to 17 years (N = 6,726) and developed generalized norms as well as sex- or age-specific norms for both research purpose and clinical practice.

Scale Means, Correlations with Age and Scale Homogeneity

The study addressed different practical methodological aspects. First, on average, girls reported higher scores on the Emotional symptoms scale and on the Total difficulties score. Moreover, compared to boys, girls evaluated themselves as more prosocial and reported fewer Conduct problems. These results are in accordance with clinical observations and, for example, the results of Achenbach et al. (2008).

For girls and boys, the Emotional symptoms and Conduct problem scores decreased significantly with increasing age. On the contrary, their Hyperactivity/Inattention scores increased over time. Interactions between age and sex also pointed out that older girls might be the most burdened group. Participants from families with a low SES showed more “abnormal” values on all scales than those from families with a higher SES, except for the Prosocial behavior scale. These results are in line with the results of Achenbach et al. (2008) and Becker et al. (2015), and suggest that even children and adolescents are able to adopt a strong negative moderation effect of a low SES.

The examination of the SDQ-S reliability by Cronbach’s a revealed a low internal consistency for some SDQ-S scales in this study. However, using McDonald’s ω as a measure of internal consistency, we found satisfactory to good values for both subscales and the Total difficulties score. Similarly, a recent publication on the issue of the measurement invariance between SDQ-S and SDQ-P with data from KiGGS Wave 1 (a telephone survey) showed that the internal consistency of the subscales with McDonald’s ω was considerably better (Rogge et al., 2017). There is thus no reason to not recommend the use of individual subscales because of low internal consistency, which would have been the case if Cronbach’s a were taken as a basis. Despite its superior properties, the use of McDonald’s ω is not yet very widespread. In the psychometric literature, McDonald’s ω is considered superior to Cronbach’s a because it is more robust against the violation of the assumption of essential tau-equivalence of the measurement and therefore less likely to over- or underestimate reliability (cf. Dunn et al., 2014).

Evaluation of the Factorial Structure

The factorial structure was examined to validate the German version of the SDQ-S. Our findings replicated the hypothesized five-factor structure of the original SDQ-S, largely as intended. However, the loading pattern of the subscale Conduct problems was unsatisfactory. Together with the low internal consistency of the Conduct problems scale, a clinical evaluation should not lean only on this SDQ-S subscale. Further, factor loadings of two items from the Hyperactivity/Inattention scale were not satisfying. Our findings are in line with the results of Lohbeck et al. (2015), who reported that items 7 (Conduct problems, obedient), 21 (Hyperactivity, reflective) and 25 (Hyperactivity, persistent) were problematic. All three items had their highest negative (lateral) loading on the Prosocial behavior subscale. Moreover, we found that item 18 (Conduct problems, lies/cheats) did not load on the intended subscale but on the subscale Peer problems, while items 7 (Conduct problems, obedient) and 18 (Conduct problems, lies/cheats) did not load sufficiently (factor loadings < 35) on the initially intended subscale Conduct problems.

Recommended Bandings of Raw Scores

To our knowledge, no publication reports the British norms in details and includes a description of the analyzed sample for the SDQ-S. However, three normative studies on norms of the SDQ-S need to be considered. The first one is from 1998, a pilot study on the validity of the SDQ-S (Goodman, 1998). There, preliminary bandings and cutoff values were mentioned which are published as valid norms on the official SDQ website (see sdqinfo.org). However, Koskelainen et al. (2000) reported different cutoff values. It can only be assumed that these cutoff values (compare sdqinfo.org) were independently calculated on the basis of the mentioned frequencies of the British norms (without further details) using the statistical case definition based on the 80/10/10 rule. Moreover, in another study by Goodman (2001), the underlying cutoff values for the British SDQ-S were only briefly mentioned. Goodman (2001) used the cutoff values of Koskelainen (2000) for comparison. The cutoff values reported in the first study by Goodman (1998) were not used, as the percentages of “abnormal” values were significantly lower than those described in other standardization studies.

Country-specific cutoff values (1) should be calculated based on the results from correspondent epidemiological studies and (2), as Robert Goodman stated, “The main implication is that users probably shouldn’t be too focused on whether the score is just this side or just the other side of an arbitrary boundary. We may need to use fairly arbitrary cutoffs in terms of rules such as that above a score of X we will carry out more detailed screening, but that sort of pragmatic rule should not blind us to the fact that one point above threshold and one point below threshold actually have almost identical implications.”

Previous research found that the bandings calculated in accordance with the epidemiological study results of the BELLA study (Hölling et al., 2008) and the British bandings reported by Robert Goodman (1998) differed for some SDQ-S scales (Table 4). In the present study, the British cutoffs for four SDQ scales (except for Prosocial behavior) were uprated so that the percentage of children classified as “borderline” or “abnormal” remained below 20 %. For the Total difficulties score, slightly different cutoffs could be determined for age-homogeneous subgroups, analogous to the standardization study of Woerner et al. (2002). Similarly, sex is specifically considered when interpreting individual raw values.

In the case of recommended bandings of raw scores obtained with the German SDQ-S, it was found that the determined cutoff values differed from the British norms on four SDQ scales. For the scale Emotional symptoms, a cutoff value of 6 was determined for the “abnormal” range. Thus, in this study, 6.6 % of probands were described as emotionally “abnormal.” The greatest difference in the cutoff values was found for the scale Peer problems, where a value of 5 (compared to 3 in the British bandings) was chosen and led to 7 % of children being described as “abnormal.” However, despite the statistically meaningful comparison, differences in mean values on the subscales of the SDQ were not sufficient to suggest age- or sex-specific banding scores. Even for the Total difficulties score, only slight deviations in the raw value distribution were found in the subgroups. Thus, as with the parent SDQ (Woerner et al., 2002), the use of sex- and age-specific norms does not seem to be necessary.

In sum, it could be shown that the five-factor structure of the SDQ-S was confirmed for the German version of the measure. The chosen banding scores are comparable with those reported by studies performed in other countries and demonstrated the basic consensus of the SDQ-S in the German community sample. However, considering the known measurement variance of SDQ-S from different countries, comparisons between these norms should be interpreted with caution. A conservative determination of the bandings avoided too many false positive results concerning the categories “borderline” and “abnormal.” Sex effects, i. e., girls having more emotional problems but fewer social problems than boys, confirm existing results obtained using the SDQ-S.

Although the use of a subscale level is not recommended because of the low reliability of some scales (i. e., Peer problems and Conduct problems), the use of the remaining SDQ-S scales can be recommended for psychopathological screening. Specifically, it is important to collect judgments from different informants when parent or teacher reports are not available. The German SDQ-S is a practicable and economic screening instrument suitable for research purposes, for initial assessments as well as for the documentation of therapeutic courses.

Strengths and Limitations

This study has several strengths and limitations. A major strength is the use of a large and nationally representa- tive German sample (KiGGS baseline assessment 2003–2006), which enables the generalization of our results to the overall German child and adolescent population. To our knowledge, to date no other SDQ-S sample with comparable properties has been published. Yet, the validity of the results may be restricted to the German child and adolescent population and may not be generalized to other countries and cultures. However, an examination of the original standardization for the SDQ parent version (Woerner et al., 2002) and an analysis based on the BELLA data (Rothenberger et al., 2008) did not indicate distributional differences regarding normal, borderline, and abnormal. Another limitation is that, although sex-specific mean values of the SDQ scales were compared and corresponding bandings were reported, we did not test for measurement invariance between boys and girls. Future studies would benefit from such test by revealing the construct across sex. Moreover, data for the current analysis stem from the baseline survey of the German KIGGS study (Kurth et al., 2008), so that any interpretation of the results should be cautious about possible changes in the normative data over the time interval. A further limitation is that the banding scores had to be established without consideration of any clinical diagnosis because such data were not available.

Electronic Supplementary Material

The electronic supplementary material (ESM) is available with the online version of the article at https://doi.org/10.1024/1422-4917/a000589.

  • Tables e1(a–f): Gender- and age-specific bandings of raw scores of the SDQ-S total and subscale score.
  • Tables e2: Factor loadings of items of the SDQ based on self-reported data of the German normative child and adolescent sample.
  • Total and subscale score according to the self-reported SDQ in the German normative child and adolescent sample.

Conflicts of interests: No conflicts of interest exist.

Literature

  • Achenbach, T., Becker, A., Döpfner, M., Heiervang, E., Roessner, V., Steinhausen, H. C. & Rothenberger, A. (2008). Multicultural assessment of child and adolescent psychopathology with ASEBA and SDQ instruments: research findings, applications, and future directions. Journal of Child and Adolescent Psychiatry, 49, 251–275. First citation in articleGoogle Scholar

  • Achenbach, T. M., McConaughy, S. H. & Howell, C. T. (1987). Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin, 101, 213–232. First citation in articleCrossref MedlineGoogle Scholar

  • Altendorfer-Kling, U., Ardelt-Gattinger, E. & Thun-Hohenstein, L. (2007). The self-assessment sheet of the SDQ using an Austrian field test. Journal of Child and Youth Psychiatry and Psychotherapy, 35, 265–271. First citation in articleAbstractGoogle Scholar

  • Aschpurwis und Behrens GmbH: [BIK regions: metropolitan areas, city regions, middle and low order centers – Description of method of the last update 2001]. Hamburg; 2001. First citation in articleGoogle Scholar

  • Becker, A., Woerner, W., Hasselhorn, M., Banasschewski, T. & Rothenberger, A. (2004a). Validation of the parent and teacher SDQ in a clinical sample. European Child and Adolescent Psychiatry, 13, II/11–II/16 (Supplement 2). First citation in articleCrossrefGoogle Scholar

  • Becker, A., Hagenberg, N., Roessner, V., Woerner, W. & Rothenberger, A. (2004b). Evaluation of the self-reported SDQ in a clinical setting: Do self-ratings tell us more than ratings by adult informants? European Child and Adolescent Psychiatry, 13, II/17–II/24 (Supplement 2). First citation in articleCrossrefGoogle Scholar

  • Becker, A., Rothenberger, A., Sohn, A., Ravens-Sieberer, U. & Klasen, F., & BELLA Study Group. (2015). Six years ahead: Course and predictive value of psychopathological screening in children of the community. European Child and Adolescent Psychiatry, 24, 715–725. doi 10.1007/s00787-015-0706-4 First citation in articleCrossref MedlineGoogle Scholar

  • Björnsdotter, A., Enebrink, P. & Ghaderi, A. (2013). Psychometric properties of online administered parental strengths and difficulties questionnaire (SDQ), and normative data based on combined online and paper-and-pencil administration. Child and Adolescent Psychiatry and Mental Health, 7, 40. First citation in articleCrossref MedlineGoogle Scholar

  • Bourdon, K. H., Goodman, R., Rae, D., Simpson, G. & Koretz, D. S. (2005). The Strengths and Difficulties Questionnaire: U. S. normative data and psychometric properties. Journal of the American Academy of Child and Adolescent Psychiatry, 44, 557–564. First citation in articleCrossref MedlineGoogle Scholar

  • Bøe, T., Hysing, M., Skogen, J. C. & Breivik, K. (2016). The Strengths and Difficulties Questionnaire (SDQ): Factor structure and gender equivalence in Norwegian adolescents. PloS one, 11(5), e0152202. First citation in articleCrossref MedlineGoogle Scholar

  • Caci, H., Morin, A. J. & Tran, A. (2015). Investigation of a bifactor model of the Strengths and Difficulties Questionnaire. European Child & Adolescent Psychiatry, 24, 1291–1301. First citation in articleCrossref MedlineGoogle Scholar

  • Capron, C., Thérond, C. & Duyme, M. (2007). Psychometric properties of the French version of the self-report and teacher strengths and difficulties questionnaire (SDQ). European Journal of Psychological Assessment, 23, 79–88. First citation in articleLinkGoogle Scholar

  • Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98–104. First citation in articleCrossrefGoogle Scholar

  • Costello, E. J., Mustillo, S., Erkanli, A., Keeler, G. & Angold, A. (2003). Prevalence and development of psychiatric disorders in childhood and adolescence. Archives of General Psychiatry, 60, 837–844. First citation in articleCrossref MedlineGoogle Scholar

  • Cox, L. H. (1987). A constructive procedure for unbiased controlled rounding. Journal of the American Statistical Association, 82, 520–524. First citation in articleCrossrefGoogle Scholar

  • Dickey, W. & Blumberg, S. (2004). Revisiting the factor structure of the Strengths and Difficulties Questionnaire. Journal of the American Academy of Child and Adolescent Psychiatry, 43, 1159–1167. First citation in articleCrossref MedlineGoogle Scholar

  • Döpfner, M. & Petermann, F. (2012). Diagnostik psychischer Störungen im Kindes- und Jugendalter [Diagnosis of mental disorders in children and adolescents] (Vol. 2). Hogrefe Verlag. First citation in articleGoogle Scholar

  • Du, Y., Kou, J. & Coghill, D. (2008). The validity, reliability and normative scores of the parent, teacher and self-report versions of the Strengths and Difficulties Questionnaire in China. Child and Adolescent Psychiatry and Mental Health, 2: 8. First citation in articleCrossref MedlineGoogle Scholar

  • Dunn, T. J., Baguley, T. & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105, 399–412. First citation in articleCrossref MedlineGoogle Scholar

  • Essau, C. A., Olaya, B., Anastassiou-Hadjicharalambous, X., Pauli, G., Gilvarry, C. & Bray, D., … (2012). Psychometric properties of the Strengths and Difficulties Questionnaire from five European countries. International Journal of Methods Psychiatric Research, 21, 232–245. First citation in articleCrossref MedlineGoogle Scholar

  • Fombonne, E. (1991). The use of questionnaires in child psychiatry research: Measuring their performance and choosing an optimal cutoff. Journal of Child Psychology and Psychiatry, 32, 677–693. First citation in articleCrossref MedlineGoogle Scholar

  • Goodman, R. (1997). The Strengths and Difficulties Questionnaire: A research note. Journal of Child Psychology and Psychiatry, 38, 581–586. First citation in articleCrossref MedlineGoogle Scholar

  • Goodman, R. (2001). Psychometric properties of the Strengths and Difficulties Questionnaire. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 1337–1345. First citation in articleCrossref MedlineGoogle Scholar

  • Goodman, R., Ford, T., Simmons, H., Gatward, R. & Meltzer, H. (2000). Using the Strengths and Difficulties Questionnaire (SDQ) to screen for child psychiatric disorders in a community sample. British Journal of Psychiatry, 177, 534–539. First citation in articleCrossref MedlineGoogle Scholar

  • Goodman, A., Lamping, D. L. & Ploubidis, G. B. (2010). When to use broader internalising and externalising subscales instead of the hypothesised five subscales on the Strenghs and Difficulties Questionnaire (SDQ). Data from British parents, teachers and children. Journal of Abnormal Child Psychology, 38, 1179–1191. First citation in articleCrossref MedlineGoogle Scholar

  • Goodman, R., Meltzer, H. & Bailey, V. (1998). The Strengths and Difficulties Questionnaire: A pilot study on the validity of the self-report version. European Child and Adolescent Psychiatry, 7, 125–130. First citation in articleCrossref MedlineGoogle Scholar

  • Hagenberg, N., Becker, A., Roessner, V., Woerner, W. & Rothenberger, A. (2004). Evaluation of the self-reported SDQ in a clinical setting: Do self-reports tell us more than ratings by adult informants? European Journal of Child and Adolescent Psychiatry, 13, 17–23. First citation in articleGoogle Scholar

  • Hill, C. & Hughes, J. (2007). An examination of the convergent and discriminant validity of the Strengths and Difficulties Questionnaire. School PsycholQuart, 22, 380–406. First citation in articleCrossref MedlineGoogle Scholar

  • Hodges, K. (1993). Structured interviews for assessing children. Journal of Child Psychology and Psychiatry, 34, 49–68. First citation in articleCrossref MedlineGoogle Scholar

  • Hölling, H., Kurth, B.-M., Rothenberger, A., Becker, A. & Schlack, R. (2008). Assessing psychopathological problems of children and adolescents from 3 to 17 years in a nationwide representative sample: Results of the German health interview and examination survey for children and adolescents (KiGGS). European Child & Adolescent Psychiatry, 17(Suppl 1), 34–41. First citation in articleCrossref MedlineGoogle Scholar

  • Hölling, H., Schlack, R., Kamtsiuris, P., Butschalowsky, H., Schlaud, M. & Kurth, B. M. (2012). The KiGGS study: Nationwide representative longitudinal and cross-sectional study on the health of children and adolescents within the framework of health monitoring at the Robert Koch Institute. Bundesgesundheitsblatt, Gesundheitsforschung, Gesundheitsschutz, 55, 836–842. First citation in articleMedlineGoogle Scholar

  • Hölling, H., Schlack, R., Petermann, F., Ravens-Sieberer, U. & Mauz, E., & KiGGS Study Group. (2014). Psychological disorders and psychosocial impairment in children and adolescents aged between 3 and 17 years in Germany prevalence and temporal trends at 2 collection periods (2003–2006 and 2009–2012). Federal Health Gazette – Health Research Health Protection, 57, 807–819. First citation in articleGoogle Scholar

  • Kish, L. (1965). Survey sampling. New York: Wiley. First citation in articleGoogle Scholar

  • Klasen, H., Woerner, W., Wolke, D., Meyer, R., Overmeyer, S., Kaschnitz, W. & … Goodman, R. (2000). Comparing the German versions of the strengths and difficulties questionnaire (SDQ-Deu) and the child behavior checklist. European Child & Adolescent Psychiatry, 9, 271–276. First citation in articleCrossref MedlineGoogle Scholar

  • Kline, R. (2000). Reliability of tests: Practical issues. In The handbook of psychological testing (2nd ed., pp. 7–16). London: Routledge. First citation in articleGoogle Scholar

  • Koglin, U., Barquero, B., Mayer, H., Scheithauer, H. & Petermann, F. (2007). German version of the Strengths and Difficulties Questionnaire (SDQ-Deu): Psychometrische Qualität der Lehrer-/Erzieherversion für Kindergartenkinder [Psychometric quality of teacher/Educator version for kindergarten children]. Diagnostica, 53(4), 175–183. First citation in articleLinkGoogle Scholar

  • Koskelainen, M., Sourander, A. & Kaljonen, A. (2000). The Strengths and Difficulties Questionnaire among Finnish school-aged children and adolescents. European Child and Adolescent Psychiatry, 9, 277–284. First citation in articleCrossref MedlineGoogle Scholar

  • Kurth, B. M., Lange, C., Kamtsiuris, P. & Hölling, H. (2009). Health monitoring at the Robert Koch Institute: Status and perspectives. Bundesgesundheitsblatt, Gesundheitsforschung, Gesundheitsschutz, 52, 557–570. First citation in articleCrossref MedlineGoogle Scholar

  • Kurth BM, Kamtsiuris P et al. (2008). The challenge of comprehensively mapping children’s health in a nationwide health survey: Design of the German KiGGS-Study. BMC Public Health, 8(1): 196. First citation in articleCrossref MedlineGoogle Scholar

  • Kóbor, A., Takács, Á. & Urbán, R. (2013). The bifactor model of the Strengths and Difficulties Questionnaire. European Journal of Psychological Assessment, 29, 299–307. First citation in articleLinkGoogle Scholar

  • Lange, M., Butschalowsky, H. G., Jentsch, F., Kuhnert, R., Schaffrath, A. R., Schlaud, M. & Kamtsiuris, P. (2014). The first KiGGS follow-up (KiGGS Wave 1): Study conduct, sample design, and response. Bundesgesundheitsblatt, Gesundheitsforschung, Gesundheitsschutz, 57, 747–761. First citation in articleCrossref MedlineGoogle Scholar

  • Lohbeck, A., Schultheiß, J., Petermann, F. & Petermann, U. (2015). Die deutsche Selbstbeurteilungsversion des Strengths and Difficulties Questionnaire (SDQ-Deu-S). [The German self-assessment version of the Strengths and Difficulties Questionnaire (SDQ-Deu-S)]. Psychometrische Eigenschaften, Faktorenstruktur und Grenzwerte. Diagnostica, 61, 222–235. First citation in articleLinkGoogle Scholar

  • Masi, G., Muratori, P., Manfredi, A., Lenzi, F., Polidori, L., Ruglioni, L., … Milone, A. (2013). Response to treatments in youth with disruptive behaviour disorders. Comprehensive Psychiatry, 54, 1009–1015. First citation in articleCrossref MedlineGoogle Scholar

  • Morgan, C. J. & Cauce, A. M. (1999). Predicting DSM-III-R disorders from the Youth Self-Report: Analysis of data from a field study. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 1237–1245. First citation in articleCrossref MedlineGoogle Scholar

  • Muris, P., Meesters, C., Eijkelenboom, A. & Vincken, M. (2004). The self-report version of the Strengths and Difficulties Questionnaire: Its psychometric properties in 8- to 13-year-old non-clinical children. British Journal of Clinical Psychology, 43, 437–448. First citation in articleCrossref MedlineGoogle Scholar

  • Muris, P., Meesters, C. & van den Berg, F. (2003). The Strengths and Difficulties Questionnaire (SDQ): Further evidence for its reliability and validity in a community sample of Dutch children and adolescents. European Child and Adolescent Psychiatry, 12, 1–8. First citation in articleCrossref MedlineGoogle Scholar

  • Ortuño-Sierra, J., Fonseca-Pedrero, E., Aritio-Solana, R., Velasco, A. M., De Luis, E. C., Schumann, G. & Bokde, A. (2015). New evidence of factor structure and measurement invariance of the SDQ across five European nations. European Child & Adolescent Psychiatry, 24, 1523–1534. First citation in articleCrossref MedlineGoogle Scholar

  • Patalay, P., Fonagy, P., Deighton, J., Belsky, J., Vostanis, P. & Wolpert, M. (2015). A general psychopathology factor in early adolescence. The British Journal of Psychiatry, 207, 15–22. First citation in articleCrossref MedlineGoogle Scholar

  • Petermann, U., Döpfner, M., Lehmkuhl, G. & Scheithauer, H. (2000). Klassifikation und Epidemiologie psychischer Störungen [translation please]. In F. Petermann (Ed.), Lehrbuch der klinischen Kinderpsychologie und -psychotherapie (pp. 30–56). Hogrefe: Berlin. First citation in articleGoogle Scholar

  • Ravens-Sieberer, U., Wille, N., Bettge, S. & Erhart, M. (2007). Psychische Gesundheit von Kindern und Jugendlichen in Deutschland Ergebnisse aus der BELLA-Studie im Kinderund Jugendgesundheitssurvey (KiGGS) (2007) [Mental health of children and adolescents in Germany Results from the BELLA study in the child and adolescent health survey]. Bundesgesundheitsblatt, Gesundheitsforschung Gesundheitsschutz, 50, 871–878. First citation in articleCrossref MedlineGoogle Scholar

  • Revelle, W. & Zinbarg, R. E. (2009). Coefficients alpha, beta, omega, and the GLB: Comments on Sijtsma. Psychometrika, 74, 145–154. First citation in articleCrossrefGoogle Scholar

  • Rogge, J., Speck, K., Hölling, H. Minnaert, A., Koglin, U. & Schlack, R. (2017). Messinvarianz zwischen Eltern- und Jugendversion des Strengths and Difficulties Questionnaire (SDQ)? [Measurement Invariance Between Parent and Youth Version of the Strengths and Difficulties Questionnaire (SDQ)?]. Diagnostica. First citation in articleGoogle Scholar

  • Rothenberger, A., Becker, A., Erhart, M., Wille, N. & Rayens-Sieberer, U. (2008). Psychometric properties of the parent strengths and difficulties questionnaire in the general population of German children and adolescents: Results of the BELLA study. European Child & Adolescent Psychiatry, 7 (Suppl 1), 99–105. First citation in articleCrossrefGoogle Scholar

  • Sharratt, K., Boduszek, D., Gallagher, B. & Jones, A. (2018). Factor structure and factorial invariance of the strengths and difficulties questionnaire among children of prisoners and their parents. Child Indicators Research, 11(2), 649–660. First citation in articleCrossref MedlineGoogle Scholar

  • Sijtsma, K. (2009). On the use, the misuse, and very limited usefulness of Cronbach’s alpha. Psychometrika, 74, 107–120. First citation in articleCrossref MedlineGoogle Scholar

  • Streiner, D. L. (2003). Starting at the beginning: An introduction to coefficient alpha and internal consistency. Journal of Personality Assessment, 80, 99–103. First citation in articleCrossref MedlineGoogle Scholar

  • Stone, L. L., Otten, R., Engels, R. C., Vermulst, A. A. & Janssens, J. M. (2010). Psychometric properties of the parent and teacher versions of the Strengths and Difficulities Questionnaire for 4- to 12-year-olds: A review. Clinical Child and Family Psychology Review, 13, 254–274. First citation in articleCrossref MedlineGoogle Scholar

  • van de Looij-Jansen, P. M., Goedhart, A. W., de Wilde, E. J. & Treffers, P. D. (2011). Confirmatory factor analysis and factorial invariance analysis of the adolescent self-report Strengths and Difficulties Questionnaire: How important are method effects and minor factors? British Journal of Clinical Psychology, 50, 127–144. First citation in articleCrossref MedlineGoogle Scholar

  • Vostanis, P. (2006). Strengths and Difficulties Questionnaire: Research and clinical applications. Current Opinion in Psychiatry, 19, 367–372. First citation in articleCrossref MedlineGoogle Scholar

  • Yao, S., Zhang, C., Zhu, X., Jing, X., McWhinnie, C. M. & Abela, J. R. (2009). Measuring adolescent psychopathology: psychometric properties of the self-report strengths and difficulties questionnaire in a sample of Chinese adolescents. Journal of Adolescent Health, 45, 55–62. First citation in articleCrossref MedlineGoogle Scholar

  • Woerner, W., Becker, A. & Rothenberger, A. (2004). Normative data and scale properties of the German parent SDQ. European Child and Adolescent Psychiatry, 13, II3–II10. First citation in articleCrossrefGoogle Scholar

  • Woerner, W., Becker, A., Friedrich, C., Klasen, H., Goodman, R. & Rothenberger, A. (2002). Normierung und Evaluation der deutschen Elternversion des Strengths and Difficulties Questionnaire (SDQ): Ergebnisse einer repräsentativen Felderhebung [Standardization and Evaluation of the German Parent Version of the Strengths and Difficulties Questionnaire (SDQ): Results of a Representative Field Survey]. Zeitschrift für Kinder- und Jugendpsychiatrie und Psychotherapie, 30, 105–112. First citation in articleLinkGoogle Scholar

Biyao Wang, Department of Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, von-Siebold-Str. 5, 37075 Göttingen, Germany, E-mail