Background
The concept of personality organization or, in other terms, personality structure stands for intrapsychic formations that represent a basis of the personality and determine a person’s functioning in dealing with his or her own self and interpersonal relationships. Thus, personality functioning can be regarded as the observable manifestation of the underlying personality organization. The assessment of personality functioning goes back to Freud’s first structural model [
1], that distinguished conscious, pre-conscious, and unconscious aspects of the mind. Based on Anna Freud’s work about the defense mechanisms [
2] Hartmann [
3,
4] described ego functions as result of a healthy development and a basic condition for a mental equilibrium and psychosocial functioning. Kernberg [
5,
6] coined the term personality organization and initially distinguished three levels: Neurotic, borderline, and psychotic level of personality organization. While neurotic patients are characterized by an integrated identity, mature defense mechanisms (e.g., repression, rationalization, intellectualization), and good reality testing, borderline patients show impaired identity integration (“identity diffusion”) and primitive defense mechanisms (splitting, idealization, devaluation, denial, projective identification). (It has to be pointed out that the term borderline personality organization stands for a level of personality functioning and not for the nosological entity borderline personality disorder. However, borderline personality disorder usually occurs on a borderline level of personality organization.) On a psychotic level, in addition, reality testing is suspended. The basic assumption in Kernberg’s model is that the internal world of individuals on a borderline or psychotic level consists of split-off aspects of the self and others, which means that there are no integrated internal images of the self and significant others. This deficit leads to numerous problems in personality functioning in the realm of identity, interpersonal relations, coping with stress and aggression, as well as moral values. Kernberg presented a theoretical model that assigns the different personality disorders to different levels of personality organization [
7], p. 14. In this model personality disorders like obsessive-compulsive or depressive-masochistic PD are on a neurotic level, histrionic, dependent, and narcissistic on a higher borderline level, and borderline, paranoid, schizoid, and antisocial on a lower borderline level of personality organization. Thus, DSM-IV cluster C personality disorders are located on a higher level of personality organization than cluster A and B personality disorders.
On the basis of Bowlby’s attachment theory [
8] Fonagy [
9,
10] developed the model of mentalization that focuses on an individual’s ability to understand emotions, thoughts, and motives of other people and the mutual processes in interpersonal relationships. Mentalization has been operationalized as reflective functioning and can be assessed by means of Fonagy’s Reflective Functioning Scale [
11]. It has been shown that reflective functioning is highly correlated with personality organization in terms of Kernberg [
12]. Alternative and well-established measures for the assessment of personality functioning are, e.g., Wallerstein’s Scales of Psychological Capacities (SPC) [
13] and the Operationalized Psychodynamic Diagnosis (OPD) [
14].
Based on the above-mentioned concepts, recently an assessment of personality functioning was developed [
15] that has been incorporated into the new DSM-5 classification [
16]. The annex (Section 3) of the DSM-5 contains the Levels of Personality Functioning Scale (LPFS) for diagnosing personality disorders. The LPFS consists of two dimensions with two subdomains each: Self (identity and self-direction) and interpersonal (empathy and intimacy). This scale is provided for future research as an assessment tool for severity of personality disorders [
16]. The International Classification of Diseases (ICD-11) will probably contain a similar measure [
17].
Kernberg was the first to describe a clinical interview, the “Structural Interview” [
6,
18], that aimed at the assessment of personality organization in a clinical and qualitative way. With the aim to quantify Kernberg’s dimensions of personality organization, Clarkin, Kernberg, and colleagues at first created a questionnaire, The Inventory of Personality Organization (IPO) [
19,
20]. Thereafter, an interviewer-based instrument was developed by the same group, the Structured Interview of Personality Organization (STIPO) [
21]. This structured 100-item interview contains seven domains that were derived from Kernberg’s conceptual work: (1) identity, (2) quality of object relations, (3) primitive defenses, (4) coping and rigidity, (5) aggression, (6) moral values, (7) reality testing and perceptual distortions.
The development of the STIPO was described in detail by Stern et al. [
22]. They focused on the three scales identity, primitive defenses, and reality testing as the core domains of Kernberg’s model of personality organization. Interrater reliability was found to be satisfying (intraclass correlations .96 for identity, .97 for primitive defenses, and .72 for reality testing) as was internal consistency (Cronbach’s alpha .86 for identity, .85 for primitive defenses, and .69 for reality testing). Moreover, the STIPO dimension identity predicted positive and negative affect (assessed by the Schedule of Nonadaptive an Adaptive Personality, SNAP [
23]), whereas the STIPO scale primitive defenses was correlated with aggression (assessed by the SNAP and the IPO) and cluster B personality traits according to DSM-IV [
22]. Stern et al. [
22] cautiously regarded their initial study on the STIPO as a preliminary empirical support of Kernberg’s model of personality organization. They recommended it to researches “interested in the empirical relation between psychoanalytically informed constructs, contemporary trait models of normal and disordered personality and their neurobehavioral underpinnings, and current personality disorder nosology” [
22], p. 43. In addition, Stern and colleagues pointed out the necessity of a replication of their results in other studies with diverse samples.
Hörz [
24] demonstrated the construct validity of the STIPO by generating a prototype of borderline personality organization that correlated significantly with corresponding clinical measures. In a treatment outcome study on borderline personality disorder [
25] the STIPO demonstrated its sensitivity to change: patients with borderline personality disorder treated with Transference-Focused Psychotherapy (TFP) [
7] showed a significantly higher improvement of personality organization than patients of the control group. Moreover, the STIPO was used as a severity measure of psychopathology, especially in personality disorders: It was shown that worse personality functioning as assessed by the STIPO goes along with more axis I and more axis II diagnoses [
26].
Taken together, the STIPO is the only structured interview for the assessment of personality functioning. As such, it allows for the determination of specific psychopathology beyond symptoms. Since particularly psychodynamic treatments aim at the change of personality functioning, an instrument like the STIPO is needed to empirically demonstrate changes of this kind, like it was done in the study by Doering et al. [
25]. From a conceptual point of view as pointed out by Stern et al. [
22] the STIPO might be helpful to empirically test Kernberg’s model of personality organization. For these reasons it seemed worthwhile to translate the instrument into German language and to replicate and in part extend the findings of the above mentioned previous studies. In this study reliability and validity of the German version of the 100-item STIPO was evaluated in 122 psychiatric patients. SCID-I and -II interviews were used for diagnosing psychiatric disorders and for the determination of discriminant validity. Scales from eight well-validated questionnaires served as external criteria for the assessment of concurrent validity.
Discussion
The Structured Interview of Personality Organization was evaluated with regard to interrater reliability, internal consistency of the seven scales, as well as concurrent and discriminant validity.
Interrater reliability was high, the intraclass correlations (ICC) of .89 to 1.0 for the seven dimensions and .96 for the global rating are in line with those reported by Stern et al. [
22] for the STIPO as well as with other structured interviews like the SCID-II (.90 to .98) [
31]. As to be expected, the interrater reliability of this structured interview exceeds the numbers reported for more unstructured, clinically oriented interviews like the Scales of Psychological Capacities (SPC) [
13] or the Operationalized Psychodynamic Diagnosis (OPD-2) [
14]. For the SPC ICC of .54 to .89 (mean ICC=.82) were reported [
50], for the OPD-2 structure axis, the ICC varied between .61 and .82 for the subdimensions and .83 for the total score [
33]. This difference can easily be explained by the fact that structured interviews give more detailed and strict advice for the rating of each single item, whereas unstructured interviews grant the freedom to judge in a more clinical fashion, which might lead to a better clinical impression of the patient at the expense of reliability in terms of agreement between different raters.
The internal consistency of the seven STIPO dimensions was found between .80 and .93 with the exception of the dimension reality testing (.69), Crohnbach’s α for the total score was .97. This confirms the numbers reported by Stern et al. [
22], who also found Crohnbach’s α above .80 for identity and primitive defenses, but lower in reality testing (.69). The lower internal consistency of the dimension reality testing can be explained by the fact that this scale contains different constructs like paranoid thinking, dissociation, and depersonalization that do not necessarily correlate highly in all patients. Maffei et al. [
31] found Crohnbach’s α between .71 and .94 for the SCID-II personality disorder scales. These results show that the constructs of dimensions of personality functioning are as coherent as the constructs of distinct personality disorders in the DSM-IV; both are on a satisfactory level.
High correlations among the STIPO dimensions occurred (.48 to .79) which means that the seven dimensions are not independent from each other. This is not astonishing since Kernberg conceptualized the dimensions of personality organization as different manifestations of an underlying core pathology, namely identity diffusion as a result of disturbed development during early life due to genetic disposition and mainly adverse early relationships [
5,
6]. From a theoretical point of view it could be argued that one dimension would be enough for the determination of personality organization or functioning. This argument supports the development of a short version of the STIPO, which is currently being prepared by the authors of the instrument. From a clinical point of view one would be reluctant to relinquish the important detailed clinical information from each of the STIPO dimensions. As a consequence it will be recommendable to maintain both, a short and a long version; a short version for screening purposes and general scientific use and a long version for treatment planning in the clinical field and specific research questions.
As far as concurrent validity is concerned, the STIPO correlates significantly with all a priori selected corresponding questionnaire scales, but the STIPO dimensions also correlated significantly with almost all of the other questionnaire scales. At first, this result suggests, that a general factor underlies the different measures, which might be a general severity of psychopathology. However, a closer look reveals a number of relevant details of the correlational patterns. Throughout all STIPO dimensions except moral values and in part object relations and reality testing, the highest correlations occur with the BPI dimensions (except BPI reality testing). This can be attributed to the fact that the BPI is the only questionnaire employed in this study that is explicitly rooted in Kernberg’s theory, while other instruments like the Frankfurt Self-Concept Scales, the State-Trait-Anger-Expression-Inventory, and the Experiences in Close Relationships were not developed to assess personality functioning, but different cognitive or behavioral aspects. Stern et al. [
22] used the Inventory of Personality Organization (IPO) [
19] as criterion for concurrent validity testing of the STIPO scales“ identity”, “primitive defenses”, and “reality testing”. They found significant correlations between .45 and .57, which reflects the closeness of the two instruments: Both, the IPO questionnaire and the STIPO, are based on Kernberg’s concepts.
Another remarkable finding is that almost all STIPO dimensions showed their highest correlations with the ADP-IV “antisocial PD” scale. Looking at Kernberg’s concept of antisocial behavior in relation to personality organization, it is clearly regarded as a manifestation of a very low level of personality organization [
5,
6]. Since antisocial behavior is the key feature of antisocial personality disorder, it is not surprising that the “antisocial PD” score correlates highly with the STIPO dimensions. Interestingly, the STIPO dimension “moral values” shows the lowest correlations with all other questionnaire scales but the ADP-IV “antisocial PD” scale. This may indicate, that the converse argument does not hold. Low level of personality organization does not necessarily go along with antisocial behavior, while antisocial behavior is linked to low level of personality organization.
The concept of identity underlying the STIPO is also the basis of the BPI, thus, it has to be expected that the correlations of the STIPO “identity” domain with the BPI scales are high. The fact that the correlation with the BPI scales “primitive defenses” and “fear of closeness” were even slightly higher than the one with the BPI scale “identity diffusion” again reveals the fact that the different dimensions of Kernberg´s concept are not independent from each other, but rather based in a shared basic pathology. The same is true for “primitive defenses” that are closely related to identity diffusion in Kernberg’s concept.
The STIPO “object relations” scale shows comparably low correlations with the predicted questionnaire scales. This may have two reasons: Kernberg’s concept of object relations (and the corresponding STIPO scale) does not only contain cognitive and behavioral aspects, but also emphasizes affective components and internal working models of relationships. These are not (or much less) incorporated into the FSKN and ECR scales. The other reason for the lower correlations can probably be found in the fact that an impairment in this domain does not exclusively occur in patients with low personality organization, but also in patients with moderate impairment (neurotic level). Some features of the STIPO object relations domain are regarded as specific for low personality functioning (e.g., incapacity to be alone), while others are not (e.g., not having an intimate relationship for years).
The two STIPO subdomains of aggression, “self-directed” and “other-directed aggression”, correlate higher with the BPI and a few other scales than with the predicted FKBS and STAXI scales. This result can be explained by the different degrees or forms of aggression that are addressed: the STIPO asks for severe and partly physical aggression against the self and others, while the two questionnaires primarily focus less severe inner feelings of aggression or less severe verbal expression of anger towards others.
In addition to high correlations with the BPI scales and the ADP-IV “antisocial PD” scale, the STIPO domain “coping/rigidity” showed the predicted high correlation with the SPQ “negative coping” scale, whereas the predicted correlation with the FSKN “problem solving” scale was very low. A closer look at the FSKN scale reveals that it focuses a person’s self-image, while the STIPO are directed more towards behavioral aspects, how a person really copes with specific strains.
The low correlation of the STIPO “reality testing” domain with the corresponding BPI scale was somewhat surprising, since both instruments root in the same theory. However, a closer inspection of the BPI items reveals that the questionnaire solely addresses hallucinatory symptoms and thought disorder, while the STIPO aims at a broader spectrum of problems including paranoid thinking, dissociation, and depersonalization.
To summarize, the results demonstrated the convergent validity of the STIPO by means of a priori hypothesized correlations between the STIPO domains and related questionnaire scales. The correlations were not the highest of each STIPO domain with all of the questionnaire scales, which can be attributed to the different concepts underlying the instruments. As mentioned above, the highest correlations were found with the conceptually most closely related measures, the BPI and the ADP-IV “antisocial PD” scale.
The evaluation of discriminant validity is of particular importance here, since DSM-5 [
15,
16] has adapted the concept of personality functioning for the assessment of personality disorders. If the STIPO should be acceptable as an instrument for the determination of personality functioning in the sense of DSM-5 it must be able to differentiate between patients with and without personality disorder and between different degrees of severity among patients with personality disorders. Our results demonstrate that the STIPO can well distinguish between patients with and without personality disorder (between group effect size d=1.62) as well as between cluster B and cluster C personality disorder (d=1.26).
Taken together, the STIPO can be regarded as a reliable and valid instrument for the assessment of personality functioning in clinical and research settings. It might help to validate the DSM-5 Levels of Personality Functioning Scale [
16] and other instruments - particularly questionnaires - that aim at the assessment of these dimensions. In a clinical setting, the duration of a STIPO interview (90 to 180 minutes) might be seen as a disadvantage. Therefore, short versions of the interview are presently developed. For research purposes, the STIPO appears to have two main advantages compared to other instruments: The structured interview type yields better interrater reliability and no audio or video recordings are needed for the ratings, which are done by the interviewer during the interview. In contrast, less structured interviews like SPC or OPD-2 require expert ratings of the recordings for the achievement of sufficient reliability of the ratings.
A highly interesting research question for the future will be, whether the DSM-5 restriction to two domains of personality functioning (self and interpersonal) is justified. Recently, it was argued, that additional dimensions of personality functioning, like e.g. aggression/ impulse control are needed [
51]. On the one hand, the additional dimensions of the STIPO or related instruments provide important clinical information for indication and treatment planning. On the other hand, it remains to be investigated, whether the different dimensions change simultaneously during treatment or consecutively. It appears likely that, e.g., the sense of self and others changes before the quality of interpersonal relationships improves. It is to be expected that the DSM-5 focus on personality functioning will trigger research in this area that can benefit from reliable and valid instruments like the STIPO.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
SD developed the study design, carried out the statistical analyses, and drafted the manuscript, ML, DM, BB, and MF recruited the patients, have been responsible for data acquisition, GS, MB, GH, and SD contributed substantially to data analyses and interpretation of data. All authors contributed substantially to a number of revisions of the manuscript and gave final approval for its submission. All authors read and approved the final manuscript.