We found problems with the convergent/divergent properties of the CVF subscales when applied to a survey of non-supervisory VHA employees. Employees did not appear to distinguish among entrepreneurial, team and rational cultures. Furthermore, the four-item hierarchical subscale had mediocre reliability. These findings could reflect one or more problems with external, internal, and construct validities.
External validity
The CVF as a model, or the CVF instrument, may not generalize to the VHA, or to non-managers (or to the combination of both). The CVF was validated originally among managers of non-governmental organizations, whereas this study applied it to non-supervisors in VHA, a national, integrated health care delivery system and agency of the federal government.
Preliminary to the analyses reported here, we conducted measurement equivalence/invariance analysis (ME/I) [
37] to compare response equivalence among employees at different supervisory levels to determine whether perceptions of organizational culture are systematically differ among organizational members belonging to various organizational hierarchy levels or subgroups. We found essentially identical factor structures to those reported here, but the item response levels differed systematically such that as supervisory level increases, the higher one rates one's organization on items from the entrepreneurial, team and rational subscales and the lower one rates one's organization on items from the hierarchical subscale (details of these analyses are available from the authors). As a result, we elected to focus our present analysis on the instrument's performance among non-supervisors. However, our preliminary ME/I analysis suggests that the instrument performs differently among different supervisory levels within VHA, and further ME/I analysis based on supervisory level is warranted.
We also believe ME/I analyses are needed to assess response equivalence among employees of an organization over time. Time invariance studies are important to determine whether observed differences reflect changes of phenomenon being studied or changes in the relationships between the factors or constructs and their correspondent items [
34].
Internal validity
The instrument used in this study may have contributed to poor internal validity, owing either to measurement problems with the original instruments published by Shortell and colleagues [
10] and Zammuto and Krakower [
20], or to modifications made to the survey used in VHA. We focus here on describing the modifications to the VHA instrument, and why we believe they do not represent significant threats to internal validity. We then briefly touch on issues related to the original instruments.
There were three modifications to the instrument used in VHA relative to Shortell and colleagues' instrument [
10]. First, the VHA instrument had 14 items rather than 20. Six items were dropped during pilot testing due to survey length constraints. The four items addressing facility rewards were dropped (one each from the four culture subscales), and one additional item each was dropped from the team and rational subscales (both relating to the "facility character" domain). The eliminated items were selected to minimize the effect on scale reliability. For example, dropping the two items from the team subscale reduced the alpha coefficient from 0.79 to 0.78, then to 0.76 in the pilot study data. Summary of pilot findings are available upon request from the authors.
It is conceivable though unlikely that shortening the instrument altered its psychometric properties and accounts for our findings. Four of the dropped items, those addressing facility rewards, were not original to the Zammuto and Krakower survey [
20], but were added by Shortell and colleagues [
10]. Adding the items back would not alter the high correlations among items in the entrepreneurial, rational and team subscales, and it is unlikely they would change the factor structure. Moreover, one would expect dropped items to be reflected in poor alpha statistics. However, the alpha statistics for the team and rational subscales (the subscales from which two items each were dropped) were already reasonably high, and improving them further would not change our conclusions. The hierarchical subscale was the only one of the four with poor reliability, and it is possible that the addition of an item would have improved the alpha coefficient. On the other hand, a hierarchical subscale based on four items is consistent with the Zammuto and Krakower instrument [
20] upon which the Shortell instrument is based [
10].
The second modification was to the wording of two items, both from the hierarchical subscale scale. The changes were made following pilot testing to improve readability and comprehension. The wording of VHA items was otherwise identical to that of the Shortell instrument [
10]. Nonetheless, the changes may have altered the scales' psychometric properties. In the VHA survey, item nine reads, "The glue that holds my facility together is formal rules and policies. People feel that following the rules is important." In the Shortell and colleagues instrument, the first statement is the same, but the second reads, "Maintaining a smooth-running institution is important here." Similarly, item 13 reads, "My facility emphasizes permanence and stability. Keeping things the same is important." Whereas in the Shortell and colleagues instruments, the latter part of item 13 reads, "Efficient, smooth operations are important." Thus, in the modified VHA items, the elements of coordination and operational efficiency are lost and that of rule adherence and stability are reinforced.
These changes may account for the poor reliability of the hierarchical subscale. This is suggested by the fact that dropping item 13 marginally improved reliability. One can speculate that had a fourth item with better item-rest correlation been included, the hierarchical subscale may have met conventional thresholds of reliability.
Nevertheless, these changes fail to account for the poor divergent properties of the rational, entrepreneurial and team subscales. Items for these three subscales were identical in wording to the Shortell version, yet elicited numerous, high cross-scale correlations. In brief, differences in item wording cannot account for the emergence of the humanistic factor.
Although it is beyond the scope of the current paper, it is worth noting that Shortell and colleagues' instrument [
10], itself, adapted the wording of items relative to the Zammuto and Krakower instrument [
20]. In most cases these changes were minimal. For example, the term "organization" replaced "institution" and "school," and instead of being a "mentor, sage or father/mother figure," managers were "warm and caring, and acted as mentors or guides." However, in some items, the change was more significant and some key terms did not carry over. The clearest example is item six of the rational subscale, which, in Shortell and colleagues' instrument reads, "Managers in my facility are coordinators and coaches. They help employees meet the facility's goals and objectives." In the Zammuto and Krakower survey, the equivalent items reads, "The head of institution D is generally considered to be a producer, a technician, or a hard-driver." Thus, in the revised item, the manager is presented in a more supportive light, while the sense of the manager as a taskmaster is lost. So far as we know, these changes and their effects on instrument reliability and validity have not been the subject of any published work.
The third and final modification to the VHA instrument is the way the scales were scored. Both Shortell and colleague's instrument [
10] and the Zammuto and Krakower instrument [
20] – as well as most research in health services using the CVF [
4,
5,
8,
9,
11] – used ipsative scales. These require respondents to allocate 100 points among four statements, each reflecting one of the hypothesized culture types. The VHA instrument used 5-point Likert scales, or normative scales. Ipsative scales, by their nature, are correlated. For example, respondents can only rate one culture stronger by rating weaker one or more of the others. This imposition of interdependence among subscales often inflates reliability statistics [
38]. It also makes such data unsuitable for correlation-based statistical modeling, such as factor analysis and regression modeling [
19]. Our use of data based on normative scales is therefore not a threat to internal validity, but it may help explain why we find lower reliability for the hierarchical subscale relative to past studies using ipsative scales. We note that although most studies have used ipsative scales, the validation by Quinn and Spreitzer [
19] used two versions of the instrument, one with ipsative scales and one normative (Likert) scales, and the subsequent validation by Kaliath [
15] also used normative scales. Thus, it is unlikely that our finding of poor divergent validity is primarily due to the normative scales.
There are also potential threats to internal validity originating with the CVF instrument reported by Zammuto and Krakower [
20] and carried over in Shortell and colleagues' instrument [
10] that we briefly note. First, terms such as "bureaucratic" and "innovative" likely carry normative connotations for lay readers that may overwhelm the technical nuances they are intended to elicit. For example, organizational theorists often use "bureaucracy" in reference to Weber's three principles of the bureaucracy (e.g., fixed and official jurisdiction for roles within the organization) [
39], whereas bureaucracy is a popular byword for pathological adherence to rules and the arbitrary exercise of administrative power. Second, most of the original CVF items consist of two declarative statements, often addressing clearly different aspects of culture, such as smooth-running operations and adherence to rules. Respondents may react to each statement differently but are obliged to respond with a single score. This introduces potential measurement error. Third and finally, items were intentionally organized across four organizational domains or content areas: institutional characteristics, institutional leader, institutional "glue" and institutional emphases (and Shortell and colleagues added a fifth: institutional rewards [
10]). A major theoretical assumption of the CVF is that organizational culture pervades and unifies the organization across these different domains. Accordingly, there was one item per culture type for each domain. However, it may be that different cultures exist within each of these domains, or that the cultures operate differently in different domains. By using items across different domains to assess a single culture subscale, the instrument may have introduced measurement error.
Construct validity
There may be poor construct validity for three of the four CVF culture types. Factor analysis indicates that what have been previously described as entrepreneurial, rational and team cultures are accounted for by a single common factor. We find a simplified 12-item, two-factor model fits the data marginally better and more parsimoniously than the classic CVF. More importantly, the convergent/divergent properties of the two-factor solution are superior. This modified two-factor model may provide an alternative to the CVF subscales, and to that end we describe what we believe are the defining characteristics of each factor, which we dub prescriptive culture and humanistic culture.
The prescriptive culture subscale consists of three items, two from the hierarchical subscale (items five and nine) and one from the rational subscale (item ten); the latter cross-loaded on both subscales. In these items, managers are "rule-enforcers;" employees adhere to "formal rules and policies;" and the facility emphasizes "tasks and goal accomplishment." Thus, there is a strong subtext of extrinsic motivation, deriving from the formal policies of the organization and mediated by management. The object of this motivation is to accomplish the employees' tasks in the service of the facility's goals. In the two items from the hierarchical scale that do not load on prescriptive culture, the facility is a "formalized and structured place" where "bureaucratic procedures" govern (item two), and the touchstone is "permanence and stability" and "keeping things the same" (item 13). This suggests that a crucial difference may exist between this emergent factor and the original construct, with the latter including a sense that formal structure is in the service of stability. Nonetheless, three items provide a limited basis for reliably deriving or assessing a construct. Consequently, our outline of the prescriptive culture construct should be viewed as provisional.
Humanistic culture appears to encompass more conceptually diverse qualities, from "warm and caring" to "commitment to innovation and development" to "loyalty and tradition." Nonetheless, the ten items in the subscale share generally positive connotations. They all reflect qualities that one might characterize as human virtues, and which imply that individuals are intrinsically motivated. The organization works to engender loyalty and commitment to innovation, but these values ultimately derive from the individual employees and the survey items suggest impulsion rather than compulsion.
Item ten, originally of the rational subscale, loaded almost equally onto humanistic and prescriptive cultures, with the loading on prescriptive culture falling just short of the threshold of 0.40. Item 10 states, "The glue that holds my facility together is the emphasis on tasks and goal accomplishment. A production orientation is commonly shared." Conceptually, the reference to tasks and goal accomplishment may map more closely to prescriptive culture. The reason for the cross-loading may be lay readers' confusion over the term "production orientation." Had the item referred only to tasks and goal accomplishment, it may have correlated more highly with the prescriptive culture subscale.
The moderately strong, positive correlation between humanistic and prescriptive cultures suggests that VHA employees do not see cultures of intrinsic and extrinsic motivation as mutually exclusive. In fact this supports a central contention among some proponents of the original CVF model, namely that the same organization may simultaneously exhibit qualities of fundamentally competing value systems, and that the "best" organizational culture may be one of equilibrium [
17]. We find a timely example of this in a recent study of top-ranked hospitals for acute cardiac care, which simultaneously exhibited a high degree of flexibility (for example, in applying clinical protocols) and a high degree of rigidity (for example, in selecting and pursuing specific performance targets) [
40]. Shortell and colleagues also found culture balance related to the number and depth of changes made by teams in chronic care settings [
9], and this finding is consistent with Kalliath and colleagues' observed positive correlation between hierarchical and entrepreneurial cultures [
15].
Although we chose new labels, the two factors strongly resemble past management theories including Burns and Stalker's "mechanistic" and "organic" organizations [
41], and McGregor's "Theory X" and "Theory Y" of management [
42]. Mechanistic organizations are said to be characterized by a clear understanding among employees of their performance obligations and what they can expect from the organization in return, clear policies regarding behavior, and an emphasis on chain of command. Organic organizations are characterized by an ethic of diffuse responsibility and decision making such that each employee is expected to do whatever is necessary to get the job done at the time; they rely on shared values and goals to govern behavior rather than specific and extensive rules and instructions. Theory X holds that employees primarily desire stability and security, and require supervision to be productive. Theory Y holds that employees who share the organization's goals will be intrinsically motivated to do their best and will actively seek responsibilities. Humanistic and prescriptive cultures may be iterations of these constructs. In fact, a wry article in the lay press recently proclaimed that virtually all management theory boils down to some version of a dualistic "humanistic" versus "mechanistic" view of organizations [
43].
Directions for future research
Our study raises questions about the validity of a popular instrument based on the CVF when applied to a sample of non-managers. We identify and describe several explanations our findings. We also describe a two-factor scale solution that emerged as an alternative to the conventional four-factor scale. We dub these two factors humanistic and prescriptive. However, our study is not the final word on the CVF, nor is it a sufficient basis to conclude the two-factor solution is a valid or meaningful alternative. Significant additional research is needed.
The first need is for additional analysis of the differences in perception of organizational culture among managers and non-managers. In a measurement equivalence/invariance analysis preliminary to the results reported here, we found that item response varied among supervisory groups. Further analyses of these differences are needed, as well as analysis of how item response varies within organizations over time.
Second, further research is needed on the psychometrics of particular items. We describe potential issues with item wording and structure – derived both from the original CVF instrument and from changes made to the VHA survey – that may account for some of our findings, notably the poor reliability of the hierarchical subscale. There is also need to explore how experiences with different parts of an organization may influence respondents. For example, perhaps employees perceive different cultures in different workgroups or departments: a physician might perceive their internal medicine service as relatively supportive and entrepreneurial, but their human resources department as relatively rule-bound and bureaucratic. If so, it is not clear how respondents answer questions based on the overall organization.
Third and finally, additional research is needed on the emergent two-factor solution both to determine if it is observed in other settings, and whether it is associated with theoretically relevant organizational processes or outcomes, such as performance measures, in order to establish criterion validity.