Delphi as a method to establish consensus for diagnostic criteria

doi:10.1016/S0895-4356(03)00211-7

Journal of Clinical Epidemiology

Volume 56, Issue 12, December 2003, Pages 1150-1156

https://doi.org/10.1016/S0895-4356(03)00211-7 Get rights and content

Abstract

Background and objectives

To achieve a consensus, among a panel of experts, on the best clinical criteria for the clinical diagnosis of carpal tunnel syndrome (CTS).

Method

Experts rated the diagnostic importance of items from the clinical history and physical examination for CTS. The ratings were expressed on a 10-cm visual analog scale. The average and standard deviation of the scores for each item were returned to the panelists. The panel members evaluated the items a second time with knowledge of the group responses from the first round. The scores were standardized to minimize scaling variations and, after the second round, the items were ranked in order of importance assigned by the group. Cronbach's α was used as a measure of homogeneity for the rankings. Increasing homogeneity was considered to be an indication of consensus among the panelists.

Results

Cronbach's α increased from 0.86 after the first round to 0.91 after the second iteration. Panelists who were relative outliers on the first round demonstrated a much higher correlation with the entire group after the second round.

Conclusions

Delphi is an effective method of establishing consensus for certain clinical questions. Cronbach's α was a useful statistic for measuring the extent of consensus among the panel members. Delphi was chosen from the possible methods of group process because of its inherent feasibility. The absence of a need by the panelists to meet in person removed any constraint on the geographic location of the panel members. In addition, the anonymous nature of Delphi was thought to be a key factor in avoiding a result that might be skewed by one or more persuasive panelists. Both of these characteristics were felt to be particularly important to the topic on which consensus was sought, the clinical diagnostic criteria for CTS. This movement in the opinions of some of the panelists appeared to result from the feedback of information describing the group opinion.

Introduction

Delphi is a well-recognized group process in the social sciences [1], [2], [3], [4], [5], [6], [7], [8]. Although prior studies have used this method to establish appropriateness criteria for treatment [9], [10], [11], [12], [13], [14], [15], [16], it has received less attention as a tool to establish consensus among health care professionals on diagnosis.

Delphi is a completely anonymous process in which the participants never meet. Delphi otherwise resembles the nominal group in structure. Ideas are expressed to the participants in the context of a mailed questionnaire. Responses to the items in the questionnaire are collated and analyzed. Items may be dropped or added in a second round in which the group responses to the first round are reported to the participants. New responses to the items are recorded and repeated iterations of the process carried out until a consensus appears to have been reached. The determination that a consensus has been achieved requires an operational definition that is appropriate to the issue under consideration.

Delphi is particularly attractive for the task of achieving consensus, especially among health care professionals. First, the absence of an obligation to meet in person greatly improves the feasibility of Delphi and lowers the cost significantly. Second, and perhaps more importantly, there are less likely to be constraints on either the size or composition of the group. Participants may be recruited from diverse geographic locations and clinical backgrounds. Third, the reliability of group consensus for the issue being examined improves as the number of panelists is increased [2]. A panel size appropriate to the issue under consideration is easier to achieve because of the inherent feasibility of the Delphi process. Finally, the anonymous nature of the exercise ensures that a single influential participant will not have a disproportionate impact on the outcome of the group as can occur with other group processes.

The Delphi process has been criticized as being subject to bias because the investigator limits the scope of the issue evaluated by the panelists. Thus, as the breadth of the issue under consideration is at least partially controlled by the investigator, any consensus that may emerge may be somewhat distorted [5]. The Delphi method has also been criticized for the fact that the panelists never meet together. Other group processes depend on the interaction between the participants as a source of novel insight into an issue. Due to the nature of Delphi, no discussion takes place, and any consensus that the group appears to have developed can only derive from information provided to it by the investigator. Where there is discussion among panelists, like in other types of group process, the consensus reached may be significantly different from that expected prior conducting the group process [4]. Finally, criteria for determining that group consensus has been achieved have not been established.

Carpal tunnel syndrome (CTS) is a diagnosis commonly made in industrialized societies. The prevalence of CTS reportedly varies with the geographic location of the population under study [17], [18], [19]. There also are variations in the reported prevalence of the condition between different industries [20], [21], [22], [23], [24], [25], [26], [27]. Potential explanations for these variations may include intrinsic differences among the populations, different exposures, or variations in the diagnostic criteria for identifying CTS. One of the more likely explanations for these variations is the different case definition for this condition among reports. These vary in their nature, emphasis, and stringency.

The clinical evaluation remains an important aspect of the diagnostic process for CTS. Electrodiagnostic studies are often considered to be a gold standard diagnostic test for CTS [28], [29], [30]. In other words, electrodiagnostic tests are frequently taken to represent a demonstration of the essential lesion for the clinical condition called CTS. There is not consensus on this issue, and the assumption that these tests represent the essential lesion in CTS may be flawed for the following reasons. First, electrodiagnostic testing does not have perfect sensitivity or specificity, and the tests may be normal despite clinically significant nerve compression [31], [32]. Second, the standard interpretation of electrodiagnostic data assumes a normal distribution of nerve conduction velocities, and arbitrarily designates velocities more than two standard deviations from the mean as abnormal. This results in a misclassification of asymptomatic individuals as being affected by CTS. In addition, the literature reports that assumption of a normal distribution of nerve conduction velocity may not be reasonable [33]. The third point is that cut point for defining an abnormality of nerve conduction velocity varies in the literature [30], [34], [35]. There is no established consensus on the electrical evidence for CTS. The specificity and sensitivity of sensory nerve conduction measurements is affected by the threshold defining CTS. This would not be expected of a gold standard test in the usual connotation for this term.

The use of electrodiagnostic tests is not universal among experts who treat CTS [36]. Like any diagnostic test, electrodiagnostic evaluations should be interpreted from within a clinical context. Ideally, they should be used in a bayesian manner to modify the pretest probability of CTS established clinically. Thus, there is a need to standardize the clinical criteria for the diagnosis of CTS.

Clinical experience at our center has indicated that CTS is a condition that is diagnosed and treated by a broad spectrum of specialist and primary care clinicians. These diagnosticians bring varying experiences to the task of diagnosing CTS, experiences that have been obtained within an intellectual framework or paradigm specific to a particular clinical specialty. These unique clinical experiences could form the basis for diagnostic criteria that are not uniform among diagnosticians from different training backgrounds. The absence of uniform diagnostic criteria makes the study of potentially important factors, like industrial exposures, difficult or even impossible. Thus, CTS is a common clinical condition diagnosed using widely varying clinical criteria.

Diagnostic criteria for any condition must be both valid and clinically sensible. Establishing a consensus among clinical experts on what the criteria should comprise does not ensure validity. As clinical experience evolves, the opinions of experts may also change, together with their diagnostic practices. The development of methods for seeking consensus must consider this and be flexible so that the criteria can be re-examined and revised at intervals. Obtaining agreement among clinical opinion makers should be seen as a starting point for establishing criteria that are likely to have significant clinical sensibility and that can be tested to evaluate validity. The key issue is the development of a consensus so that a gold standard diagnostic criterion can be established.

The objective of this study was to determine whether the Delphi method could be used to achieve consensus among influential expert clinicians representing all of the involved clinical disciplines for diagnostic criteria for CTS. An additional goal was to measure agreement within the panel using Cronbach's α as a measure of the internal consistency of the group.

Section snippets

Selection of the panelists

Panelists were recruited from clinical disciplines involved in the diagnosis and treatment of CTS including neurology, neurosurgery, rheumatology, occupational health, plastic surgery, and orthopedic surgery. We attempted to identify experts defined in at least one of two ways. First, some panelists were leaders in their clinical fields as evidenced by their roles as opinion makers within national organizations such as the American Society for Surgery of the Hand and the American Society for

Results

Cronbach's α for the first round of the Delphi process was 0.86. The individual panelist-group correlation ranged between 0.23 and 0.73 (Table 2). This suggested that there were some panelists who were relative outliers. Two of the three panelists with the lowest correlation with the entire group were from the same clinical specialty, rheumatology. The three highest values were orthopedic hand surgeons.

The items were ranked in descending order according to the average score assigned by the

Discussion

The concept of consensus within a group is easily understood, but the best way to measure this phenomenon is unclear. Furthermore, criteria that indicate a consensus has been achieved will vary with the setting in which agreement is sought and the method being utilized.

Consensus within the group in general should be reflected in decreases in the variance of the responses. Consensus among groups has often been quanitated using group means and standard deviations [16]. In our study, the standard

Acknowledgements

Dr. Wright is supported as a the R.B. Salter Chair of Surgical Research and as an Investigator of the Canadian Institute for Health Research. This work was supported by Physicians' Services Incorporated Foundation Grant 97-52.

References (40)

J. Pill
The Delphi method: substance, context, a critique and an annotated bibliography
Socio-Econ Plan Sci
(1971)
M.E. Matthews et al.
Profiles of the future for administrative dietitians via the Delphi technique
J Am Diet Assoc
(1975)
M.C.T.F. de Krom et al.
Carpal tunnel syndrome: prevalence in the general population
J Clin Epidemiol
(1992)
G. Dieck et al.
An epidemiologic study of the carpal tunnel syndrome in an adult female population
Prev Med
(1985)
M.L. Bleecker
Medical surveillance for carpal tunnel syndrome in workers
J Hand Surg [Am]
(1987)
R.M. Braun et al.
Electrical studies as a prognostic factor in the surgical treatment of carpal tunnel syndrome
J Hand Surg [Am]
(1994)
A. Grundberg
Carpal tunnel decompression in spite of normal electromyography
J Hand Surg
(1983)
D.A. Jackson et al.
Electrodiagnosis of mild carpal tunnel syndrome
Arch Phys Med Rehabil
(1989)
K.H. Duncan et al.
Treatment of carpal tunnel syndrome by members of the American Society for Surgery of the Hand: results of a questionnaire
J Hand Surg
(1987)
G. Bravo et al.
Estimating the reliability of continuous measures with Cronbach's alpha or the intraclass correlation coefficient: toward the integration of two traditions
J Clin Epidemiol
(1991)

N.C. Dalkey et al.

An experimental application of the Delphi method to the use of experts

Manage Sci

(1963)

N.C. Dalkey

The Delphi method: an experimental study of group opinion

(1969)

N. Dalkey et al.

The Delphi Method III. Use of self-ratings to improve group estimates

(1969)

A. Fink et al.

Consensus methods: characteristics and guidelines for use

Am J Public Health

(1984)

F.J. Romm et al.

Developing criteria for quality of assessment: effect of the Delphi technique

Health Serv Res

(1979)

A.D. Weinberg et al.

The Delphi technique as a method for determining the continuing education needs of physicians regarding coronary artery disease

J Assoc Hosp Med Educ

(1977)

J. Zinn et al.

The use of the Delphi panel for consensus development on indicators of laboratory performance

Clin Lab Manage Rev

(1999)

M.R. Couper

The Delphi technique: characteristics and sequence model

ANS Adv Nurs Sci

(1984)

R.P. Gale et al.

Delphi-panel analysis of appropriateness of high-dose chemotherapy and blood cell or bone marrow autotransplants in diffuse large-cell lymphoma

Leuk Lymphoma

(1998)

B.J. Hillman et al.

Improving diagnostic accuracy: a comparison of interactive and Delphi consultations

Invest Radiol

(1977)

Cited by (362)

The Oberg, Manske, and Tonkin Classification of Congenital Upper Limb Anomalies: A Consensus Decision-Making Study for Difficult or Unclassifiable Cases
2024, Journal of Hand Surgery
An ideal classification system promotes communication and guides treatment for congenital upper limb differences (CULDs). The Oberg, Manske, and Tonkin (OMT) classification utilizes phenotypic presentation and knowledge of developmental biology for the classification of CULDs. In this consensus decision-making study, we hypothesized that CULDs that are difficult to classify would be identically classified by a group of experienced pediatric hand surgeons.
An international consortium of 14 pediatric hand surgeons in 3 countries contributed a group of 72 difficult-to-classify CULD cases. These were identified from the clinical practices of the surgeons and from associated registries. Through a Delphi-type process, repeated efforts were made to obtain consensus for the correct OMT classification of each case utilizing clinical images and radiographs.
The first round of discussion yielded a universal consensus for 57 cases. The remaining 15 cases continued to be put through additional rounds of the Delphi-type process. The repeat classification and discussion resulted in a final yield of 93% complete consensus in classification by the OMT. The primary challenge in diagnosis was differentiating cleft hand from ulnar longitudinal deficiency, identified as group A. Five cases were in this group, yet 2 remained without a clear consensus. Another controversial group, group B, was termed “brachy-polydactyly” and consisted of 3 cases where diagnoses varied between sympolydactyly, symbrachydactyly, or complex syndactyly.
The Delphi-type process was feasible and effective and allowed a 93% consensus in the diagnosis of difficult-to-classify cases by the OMT Classification. There remain limitations and controversies with the OMT system, especially when classifying hands with less than 5 skeletal digits, syndactyly, and those with diagnostic overlap between ulnar longitudinal deficiency and cleft hand and those considered “brachypolydactyly.” An improved understanding of the underlying etiology may be needed to determine the final diagnosis in difficult-to-classify conditions.
A consensus-seeking approach is effective and feasible in addressing difficult-to-classify CULDs.
Principles of Optimal Antithrombotic Therapy for Iliac VEnous Stenting (POATIVES): A national expert-based Delphi consensus study
2024, Journal of Vascular Surgery: Venous and Lymphatic Disorders
Management of antithrombotic therapy in patients undergoing venous stents has not yet reached consensus, and there are not any recommendations from published guidelines. We undertook a Delphi consensus from Chinese experts to develop recommendations regarding the preferred antithrombotic therapy in patients following venous stenting.
The phase 1 questionnaire was comprised of three clinical scenarios of venous stenting for non-thrombotic iliac vein lesions (NIVL), acute deep vein thrombosis (DVT), and post-thrombotic syndrome (PTS) and was sent to venous practitioners across China. In phase 2, the results of phase 1 were distributed to a panel of experts for evaluation along with a questionnaire encompassing a series of statements produced during phase 1. A modified Delphi method was used to reach consensus on recommendations through two rounds of surveys.
The phase 1 questionnaire was completed by 283 respondents. In phase 2, an expert panel consisting of 28 vascular surgeons and interventional radiologists was assembled and voted 17 statements relating to antithrombotic management after venous stenting for NIVL (4 statements), DVT (6 statements), and PTS (7 statements). The majority of the statements about the antithrombotic agent selection received a high consensus strength.
Based on the national Delphi consensus of Chinese experts regarding antithrombotic therapy following iliac venous stenting in three common scenarios, most of the statements could be used to guide antithrombotic management following venous stenting. Further studies are required to clarify controversial issues including the dose and duration of anticoagulants, the role of antiplatelet agents, especially in patients with NIVL.
Definition, diagnosis and treatment of oligometastatic oesophagogastric cancer: A Delphi consensus study in Europe
2023, European Journal of Cancer
Local treatment improves the outcomes for oligometastatic disease (OMD, i.e. an intermediate state between locoregional and widespread disseminated disease). However, consensus about the definition, diagnosis and treatment of oligometastatic oesophagogastric cancer is lacking. The aim of this study was to develop a multidisciplinary European consensus statement on the definition, diagnosis and treatment of oligometastatic oesophagogastric cancer.
In total, 65 specialists in the multidisciplinary treatment for oesophagogastric cancer from 49 expert centres across 16 European countries were requested to participate in this Delphi study. The consensus finding process consisted of a starting meeting, 2 online Delphi questionnaire rounds and an online consensus meeting. Input for Delphi questionnaires consisted of (1) a systematic review on definitions of oligometastatic oesophagogastric cancer and (2) a discussion of real-life clinical cases by multidisciplinary teams. Experts were asked to score each statement on a 5-point Likert scale. The agreement was scored to be either absent/poor (<50%), fair (50%–75%) or consensus (≥75%).
A total of 48 experts participated in the starting meeting, both Delphi rounds, and the consensus meeting (overall response rate: 71%). OMD was considered in patients with metastatic oesophagogastric cancer limited to 1 organ with ≤3 metastases or 1 extra-regional lymph node station (consensus). In addition, OMD was considered in patients without progression at restaging after systemic therapy (consensus). For patients with synchronous or metachronous OMD with a disease-free interval ≤2 years, systemic therapy followed by restaging to consider local treatment was considered as treatment (consensus). For metachronous OMD with a disease-free interval >2 years, either upfront local treatment or systemic treatment followed by restaging was considered as treatment (fair agreement).
The OMEC project has resulted in a multidisciplinary European consensus statement for the definition, diagnosis and treatment of oligometastatic oesophagogastric adenocarcinoma and squamous cell cancer. This can be used to standardise inclusion criteria for future clinical trials.
Construction of nursing-sensitive quality indicators for epilepsy in China: A Delphi consensus study
2023, Seizure
The quality and safety of epilepsy care are of great importance because seizures are unpredictable. The aim of this study was to develop a set of nursing-sensitive quality indicators (NSQIs) for assessing and improving the quality of epilepsy nursing care in China.
An international literature review, a cross-sectional survey and a qualitative study were conducted to identify candidate NSQIs for epilepsy care and compile a questionnaire. Then, two rounds of electronic Delphi studies were conducted with a panel of 27 independent experts to identify the final NSQIs for epilepsy.
Thirty-nine candidate NSQIs were extracted for the Delphi process. The recovery rates in the first and second rounds of expert consultations were 92.6% and 96.2%, respectively. The experts’ authority coefficients of the two rounds were 0.876 and 0.878, respectively. The Kendall W value of the two rounds ranged between 0.094 and 0.200 (p<0.001). Eight structure indicators, 9 process indicators and 7 outcome indicators that represented the following three domains were included in the set of NSQIs for epilepsy: nursing resource allocation, implementation of nursing care, and outcomes of patients with epilepsy.
These NSQIs for epilepsy provide a primary foundation for monitoring and improving the quality of epilepsy nursing care in China. However, the effects of these indicators on improvements in epilepsy care and outcomes in patients need to be verified in clinical practice.
International multispecialty consensus on how to image, define, and grade ultrasound imaging features of first metatarsophalangeal joint osteoarthritis, a Delphi consensus study
2023, Osteoarthritis and Cartilage Open
To reach consensus concerning which ultrasound imaging features should be assessed and graded, and what ultrasound imaging procedure should be performed when examining osteoarthritic change in the first metatarsophalangeal joint.
An online Delphi study was conducted over four iterative rounds with 16 expert health professionals. Items were scored from 0 to 100 (0 = not at all important; 100 = extremely important). Consensus was defined based upon an item receiving a median score of ≥70% acceptance. Items receiving median score of ≤50% were rejected. Items considered ambiguous (median score 51%–69% of acceptance) were assessed in an additional round. A final round determined the content validity of items through calculation of the content validity ratio and content validity index.
Sixteen items were deemed essential, which included osteophytes graded dichotomously, cartilage damage graded continuously, synovitis and joint space narrowing graded on a semiquantitative scale. The panel deemed essential that the first metatarsophalangeal joint start in a neutral position, then move through range of motion for both dorsal and plantar scanning, orientating the probe in longitudinal and in transverse, whilst using first metatarsal head and proximal phalanx as anatomical landmarks. A supine body position was only deemed essential for a dorsal scan and a neutral foot/ankle position was only rated essential for a plantar scan. The content validity index of the 16 essential items was 0.19.
The consensus exercise has identified the essential components the ultrasound imaging acquisition procedure should encompass when examining first metatarsophalangeal joint osteoarthritis.
Cubital Tunnel Syndrome: Does a Consensus Exist for Diagnosis?
2023, Journal of Hand Surgery
Cubital tunnel syndrome (CuTS) is the second most common compressive neuropathy of the upper extremity. We aimed to determine a consensus among experts using the Delphi method for clinical criteria that could be validated further for the diagnosis of CuTS.
The Delphi method was used for establishing a consensus among a group of expert panelists, comprising 12 hand and upper-extremity surgeons, who ranked the diagnostic clinical importance of 55 items related to CuTS on a scale from 1 (least important) to 10 (most important). The average and SDs of each item were calculated, and Cronbach α was used to assess homogeneity among the panelist-ranked items.
All panelists answered the 55-item questionnaire. A Cronbach α value of 0.963 was obtained on the first iteration. The top criteria that were considered most clinically relevant to the diagnosis of CuTS among the group were determined based on the most highly ranked and correlated items among the expert panelist group. The criteria based on which there was agreement were as follows: (1) paresthesias in ulnar nerve distribution, (2) symptoms precipitated by increased elbow flexion/positive elbow flexion tests, (3) positive Tinel sign at the medial elbow, (4) atrophy/weakness/ late findings (eg, claw hand of the ring/small finger and Wartenberg or Froment sign) of ulnar nerve-innervated muscles of the hand, (5) loss of two-point discrimination in ulnar nerve distribution, and (6) similar symptoms on the involved side after successful treatment on the contralateral side.
Our study demonstrated a consensus among an expert panelist group of hand and upper-extremity surgeons on potential diagnostic criteria for CuTS. This consensus on diagnostic criteria may help clinicians readily diagnose CuTS in a standardized form; however, further weighting and validation are necessary prior to the development of a formal diagnostic scale.
This study is the first step in producing a consensus on how to diagnose CuTS.

View all citing articles on Scopus

View full text

Delphi as a method to establish consensus for diagnostic criteria

Abstract

Background and objectives

Method

Results

Conclusions

Introduction

Section snippets

Selection of the panelists

Results

Discussion

Acknowledgements

Socio-Econ Plan Sci

J Am Diet Assoc

J Clin Epidemiol

Prev Med

J Hand Surg [Am]

J Hand Surg [Am]

J Hand Surg

Arch Phys Med Rehabil

J Hand Surg

J Clin Epidemiol

An experimental application of the Delphi method to the use of experts

Manage Sci

The Delphi method: an experimental study of group opinion

The Delphi Method III. Use of self-ratings to improve group estimates

Consensus methods: characteristics and guidelines for use

Am J Public Health

Developing criteria for quality of assessment: effect of the Delphi technique

Health Serv Res

The Delphi technique as a method for determining the continuing education needs of physicians regarding coronary artery disease

J Assoc Hosp Med Educ

The use of the Delphi panel for consensus development on indicators of laboratory performance

Clin Lab Manage Rev

The Delphi technique: characteristics and sequence model

ANS Adv Nurs Sci

Delphi-panel analysis of appropriateness of high-dose chemotherapy and blood cell or bone marrow autotransplants in diffuse large-cell lymphoma

Leuk Lymphoma

Improving diagnostic accuracy: a comparison of interactive and Delphi consultations

Invest Radiol