Background
Clinical Practice Guidelines (CPG) are one of the tools most frequently used by health professionals to improve the micro level decision-making process. As defined by the Institute of Medicine (IOM), they are "systematically developed statements to assist practitioner and patient decisions about appropriate health care for specific clinical circumstances"[
1]. Guidelines may offer concise instructions on which diagnostic or screening tests need to be order, how to provide medical or surgical services, how long patients should stay in hospital, or other details of clinical practice [
2]. The ultimate purpose of developing and using guidelines is to improve the quality of care provided, particularly in areas of clinical uncertainty.
In the last years, a sustained increment in guidelines production was observed all over the world, especially in United States, Canada, Australia, New Zealand and European countries. Most of these countries have developed national programs for CPG production, dissemination and implementation in order to increase the effectiveness and quality of the health system [
3].
Some of these initiatives, which were originally conceived as individual efforts, have been strongly improved by international cooperation through organizations such as the Guidelines International Network (GIN) [
4] and the Appraisal of Guidelines for Research and Evaluation (AGREE) Research Trust [
5]. The need for harmonizing and systematizing guideline development and assessment was one of the most important reasons that prompted the establishment of these international organizations [
6].
A "good quality guideline" is that one that ultimately leads to improve patient outcome. However, quality of guideline is indirectly measured by assessing in what degree guideline producers minimized potential biases that could occur in the development process and affect validity of its recommendations [
7]. Wrong recommendations affect health professionals' credibility on guidelines, and consequently, limit their adoption [
2].
In 1999, Shaneyfelt and col. assessed quality of CPG published in Medline between 1985 and 1997 by using a systematically developed instrument. The majority of 279 assessed guidelines did not meet the pre-established methodological standards, being rigour of recommendations one of the most deficiently reported [
8]. Similar results were reported by Cluzeau and col.[
9], Grilli and col.[
10] and Graham and col.[
11] in 1999, 2000 y 2001, respectively. In 2003, the AGREE collaboration (currently the AGREE Research Trust) published the results of the first international project aimed at developing and validating a generic instrument for guidelines assessment [
7]. This instrument has been translated to different languages, extending its use throughout the world. In the recent years, several studies showed methodology deficiencies in guideline development by using the AGREE instrument [
12‐
14].
Almost all research about quality assessment of CPG has been undertaken in High Income Countries (HIC). Studies about quality of guidelines produced and diffused in Low and Middle Income Countries (LMIC), and particularly in Latin America, are scarce [
15]. In Argentina, although many different institutions are interested in CPG development, there is no information about quantity of guidelines produced, and moreover, quality of these documents. The purpose of this research is to describe trends in guidelines production in Argentina and to assess their quality by using the AGREE instrument.
Methods
A cross-sectional study was undertaken to describe guidelines production in Argentina between 1994 and 2004. Documents were considered as CPG if: 1) they included explicit recommendations targeted to health professional decision-making, being this related to: screening and primary prevention, diagnosis, treatment and secondary prevention and/or rehabilitation; 2) they contained bibliographic references and in the case of consensus, participants or responsible institutions were described; 3) they were produced and diffused in the period of study (January 1994- December 2004) and could be freely accessed. The exclusion criteria were: 1) guidelines targeted to patients (patients'guidelines) and/or exclusively oriented to health services organization and not to clinical decision-making; 2) guidelines for which it was not possible to determine if a systematic process was applied in their development (i.e. documents that lacked an explanation of the guideline development methodology that had been used; documents diffused as brief reports which only contained a set of recommendations; documents referred to as guidelines, but that were undertaken by only one author without any reference to the methodology applied); 3) guidelines whose year of development could not be established as it was not stated; 4) guidelines that were not produced by an Argentine institution (adapted guidelines were included only when the adaptation process was explicitly explained).
Electronic databases searching (EDS): executed by an expert by means of the strategy described in Table
1. It was initially developed to be performed in MEDLINE under PubMed platform. Afterwards it was adapted to be used in regional databases (LILACS). All retrieved articles were assessed by the principal investigator. An Internet searching (IS) was also perfomed to identify CPG posted on Websites. Subsequent institutions were classified according to the level of the health system to which they belong (macro, meso and micro level). Table
2 shows the definition used to describe each level.
Table 1
Description of the searching strategy employed in Medline
1 | ("guideline" [Publication Type] OR "guidelines" [MeSH Terms] OR "guidelines" [Text Word]) |
2 | ("consensus" [MeSH Terms] OR consensus [Text Word]) |
3 | algoryth* |
4 | #1 OR #2 OR #3 |
5 | argentin* |
6 | #4 AND #5 |
Table 2
Internet searching: institutions included at each level of Health System
Macro Level
| Organisms of the national, provincial and municipal State in charge of health policy formulation, execution and control. It includes: a) National State: Ministry of Health, including all its decentralized departments, secretaries'offices and their dependent organizations that have health promotion, prevention and care as one of their specific goals; b) Provincial State: Health ministries or offices of the provincial government; c) Municipal State: Health offices of cities that were provincial capitals or have more than 250.000 inhabitants. |
| Websites of Health Technology Assessment agencies were also included at this level. |
Meso Level
| Intermediate institutions of the public, private and social security sector that provide or manage health services. It includes: individual providers, organizations of providers and health assurance institutions. |
Micro Level
| It is theoretically constituted by individual health professionals. In practice, scientific or professional associations were selected. Only national organizations were included. |
Quality guideline assessment was perfomed through the AGREE instrument. This instrument was selected amongst others as it is the only one that covers practically all the relevant dimensions of the guideline development process; it has been internationally validated; it has fewer items and uses a numerical scale that facilitates the analysis and comparison of the results [
7,
16,
17]. This instrument has been widely used all over the world, mainly as a result of its translation into many other languages, including Spanish. This version was already tested in Spain and it proved to be reliable and feasible to apply [
13].
A total of 30 health professionals distributed throughout the country were invited to participate in the assessment phase. To be considered elegible, professionals should have had at least one of the following criteria: a) previous clinical epidemiology background; b) knowledge on guidelines development; c) experience with the AGREE instrument. Those professionals that accepted the invitation and fulfilled the eligibility criteria were trained in the use of the AGREE instrument. A 45 days-e-learning program was developed in three stages: I. Self-reading of the tool-kit (15 days): all participants were provided with the Spanish version of the AGREE instrument, the Spanish and English version of the Training Manual. II. Pilot CPG critical appraisal (15 days): one CPG was assessed independently by all professionals. A data collection form designed on an Excel sheet, accompanied by a user-guide were sent to each participant. Results of assessments were returned to the researcher team by e-mail. Results were compared and divergences were discussed with each appraiser through an individual feedback. III. Adjustment phase (15 days): during this last stage, unresolved doubts could be raised by each participant in order to be discussed with the researcher team. Only those professionals that suscessfully completed this three-stage training were formally accepted as appraisers (n = 23). No one received any honorarium.
Whenever was possible, guidelines were assigned taking into account the expertise and specialty of each appraiser. The median numbers of guidelines assessed by each appraiser was 8. According to the AGREE collaboration the domain scores of each CPG were individually considered and scores of individual items in each domain were summed and standardized as a percentage of the maximum possible score for that domain, taking into account the number of appraisers. Relation between quality domain scores and other variables (year of production, level of the health system, guideline publication and category) was assessed through bivariate analysis. As distribution of the dependent variable was generally asymetrical, non- parametrics tests were used. The Kruskal-Wallis test was used to test the statistical significance of the difference when categories of the grouping variable was higher than 2. When categories were only 2, the Mann Whitney Test was applied.
The internal consistency of each domain was evaluated using Cronbach's alpha. The reliability between appraisers was determined for each question and each domain of the AGREE. Intraclass correlation coefficients (ICC) were calculated within each pair of appraisers and across the pool of appraisers. ICCs and kappa values above 0.75 were considered to represent good, 0.40–0.75 moderate and <0.40 poor reliability.
Feasibility of the instrument was assessed through an ad-hoc instrument that contained two dimensions: usefulness and simplicity; both dimensions were assessed through a 1–5 scale, being 5 the highest score.
Discussion
In the last years, development of guidelines in Argentina has progressively increased; however, quality did not improve. This situation could be clearly resumed in the phrase of Sudlow and Thomson: "
Quantity but no quality" [
18].
Similiar results were reported by some studies performed in HIC between 1999 and 2005 [
8‐
14,
19]. In many cases, these findings could have contributed to prompt the establishement of national guidelines programs with the aim of systematizing the guideline development process. However, and even when, comparatively to LMIC, important improvements have been achieved in relation to guidelines production, this issue of "pluralism and low quality" still raises serious concerns in HIC. A recent report from the National Institute of Medicine (USA) questionated the validity and reliability of many guidelines produced in this country due to the lack of rigourness, objectivity and transparency of the development methodology that had been applied [
20].
Quality of guidelines produced in LMIC, and particularly in Latin America, is practically unknown. To our knowledge, there is only one CPG quality assessment that precedes the Argentine research, which was performed in Brazil. In this study, twenty-eight guidelines developed by the Brazilian Medical Association were independently assessed by 2 appraisers using the AGREE instrument [
15]. This is the second guideline appraisal study in a Latin American country, but on a larger scale than the first conducted in Brazil.
Quality of the assessed Argentine guidelines was far from ideal: scores were low and very low in all domains and items of the AGREE instrument. Many factors might have contributed to this situation.
First, low quality could have been the result of the absence of an explicit policy for guidelines production and evaluation during the period under assessment. Argentine health system is highly complex and integration of activities of multiple stakeholders is difficult to achieve without a clear guidance. Although many institutions of the three levels of the health system participated in this process, a more integrated approach is required in order to balance the interests, preferences and knowledge of different stakeholders whose participation in the guideline development process is not only required, but need to be guaranteed.
Second, low quality scores of Argentine guidelines could be explained by a slower penetration and consolidation of the evidence-based movement in LMIC countries in comparison to developed countries. As described by Burgers, development of guidelines in Europe, Australia and North America started in the 80's and 90's [
3]. In the United States, the Consensus Development Program at the National Institute of Health developed its first guideline in 1977. In the last 30 years, all these organizations have accumulated a vast experience in guideline development, dissemination and implementation. Currently, principles of evidence-based-medicine dominate almost all of these national guideline programs. The creation of international networks, like the GIN, as well as the conduct of projects like the AGREE, have clearly contributed to the improvement and standardization of these processes in the participant countries. Contrastingly, LMIC countries, with few exceptions, did not take part of these experiences. Diffusion and dissemination of appropiate methods for evidence-based guidelines development is limited in these countries. According to the results of this study, in Argentina, as late as 2004, this process was not systematized and still relied heavily on the opinion of experts.
Thirdly, standards proposed by the AGREE instrument could be relatively high for the context of a LMIC and specially if it is taken into account the fact that, except for the last two years (2003 and 2004), the period during which Argentine guidelines production was described preceded the year of diffusion of this instrument (2003). In some LMIC, language barriers and limited accesibility to updated biomedical literature can negatively impact on the use of relevant and important evidence to support guidelines recommendations. In this sense, an invaluable resource for Argentine guidelines developers is the Cochrane Library Plus, which can be freely accessed and contains the Spanish version of systematic reviews produced by the Cochrane Collaboration [
21]. Even when currently there is broad agreement on the need for systematic reviews to inform recommendations, this type of evidence was rarely referred in Argentine guidelines [
22]. Therefore, networking activities betweeen guideline producers and Argentine Cochrane Centers shoud also be promoted. As reported by Varonen and col., this kind of cooperation showed to be very positive in many senses [
23].
Another factor that could have influenced quality of Argentine guidelines is the lack of economical and human resources devoted to guideline production. Since the cost of producing evidence-based guidelines is relatively too high for health budgets of LMIC countries, a systematic methodology to adapt international guidelines would be an efficient way of improving not only the quantity but also their quality [
24]. Internationally developed guidelines can be adapted to the national context, representing a considerable saving of money. However, an explicit and systematic adaptation process should be performed as guidelines' applicability and transferability can be strongly influenced by different factors, e.g.: population needs (prevalence of disease, baseline risk status), setting (availability of resources) and other factors that modify translation of recommendations into practice [
25,
26]. In 2006, the National Academy of Medicine (NAM), in collaboration with the National Ministry of Health, developed and validated an adaptation process in order to increase the quality of guidelines produced in the country. Currently, a virtual learning course is implemented by NAM, with the purpose of improving national and local capacities in guideline adaptation [
27].
Finally, findings of this assessment highlight the need of improving the reporting of the editorial independence of guideline producers. Practically none Argentine guideline reported conflict of interests or funding sources. Lack of transparency was also reported by Papanikolaou et al. in an evaluation of 191 published guidelines: only 7 (3.7%) disclosed potential conflicts of interest [
28]. In the case of Argentine guidelines, omission could have been unintentional or, on the contrary, intentional (financial ties might have existed in some situations and deliberately hidden by guideline authors). However, regardless of the intent of guideline developers' actions, explicit declaration of conflict of interests at the begining of the process is strongly recommended by most international organizations as a way of reducing the probability of biased recommendations and increasing guidelines' credibility [
29].
Some methodological issues must be addressed. First of all, evaluation was restricted to guidelines that were diffused and identified on Websites or in journals. Diffusion is not the same as development as there might have been guidelines produced and used in health institutions that could have not been identified by the searching strategies applied in this research. In spite of this limitation, the study was focalized on those guidelines diffused by well-known and reputed institutions in Argentina, which have a high probability of being adopted by healthcare professionals. Secondly, Internet searching was not exhaustive at the meso level, concretely in hospitals: as a reduced number of these institutions have Websites, only 3 out of 10 eligible hospitals could be assessed. Thirdly, even when the AGREE collaboration strongly suggests 4 appraisers per each CPG, this could not be performed because of lack of resources. All researchers and appraisers work ad-honorem. However, in spite of the inclusion of only two appraisers per guideline, reliability scores were acceptable. In fourth place, only CPG documents were considered for the assessment. Finally, there are some limitations inherent to the instrument applied. Quality of guideline is assessed on the basis of what is reported: quality of reporting is not the same as quality of the development process. As in other quality assessment studies, none content analysis of the recommendations was performed [
30].
To our knowledge this is the first time a study of this kind has been undertaken in Argentina and Latin America, the Brazilian research excepted. Its execution was the first step in the building of a network of professionals interested in improving CPG development, dissemination and implementation in the country. Its findings might be very useful in the set up of a national evidence based guideline development program.
Conclusion
This study was one of the firsts that systematically employed the AGREE instrument for the critical assessment of guidelines produced in a LMIC. The AGREE instrument can serve as a model to identify improvement opportunities in the guidelines development process of these countries. In this sense, this research shows the low quality of guidelines produced and points out areas to which training iniatiatives should be oriented.
Guideline development and diffusion in Argentina from 1994 to 2004 shows a constant increment, although quality of reporting did not improve; moreover, in some aspects it seemed to decline. Institutions involvement in this process was dispersed, rarely integrated, and not systematized. A national debate between main stakeholders is urgently needed in order to contribute to the definition of a clear and explicit policy for CPG development, dissemination and implementation in the country.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
MEE conceived the study, designed the protocol and coordination of the research, performed the Internet Search, registered pCPG in the database, selected the CPG from the database, was in charged of appraisers training and appraised guidelines, performed the statistical analysis, interpreted the data, drafted the manuscript. ZO participated in the design of the protocol and coordination of the research, selected the CPG from the database, interpreted the data and helped to draft the manuscript. MGD performed the database electronic searching, appraised guidelines, interpreted the data, and helped to draft the manuscript. ECh, RM and RB appraised guidelines, interpreted the data and helped to draft the manuscript. All the authors read and approved the final manuscript.