Background
Detailed and complete vital registration (VR) systems are important for effectively informing public health planning [
1,
2]. Many countries rely on hospital deaths to update VR systems [
1]. It is generally believed that hospital physicians have a comprehensive diagnostic understanding of their patients, and this will be reflected in high quality hospital cause of death (COD) statistics [
3,
4].
A recent systematic review could identify only 29 studies that reported the accuracy of hospital data on COD published between 1980 and 2013 [
5]. Importantly, most studies reported substantial misdiagnosis of all-cause mortality. The reviewers concluded that the assumption of high levels of accuracy in hospital COD data was unfounded. Other studies have identified poor death certification practices as a major issue [
6]. For example, even after an extensive training period, Bangladeshi physicians failed to adhere to international standards in completing the medical certificate such as using ill-defined causes of death [
7]. Poor adherence to medical certification practices can lead to COD statistics of uncertain value for VR systems and public health interventions.
Because of the implications for national health policy and planning, it is in the national interest for a country to periodically review the accuracy of COD data through a medical record review (MRR). Unfortunately, MRRs are not a routine part of hospital practice and have required external inputs from national departments of health to find out where education and training would best be directed [
6].
A Medical Certificate of Cause of Death (MCCOD) is divided into two parts. Part 1 contains the sequence of causes that led to death; Part 2 contains conditions that contributed to the death but were not part of the sequence. The sequence in Part 1 establishes the underlying cause of death (UCOD) which is of principal interest to public health. Trained mortality coders not only assign codes from the International Classification of Diseases (10th edition) (ICD-10) to the conditions in the MCCOD, but also correct the sequence of causes according to the rules of the ICD-10 coding manual [
8,
9].
In the Philippines, 35–45% of deaths occur in hospitals, depending on the region [
10]. The MCCOD is written by a hospital physician familiar with the events leading to the patient’s death. This first-hand knowledge is supported by the contents of the medical record. Hospitals forward the completed MCCOD to the Office of the Civil Registrar which is part of the Philippines Statistics Authority (PSA), formerly the National Statistics Office. The certificates are then mortality coded and analyzed by the PSA. The results are reported to, as well as are published by, the Philippines Department of Health. The Department is an end-user of these statistics which it reports to the World Health Organization.
The Philippines has a mature and functioning VR system, but there has been little research about the accuracy of the UCOD assigned by the PSA [
10‐
12]. To our knowledge, no study has examined the concordance between hospital deaths and those coded by the PSA. In this paper, we compared the UCOD reported by the PSA to that assigned by MRR at a regional hospital in the Philippines in order to assess the diagnostic accuracy of the PSA data.
Discussion
After reviewing medical records at the BRH over 3 years, study physicians found it necessary to correct the UCOD at the three-digit level in 41.2% of the MCCOD; the change was due to a change in the sequence of causes in 7.2% of all MCCOD and to the introduction of a new diagnosis in 33.7% of these. Overall chance-corrected cause agreement between the MRR and PSA in the misclassification matrix (22 causes) was reasonably high at both the individual (0.73) and population level (0.58). The PSA reports the 10 leading causes of death in the country by sex [
18]. The report covers about 70% of all causes and includes categories such as Neoplasms and Other heart conditions. Findings from this study in Bohol indicate that a similar report covering the leading 22 causes and including all hospital deaths would be reasonably accurate.
The individual and population level concordance between the MRR and the PSA was higher than would have been expected given the percentage of MCCOD that required revision of the UCOD. However, the level of concordance is correlated with the number of COD categories under consideration: the higher the number of categories, the lower the concordance. In reporting revisions to UCOD, the study physicians were dealing with a much greater number of causes than the final 22 aggregated categories. The fact that the concordance between the MRR and PSA was high despite a large proportion of the MCCOD requiring insertion of a new cause suggests that many of the causes inserted by the study physicians were in the same cause categories as the causes indicated by the PSA. Also, there was no significant difference between the concordance of high quality versus low quality diagnoses, which was likely due to low quality medical records providing so little information that the study physicians were required to come to the same UCOD conclusion as the hospital physician.
Overall, the quality of the Philippine medical records was comparatively high, with 75% of records classified as GS1, GS2A, or GS2B. That is, these records provided evidence to justify the study physician diagnosis with a reasonable degree of certainty. However, 41.2% of adult deaths required a change in the UCOD on the MCCOD and 82% of these deaths required introduction of a new COD to the certificate. It follows that either 1) the study physicians did not accept a key diagnosis and altered it on the basis of the clinical record or 2) a diagnosis was present in the medical record but had not been not entered into the MCCOD. The second explanation implies a fundamental failure in entering all diagnoses on the clinical record to the MCCOD. (In the remaining 18% of cases, either the UCOD had not been entered into Part 1 of the certificate or else the sequencing was incorrect.)
There are several possible explanations for these corrections. First, record maintenance and medical certification of deaths may be assigned to the most junior member of the clinical team because senior clinicians view medical death certification as tedious and boring. Second, it was observed that many clinical investigations were not attached to the medical record following the death of the patient and needed to be tracked back to the laboratory or the radiology department. The hospital physician would not necessarily have seen these results when completing the MCCOD. Third, the diagnoses that should have been included in the MCCOD and hence for VR were ignored in favor of diagnoses that were essential for clinical decision-making.
Several studies have examined concordance between the MCCOD and MRR, but this study is best compared with a study in Mexico by Hernandez et al. that used similar GS criteria, COD categories, and robust metrics [
5,
19]. Hernandez et al. examined 1284 adult deaths in Mexico and compared the UCOD obtained from medical certificates against gold standard diagnoses developed by the PHMRC and derived from MRR. Median CCC increased from 66.5 to 75.9% when considering the mention of any COD on the death certificate. The median CCC for the UCOD in the Philippines was higher than in Mexico but slightly lower than that for all COD on the medical certificates in Mexico. The results of the Mexico study suggest that hospital physicians failed to correctly indicate the UCOD on the medical certificate in 33.5% of cases, but only 30% of these failures could be explained by physicians inputting the incorrect sequence of causes leading to death. 70% of the disagreements would have required the introduction of a new cause to the certificate. The situation is thus similar to the Philippines, where 41.0% of medical certificates required a change in UCOD, 18% were due to an incorrect sequence, and 82% of these were consequent upon the introduction of a new COD.
This study was part of a larger NHMRC study which brought with it several strengths. First, the study physicians in this study had experience in local settings as well as using and training others in MRR. Second, the number of cause categories (22) was appropriate to calculate concordance metrics. Too few categories are too broad for analysis and too many categories are too narrow. We recommend selecting 20–30 leading causes and establishing a standard list of causes for reviews in different hospitals. Third, this study calculated robust metrics for assessing concordance that avoid the bias of using a single CSMF composition.
It could be argued that medical reviewers should be blinded to the hospital physicians’ MCCOD. In our view this would not be helpful. The aim of the review is to produce the most accurate MCCOD possible. The hospital physician, with personal experience of the patient, has access to signs and symptoms denied the reviewer. Our practice was to establish the likely COD, criticize clinical diagnosis as appropriate, and to compare reviewer diagnoses with the hospital MCCOD. Records are not always well kept, and it is not difficult to miss a key point. Given that they changed the UCOD in over 40% of deaths, the Bohol reviewers were not inhibited in making changes.
Nonetheless, the present study has its limitations. The hospital data were based on a gold standard validation study for verbal autopsies. A UCOD was assigned by study physicians to each death following a MRR and a review of the hospital physician MCCOD. This study physician UCOD was compared with the UCOD assigned by the PSA. Both the study physicians and the PSA corrected the sequence of causes in the hospital physician MCCOD, but the PSA lacked access to the medical record. Differences between the two could either have been due to different revisions of the hospital physician sequences or to the addition of new causes obtained from the medical record. Because only the UCOD was coded by the validation study and the PSA, we were not able to compare the effects of the changes to sequence directly. The finding of failure to include all possible causes on the MCCOD had not been anticipated and required more detailed analysis than we could provide.
On the other hand, the evidence that 82% of the changes made to the hospital physician UCOD was a consequence of the failure to include all possible causes suggests very strongly that this was a fundamental problem. Residual, or other, categories had been developed in the gold standard validation studies for defined causes of death not specified elsewhere. The present study included not only high quality but also low quality diagnoses, which resulted in undefined causes being included in residual categories. Low counts of children and neonate deaths made it impossible to calculate appropriate metrics to measure concordance between the MRR and PSA assignment.
Conclusions
Filipino physicians had difficulty in transferring the UCOD from the medical record to the MCCOD, but the UCOD reported by the PSA was generally in agreement with that of study physicians. This suggests that the routine mortality data from PSA, at least for Bohol, might be used with some confidence to describe comparative cause of death patterns in the population. These results suggest that medical records used to complete the MCCOD and inform the VR system in the Philippines are of sufficient quality in their assigning of an UCOD and most MCCOD are sufficient for public health purposes (i.e., to report the leading 10–20 causes of death in the country). Junior physicians are commonly assigned the tasks of maintaining clinical records and of writing MCCOD. If unsupervised in these tasks, they are open to making errors, firstly, in their interpretation of the correct clinical diagnosis and, secondly, in their entry of diagnoses into the MCCOD. Review of MCCOD needs to become part of routine clinical audits by hospital teams which can be used both to strengthen clinical practice and to improve the quality of medical records and MCCOD. Future studies that conduct MRRs should continue to use rigorous GS criteria and compute robust metrics to avoid bias.