Review Article
Administrative database research has unique characteristics that can risk biased results

https://doi.org/10.1016/j.jclinepi.2011.08.002Get rights and content

Abstract

Objective

The provision of health care frequently creates digitized data—such as physician service claims, medication prescription records, and hospitalization abstracts—that can be used to conduct studies termed “administrative database research.” While most guidelines for assessing the validity of observational studies apply to administrative database research, the unique data source and analytical opportunities for these studies create risks that can make them uninterpretable or bias their results.

Study Design

Nonsystematic review.

Results

The risks of uninterpretable or biased results can be minimized by; providing a robust description of the data tables used, focusing on both why and how they were created; measuring and reporting the accuracy of diagnostic and procedural codes used; distinguishing between clinical significance and statistical significance; properly accounting for any time-dependent nature of variables; and analyzing clustered data properly to explore its influence on study outcomes.

Conclusion

This article reviewed these five issues as they pertain to administrative database research to help maximize the utility of these studies for both readers and writers.

Introduction

Health care provision is becoming increasingly digitized. In most jurisdictions, patient visits are logged in registration systems. The dates of physician visits, laboratory tests, and radiological investigations are recorded in physician claims databases. The diagnoses, procedures, and simple outcomes of visits to emergency departments or admissions to hospital are documented in hospitalization databases. Each of these systems leaves a trail of digital information that describes (to varying degrees of detail) a patient’s course through a health care system. These data can be used to conduct research studies that can be termed “administrative database research.”

As in other types of observational research, the overarching goal of administrative database research is the description of a particular measure (or variable) with or without its relationship to another measure. Many of the guidelines that are available to assess the internal validity of observational research [1] apply to administrative database research. However, these studies have several unique issues that also need to be addressed by the writer and evaluated by the reader to establish their internal validity [2]. If these are not addressed or considered, potential threats to the validity of administrative database research may persist. In this article, we discuss five issues that likely should be considered whenever administrative database research is written or read.

Section snippets

Description of the data sets used for study

In studies using primary data collection, the methods section describes steps taken to collect the data used to create the study analytical data set. Key issues here include the sampling frame and sampling methods as well as the inclusion and exclusion criteria. This information helps readers understand which people were considered for inclusion in the study and has important implications for determining the internal and, especially, external validity of the study. In administrative database

Reporting diagnostic and procedural code accuracy using meaningful statistics

Administrative data use codes to identify diagnoses or procedures that are often used for research studies. In a systematic review of administrative database research [3], we found that 76% of administrative database studies used diagnostic or procedural codes to define patient cohorts, exposures, or outcomes. HRAs (or, occasionally, physicians) review health records to identify diagnoses and procedures that have been documented therein. They then use standard coding systems to substitute the

Statistical significance vs. clinical significance

The issue of statistical significance vs. clinical significance is not unique to administrative database research. However, these studies often have very large sample sizes, thereby highlighting this issue and making it a recurrent theme in such studies.

Table 1 illustrates the influence that study sample size can have on P-values for statistical testing. In this example, two equally sized groups have a very similar baseline prevalence of a binary trait (49.9% vs. 50.1%). Data in the table show

Time-dependent bias

Patient-level variables can change value during observation. Such “time-dependent” variables can be termed “baseline immeasurable” if their value cannot be determined at baseline. Biased conclusions can occur when these variables are analyzed as if their values were known at the start of patient observation.

In this situation, a patient’s outcome will influence the value of their time-dependent variable. Consider a binomial (0/1) time-dependent covariate indicating the presence or absence of a

Accounting for clustering

Study samples derived from health administrative data are frequently subject to clustering. For example, consider a study that consists of patients hospitalized with an acute myocardial infarction (AMI) who were treated by physicians who practice within hospitals [18]. This study consists of data having a three-level structure of AMI patients nested within physicians nested within hospitals.

Researchers using health administrative data are frequently interested in determining the association

Summary

Administrative database research can offer extensive opportunities for health-related scientific studies. In this article, we discussed five issues that we believe are especially prominent in administrative database research. It is important that writers of administrative database research address and clarify these issues to avoid confusion and misinformation in readers.

References (23)

  • W.J. Rogan et al.

    Estimating prevalence from the results of a screening test

    Am J Epidemiol

    (1978)
  • Cited by (179)

    • Treatment and Outcomes of Acute Myocardial Infarction in Patients With Polymyalgia Rheumatica With and Without Giant Cell Arteritis

      2022, American Journal of Cardiology
      Citation Excerpt :

      This study has several limitations that are inherent to the NIS database. NIS data are subject to potential selection bias due to coding inaccuracies and incomplete data.28 Information on pharmacological management of PMR (e.g., prescription, dose, and duration of glucocorticoids) or laboratory findings (e.g., platelet and hemoglobin count) are not provided by the NIS and could have provided information to improve the analysis of risk and outcomes.6

    View all citing articles on Scopus
    View full text