Skip to main content

Advertisement

Log in

Surveillance for Hepatocellular Carcinoma: Development and Validation of an Algorithm to Classify Tests in Administrative and Laboratory Data

  • Original Article
  • Published:
Digestive Diseases and Sciences Aims and scope Submit manuscript

Abstract

Background

The purpose of alpha-fetoprotein (AFP) and abdominal ultrasound (US) cannot be discerned in administrative data.

Aim

We developed an algorithm to identify AFP and US used as surveillance tests for hepatocellular carcinoma (HCC).

Methods

We evaluated 300 AFP and 301 US tests from a VA database. Surveillance predictors in the administrative files (diagnoses, labs) were examined in logistic regression models. We calculated model-based probabilities of HCC surveillance status, and developed classification procedures using single and multiple imputation methods.

Results

The predictors of surveillance intent for AFP were absence of alcoholism, abdominal pain, ascites, diabetes and high AST levels. For US, the predictors of surveillance were prior AFP testing and HIV status and absence of abdominal pain, ascites, or drug dependence. For AFP classification, single imputation compared favorably with multiple imputation, both showing robustness in discrimination and calibration. For US both approaches were less robust in discrimination and calibration which was more moderate in multiple imputation than single imputation.

Conclusions

Predictive algorithms in administrative files can be used to identify AFP performed for HCC surveillance, however, the intent of US is more difficult to identify.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

AFP:

Alpha-fetoprotein

CT:

Computed tomography

HCC:

Hepatocellular carcinoma

VA:

Veterans administration

References

  1. El-Serag HB, Rudolph KL. Hepatocellular carcinoma: epidemiology and molecular carcinogenesis. Gastroenterology. 2007;132:2557–2576.

    Article  CAS  PubMed  Google Scholar 

  2. El-Serag HB, Marrero JA, Rudolph L, Reddy KR. Diagnosis and treatment of hepatocellular carcinoma. Gastroenterology. 2008;134:1752–1763.

    Article  PubMed  Google Scholar 

  3. Bruix J, Sherman M. Practice guidelines committee AAftSoL. Management of hepatocellular carcinoma. Hepatology. 2005;42:1208–1236.

    Article  PubMed  Google Scholar 

  4. Bruix J, Sherman M, Llovet JM, et al. Clinical management of hepatocellular carcinoma. Conclusions of the Barcelona-2000 EASL conference. European Association for the study of the liver. J Hepatol. 2001;35:421–430.

    Article  CAS  PubMed  Google Scholar 

  5. Trevisani F, De NS, Rapaccini G, et al. Semiannual and annual surveillance of cirrhotic patients for hepatocellular carcinoma: effects on cancer stage and patient survival (Italian experience). Am J Gastroenterol. 2002;97:734–744.

    Article  PubMed  Google Scholar 

  6. Singal A, Volk ML, Waljee A, et al. Meta-analysis: surveillance with ultrasound for early-stage hepatocellular carcinoma in patients with cirrhosis. Aliment Pharmacol Ther. 2009;30:37–47.

    Article  CAS  PubMed  Google Scholar 

  7. Backus LI, Gavrilov S, Loomis TP, et al. Clinical case registries: simultaneous local and national disease registries for population quality management. J Am Med Inform Assoc. 2009;16:775–783.

    Article  PubMed  Google Scholar 

  8. Kramer JR, Davila JA, Miller ED, Richardson P, Giordano TP, El-Serag HB. The validity of viral hepatitis and chronic liver disease diagnoses in Veterans Affairs administrative databases. Aliment Pharmacol Ther. 2008;27:274–282.

    Article  CAS  PubMed  Google Scholar 

  9. Greenland S. Modeling and variable selection in epidemiologic analysis. Am J Public Health. 1989;79:340–349.

    Article  CAS  PubMed  Google Scholar 

  10. Rubin DB. Multiple imputation after 18+ years. J Am Stat Assoc. 1996;91:473–489.

    Article  Google Scholar 

  11. Xiao-Hua Z, McClish DK, Obuchowski NA. Methods in Diagnostic Medicine. New York: Wiley-Interscience; 2002.

    Google Scholar 

  12. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York: Springer; 2009.

    Book  Google Scholar 

  13. Meng X, Rubin DB. Performing likelihood ratio tests with multiply-imputed data sets. Biometrika. 1992;79:103–111.

    Article  Google Scholar 

Download references

Acknowledgments

Financial support: This work was supported in part by the Houston VA HSR&D Center of Excellence (HFP90-020) and the National Cancer Institute (R01-CA-125487).

Conflicts of interest

None.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hashem B. El-Serag.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 84 kb)

Appendices

Appendix A

See Table 6.

Table 6 Variables for consideration in model development

Appendix B

Direct Multiple Imputation

In the direct multiple imputation approach, surveillance counts are taken to be expected values of those counts as random variables with respect to the probability distribution induced by the model-predicted surveillance probability values. Estimates of expected values of these and other random variables are obtained by taking means of the imputed counts over the imputation iterations. In particular, estimation of parameter estimates in multiple imputation approaches to generalized linear modeling proceeds in this same way [12].

Decision-Theoretic Framework for Dichotomization

The decision-theoretic approach involves treating the predicted probabilities as a continuous scale, on which, thresholds are determined which minimize the total misclassification costs of true and false positives and negatives.

Costs are assigned per unit to true and false positives and negatives:

$$ \begin{aligned} {\text{Total}}\,{\text{cost}} \,=\, & \alpha *\left( {\# {\text{TP}}} \right) + \beta *\left( {\# {\text{FP}}} \right) + \gamma *\left( {\# {\text{FN}}} \right) + \delta *\left( {\# {\text{TN}}} \right) \\ \,=\, & {\text{N}}*(\left( {\alpha - \gamma } \right){\text{sens}}\left( {{\text{true}}\,{\text{screen}}\,{\text{rate}}} \right) + \left( {\delta - \beta } \right)*{\text{spec}}*\left( {{\text{1}} - {\text{true}}\,{\text{screen}}\,{\text{rate}}} \right) \\ & +\, \gamma *\left( {{\text{true}}\,{\text{screen}}\,{\text{rate}}} \right) + \beta *\left( {{\text{1}} - {\text{true}}\,{\text{screen}}\,{\text{rate}}} \right) \\ \end{aligned} $$

This cost function can be applied directly to the vertices of the ROC curve for the predictive model. Those vertices result in minimal yield of the desired dichotomization thresholds. Slopes of level curves of this cost function can assist in this process:

The slopes of the level curves of the total cost function on the ROC plane (with x = (1 − spec) and y = sens) are equal to

$$ \frac {(\delta - \beta)*(1 - {\text {true screen rate}})}{(\alpha - \gamma)*(\text {true screen rate})} = \frac {(\delta - \beta)/(\alpha - \gamma)}{\text{odds of true screen}}$$

In case, as with most applications, the costs for true positives and true negatives are zero, this is simply

$$ {\frac{{{\frac{\beta }{\gamma }}}}{{{\text{odds}}\,{\text{of}}\,{\text{true}}\,{\text{screen}}}}}. $$

The cost-minimizing ROC vertices will be tangency points for these slopes.

In particular, equal unit misclassification cost assignments to false negatives and false positives (β = γ) result in cost-minimization, which coincides with maximization of agreement (#TP + #TN).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Richardson, P., Henderson, L., Davila, J.A. et al. Surveillance for Hepatocellular Carcinoma: Development and Validation of an Algorithm to Classify Tests in Administrative and Laboratory Data. Dig Dis Sci 55, 3241–3251 (2010). https://doi.org/10.1007/s10620-010-1387-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10620-010-1387-y

Keywords

Navigation