Surveillance for Hepatocellular Carcinoma: Development and Validation of an Algorithm to Classify Tests in Administrative and Laboratory Data

Richardson, Peter; Henderson, Louise; Davila, Jessica A.; Kramer, Jennifer R.; Fitton, Conar P.; Chen, G. John; El-Serag, Hashem B.

doi:10.1007/s10620-010-1387-y

Surveillance for Hepatocellular Carcinoma: Development and Validation of an Algorithm to Classify Tests in Administrative and Laboratory Data

Original Article
Published: 16 September 2010

Volume 55, pages 3241–3251, (2010)
Cite this article

Digestive Diseases and Sciences Aims and scope Submit manuscript

Peter Richardson¹,
Louise Henderson³,
Jessica A. Davila¹,
Jennifer R. Kramer¹,
Conar P. Fitton¹,
G. John Chen¹ &
…
Hashem B. El-Serag^1,2,4

309 Accesses
16 Citations
Explore all metrics

Abstract

Background

The purpose of alpha-fetoprotein (AFP) and abdominal ultrasound (US) cannot be discerned in administrative data.

Aim

We developed an algorithm to identify AFP and US used as surveillance tests for hepatocellular carcinoma (HCC).

Methods

We evaluated 300 AFP and 301 US tests from a VA database. Surveillance predictors in the administrative files (diagnoses, labs) were examined in logistic regression models. We calculated model-based probabilities of HCC surveillance status, and developed classification procedures using single and multiple imputation methods.

Results

The predictors of surveillance intent for AFP were absence of alcoholism, abdominal pain, ascites, diabetes and high AST levels. For US, the predictors of surveillance were prior AFP testing and HIV status and absence of abdominal pain, ascites, or drug dependence. For AFP classification, single imputation compared favorably with multiple imputation, both showing robustness in discrimination and calibration. For US both approaches were less robust in discrimination and calibration which was more moderate in multiple imputation than single imputation.

Conclusions

Predictive algorithms in administrative files can be used to identify AFP performed for HCC surveillance, however, the intent of US is more difficult to identify.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating screening approaches for hepatocellular carcinoma in a cohort of HCV related cirrhosis patients from the Veteran’s Affairs Health Care System

Article Open access 04 January 2018

Should AFP (or Any Biomarkers) Be Used for HCC Surveillance?

Article 28 April 2017

Evaluation of the aMAP score for hepatocellular carcinoma surveillance: a realistic opportunity to risk stratify

Article Open access 07 July 2022

Abbreviations

AFP:: Alpha-fetoprotein
CT:: Computed tomography
HCC:: Hepatocellular carcinoma
VA:: Veterans administration

References

El-Serag HB, Rudolph KL. Hepatocellular carcinoma: epidemiology and molecular carcinogenesis. Gastroenterology. 2007;132:2557–2576.
Article CAS PubMed Google Scholar
El-Serag HB, Marrero JA, Rudolph L, Reddy KR. Diagnosis and treatment of hepatocellular carcinoma. Gastroenterology. 2008;134:1752–1763.
Article PubMed Google Scholar
Bruix J, Sherman M. Practice guidelines committee AAftSoL. Management of hepatocellular carcinoma. Hepatology. 2005;42:1208–1236.
Article PubMed Google Scholar
Bruix J, Sherman M, Llovet JM, et al. Clinical management of hepatocellular carcinoma. Conclusions of the Barcelona-2000 EASL conference. European Association for the study of the liver. J Hepatol. 2001;35:421–430.
Article CAS PubMed Google Scholar
Trevisani F, De NS, Rapaccini G, et al. Semiannual and annual surveillance of cirrhotic patients for hepatocellular carcinoma: effects on cancer stage and patient survival (Italian experience). Am J Gastroenterol. 2002;97:734–744.
Article PubMed Google Scholar
Singal A, Volk ML, Waljee A, et al. Meta-analysis: surveillance with ultrasound for early-stage hepatocellular carcinoma in patients with cirrhosis. Aliment Pharmacol Ther. 2009;30:37–47.
Article CAS PubMed Google Scholar
Backus LI, Gavrilov S, Loomis TP, et al. Clinical case registries: simultaneous local and national disease registries for population quality management. J Am Med Inform Assoc. 2009;16:775–783.
Article PubMed Google Scholar
Kramer JR, Davila JA, Miller ED, Richardson P, Giordano TP, El-Serag HB. The validity of viral hepatitis and chronic liver disease diagnoses in Veterans Affairs administrative databases. Aliment Pharmacol Ther. 2008;27:274–282.
Article CAS PubMed Google Scholar
Greenland S. Modeling and variable selection in epidemiologic analysis. Am J Public Health. 1989;79:340–349.
Article CAS PubMed Google Scholar
Rubin DB. Multiple imputation after 18+ years. J Am Stat Assoc. 1996;91:473–489.
Article Google Scholar
Xiao-Hua Z, McClish DK, Obuchowski NA. Methods in Diagnostic Medicine. New York: Wiley-Interscience; 2002.
Google Scholar
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York: Springer; 2009.
Book Google Scholar
Meng X, Rubin DB. Performing likelihood ratio tests with multiply-imputed data sets. Biometrika. 1992;79:103–111.
Article Google Scholar

Download references

Acknowledgments

Financial support: This work was supported in part by the Houston VA HSR&D Center of Excellence (HFP90-020) and the National Cancer Institute (R01-CA-125487).

Conflicts of interest

None.

Author information

Authors and Affiliations

Section of Health Services Research, Houston Center for Quality of Care & Utilization Studies, The Houston Veterans Affairs Medical Center and Baylor College of Medicine, Houston, TX, USA
Peter Richardson, Jessica A. Davila, Jennifer R. Kramer, Conar P. Fitton, G. John Chen & Hashem B. El-Serag
Section of Gastroenterology, Houston Center for Quality of Care & Utilization Studies, The Houston Veterans Affairs Medical Center and Baylor College of Medicine, Houston, TX, USA
Hashem B. El-Serag
Department of Radiology, The University of North Carolina, Chapel Hill, NC, USA
Louise Henderson
The Michael E. DeBakey VA Medical Center, 2002 Holcombe Blvd. (152), Houston, TX, 77030, USA
Hashem B. El-Serag

Authors

Peter Richardson
View author publications
You can also search for this author in PubMed Google Scholar
Louise Henderson
View author publications
You can also search for this author in PubMed Google Scholar
Jessica A. Davila
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer R. Kramer
View author publications
You can also search for this author in PubMed Google Scholar
Conar P. Fitton
View author publications
You can also search for this author in PubMed Google Scholar
G. John Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hashem B. El-Serag
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hashem B. El-Serag.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 84 kb)

Appendices

Appendix A

See Table 6.

Table 6 Variables for consideration in model development

Full size table

Appendix B

Direct Multiple Imputation

In the direct multiple imputation approach, surveillance counts are taken to be expected values of those counts as random variables with respect to the probability distribution induced by the model-predicted surveillance probability values. Estimates of expected values of these and other random variables are obtained by taking means of the imputed counts over the imputation iterations. In particular, estimation of parameter estimates in multiple imputation approaches to generalized linear modeling proceeds in this same way [12].

Decision-Theoretic Framework for Dichotomization

The decision-theoretic approach involves treating the predicted probabilities as a continuous scale, on which, thresholds are determined which minimize the total misclassification costs of true and false positives and negatives.

Costs are assigned per unit to true and false positives and negatives:

$$ \begin{aligned} {\text{Total}}\,{\text{cost}} \,=\, & \alpha *\left( {\# {\text{TP}}} \right) + \beta *\left( {\# {\text{FP}}} \right) + \gamma *\left( {\# {\text{FN}}} \right) + \delta *\left( {\# {\text{TN}}} \right) \\ \,=\, & {\text{N}}*(\left( {\alpha - \gamma } \right){\text{sens}}\left( {{\text{true}}\,{\text{screen}}\,{\text{rate}}} \right) + \left( {\delta - \beta } \right)*{\text{spec}}*\left( {{\text{1}} - {\text{true}}\,{\text{screen}}\,{\text{rate}}} \right) \\ & +\, \gamma *\left( {{\text{true}}\,{\text{screen}}\,{\text{rate}}} \right) + \beta *\left( {{\text{1}} - {\text{true}}\,{\text{screen}}\,{\text{rate}}} \right) \\ \end{aligned} $$

This cost function can be applied directly to the vertices of the ROC curve for the predictive model. Those vertices result in minimal yield of the desired dichotomization thresholds. Slopes of level curves of this cost function can assist in this process:

The slopes of the level curves of the total cost function on the ROC plane (with x = (1 − spec) and y = sens) are equal to

$$ \frac {(\delta - \beta)*(1 - {\text {true screen rate}})}{(\alpha - \gamma)*(\text {true screen rate})} = \frac {(\delta - \beta)/(\alpha - \gamma)}{\text{odds of true screen}}$$

In case, as with most applications, the costs for true positives and true negatives are zero, this is simply

$$ {\frac{{{\frac{\beta }{\gamma }}}}{{{\text{odds}}\,{\text{of}}\,{\text{true}}\,{\text{screen}}}}}. $$

The cost-minimizing ROC vertices will be tangency points for these slopes.

In particular, equal unit misclassification cost assignments to false negatives and false positives (β = γ) result in cost-minimization, which coincides with maximization of agreement (#TP + #TN).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Richardson, P., Henderson, L., Davila, J.A. et al. Surveillance for Hepatocellular Carcinoma: Development and Validation of an Algorithm to Classify Tests in Administrative and Laboratory Data. Dig Dis Sci 55, 3241–3251 (2010). https://doi.org/10.1007/s10620-010-1387-y

Download citation

Received: 28 June 2010
Accepted: 04 August 2010
Published: 16 September 2010
Issue Date: November 2010
DOI: https://doi.org/10.1007/s10620-010-1387-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Surveillance for Hepatocellular Carcinoma: Development and Validation of an Algorithm to Classify Tests in Administrative and Laboratory Data