Skip to main content
Erschienen in: PharmacoEconomics 2/2016

01.02.2016 | Practical Application

Realising the Value of Linked Data to Health Economic Analyses of Cancer Care: A Case Study of Cancer 2015

verfasst von: Paula K. Lorgelly, Brett Doble, Rachel J. Knott, The Cancer 2015 Investigators

Erschienen in: PharmacoEconomics | Ausgabe 2/2016

Einloggen, um Zugang zu erhalten

Abstract

There is a growing appetite for large complex databases that integrate a range of personal, socio-demographic, health, genetic and financial information on individuals. It has been argued that ‘Big Data’ will provide the necessary catalyst to advance both biomedical research and health economics and outcomes research. However, it is important that we do not succumb to being data rich but information poor. This paper discusses the benefits and challenges of building Big Data, analysing Big Data and making appropriate inferences in order to advance cancer care, using Cancer 2015 (a prospective, longitudinal, genomic cohort study in Victoria, Australia) as a case study. Cancer 2015 has been linked to State and Commonwealth reimbursement databases that have known limitations. This partly reflects the funding arrangements in Australia, a country with both public and private provision, including public funding of private healthcare, and partly the legislative frameworks that govern data linkage. Additionally, linkage is not without time delays and, as such, achieving a contemporaneous database is challenging. Despite these limitations, there is clear value in using linked data and creating Big Data. This paper describes the linked Cancer 2015 dataset, discusses estimation issues given the nature of the data and presents panel regression results that allow us to make possible inferences regarding which patient, disease, genomic and treatment characteristics explain variation in health expenditure.
Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Note that these ‘V’ terms, and those that follow, have been variously defined in lists of the three Vs for Big Data [12], the five Vs [13], the seven Vs [14] and even ten Vs [15].
 
2
MBS data will also capture private hospital services and hospital outpatient services if they are billed via Medicare.
 
3
As the cohort did not specifically target the elderly, and is not expressly interested in issues with end-of-life care, we chose not to establish a linkage with the Commonwealth Department of Veterans’ Affairs (DVA) database. Furthermore, the cost of requesting Commonwealth data is not inconsequential, so it was also for budgetary reasons. This likely means that our estimate of total healthcare expenditure is an underestimate. See Ward et al. [16] for an analysis that includes DVA patients.
 
4
Interested researchers are invited to discuss applications to access the Cancer 2015 data with the Steering Committee.
 
5
As VDL de-identify the data, they can provide data for as far back as the study team requests, although linkage and data quality does diminish.
 
6
Note that there is currently a lag with batch testing within the NGS panel, as it is more cost effective to run it with large samples of data.
 
7
Although this is just the first wave of data from Cancer 2015 to be linked, the breadth of the data in both the cohort and available from the State and Commonwealth governments results in a combined dataset that is greater than 67 Mb; this will grow exponentially as enrollment and follow-up continue, and will very much be in the realm of terabytes of Big Data.
 
8
The research team considered it unlikely that an individual diagnosed with cancer would not appear in either of these records; on the other hand, they may not appear in the hospital records if a palliative pathway was established from the outset or the cancer was particularly aggressive.
 
9
The peak just after diagnosis is likely to be due to the large number of tests that occur to inform diagnosis and treatment alternatives, and in those circumstances where it is possible, the surgery to remove the tumour.
 
10
The Hausman test rejected the null hypothesis that the random effects and regressors are uncorrelated, favouring the fixed-effects specification.
 
11
These were identified in the PBS records using the high level ATC (Anatomical Therapeutic Chemical) Code. These two ATC codes were included to reflect possible adverse effects of cancer treatment.
 
12
The UK EQ-5D-3L values were used to estimate EQ-5D values. We used a means of forward (and backward) extrapolation to determine QOL prior to (after) the first (last) recorded EQ-5D-3L, and a method of linear interpolation between QOL measurement points, such that we have a measure of QOL for each time period.
 
13
All analyses were undertaken in STATA/MP ® 13 (STATACorp LP, College Station, TX, USA). The dataset required manipulation and selection to perform optimally; it is likely that future analyses with more linked data will require alternate software packages.
 
Literatur
1.
Zurück zum Zitat Trusheim MR, Berndt ER, Douglas FL. Stratified medicine: strategic and economic implications of combining drugs and clinical biomarkers. Nat Rev Drug Discov. 2007;6(4):287–93.CrossRefPubMed Trusheim MR, Berndt ER, Douglas FL. Stratified medicine: strategic and economic implications of combining drugs and clinical biomarkers. Nat Rev Drug Discov. 2007;6(4):287–93.CrossRefPubMed
2.
Zurück zum Zitat Sullivan R, Peppercorn J, Sikora K, Zalcberg J, Meropol NJ, Amir E, et al. Delivering affordable cancer care in high-income countries. Lancet Oncol. 2011;12(10):933–80.CrossRefPubMed Sullivan R, Peppercorn J, Sikora K, Zalcberg J, Meropol NJ, Amir E, et al. Delivering affordable cancer care in high-income countries. Lancet Oncol. 2011;12(10):933–80.CrossRefPubMed
4.
Zurück zum Zitat Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff. 2014;33(7):1123–31.CrossRef Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff. 2014;33(7):1123–31.CrossRef
5.
Zurück zum Zitat Boyd D, Crawford K. Critical questions for big data: provocations for a cultural, technological, and scholarly phenomenon. Inform Commun Soc. 2012;15(5):662–79.CrossRef Boyd D, Crawford K. Critical questions for big data: provocations for a cultural, technological, and scholarly phenomenon. Inform Commun Soc. 2012;15(5):662–79.CrossRef
6.
Zurück zum Zitat Collins B. Big Data and health economics: strengths, weaknesses, opportunities and threats. Epub: Pharmacoeconomics; 2015. Collins B. Big Data and health economics: strengths, weaknesses, opportunities and threats. Epub: Pharmacoeconomics; 2015.
7.
Zurück zum Zitat Hart CL, MacKinnon PL, Watt GC, Upton MN, McConnachie A, Hole DJ, et al. The Midspan studies. Int J Epidemiol. 2005;34(1):28–34.CrossRefPubMed Hart CL, MacKinnon PL, Watt GC, Upton MN, McConnachie A, Hole DJ, et al. The Midspan studies. Int J Epidemiol. 2005;34(1):28–34.CrossRefPubMed
8.
Zurück zum Zitat Geue C, Briggs A, Lewsey J, Lorgelly P. Population ageing and healthcare expenditure projections: new evidence from a time to death approach. Eur J Health Econ. 2014;15(8):885–96.CrossRefPubMed Geue C, Briggs A, Lewsey J, Lorgelly P. Population ageing and healthcare expenditure projections: new evidence from a time to death approach. Eur J Health Econ. 2014;15(8):885–96.CrossRefPubMed
9.
Zurück zum Zitat Parisot JP, Thorne H, Fellowes A, Doig K, Lucas M, McNeil JJ, et al. “Cancer 2015”: a prospective, population-based cancer cohort—phase 1: feasibility of genomics-guided precision medicine in the clinic. J Personalised Med. 2015;5(4):354–69.CrossRef Parisot JP, Thorne H, Fellowes A, Doig K, Lucas M, McNeil JJ, et al. “Cancer 2015”: a prospective, population-based cancer cohort—phase 1: feasibility of genomics-guided precision medicine in the clinic. J Personalised Med. 2015;5(4):354–69.CrossRef
10.
Zurück zum Zitat Katz SJ. Cancer care delivery research and the National Cancer Institute SEER program challenges and opportunities. JAMA. 2015;313(2):165–73.CrossRef Katz SJ. Cancer care delivery research and the National Cancer Institute SEER program challenges and opportunities. JAMA. 2015;313(2):165–73.CrossRef
11.
Zurück zum Zitat Reichman ME, Altekruse S, Li CI, Chen VW, Deapen D, Potts M, et al. Feasibility study for collection of HER2 data by National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) Program central cancer registries. Cancer Epidemiol Biomark Prev. 2010;19(1):144–7.CrossRef Reichman ME, Altekruse S, Li CI, Chen VW, Deapen D, Potts M, et al. Feasibility study for collection of HER2 data by National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) Program central cancer registries. Cancer Epidemiol Biomark Prev. 2010;19(1):144–7.CrossRef
13.
Zurück zum Zitat Marr B. Big Data: using SMART big data, analytics and metrics to make better decisions and improve performance. Chichester: John Wiley & Sons; 2015. Marr B. Big Data: using SMART big data, analytics and metrics to make better decisions and improve performance. Chichester: John Wiley & Sons; 2015.
16.
Zurück zum Zitat Ward RL, Laaksonen MA, Gool K, Pearson SA, Daniels B, Bastick P, et al. Cost of cancer care for patients undergoing chemotherapy: the Elements of Cancer Care study. Asia Pac J Clin Oncol. 2015;11(2):178–86.CrossRefPubMed Ward RL, Laaksonen MA, Gool K, Pearson SA, Daniels B, Bastick P, et al. Cost of cancer care for patients undergoing chemotherapy: the Elements of Cancer Care study. Asia Pac J Clin Oncol. 2015;11(2):178–86.CrossRefPubMed
17.
Zurück zum Zitat Wong S, Fellowes A, Doig K, Ellul J, Bosma T, Irwin D, et al. Assessing the clinical value of targeted massively parallel sequencing in a longitudinal, prospective population-based study of cancer patients. Br J Cancer. 2015;112(8):1411–20.CrossRefPubMed Wong S, Fellowes A, Doig K, Ellul J, Bosma T, Irwin D, et al. Assessing the clinical value of targeted massively parallel sequencing in a longitudinal, prospective population-based study of cancer patients. Br J Cancer. 2015;112(8):1411–20.CrossRefPubMed
18.
Zurück zum Zitat Sundararajan V, Henderson TM, Ackland M, Marshall R. Linkage of the Victorian Admitted Episodes Dataset. Symposium on health data linkage: its value for Australian health policy development and policy relevant research; 20–21 Mar 2002; Sydney. Sundararajan V, Henderson TM, Ackland M, Marshall R. Linkage of the Victorian Admitted Episodes Dataset. Symposium on health data linkage: its value for Australian health policy development and policy relevant research; 20–21 Mar 2002; Sydney.
21.
Zurück zum Zitat Mihaylova B, Briggs A, O’Hagan A, Thompson SG. Review of statistical methods for analysing healthcare resources and costs. Health Econ. 2011;20(8):897–916.PubMedCentralCrossRefPubMed Mihaylova B, Briggs A, O’Hagan A, Thompson SG. Review of statistical methods for analysing healthcare resources and costs. Health Econ. 2011;20(8):897–916.PubMedCentralCrossRefPubMed
22.
Zurück zum Zitat Jones AM (2000). Health econometrics. In: Culyer AJ, Newhouse JP, editors. Handbook of health economics. Part 1. Amsterdam: North Holland; 2001. p. 265–344. Jones AM (2000). Health econometrics. In: Culyer AJ, Newhouse JP, editors. Handbook of health economics. Part 1. Amsterdam: North Holland; 2001. p. 265–344.
23.
Zurück zum Zitat Wong SQ, Li J, Tan AY, Vedururu R, Pang J-MB, Do H, et al. Sequence artefacts in a prospective series of formalin-fixed tumours tested for mutations in hotspot regions by massively parallel sequencing. BMC Med Genomics. 2014;7(1):23.PubMedCentralCrossRefPubMed Wong SQ, Li J, Tan AY, Vedururu R, Pang J-MB, Do H, et al. Sequence artefacts in a prospective series of formalin-fixed tumours tested for mutations in hotspot regions by massively parallel sequencing. BMC Med Genomics. 2014;7(1):23.PubMedCentralCrossRefPubMed
24.
Zurück zum Zitat Ellis RP, Fiebig DG, Johar M, Jones G, Savage E. Explaining health care expenditure variation: large-sample evidence using linked survey and health administrative data. Health Econ. 2013;22(9):1093–110.CrossRefPubMed Ellis RP, Fiebig DG, Johar M, Jones G, Savage E. Explaining health care expenditure variation: large-sample evidence using linked survey and health administrative data. Health Econ. 2013;22(9):1093–110.CrossRefPubMed
25.
Zurück zum Zitat Medeiros BC, Satram-Hoang S, Hurst D, Hoang KQ, Momin F, Reyes C. Big data analysis of treatment patterns and outcomes among elderly acute myeloid leukemia patients in the United States. Ann Hematol. 2015;94(7):1127–38.PubMedCentralCrossRefPubMed Medeiros BC, Satram-Hoang S, Hurst D, Hoang KQ, Momin F, Reyes C. Big data analysis of treatment patterns and outcomes among elderly acute myeloid leukemia patients in the United States. Ann Hematol. 2015;94(7):1127–38.PubMedCentralCrossRefPubMed
26.
Zurück zum Zitat Lorgelly P, Knott R, Doble B, Harris M (2015). Modelling the cost of cancer: a system of equations approach to understanding inter-relationships. In: Health Economists Study Group, 22–24 June 2015. Lancaster University, UK. Lorgelly P, Knott R, Doble B, Harris M (2015). Modelling the cost of cancer: a system of equations approach to understanding inter-relationships. In: Health Economists Study Group, 22–24 June 2015. Lancaster University, UK.
27.
Zurück zum Zitat Saleema J, Shenoy PD, Venugopal K, Patnaik L. Cancer prognosis prediction model using data mining techniques. Data Min Knowl Eng. 2014;6(1):21–9. Saleema J, Shenoy PD, Venugopal K, Patnaik L. Cancer prognosis prediction model using data mining techniques. Data Min Knowl Eng. 2014;6(1):21–9.
28.
Zurück zum Zitat Al-Bahrani R, Agrawal A, Choudhary A. Colon cancer survival prediction using ensemble data mining on SEER data. 2013 IEEE International Conference on Big Data; 6–9 Oct 2013; Silicon Valley. Al-Bahrani R, Agrawal A, Choudhary A. Colon cancer survival prediction using ensemble data mining on SEER data. 2013 IEEE International Conference on Big Data; 6–9 Oct 2013; Silicon Valley.
29.
Zurück zum Zitat Crown WH. Potential application of machine learning in health outcomes research and some statistical cautions. Value Health. 2015;18(2):137–40.CrossRefPubMed Crown WH. Potential application of machine learning in health outcomes research and some statistical cautions. Value Health. 2015;18(2):137–40.CrossRefPubMed
31.
Zurück zum Zitat Blakely T, Atkinson J, Kvizhinadze G, Wilson N, Davies A, Clarke P. Patterns of cancer care costs in a country with detailed individual data. Med Care. 2015;53(4):302–9.PubMedCentralPubMed Blakely T, Atkinson J, Kvizhinadze G, Wilson N, Davies A, Clarke P. Patterns of cancer care costs in a country with detailed individual data. Med Care. 2015;53(4):302–9.PubMedCentralPubMed
Metadaten
Titel
Realising the Value of Linked Data to Health Economic Analyses of Cancer Care: A Case Study of Cancer 2015
verfasst von
Paula K. Lorgelly
Brett Doble
Rachel J. Knott
The Cancer 2015 Investigators
Publikationsdatum
01.02.2016
Verlag
Springer International Publishing
Erschienen in
PharmacoEconomics / Ausgabe 2/2016
Print ISSN: 1170-7690
Elektronische ISSN: 1179-2027
DOI
https://doi.org/10.1007/s40273-015-0343-2

Weitere Artikel der Ausgabe 2/2016

PharmacoEconomics 2/2016 Zur Ausgabe