Skip to main content
Erschienen in: Clinical Rheumatology 1/2021

05.06.2020 | Review Article

The basics of data, big data, and machine learning in clinical practice

verfasst von: David Soriano-Valdez, Ingris Pelaez-Ballestas, Amaranta Manrique de Lara, Alfonso Gastelum-Strozzi

Erschienen in: Clinical Rheumatology | Ausgabe 1/2021

Einloggen, um Zugang zu erhalten

Abstract

Health informatics and biomedical computing have introduced the use of computer methods to analyze clinical information and provide tools to assist clinicians during the diagnosis and treatment of diverse clinical conditions. With the amount of information that can be obtained in the healthcare setting, new methods to acquire, organize, and analyze the data are being developed each day, including new applications in the world of big data and machine learning. In this review, first we present the most basic concepts in data science, including the structural hierarchy of information and how it is managed. A section is dedicated to discussing topics relevant to the acquisition of data, importantly the availability and use of online resources such as survey software and cloud computing services. Along with digital datasets, these tools make it possible to create more diverse models and facilitate collaboration. After, we describe concepts and techniques in machine learning used to process and analyze health data, especially those most widely applied in rheumatology. Overall, the objective of this review is to aid in the comprehension of how data science is used in health, with a special emphasis on the relevance to the field of rheumatology. It provides clinicians with basic tools on how to approach and understand new trends in health informatics analysis currently being used in rheumatology practice. If clinicians understand the potential use and limitations of health informatics, this will facilitate interdisciplinary conversations and continued projects relating to data, big data, and machine learning.
Literatur
3.
6.
Zurück zum Zitat Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25:44–56CrossRef Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25:44–56CrossRef
9.
Zurück zum Zitat Kahate A (2004) Introduction to database management systems. Pearson Education, Singapore Kahate A (2004) Introduction to database management systems. Pearson Education, Singapore
12.
Zurück zum Zitat Sebastian-Coleman L (2013) Measuring data quality for ongoing improvement. Elsevier Sebastian-Coleman L (2013) Measuring data quality for ongoing improvement. Elsevier
16.
Zurück zum Zitat Pringle M, Ward P, Chilvers C (1995) Assessment of the completeness and accuracy of computer medical records in four practices committed to recording data on computer. Br J Gen Pract 45:537–541PubMedPubMedCentral Pringle M, Ward P, Chilvers C (1995) Assessment of the completeness and accuracy of computer medical records in four practices committed to recording data on computer. Br J Gen Pract 45:537–541PubMedPubMedCentral
17.
Zurück zum Zitat Northrop RB (2017) Introduction to instrumentation and measurements. CRC Pr I Llc Northrop RB (2017) Introduction to instrumentation and measurements. CRC Pr I Llc
20.
Zurück zum Zitat Glandon GL, Smaltz DH, Slovensky DJ Information systems for healthcare management Glandon GL, Smaltz DH, Slovensky DJ Information systems for healthcare management
23.
24.
Zurück zum Zitat Anton H (1994) Elementary linear algebra. John Wiley Anton H (1994) Elementary linear algebra. John Wiley
25.
Zurück zum Zitat Viswanathan V, Viswanathan SR data analysis cookbook: over 80 recipes to help you breeze through your data analysis projects using R Viswanathan V, Viswanathan SR data analysis cookbook: over 80 recipes to help you breeze through your data analysis projects using R
26.
Zurück zum Zitat Samuel AL (1988) Some studies in machine learning using the game of checkers. II—Recent progress. In: Computer games I. Springer New York, New York, pp 366–400 Samuel AL (1988) Some studies in machine learning using the game of checkers. II—Recent progress. In: Computer games I. Springer New York, New York, pp 366–400
27.
Zurück zum Zitat Russell SJ, Davis E, Norvig P Artificial intelligence: a modern approach Russell SJ, Davis E, Norvig P Artificial intelligence: a modern approach
28.
Zurück zum Zitat Alpaydin E (2010) Introduction to machine learning. MIT Press Alpaydin E (2010) Introduction to machine learning. MIT Press
29.
Zurück zum Zitat Fox J (1997) Applied regression analysis, linear models, and related methods. Sage Publications, Thousand Oaks Fox J (1997) Applied regression analysis, linear models, and related methods. Sage Publications, Thousand Oaks
32.
Zurück zum Zitat Rajathi S, Radhamani G (2016) Prediction and analysis of rheumatic heart disease using kNN classification with ACO. In: 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE). IEEE, pp 68–73 Rajathi S, Radhamani G (2016) Prediction and analysis of rheumatic heart disease using kNN classification with ACO. In: 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE). IEEE, pp 68–73
33.
Zurück zum Zitat Monmarché N, Guinand F, Siarry P (2010) Artificial ants: from collective intelligence to real-life optimization and beyond. ISTE Monmarché N, Guinand F, Siarry P (2010) Artificial ants: from collective intelligence to real-life optimization and beyond. ISTE
35.
Zurück zum Zitat Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann
38.
Zurück zum Zitat Mittag F, Büchel F, Saad M, Jahn A, Schulte C, Bochdanovits Z, Simón-Sánchez J, Nalls MA, Keller M, Hernandez DG, Gibbs JR, Lesage S, Brice A, Heutink P, Martinez M, Wood NW, Hardy J, Singleton AB, Zell A, Gasser T, Sharma M, International Parkinson’s Disease Genomics Consortium (2012) Use of support vector machines for disease risk prediction in genome-wide association studies: concerns and opportunities. Hum Mutat 33:1708–1718. https://doi.org/10.1002/humu.22161CrossRefPubMedPubMedCentral Mittag F, Büchel F, Saad M, Jahn A, Schulte C, Bochdanovits Z, Simón-Sánchez J, Nalls MA, Keller M, Hernandez DG, Gibbs JR, Lesage S, Brice A, Heutink P, Martinez M, Wood NW, Hardy J, Singleton AB, Zell A, Gasser T, Sharma M, International Parkinson’s Disease Genomics Consortium (2012) Use of support vector machines for disease risk prediction in genome-wide association studies: concerns and opportunities. Hum Mutat 33:1708–1718. https://​doi.​org/​10.​1002/​humu.​22161CrossRefPubMedPubMedCentral
40.
Zurück zum Zitat Bellman R (2003) Dynamic programming. Dover Publications Bellman R (2003) Dynamic programming. Dover Publications
41.
Zurück zum Zitat Ester M, Ester M, Kriegel H-P, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. 226–231 Ester M, Ester M, Kriegel H-P, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. 226–231
44.
Zurück zum Zitat Patterson KA, Roberts-Thomson PJ, Lester S, Tan JA, Hakendorf P, Rischmueller M, Zochling J, Sahhar J, Nash P, Roddy J, Hill C, Nikpour M, Stevens W, Proudman SM, Walker JG (2015) Interpretation of an extended autoantibody profile in a well-characterized Australian systemic sclerosis (scleroderma) cohort using principal components analysis. Arthritis Rheum 67:3234–3244. https://doi.org/10.1002/art.39316CrossRef Patterson KA, Roberts-Thomson PJ, Lester S, Tan JA, Hakendorf P, Rischmueller M, Zochling J, Sahhar J, Nash P, Roddy J, Hill C, Nikpour M, Stevens W, Proudman SM, Walker JG (2015) Interpretation of an extended autoantibody profile in a well-characterized Australian systemic sclerosis (scleroderma) cohort using principal components analysis. Arthritis Rheum 67:3234–3244. https://​doi.​org/​10.​1002/​art.​39316CrossRef
45.
Zurück zum Zitat Lakota K, Thallinger GG, Sodin-Semrl S, Rozman B, Ambrozic A, Tomsic M, Praprotnik S, Cucnik S, Mrak-Poljsak K, Ceribelli A, Cavazzana I, Franceschini F, Vencovsky J, Czirják L, Varjú C, Steiner G, Aringer M, Stamenkovic B, Distler O, Matucci-Cerinic M, Kveder T (2012) International cohort study of 73 anti-Ku-positive patients: association of p70/p80 anti-Ku antibodies with joint/bone features and differentiation of disease populations by using principal-components analysis. Arthritis Res Ther 14:R2. https://doi.org/10.1186/ar3550CrossRefPubMedPubMedCentral Lakota K, Thallinger GG, Sodin-Semrl S, Rozman B, Ambrozic A, Tomsic M, Praprotnik S, Cucnik S, Mrak-Poljsak K, Ceribelli A, Cavazzana I, Franceschini F, Vencovsky J, Czirják L, Varjú C, Steiner G, Aringer M, Stamenkovic B, Distler O, Matucci-Cerinic M, Kveder T (2012) International cohort study of 73 anti-Ku-positive patients: association of p70/p80 anti-Ku antibodies with joint/bone features and differentiation of disease populations by using principal-components analysis. Arthritis Res Ther 14:R2. https://​doi.​org/​10.​1186/​ar3550CrossRefPubMedPubMedCentral
46.
Zurück zum Zitat Rao CR, Miller JP, Rao DC. (2008) Epidemiology and medical statistics. Elsevier Rao CR, Miller JP, Rao DC. (2008) Epidemiology and medical statistics. Elsevier
49.
Zurück zum Zitat Macqueen J, Macqueen J (1967) Some methods for classification and analysis of multivariate observations. 5-TH BERKELEY Symp Math Stat Probab 281–297 Macqueen J, Macqueen J (1967) Some methods for classification and analysis of multivariate observations. 5-TH BERKELEY Symp Math Stat Probab 281–297
50.
Zurück zum Zitat McNicholas PD Mixture model-based classification McNicholas PD Mixture model-based classification
56.
Zurück zum Zitat Khanna NN, Jamthikar AD, Gupta D, Piga M, Saba L, Carcassi C, Giannopoulos AA, Nicolaides A, Laird JR, Suri HS, Mavrogeni S, Protogerou AD, Sfikakis P, Kitas GD, Suri JS (2019) Rheumatoid arthritis: atherosclerosis imaging and cardiovascular risk assessment using machine and deep learning–based tissue characterization. Curr Atheroscler Rep 21:7. https://doi.org/10.1007/s11883-019-0766-xCrossRefPubMed Khanna NN, Jamthikar AD, Gupta D, Piga M, Saba L, Carcassi C, Giannopoulos AA, Nicolaides A, Laird JR, Suri HS, Mavrogeni S, Protogerou AD, Sfikakis P, Kitas GD, Suri JS (2019) Rheumatoid arthritis: atherosclerosis imaging and cardiovascular risk assessment using machine and deep learning–based tissue characterization. Curr Atheroscler Rep 21:7. https://​doi.​org/​10.​1007/​s11883-019-0766-xCrossRefPubMed
61.
Zurück zum Zitat Richard S. Sutton AGB (2008) Reinforced learning: an introduction Richard S. Sutton AGB (2008) Reinforced learning: an introduction
63.
Zurück zum Zitat Rummery GA, Rummery GA, Niranjan M (1994) On-line Q-Learning using connectionist systems Rummery GA, Rummery GA, Niranjan M (1994) On-line Q-Learning using connectionist systems
64.
Zurück zum Zitat Mulani J, Heda S, Tumdi K et al (2020) Deep reinforcement learning based personalized health recommendations. Springer, Cham, pp 231–255 Mulani J, Heda S, Tumdi K et al (2020) Deep reinforcement learning based personalized health recommendations. Springer, Cham, pp 231–255
65.
Zurück zum Zitat Ling Y, Hasan SA, Datla V, et al (2017) Learning to diagnose: assimilating clinical narratives using deep reinforcement learning Ling Y, Hasan SA, Datla V, et al (2017) Learning to diagnose: assimilating clinical narratives using deep reinforcement learning
67.
Zurück zum Zitat Cherven K Network graph analysis and visualization with Gephi: visualize and analyze your data swiftly using dynamic network graphs built with Gephi Cherven K Network graph analysis and visualization with Gephi: visualize and analyze your data swiftly using dynamic network graphs built with Gephi
68.
Zurück zum Zitat Peláez-Ballestas I, Granados Y, Quintana R, Loyola-Sánchez A, Julián-Santiago F, Rosillo C, Gastelum-Strozzi A, Alvarez-Nemegyei J, Santana N, Silvestre A, Pacheco-Tena C, Goñi M, García-García C, Cedeño L, Pons-Éstel BA, Latin American Study Group of Rheumatic Diseases in Indigenous Peoples (GLADERPO) (2018) Epidemiology and socioeconomic impact of the rheumatic diseases on indigenous people: an invisible syndemic public health problem. Ann Rheum Dis 77:1397–1404. https://doi.org/10.1136/annrheumdis-2018-213625CrossRefPubMed Peláez-Ballestas I, Granados Y, Quintana R, Loyola-Sánchez A, Julián-Santiago F, Rosillo C, Gastelum-Strozzi A, Alvarez-Nemegyei J, Santana N, Silvestre A, Pacheco-Tena C, Goñi M, García-García C, Cedeño L, Pons-Éstel BA, Latin American Study Group of Rheumatic Diseases in Indigenous Peoples (GLADERPO) (2018) Epidemiology and socioeconomic impact of the rheumatic diseases on indigenous people: an invisible syndemic public health problem. Ann Rheum Dis 77:1397–1404. https://​doi.​org/​10.​1136/​annrheumdis-2018-213625CrossRefPubMed
Metadaten
Titel
The basics of data, big data, and machine learning in clinical practice
verfasst von
David Soriano-Valdez
Ingris Pelaez-Ballestas
Amaranta Manrique de Lara
Alfonso Gastelum-Strozzi
Publikationsdatum
05.06.2020
Verlag
Springer International Publishing
Erschienen in
Clinical Rheumatology / Ausgabe 1/2021
Print ISSN: 0770-3198
Elektronische ISSN: 1434-9949
DOI
https://doi.org/10.1007/s10067-020-05196-z

Weitere Artikel der Ausgabe 1/2021

Clinical Rheumatology 1/2021 Zur Ausgabe

Leitlinien kompakt für die Innere Medizin

Mit medbee Pocketcards sicher entscheiden.

Seit 2022 gehört die medbee GmbH zum Springer Medizin Verlag

Mehr Lebenszeit mit Abemaciclib bei fortgeschrittenem Brustkrebs?

24.05.2024 Mammakarzinom Nachrichten

In der MONARCHE-3-Studie lebten Frauen mit fortgeschrittenem Hormonrezeptor-positivem, HER2-negativem Brustkrebs länger, wenn sie zusätzlich zu einem nicht steroidalen Aromatasehemmer mit Abemaciclib behandelt wurden; allerdings verfehlte der numerische Zugewinn die statistische Signifikanz.

ADT zur Radiatio nach Prostatektomie: Wenn, dann wohl länger

24.05.2024 Prostatakarzinom Nachrichten

Welchen Nutzen es trägt, wenn die Strahlentherapie nach radikaler Prostatektomie um eine Androgendeprivation ergänzt wird, hat die RADICALS-HD-Studie untersucht. Nun liegen die Ergebnisse vor. Sie sprechen für länger dauernden Hormonentzug.

„Überwältigende“ Evidenz für Tripeltherapie beim metastasierten Prostata-Ca.

22.05.2024 Prostatakarzinom Nachrichten

Patienten mit metastasiertem hormonsensitivem Prostatakarzinom sollten nicht mehr mit einer alleinigen Androgendeprivationstherapie (ADT) behandelt werden, mahnt ein US-Team nach Sichtung der aktuellen Datenlage. Mit einer Tripeltherapie haben die Betroffenen offenbar die besten Überlebenschancen.

So sicher sind Tattoos: Neue Daten zur Risikobewertung

22.05.2024 Melanom Nachrichten

Das größte medizinische Problem bei Tattoos bleiben allergische Reaktionen. Melanome werden dadurch offensichtlich nicht gefördert, die Farbpigmente könnten aber andere Tumoren begünstigen.

Update Innere Medizin

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert.