Skip to main content
Erschienen in: Maternal and Child Health Journal 3/2024

26.12.2023

Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes

verfasst von: Veronica Barcelona, Danielle Scharp, Hans Moen, Anahita Davoudi, Betina R. Idnay, Kenrick Cato, Maxim Topaz

Erschienen in: Maternal and Child Health Journal | Ausgabe 3/2024

Einloggen, um Zugang zu erhalten

Abstract

Introduction

Stigma and bias related to race and other minoritized statuses may underlie disparities in pregnancy and birth outcomes. One emerging method to identify bias is the study of stigmatizing language in the electronic health record. The objective of our study was to develop automated natural language processing (NLP) methods to identify two types of stigmatizing language: marginalizing language and its complement, power/privilege language, accurately and automatically in labor and birth notes.

Methods

We analyzed notes for all birthing people > 20 weeks’ gestation admitted for labor and birth at two hospitals during 2017. We then employed text preprocessing techniques, specifically using TF-IDF values as inputs, and tested machine learning classification algorithms to identify stigmatizing and power/privilege language in clinical notes. The algorithms assessed included Decision Trees, Random Forest, and Support Vector Machines. Additionally, we applied a feature importance evaluation method (InfoGain) to discern words that are highly correlated with these language categories.

Results

For marginalizing language, Decision Trees yielded the best classification with an F-score of 0.73. For power/privilege language, Support Vector Machines performed optimally, achieving an F-score of 0.91. These results demonstrate the effectiveness of the selected machine learning methods in classifying language categories in clinical notes.

Conclusion

We identified well-performing machine learning methods to automatically detect stigmatizing language in clinical notes. To our knowledge, this is the first study to use NLP performance metrics to evaluate the performance of machine learning methods in discerning stigmatizing language. Future studies should delve deeper into refining and evaluating NLP methods, incorporating the latest algorithms rooted in deep learning.
Literatur
Zurück zum Zitat Berthold, M. R. C., Dill, N., Gabriel, F., Kotter, T. R., Meinl, T., Ohl, T., Thiel, P., & Wiswedel, K., B (2009). KNIME – the Konstanz Information Miner. AcM SIGKDD Explorations Newsletter, 11(1), 26–31.CrossRef Berthold, M. R. C., Dill, N., Gabriel, F., Kotter, T. R., Meinl, T., Ohl, T., Thiel, P., & Wiswedel, K., B (2009). KNIME – the Konstanz Information Miner. AcM SIGKDD Explorations Newsletter, 11(1), 26–31.CrossRef
Zurück zum Zitat Braveman, P., Dominguez, T. P., Burke, W., Dolan, S. M., Stevenson, D. K., Jackson, F. M., & Waddell, L. (2021). Explaining the black-white disparity in Preterm Birth: A Consensus Statement from a Multi-disciplinary Scientific Work Group convened by the March of dimes [Review]. 3. https://doi.org/10.3389/frph.2021.684207. Braveman, P., Dominguez, T. P., Burke, W., Dolan, S. M., Stevenson, D. K., Jackson, F. M., & Waddell, L. (2021). Explaining the black-white disparity in Preterm Birth: A Consensus Statement from a Multi-disciplinary Scientific Work Group convened by the March of dimes [Review]. 3. https://​doi.​org/​10.​3389/​frph.​2021.​684207.
Zurück zum Zitat Bridle, J. S. (1990). Probabilistic interpretation of Feedforward Classification Network Outputs, with relationships to Statistical Pattern Recognition. In F. F. Soulié, & J. Hérault (Eds.), Neurocomputing (Vol. 68). Springer. Bridle, J. S. (1990). Probabilistic interpretation of Feedforward Classification Network Outputs, with relationships to Statistical Pattern Recognition. In F. F. Soulié, & J. Hérault (Eds.), Neurocomputing (Vol. 68). Springer.
Zurück zum Zitat Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
Zurück zum Zitat Ho, T. K. (1995). Random decision forests. The Institute of Electronical and Electronics Engineers (IEEE), In Proceedings of 3rd international conference on document analysis and recognition. Ho, T. K. (1995). Random decision forests. The Institute of Electronical and Electronics Engineers (IEEE), In Proceedings of 3rd international conference on document analysis and recognition.
Zurück zum Zitat Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. European conference on machine learning Berlin, Heidelberg. Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. European conference on machine learning Berlin, Heidelberg.
Zurück zum Zitat Manning, C. D. R., & Schütze, P., H (2008). Introduction to information retrieval (Vol. 39). Cambridge University Press. Manning, C. D. R., & Schütze, P., H (2008). Introduction to information retrieval (Vol. 39). Cambridge University Press.
Zurück zum Zitat Martin, J. A., & Osterman, M. J. K. (2018). Describing the increase in Preterm Births in the United States, 2014–2016. NCHS data Brief, (312)(312), 1–8. Martin, J. A., & Osterman, M. J. K. (2018). Describing the increase in Preterm Births in the United States, 2014–2016. NCHS data Brief, (312)(312), 1–8.
Metadaten
Titel
Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes
verfasst von
Veronica Barcelona
Danielle Scharp
Hans Moen
Anahita Davoudi
Betina R. Idnay
Kenrick Cato
Maxim Topaz
Publikationsdatum
26.12.2023
Verlag
Springer US
Erschienen in
Maternal and Child Health Journal / Ausgabe 3/2024
Print ISSN: 1092-7875
Elektronische ISSN: 1573-6628
DOI
https://doi.org/10.1007/s10995-023-03857-4

Weitere Artikel der Ausgabe 3/2024

Maternal and Child Health Journal 3/2024 Zur Ausgabe