Zum Inhalt

Abstracts from EHRCON25—openEHR International Conference 2025

  • Open Access
  • 01.04.2026
  • Meeting Abstracts
Erschienen in:
Download

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Abstracts from EHRCON25 - openEHR International Conference 2025

Supplement to BMC Proceedings

Foreword
This supplement to BMC Proceedings brings together 13 peer-reviewed abstracts accepted to the scientific track of EHRCON25, held in Barcelona in October 2025. Spanning both short papers and poster presentations, this collection reflects some of the incredible work currently being undertaken across the openEHR and wider digital health communities.
The abstracts represent a significant investment of time and expertise by researchers working at the intersection of health informatics and technology. Their contributions demonstrate not only technical innovation but also a strong commitment to advancing interoperable, clinically meaningful, and sustainable health information systems. We congratulate all authors whose work has been selected for publication in this supplement.
Equally, this supplement would not have been possible without the commitment and rigour of our dedicated team of peer reviewers. Through careful and constructive review, they have helped the authors to ensure the scientific quality, clarity and relevance of the accepted abstracts. This work, which was undertaken alongside demanding professional and academic commitments, has been essential to the success of the scientific track and is gratefully acknowledged.
Together, these abstracts provide a valuable addition to EHRCON25. Not only do these contributions enrich the conference programme, they also support the ongoing promotion of knowledge exchange, whilst capturing the continuous evolution of research, practice and collaboration within the openEHR ecosystem.
We warmly thank all authors and reviewers for their commitment and contributions, and are pleased to present this supplement as part of the enduring academic and professional output of EHRCON25.

A1 RAG on openEHR: narrative-to-structure semantic binding

Matic Bernik, Robert Tovornik (robert.​tovornik@better.​care), Borut Fabjan (borut.​fabjan@better.​care)

Better Ltd, Ljubljana, Slovenia

Correspondence: Matic Bernik (matic.​bernik@better.​care)
BMC Proceedings 2026, 20(12):A1
Background
openEHR archetypes provide a rich semantic foundation for structured clinical data [1,2], yet most clinical documentation and modelling tasks begin with narrative text. Bridging the semantic gap between free-text clinical expressions and formal archetype structures remains a major challenge for scalable semantic interoperability and clinical knowledge engineering. Existing tools such as the openEHR Clinical Knowledge Manager (CKM) Resource Finder [3] rely primarily on keyword-based search and struggle with synonymy, abbreviations, and contextual language commonly used in clinical narratives.
Materials and methods
We developed a Retrieval-Augmented Generation (RAG) pipeline that maps narrative clinical input to candidate openEHR archetypes and their internal data elements. The approach combines clinical entity extraction with hybrid lexical–semantic retrieval over an attribute-weighted index of archetypes sourced from CKM. Archetypes are decomposed into granular semantic units and indexed using BM25-based keyword search [5] alongside neural vector embeddings. Attribute-level weighting prioritises clinically meaningful metadata, enabling context-aware archetype and data element recommendations while reducing reliance on manual archetype exploration.
Results
The pipeline was evaluated on 800 narrative clinical queries derived from documentation and production systems, each mapped to a gold-standard set of CKM archetypes. Compared with CKM Resource Finder and zero-shot GPT-4.1 prompting, the proposed approach achieved substantially higher archetype recall (0.73 vs. 0.53 and 0.44, respectively) and improved precision. These results demonstrate that hybrid retrieval [4] combined with fine-grained indexing more reliably identifies semantically relevant archetypes within the top-ranked results, supporting practical clinical modelling workflows and reducing the need for manual archetype exploration in tools such as Archetype Designer [6].
Conclusions
Hybrid lexical–semantic retrieval with Retrieval-Augmented Generation provides an effective mechanism for binding narrative clinical language to structured openEHR archetypes. By improving recall and precision over existing tools, the proposed pipeline accelerates clinical modelling, template design, and semantic validation while maintaining consistency across openEHR implementations. This approach supports scalable clinical knowledge engineering and reduces the manual effort required to translate narrative intent into formal clinical models.
References
1.
Haarbrandt B, et al. Automated transformation of openEHR data instances to OWL. Stud Health Technol Inform. 2016;228:63–67.
 
2.
openEHR Specifications Program. openEHR architecture overview. Available: https://specifications.openehr.org/releases/BASE/latest/architecture_overview.html
 
3.
openEHR International. Clinical Knowledge Manager (CKM). Available: https://ckm.openehr.org/ckm/
 
4.
OpenSearch Project. Hybrid search — OpenSearch documentation. Available: https://docs.opensearch.org/latest/vector-search/ai-search/hybrid-search/
 
5.
Robertson S, Zaragoza H. The probabilistic relevance framework: BM25 and beyond. Found Trends Inf Retr. 2009;3(4):333–389.
 
6.
Better Ltd, openEHR International. Archetype Designer. Available: https://tools.openehr.org/designer/
 

A2 openEHR for research cohorts data persistence: the BBMRI-ERIC CRC-cohort use case

Giovanni Delussu1, Cecilia Mascia1, Vittorio Meloni1, Mauro Del Rio1, Petr Holub2, Eva García3, Francesca Frexia1

1CRS4—Center for Advanced Studies, Research and Development in Sardinia, Pula, Italy; 2Masaryk University, Brno, Czech Republic; 3BBMRI-ERIC, Graz, Austria

Correspondence: Giovanni Delussu (giovanni.​delussu@crs4.​it)
BMC Proceedings 2026, 20(12):A2
Background: Modern biomedical research depends on large, high-quality datasets for meaningful insights [1,2], and therefore requires rigorous data curation as a critical prerequisite for successful analyses [3]. From this perspective, the European Research Infrastructure Consortium for Biobanking and BioMolecular Resources (BBMRI-ERIC) [4] plays a pivotal role by providing researchers access to European-wide biobanking resources through a centralized infrastructure that facilitates the findability and reuse of biological samples and associated data [5]. To improve the long-term value of one of the key BBMRI-ERIC assets, the ColoRectal Cancer Cohort (CRC-Cohort) [6], a collection of over 10,000 cases from 26 European biobanks, see Fig. 1, openEHR [7] was selected as the standardization framework within an H2020 EOSC-Life Project Demonstrator [8].
Methods: The original CRC-Cohort data, stored in a PostgreSQL database based on their initial data model, see appendix A in Ref. [9], were first reverse-engineered into the XML format adopted for data collection. We then mapped these data concepts to openEHR archetypes from the Clinical Knowledge Manager (CKM) [10]. A specialized openEHR template [11] was developed, referencing the original XML structure to ensure traceability and handle multiple data occurrences (e.g., surgeries). Following rigorous cleansing, data were used to populate openEHR compositions then stored in an EHRBase [12] server. This whole process is shown in Fig. 1.
Finally, a generic export tool based on the Minimum Information About BIobank data Sharing (MIABIS) [13] model was created to retrieve data via Archetype Query Language (AQL) and convert it into the standard formats supported within the BBMIR-ERIC infrastructure, Health Level 7 Fast Healthcare Interoperability Resources (HL7 FHIR) [14] and Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) [15] formats. Where possible, nodes were linked to OMOP standardized vocabulary.
Results: The entire CRC-Cohort dataset was successfully transformed and stored in openEHR format, enabling standardized querying, i.e., AQL. The resulting template [11], in Fig. 2, includes 24 existing CKM archetypes with minimal adaptation.
The export functionality to FHIR and OMOP CDM was validated, see Fig. 3. This process maintained semantic integrity while enhancing data quality through systematic cleansing. All software components, models, and documentation were released as open source [11,16,17].
Conclusions: This proof of concept demonstrates that openEHR can effectively represent and persist large-scale biobanking datasets in an interoperable, semantically rich format, while advancing adherence to FAIR Principles [18]. Future work may aim to incorporate genetic data into the CRC-Cohort by leveraging openEHR genomic archetypes [19] for personalized medicine initiatives.
Acknowledgments
This work was supported by: the H2020 EOSC-Life (grant number 824087) WP1 Demonstrator “Cloudification of BBMRI-ERIC CRC-Cohort and its Digital Pathology Imaging” (APPID 1228); the BBMRI-ERIC Common Service IT; the Sardinian Regional Authority [projects: XDATA; ToPMa, grant ID: RC CRP 077].
References
1.
Eisinger-Mathason TSK, Leshin J, Lahoti V, Fridsma DB, Mucaj V, Kho AN. Data linkage multiplies research insights across diverse healthcare sectors. Commun Med (Lond). 2025 Mar 4;5(1):58. https://doi.org/10.1038/s43856-025-00769-y.
 
2.
Luo J, Wu M, Gopukumar D, Zhao Y. Big Data Application in Biomedical Research and Health Care: A Literature Review. Biomed Inform Insights. 2016 Jan 19;8:1–10. https://doi.org/10.4137/BII.S31559.
 
3.
Bernardi FA, Alves D, Crepaldi N, Yamada DB, Lima VC, Rijo R. Data Quality in Health Research: Integrative Literature Review. J Med Internet Res. 2023 Oct 31;25:e41446. https://doi.org/10.2196/41446.
 
4.
Biobanking and Biomolecular Resources Research Infrastructure—European Research Infrastructure Consortium—www.bbmri-eric.eu.
 
5.
Litton JE. Launch of an Infrastructure for Health Research: BBMRI-ERIC. Biopreserv Biobank. 2018 Jun;16(3):233–241. https://doi.org/10.1089/bio.2018.0027.
 
7.
 
8.
H2020 ADOPT Project: implementAtion anD OPeration of the gateway for healTh into BBMRI-ERIC—https://cordis.europa.eu/project/id/676550.
 
10.
openEHR Clinical Knowledge Manager—https://ckm.openehr.org/ckm.
 
11.
openEHR models for the CRC-Cohort—https://github.com/crs4/crc_cohort_modelling.
 
12.
 
13.
Minimum Information About BIobank data Sharing (MIABIS) https://www.bbmri-eric.eu/howtomiabis.
 
14.
 
17.
BBMRI-ERIC Federated Platform ETL Tool—https://github.com/crs4/bbmri-fpetl.
 
18.
Wilkinson MD, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016 Mar 15;3:160,018. https://doi.org/10.1038/s41597-019-0009-6.
 
19.
Mascia C, Frexia F, Uva P, Zanetti G, Pireddu L, et al. The openEHR Genomics Project. Stud Health Technol Inform. 2020 Jun 16;270:443–447. https://doi.org/10.3233/SHTI200199.
 
Fig. 1 (Abstract A2)
The whole process from the database input to the openEHR server
Bild vergrößern
Fig. 2 (Abstract A2)
The CRC-Cohort template. The structure (a) reflects the data and is organized in 8 sections. The use case specialization is obtained through: (b) renaming or hiding of unrequired nodes; (c) constraints on occurrences and datatypes, term set definition. Annotations are added (d) to map with the original model
Bild vergrößern
Fig. 3 (Abstract A2)
openEHR data use. The data were extracted and converted in FHIR and OMOP formats, ready for the BBMRI-ERIC Federated Platform
Bild vergrößern

A3 Converging the WHO Digital Adaptation Kit (DAK) for antenatal care and openEHR for a universal antenatal care dataset

Nicola P Ewen Hall1, Khin T Aung2, Keisha S Barwise1, Heather Leslie3

1Ministry of Health and Wellness, Kingston, Jamaica; 2Independent Consultant; 3Atomica Informatics, Melbourne, Australia

Correspondence: Nicola P Ewen Hall (evvenlife@gmail.​com)
BMC Proceedings 2026, 20(12):A3
Background
The World Health Organisation (WHO) developed the Digital Adaptation Kit (DAK) for Antenatal Care (ANC) to assist countries to implement WHO recommendations in digital systems [1, 2]. openEHR enables the standardized representation of data elements in a computable format [2].
The openEHR modelling approach would provide a semantic data model for ANC that can provide clinically rich data. This approach ensures that clinical data can be structured and interpreted consistently across various systems and platforms.
Materials and Methods
The core data dictionary of the DAK was utilised for the mapping exercise [1]. The mapping exercise was completed using direct mapping of the data elements utilising the input method, data type, input options and collection frequency specified in the Excel® document. The direct mapping approach was chosen to maintain fidelity to the original DAK structure.
The translation of the DAK was used to develop templates that can be adapted to support the implementation of country-specific, vendor neutral ANC records.
Results
A total of eight hundred and seventy-nine (879) individual data elements were extracted from the DAK.
Sixty-two (62) archetype s were used to develop the template for the ANC record. Of the archetypes used, thirty-six (36) were published, twenty-two were drafts, two (2) were initial or pre-draft and two (2) were under review. Two (2) new archetypes were proposed from the mapping exercise; Birth Plan Preference archetype and an Audit Administrative archetype.
The mapping to the non-openEHR source was achieved almost entirely through reuse of the existing archetype library (Fig. 1).
Conclusions
Although OpenEHR archetypes can be used to map the DAK with a high level of accuracy and completion, the output is limited in its utility as a primary patient record and providing persistent information for clinical decision making.
Ninety-eight percent (98%) of the data elements list in the DAK could be mapped to existing archetypes in the CKM. This strongly demonstrates both the maturity and the generalizability of the archetypes.
A second mapping exercise will be completed for semantic optimisation of the ANC templates.
References
1.
World Health Organization. WHO digital adaptation kit for antenatal care: operational requirements for implementing WHO recommendations in digital systems. Geneva: World Health Organization; 2021. https://www.who.int/publications/i/item/9789240020306
 
2.
Min L, Tian Q, Lu X, An J, Duan H. An openEHR based approach to improve the semantic interoperability of clinical data registry. BMC Med Inform Decis Mak. 2018;18:15.. https://doi.org/10.1186/s12911-018-0596-8
 
Fig. 1 (Abstract A3)
The workflow for mapping the data elements in the DAK using openEHR standards
Bild vergrößern

A4 Isala on EHR: openEHR for secondary data use in radiotherapy

Sophie de Klerk1, William Mensen1, Heino Bosma1, Lars Kuizenga1, Rogier Janssen2, Djoeri Lipman2, Rix Groenboom1

1Research group Digital Transformation, Hanze University of Applies Sciences, Groningen, The Netherlands; 2Isala Oncologisch Centrum, Isala Hospital, Zwolle, The Netherlands

Correspondence: Sophie de Klerk (s.​de.​klerk@pl.​hanze.​nl)
BMC Proceedings 2026, 20(12):A4
Background
The increasing demand for structured high-quality clinical data for AI and decision support systems highlights the shortcomings of current healthcare data infrastructure, particularly its misalignment with FAIR [1] data principles (Findable, Accessible, Interoperable, Reusable). In radiotherapy, clinical information is fragmented across systems, each with its own data model. This heterogeneity limits secondary data use for applications like research, business intelligence, and AI. Although existing standards like DICOM [2] provide extensive technical imaging and treatment delivery persistence capabilities, the clinical context (e.g. diagnosis, toxicity, treatment intention) is not covered. openEHR has emerged as a promising solution, offering semantic consistency through archetype-based modeling and growing interoperability with standards like OMOP and FHIR [3]. This study aimed to evaluate openEHR as a unified data platform for radiotherapy at Isala.
Materials and Methods
An open-source infrastructure was implemented using EHRbase (openEHR backend), Keycloak (auth), and Snowstorm (terminology server), all deployed with Docker (Fig. 1). A Python-based ETL (extract, transform, load) pipeline was developed to extract and transform data from Mosaiq using SQL queries into openEHR Flat JSON, using mock patient data for validation. Clinical modeling was informed by the international CKM (Clinical Knowledge Manager), available literature [4–7], guided by consultations with domain experts, and collaborative workshops at Isala.
Results
The EHRbase/Keycloak/Snowstorm stack provided a functional prototype for FAIR-compliant data persistence, though setup required considerable (technical) effort due to limited documentation. The ETL process exposed the current institution-specific nature of radiotherapy data, requiring institutespecific mappings and limiting reusability. Mapping codes proved difficult due the absence of 1:1 mappings as one ICD-10 code may correspond to multiple SNOMED CT concepts. Existing openEHR archetypes did not fully cover the project requirements, particularly in representing adaptive treatments, nested targets, and dose-volume relationships. A preliminary template based on existing “Procedure” and “Irradiation” archetypes was developed, and a custom archetype was created to address data model gaps.
Conclusions
openEHR is a viable solution for structuring radiotherapy data for secondary use, but current archetypes fall short in modeling its full complexity. The creation and validation of a common data model with other institutions and international consensus is essential for standardization and broader adoption. Even though few vendors are active in the domain, institution-specific models require the creation of mappings per institute. While the technical foundation is in place, future work must assess usability by clinicians, researchers, and data scientists to fully validate the data platform.
Acknowledgments
We would like to thank Chris Ootes for the technical support and access to infrastructure provided during this project.
References
1.
Wilkinson MD, Dumontier M, Aalbersberg IjJ, Appleton G, Axton M, Baak A, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016 Mar 15;3(1):160018.
 
2.
Mustra M, Delac K, Grgic M. Overview of the DICOM standard. In: 2008 50th International Symposium ELMAR [Internet]. 2008 [cited 2025 Dec 18]. p. 39–44. Available from: https://ieeexplore.ieee.org/abstract/document/4747434
 
3.
Kohler S, Boscá D, Kärcher F, Haarbrandt B, Prinz M, Marschollek M, et al. Eos and OMOCL: Towards a seamless integration of openEHR records into the OMOP Common Data Model. J Biomed Inform. 2023 Aug 1;144:104437.
 
4.
Hayman JA, Dekker A, Feng M, Keole SR, McNutt TR, Machtay M, et al. Minimum Data Elements for Radiation Oncology: An American Society for Radiation Oncology Consensus Paper. Pract Radiat Oncol. 2019 Nov 1;9(6):395–401.
 
5.
Christodouleas JP, Anderson N, Gabriel P, Greene R, Hahn C, Kessler S, et al. A Multidisciplinary Consensus Recommendation on a Synoptic Radiation Treatment Summary: A Commission on Cancer Workgroup Report. Pract Radiat Oncol. 2020 Nov 1;10(6):389–401.
 
6.
ASTRO [Internet]. [cited 2025 Aug 28]. Guidelines—American Society for Radiation Oncology (ASTRO). Available from: https://www.astro.org/provider-resources/guidelines/clinical-practiceguidelines
 
7.
ESTRO [Internet]. [cited 2025 Aug 28]. Guidelines—European Society for Radiotherapy and Oncology. Available from: https://www.estro.org/Science/Guidelines
 
Fig. 1 (Abstract A4)
Pilot server architecture based on EHRbase (openEHR backend), Keycloak (authentication), and Snowstorm (terminology server)
Bild vergrößern

A5 Development of in-house tooling as support for the openEHR workflow

Martin A Koch, Ana Pascual Segura, Anna de la Torre Suñé, Clara Calleja Vega, David Alonso Torrella, David Hernandez Rodriguez, Hugo Briceño Garcia, Laura Moral Lopez, Tara Bonet Chinillach, Lluis Valle Martín, Jordi Piera Jiménez

Catalan Healthcare Service, CatSalut, Barcelona, Spain

Correspondence: Martin A Koch (martinandreaskoc​h@catsalut.​cat)
BMC Proceedings 2026, 20(12):A5
Background
Developing and maintaining openEHR archetypes and templates is complex and involves multiple development and publication platforms. The interplay between tools like Better Archetype Designer (AD), Clinical Knowledge Manager (CKM), and Clinical Data Repositories (CDR) creates significant challenges in version management and template validation. To streamline these processes, our team has developed a suite of in-house applications tailored to our workflow.
Material and methods
A bottom-up, needs-driven development strategy was applied. Pain points were identified in versioning, dependency management, and validation of openEHR archetypes and templates within the group’s workflow (Fig. 1), adapted from Moner et al. [1].
After defining requirements, targeted Python applications were developed to address each bottleneck. Validation and testing were performed to ensure solutions met requirements. Software is distributed in-house via a private Confluence domain as executables. Reports are published as HTML files. Adaption and satisfaction in the modelling team was measured with an anonymous survey.
Results
The applied methodology enabled identification of the workflow steps and all platforms involved in openEHR archetype and template development. Seven relevant pain points were identified across archetype management, publication, deployment, and validation stages (Fig. 1). Targeted software solutions were developed to address these pain points, ranging from HTML data visualization to executable programs. Examples of the software have been published online, such as the CKM Visualization [2] and the AQL Manager [3]. Adoption of these solutions has been partial: Of 10 team members, 7 responded to the questionnaire. Five reported using at least one in-house software solution, expressing satisfaction and noting improved work efficacy. The two non-users indicated the tools are not currently needed for their workflow steps.
Conclusions
In-house software solutions were successfully developed to address workflow pain points in openEHR archetype and template development. Easy accessibility and readability of reports were important for adoption by the modellers. Partial adoption and positive feedback demonstrate the usefulness of these solutions in the development, management, and validation of archetypes and templates. Publishing these tools as open source should be considered to maximize their impact and benefit the wider openEHR community.
References
1.
Moner D, Maldonado JA, Robles M. Archetype modeling methodology. Journal of Biomedical Informatics. 2018; 79: 71–81. https://doi.org/10.1016/j.jbi.2018.02.003
 
2.
Koch MA. CKM Visualization. GitHub. 2025. https://github.com/martinkochdesign/CKM_content_visualization
 
3.
Koch MA. openEHR AQL Manager. GitHub. 2025. https://github.com/martinkochdesign/openEHR_AQL_manager
 
Fig. 1 (Abstract A5)
Overview of the end-to-end workflow for openEHR template development: from data element collection to in template deployment
Bild vergrößern

A6 From archetypes to real world data: enhancing Eos and OMOCL

Severin Kohler1, Diego Boscá2, Michael Marschollek3, Roland Eils1

1Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany; 2Veratech for Health, Valencia, Spain; 3Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover, Germany

Correspondence: Severin Kohler (severin.​kohler@bih-charite.​de)
BMC Proceedings 2026, 20(12):A6
Background
The reuse of routinely collected electronic health record (EHR) data for research depends on reliable transformation into standardized research data models. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is widely adopted for observational research, but transforming data from openEHR-based systems into OMOP remains technically complex and resource-intensive. Eos and the OMOP Conversion Language (OMOCL) were previously introduced [1] to support archetype-driven transformation from openEHR to OMOP, but limitations remained regarding terminology coverage, visit construction, and mapping completeness.
Materials and methods
We extended OMOCL and the Eos Extract Transform Load (ETL) engine to address these limitations. OMOCL was enhanced with conceptMaps to enable mapping of internal openEHR archetype value sets to OMOP standard vocabulary concepts. Eos was updated to support visit occurrence generation based on Archetype Query Language (AQL), allowing visits to be derived flexibly from different openEHR implementations. In parallel, the international OMOCL mapping library was systematically expanded through a community-driven effort. The resulting mappings were evaluated to assess coverage, terminology alignment, and semantic loss during transformation.
Results
ConceptMap support enabled the transformation of internal archetype codes that are not typically annotated in international terminologies, improving semantic preservation for observations and measurements. ConceptMaps were added to existing mappings when possible. AQL-based visit generation allowed visits to span full healthcare encounters rather than individual compositions, increasing adaptability across openEHR platforms. The mapping library was expanded to cover 196 published archetypes. Evaluation showed that 8.65% of core OMOP concept fields could not be mapped to a valid standard concept and were assigned to zero, reflecting gaps in OMOP-supported vocabularies. Structural differences between openEHR and OMOP led to partial loss of contextual information and qualifiers, particularly where multiple clinical attributes must be collapsed into single OMOP fields.
Conclusions
The presented enhancements substantially improve the coverage, flexibility, and semantic quality of transformations from openEHR to the OMOP CDM, lowering barriers to secondary use of openEHR data in OMOP-based research. Nevertheless, persistent limitations in terminology alignment and model expressiveness result in unavoidable semantic loss. For research requiring fine-grained clinical context, direct use of openEHR data may remain preferable. Continued alignment between openEHR and OMOP communities is essential to support high-quality reuse of clinical data.
References
1.
S. Kohler, D. Boscá, F. Kärcher, B. Haarbrandt, M. Prinz, M. Marschollek, and R. Eils, “Eos and OMOCL: Towards a seamless integration of openEHR records into the OMOP Common Data Model,” Journal of Biomedical Informatics, vol. 144, p. 104437, Aug. 2023.
 

A7 AI health agents on mobile

Martin Korelič1, Veljko Pejović2,3 (veljko.​pejovic@fri.​uni-lj.​si)

1Better Ltd, Ljubljana, Slovenia; 2Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia; 3Department of Computer Systems, Institute “Jožef Stefan”, Ljubljana, Slovenia

Correspondence: Martin Korelič (martin.​korelic@better.​care)
BMC Proceedings 2026, 20(12):A7
Background
AI health agents using Large Language Models (LLMs) have demonstrated remarkable capabilities in healthcare applications. However, concerns regarding patient confidentiality, data protection, and network dependency have created demand for privacy-preserving alternatives. While edge-based healthcare AI solutions have been deployed to address these concerns, none have integrated openEHR [1] standardized structures, leaving a critical gap given openEHR’s growing adoption as the framework for interoperable health data across healthcare systems. Recent advances in model compression and mobile hardware now enable Small Language Models (SLMs) with billions of parameters to operate on consumer smartphones, presenting a unique opportunity to combine openEHR’s vendor-neutral approach with on-device privacy preservation.
Materials and Methods
We developed the first on-device Retrieval-Augmented Generation (RAG) prototype for openEHR-based personal health data, operating entirely on smartphone (Fig. 1). The Android application leverages the MobileTransformers framework [2], maintaining an on-device SLM, embedding model and on-device vector database of personal health records including vital signs, medications, allergies, and laboratory results. User queries trigger vector similarity search across indexed records, with retrieved information serving as context for quantized SLMs to generate grounded responses.
We evaluated two SLMs: TinyLlama [3] (1.1B parameters) and Phi3-mini-4k [4] (3.5B parameters), both with INT4 quantization. System performance was measured on Google Pixel 6 (CPU). Response quality was assessed using G-Eval [5] methodology with LLM-as-a-judge (Gemini-2.5-pro) [6], evaluating clinical quality and faithfulness against cloud-based LLM responses. We tested both document-level and chunked retrieval strategies across simple and complex query categories.
Results
The larger SLM (Phi3-mini-4k) achieved higher scores on both clinical quality and faithfulness dimensions, demonstrating greater alignment with cloud-based LLM responses. However, SLMs exhibited context prioritization limitations, occasionally omitting critical clinical details such as emergency management information, as illustrated in the side-by-side comparison between SLM and cloud LLM outputs (Fig. 2).
System performance metrics (Table 1) confirm viability on consumer devices, with TinyLlama requiring 0.94 GB memory at 9.04 tokens/second, while Phi3-mini-4k required 2.7 GB at 3.6 tokens/second.
Conclusions
Our work demonstrates feasibility for deploying privacy-preserving health AI agents entirely on mobile devices using standardized openEHR data. While SLMs show promising performance, domain-specific fine-tuning could improve comprehensive clinical documentation. Future work includes openEHR server integration for data synchronization and refined verbalization methods. This research establishes foundations for ubiquitous, privacy-focused AI health agents operating entirely on personal smartphones, ensuring patients retain portable health intelligence across providers and locations.
Acknowledgements
This research was partly funded by the Slovenian Research Agency grant no. N20393 “approXimation for adaptable diStributed artificial intelligence” and grant no. J2-3047 “Context-Aware On-Device Approximate Computing”.
References
1.
Kalra D, Beale T, Heard S. The openEHR foundation. Stud Health Technol Inform. 2005;115:153–173.
 
2.
Koreliˇc M, Pejovi´c V. MobileTransformers: On-Device LLM PEFT Framework for Fine-Tuning and Inference. 2025. MobileTransformers Framework [https://gitlab.fri.uni-lj.si/lrk/mobiletransformers].
 
3.
Zhang P, et al. TinyLlama: An open-source small language model. arXiv:2401.02385. 2024.
 
4.
Abdin M, et al. Phi-4 technical report. arXiv:2412.08905. 2024.
 
5.
Liu Y, et al. G-Eval: NLG evaluation using GPT-4 with better human alignment. arXiv:2303.16634. 2023.
 
6.
Gemini Team, et al. Gemini: A family of highly capable multimodal models. arXiv:2312.11805. 2023.
 
Fig. 1 (Abstract A7)
Application prototype of Mobile Health Assistant generating responses based on retrieved openEHR data
Bild vergrößern
Fig. 2 (Abstract A7)
Example and evaluation of quantized local Phi3-mini and Gemini cloud LLM output
Bild vergrößern
Table 1 (Abstract A7)
Average system metrics for evaluated SLMs: Memory (GB), Load time (s), TTFT—Time to First Token (s), Gen—Generation speed (tokens/s), Emb—Embedding time (s), and DB—Database query time (s)
SLM
Memory
Load
TTFT
Gen
Emb
DB
TinyLlama
0.94 GB
7.98
5.28
9.04
0.04
0.004
Phi3-mini-4 k
2.7 GB
29.76
11.14
3.6
0.04
0.004

A8 openEHR meets genomics: join the challenge!

Cecilia Mascia1, Paolo Uva2, Aurelie Tomczak3,4, Christina Jaeger-Schmidt3, Florian Kaercher3, Simon Schumacher3, Liv Laugen5, Pau Corral Montañés6, Pilar Mur6, Giovanni Delussu1, Francesca Frexia1, Vebjørn Arntzen5, Silje Ljosland Bakke7, Heather Leslie8

1CRS4—Center for Advanced Studies, Research and Development in Sardinia, Pula, Italy; 2Clinical Bioinformatics Unit, IRCCS G. Gaslini, Genoa, Italy; 3HiGHmed Consortium, Germany; 4Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; 5Oslo University Hospital, Oslo, Norway; 6Catalonian Cancer Strategy, Department of Health, Biomedical Research Institute of Bellvitge (IDIBELL), L’Hospitalet de Llobregat, Barcelona, Spain; 7Helse Vest IKT, ICT service provider in Western Norway Regional Health Authority, Norway; 8Atomica Informatics, Melbourne, Australia

Correspondence: Cecilia Mascia (cecilia.​mascia@crs4.​it)
BMC Proceedings 2026, 20(12):A8
Keywords: genomics, semantic interoperability, variant, modeling, VCF
Background
The wealth of information contained in genomic datasets holds great promise for improving medical care; however, their reuse and interoperability remain challenging due to data complexity, incompatible standards, and the continuous evolution of technologies and reference resources. The openEHR Genomics Project [1] was created in 2019 to build machine-readable models of genomic concepts that could preserve semantics, capture data provenance, and ease the integration with heterogeneous data (e.g., clinical, imaging, environmental information).
Method
openEHR was the approach selected to represent genomic data, as it enables a fine-grained data capture, the unambiguous definition of semantics, and the recording of essential provenance information through the inner structure of the archetypes. An international team of experts across bioinformatics, medicine, and health IT conducted the data modeling through periodic virtual meetings and community-based content reviews, drawing the core archetypes out of the Variant Call Format (VCF) specifications [2] and the HGVS Nomenclature [3], two commonly used standards for reporting and describing sequence variations.
The meaning of each node was precisely stated, and, where possible, formalized using domain-specific nomenclatures (e.g., HUGO for gene symbols) and standardized terminologies (e.g., LOINC, SNOMED-CT). Free-to-use tools such as Xmind1 (for mind mapping) and Archetype Designer2 (for modeling) were employed, while review sessions and model versioning were managed directly through the openEHR Clinical Knowledge Manager3 (CKM).
Results
The resulting archetypes capture findings and annotations related to genomic variants identified in a human individual, along with auxiliary models describing the external references used during sequencing. To date, the following 16 archetypes are publicly available in the CKM:
  • Genomic variant result: functional annotations and analysis context.
  • Simple genetic variant: a simplified version with clinically relevant data.
  • Genetic variant presence: an assertion of the presence or absence of a specific variant within a given sample.
  • Variant types: ten archetypes representing specific variant types (e.g., inversion, deletion, copy number variation).
  • Sequencing assay: details of the sequencing analysis performed.
  • Reference sequence and Knowledge base reference: references to external resources (e.g., scientific databases and bioinformatic pipeline repositories) used in the process.
The models are comprehensive yet flexible, supporting both complex research use cases, which require tracking of tools, pipelines, and parameters, and streamlined clinical applications that demand only essential information. Some models and how they could be combined in a report are shown in Fig. 1.
Conclusion
The openEHR Genomics project is an ongoing initiative that welcomes contributions, including new use cases to validate the models, participation from domain experts in development and refinement, and efforts to align with other domain standards. Interested parties are encouraged to adopt the available archetypes and engage with the openEHR Genomics group to participate in the review and development process.
Acknowledgments
This work has been partially supported by the ToPMa project, funded by the Sardinian Regional Authority (G.A. RC CRP 077).
References
1.
Mascia C, Frexia F, Uva P, et al. The openEHR Genomics Project. Stud Health Technol Inform. 2020;270:443–447. https://doi.org/10.3233/SHTI200199.
 
2.
Global Alliance for Genomics and Health (GA4GH). Variant Call Format (VCF) [Internet]. Available from: https://www.ga4gh.org/product/geneticvariation-formats-vcf/
 
3.
Hart RK, Fokkema IFAC, DiStefano M, Hastings R, Laros JFJ, Taylor R, Wagner AH, den Dunnen JT. HGVS Nomenclature 2024: improvements to community engagement, usability, and computability. Genome Med. 2024;16(1):149. https://doi.org/10.1186/s13073-024–01421-5.
 
Fig. 1 (Abstract A8)
How genomic archetypes are combined to form a report of a genomic analysis. The Genomic variant result (light blue box) shows the details of a single variant found in the analyzed DNA, annotations (e.g., identification of genes involved, variant classification), and tools used (e.g., computational pipelines, reference databases). The Sequencing assay (yellow box) describes the analysis performed (e.g., type of sequencing, device, kit) and the list of genes tested in case of panel sequencing
Bild vergrößern

A9 A document-first openEHR persistence layer for operational single-patient and cross-patient queries

Francesc Mateu Amengual1, Greg Cox1, Josep A. Mira i Palacios2, Juan Crossley1, John Underwood1, Giovanni Rodríguez1, Carlos Alonso1, Jorge Sanz de Acedo1, Franck Pachot1

1MongoDB, New York, NY, USA; 2Government of Catalonia – CTTI, Barcelona, Spain

Correspondence: Francesc Mateu Amengual (francesc.​mateu@mongodb.​com)
BMC Proceedings 2026, 20(12):A9
This work addresses a core operational challenge in openEHR Clinical Data Repositories (CDRs): executing operational single-patient and cross-patient queries at scale with low and predictable latency. openEHR separates a stable Reference Model from domain archetypes and templates; the primary unit, the composition, is a hierarchical clinical document whose nodes are addressable via Archetype Query Language (AQL) paths. Many relational implementations of openEHR persist compositions as a mixture of JSON columns and “shredded” rows [1]. This is effective for single-patient retrieval, but often weaker for large-scale cross-patient workloads. Yet modern applications increasingly need near–real-time cross-patient queries during care delivery, such as finding patients by condition, identifying cohorts that meet safety or protocol criteria, and supporting AI-assisted clinical decision support. Data warehouses are well suited for retrospective analytics, but offloading operational cross-patient queries introduces latency and duplication, breaks a unified AQL endpoint across workloads, can lose clinical context when documents are flattened (e.g., temporal order, units, provenance, containment paths), and fragments security and audit trails.
We present a document-first openEHR persistence layer on MongoDB that stores canonical compositions as single JSON documents and materializes a compact node array per composition. Each node retains its clinical subtree and positional metadata and is annotated with a reversed AQL path to enable index-friendly matching for variable-depth paths. This enables deterministic AQL-to-MQL (MongoDB Query Language) compilation, extending prior AQL-to-MongoDB interpreter work [2]. FROM/CONTAINS/WHERE constraints compile into targeted predicates over node paths and values. The design uses dual indexing: (i) for patient-filtered queries, a compound B-Tree index on (ehr_id, reversed_path) enables shard-local lookups; (ii) for cross-patient queries, a Lucene-style search index over selected node paths and values compiles AQL path constraints into wildcard matches on the reversed path and compiles node conditions into range and equals operators.
We evaluated the approach on a sharded MongoDB Atlas cluster using a de-identified export from a real openEHR CDR, scaled to one billion compositions. End-to-end API latency remained low and stable for both single-EHR and cross-patient queries (Table 1). This provides a viable, cost-effective alternative for next-generation openEHR CDRs and a clear path to inline provenance, semantic enrichment, and AI-driven use cases.
References
1.
Gamal A, Barakat S, Rezk A. Standardized electronic health record data modeling and persistence: A comparative review. J Biomed Inform. 2021;114:103670.
 
2.
Ramos M, Sánchez-de-Madariaga R, Barros J, et al. An Archetype Query Language interpreter into MongoDB: Managing NoSQL standardized Electronic Health Record extracts systems. J Biomed Inform. 2020;101:103339.
 
Table 1 (Abstract A9)
End-to-end API latency summary
Query type
Median
(ms)
P90
(ms)
Result size
Single-patient
4.98
18.75
1–10
Cross-patient
13
380
1–100,000

A10 Automated integration of REDCap eCRFs data into openEHR repository: a scalable data pipeline for the Clinnova IBD cohort

Mahsa Moein, Vanessa Pereira, Luigi De Giovanni, Maximilian Funfgeld, Michael Schnell

Data Integration Center, Department of Medical Informatics, Luxembourg Institute of Health, Strassen, Luxembourg

Correspondence: Mahsa Moein (mahsa.​moein@lih.​lu)
BMC Proceedings 2026, 20(12):A10
Keywords: Semantic Interoperability, REDCap eCRF Integration, openEHR, Apache NiFi ETL Pipeline
Background
Semantic interoperability between research and clinical systems is essential for advancing personalized medicine and ensuring long-term reuse of patient data. However, research studies often rely on heterogeneous sources, such as REDCap, wearable devices, and institutional silos, leading to inconsistent representations of clinical concepts. In the Inflammatory Bowel Disease (IBD) cohort of the Clinnova project, a cross-border European initiative, electronic Case Report Forms (eCRFs) are implemented in REDCap [1], and corresponding openEHR templates are designed to support interoperability [2]. To transfer data from REDCap’s flat data model into the hierarchical structure of openEHR compositions, a scalable Extract-Transform-Load (ETL) pipeline using Apache NiFi has been developed [3].
Methods
Each REDCap event is represented by a dedicated openEHR template aligned with the structure of the corresponding eCRF. Instruments are mapped to appropriate archetypes, and repeating instruments are modeled using recurring archetype nodes (Fig. 1). To compensate for REDCap’s lack of built-in form versioning, a versioning attribute is added to the templates to record the evolution of each form. Additionally, the openEHR FEEDER AUDIT class is included to store the original REDCap JSON, instance URL, and software version for full traceability.
The ETL architecture begins with a NiFi pipeline that retrieves data for each event and instrument via the REDCap API and stores it in a data lake. The next components are the transformation and loading pipelines, which are also implemented in NiFi. One pipeline maps REDCap data into structured, hierarchical openEHR compositions, while the other loads compositions into the repository via the openEHR REST API (Fig. 2).
The transformation process operates on two types of JSON files: (1) Instrument Template JSON (ITJ), a static file defining a segment of the openEHR template with mappings to variable names in a REDCap instrument; (2) Event Instance JSON (EIJ), dynamically generated from the data lake, contains contextual metadata and patient-specific values for a REDCap event. The pipeline dynamically selects and combines relevant ITJs based on contextual metadata and substitutes patient-specific values from the EIJs to construct complete composition payloads (Fig. 3).
The loading pipeline interacts with the openEHR REST API. For each patient, it first verifies the existence of an EHR record and creates one if needed. For each composition payload, it checks whether the entry is new, either because it has not been previously created or because it represents a new version of a previously captured REDCap form. If so, a new composition is created. Otherwise, it compares the payload with the stored composition using MD5 hash values. Updates are only performed when differences are detected, to avoid redundant writes (Fig. 4).
Results
The proposed solution enables dynamic generation of openEHR compositions by combining ITJs with contextual metadata and patient-specific values from EIJs. Content-based versioning using MD5 hashing, along with multiple validation steps, prevents redundant writes and ensures that only one composition is created per patient per REDCap event. Designing templates based on community-standard archetypes from Clinical Knowledge Manager (CKM), and aligning them with REDCap events and instruments, enables reuse of both templates and data pipelines across cohorts. Implementing the workflow in Apache NiFi provides a flexible, modular, scalable, and fully traceable orchestration framework.
Conclusion
This architecture bridges REDCap-based data collection with the openEHR model, enabling standardized and reusable representations of clinical concepts. Implementing the workflows in Apache NiFi provides modular and configurable pipelines that can be easily adapted and adopted by external teams and institutions.
References
1.
Harris PA et al. Research electronic data capture (REDCap). J Biomed Inform. 2009;42(2):377–81; 2019;95:103208.
 
2.
Beale T, Heard S. openEHR Specifications. Available at: https://www.openehr.org.
 
3.
Apache Software Foundation. Apache NiFi. Available at: https://nifi.apache.org/.
 
Fig. 1 (Abstract A10)
Structural mapping of REDCap’s events to openEHR templates
Bild vergrößern
Fig. 2 (Abstract A10)
Overview of developed data pipeline
Bild vergrößern
Fig. 3 (Abstract A10)
Transformation step for composition generation
Bild vergrößern
Fig. 4 (Abstract A10)
Loading step for storing compositions into the openEHR repository
Bild vergrößern

A11 Multi-agent healthcare interoperability: coordinating FHIR and openEHR workflows via Model Context Protocol

Robert Tovornik, Borut Fabjan (borut.​fabjan@better.​care)

Better Ltd., Ljubljana, Slovenia

Correspondence: Robert Tovornik (robert.​tovornik@better.​care)
BMC Proceedings 2026, 20(12):A11
Background
Accessing and combining clinical data across openEHR (archetype-based electronic health records) [1] and HL7 Fast Healthcare Interoperability Resources (FHIR) systems [2] remains challenging for clinical users, as meaningful queries require detailed knowledge of APIs, data models, and query languages. Model Context Protocol (MCP) [3] provides a standardized mechanism for language models to interact with external tools, but applying MCP to healthcare systems reveals practical limitations, including ambiguous tool selection, platform-imposed limits on the number of tools per agent, and repetitive tool-calling loops caused by insufficient semantic understanding.
Materials and methods
We designed a hierarchical multi-agent interoperability framework that coordinates openEHR and FHIR systems via MCP. A central orchestrator agent interprets clinical intent and executes declarative YAML-defined workflows with explicit sequencing and error handling, while specialized agents handle openEHR and FHIR operations using curated, domain-specific tool sets. The overall system architecture is shown in Fig. 1. To reduce semantic complexity in openEHR access, openEHR JavaScript Views (predefined clinical data projections) were employed as reusable templates for common clinical data retrieval patterns, minimizing reliance on ad-hoc Archetype Query Language (AQL) generation.
Results
Compared with single-agent approaches, the distributed architecture reduced incorrect tool selection and eliminated repetitive tool-calling loops by enforcing explicit workflow steps. Partitioning tools across specialized agents enabled broader functional coverage while operating within platform constraints. Agent role separation and responsibilities are summarized in Fig. 2. The use of openEHR JavaScript Views improved reliability and consistency for recurring clinical queries by standardizing data access patterns while preserving formal clinical semantics [4].
Conclusions
Hierarchical multi-agent orchestration combined with declarative workflows provides a practical and reliable pattern for coordinating resource-based (FHIR) and archetypebased (openEHR) systems through MCP. Explicit architectural design, rather than emergent tool use, is essential for clinically safe and semantically accurate healthcare interoperability. This framework supports natural language access to heterogeneous healthcare data while maintaining transparency, reproducibility, and clinical oversight.
Acknowledgements
The authors acknowledge the openEHR and HL7 communities for ongoing collaboration that enables interoperable healthcare systems.
References
1.
openEHR International. Archetype Object Model 2 (AOM2) – openEHR Specifications, Release 2.3.0. Available: https://​specifications.​openehr.​org/​releases/​AM/​ Release-2.3.0
 
2.
HL7 International. FHIR – Fast Healthcare Interoperability Resources (R5), official specification. Available: https://hl7.org/fhir/
 
3.
Anthropic. Introducing the Model Context Protocol (MCP). Anthropic News. Nov 25, 2024. Available: https://www.anthropic.com/news/model-context-protocol
 
4.
Sanchez F, et al. An openEHR based approach to improve the semantic interoperability of clinical data registry. BMC Med Inform Decis Mak. 2018;18:22. Available: https://pmc.ncbi.nlm.nih.gov/articles/PMC5872380/
 
Fig. 1 (Abstract A11)
Hierarchical multi-agent architecture for healthcare interoperability, showing orchestration and coordination between specialized FHIR and openEHR agents via Model Context Protocol
Bild vergrößern
Fig. 2 (Abstract A11)
Agent specialization and responsibility separation between the orchestrator, FHIR agents, and openEHR agents
Bild vergrößern

A12 Implementation of ePROMs in a universal healthcare system: insights from Catalonia’s real-world experience

Maria-Mercè Nogueras1,2,3, Júlia Folguera1,2, Xavier Alzaga1, David López1, Gerard Carot-Sans1,2, Lluís Valle1, Tara Bonet1,2, Xabier Michelena1,2, Alba Jiménez-Rueda1,2,4, Marina Ramiro-Pareta1,2,4, Oscar Solans1, Jordi Piera-Jiménez1

1Catalan Health Service (CatSalut), Barcelona, Spain; 2Digitalization for the Sustainability of the Healthcare System (DS3) research group, Barcelona, Spain; 3Agency of Health Quality and Assessment of Catalonia (AQuAS), Barcelona, Spain; 4Fundació Tic Salut i Social (FTSS), Barcelona, Spain

Correspondence: Maria Mercè Nogueras (merce.​nogueras@gencat.​cat)
BMC Proceedings 2026, 20(12):A12
Keywords: ePROMs, openEHR, patient-reported outcomes, primary care, digital health implementation
Background
PROMs are increasingly used to incorporate patients’ perspectives into healthcare delivery. Electronic PROMs (ePROMs) enable remote collection and allow the incorporation of health data directly into electronic health records through standards such as openEHR. However, challenges remain regarding their implementation, particularly in primary care, where patient diversity varies widely. In Catalonia, ePROMs have been progressively implemented since November 2022 with a particular focus on primary care. ePROM results were encoded into openEHR archetypes for primary use and relational databases for epidemiological investigation. This study aims to describe ePROMs administration and usage within the Catalan Health System and the key determinants influencing their adoption by both professionals and patients.
Methods
We conducted a retrospective, population-based cohort study using administrative data from the Catalan public healthcare system. The description of the implementation included all ePROMs administered between November 2022 and October 2024. Analyses of staff and patient-related factors focused on the period from November 2023 to October 2024, when system improvements were implemented.
Data were obtained from the PROMs registry and central databases of the Catalan Health Service. Variables included professional and patient sociodemographic characteristics, patient clinical features, care setting, socioeconomic status, and whether the response was submitted via the patient portal La Meva Salut (LMS).
Descriptive statistics and plots summarised trends in administration and usage. Bivariate tests and multivariate models were used to identify predictors of administration volume and patient response. Analyses were performed using R. Ethical approval was obtained, and data were anonymised in accordance with GDPR.
Results
Between November 2022 and October 2024, 789,294 ePROMs (among 42 different validated questionnaires) were administered. Administration varied widely across PROMs: some instruments showed rapid uptake shortly after implementation, while others remained scarcely used over time. The overall completion rate was 78.5%, with 48.5% of ePROMs completed within one day. Completion rates varied notably across thematic areas, from 85.6% in social support to 6.6% in oncology.
Between November 2023 and October 2024, 685,130 ePROMs were administered by 11,094 professionals. Professional age, gender, territorial socioeconomic index (SI) and rurality were associated with ePROMs administration. Nurses, physiotherapists, dietitians, and mental health professionals also had higher administration volumes than physicians.
Regarding patient response, 417,051 individuals received at least one ePROM. Most received only one (66.8%), with a completion rate of 88.1%. High responders (≥ 80% completion) represented 76.7% of patients. Response rates were particularly associated with administered ePROMs per patient, and weaker associated with gender, age, morbidity, SI and rurality. Only 10.5% of completed ePROMs were answered via the patient portal (LMS), more frequently among middle-aged patients and those with high socioeconomic status.
Conclusions
The implementation of ePROMs in Catalonia demonstrates strong engagement, especially in primary care and among nursing and community health professionals. While overall completion rates are high, usage varies by clinical profile and patient characteristics. Notably, 10.5% of responses were submitted via the patient portal, highlighting limited digital channel adoption and the need to strengthen patient-facing tools. Our findings can inform targeted strategies to improve ePROMs administration and usage. Additionally, the large volume of structured data recorded using openEHR standards offers significant potential for secondary use in research and health system planning.

A13 Using openEHR and FHIR to store and access legacy data from an EHR to be decommissioned

Erik Sundvall1,2,3, Ian McNicoll4,5

1Karolinska University Hospital, Stockholm, Sweden; 2Karolinska Institutet, Stockholm, Sweden; 3Linköping University, Linköping, Sweden; 4freshEHR Clinical Informatics Ltd, Kent, UK; 5University College, London, UK

Correspondence: Erik Sundvall (erik.​sundvall@regionstockholm.​se)
BMC Proceedings 2026, 20(12):A13
Decommissioning of large-scale Electronic Health Record (EHR) systems presents challenges like preserving data while ensuring continued, clinically useful access to it when the originating system is shut down. We present a practical approach for migrating clinical data:
  • from Region Stockholm’s current EHR system “TakeCare” (by CGM) that is due to be decommissioned
  • to a modern, standards-based EHR platform (by Tieto & Better) using openEHR and FHIR.
An important goal is that the end users of the platform should from the start be able to access the records in a familiar manner, but “under the hood” now via standards like openEHR and FHIR in a new platform so that the old system can be shut down without clinical disruption. We believe that the approach is suitable also for other legacy EHR systems.
A Proof of Concept" (PoC) in three months using a handful of full-time equivalents of consultants and hospital staff succeeded (via API) in automating the transfer of medical record text, medication prescriptions, clinical chemistry lab results, measurements, activities, etc., from TakeCare, to an EHR platform based on open standards such as openEHR and FHIR. During the same period, user interfaces were built in the EHR platform that mimicked the structure and functions (including the filtering system) of TakeCare so that healthcare professionals would feel familiar and could use the same types of filtering and workflows as before. Having fine-grained, multifaceted filtering available is much more useful to clinicians than just a word-based search in a set of PDF files that other archiving methods often provide. In Sweden EHR data needs to be made available for at least 10 years, but for both clinical and research use we would like to keep it much longer – keeping TakeCare in “read-only” mode, including maintaining its integrations to national services during 10 years is less attractive than conversion from cost, IT-security and clinical/research utility perspectives, especially when considering that we are anyway moving towards an openEHR + FHIR based EHR landscape in Region Stockholm.
Conversion methods were chosen depending on sources, targets and what was reasonable, technically, informatically, and resource-wise as illustrated in (Fig. 1):
https://static-content.springer.com/image/art%3A10.1186%2Fs12919-026-00367-3/MediaObjects/12919_2026_367_Figa_HTML.png
Much of the modelling challenge was in analysing where it was possible to make use of international CKM archetypes vs. when GENERIC_ENTRY ‘integration’ archetype-based templates were required [1]. A particular challenge was a proprietary (TakeCare) template format that was based on an internal terminology but allowed arbitrary nesting of information items.
Much of the modelling challenge was in analyzing where it was possible to make use of international CKM archetypes vs. when GENERIC_ENTRY ‘integration’ archetype-based templates were required. This decision was based on knowledge of the most common and high-value datapoints e.g. diagnosis, vital signs but full roll-out would require more formal governance by clinical stakeholders. Critically, the full context of the original data was captured so further enrichment/re-mapping to CKM archetypes (conversion strategy #3 above) could be done over time. A particular challenge was a proprietary (TakeCare) template format that was based on an internal terminology but allowed arbitrary nesting of information items.
The PoC focused on semantic feasibility of the conversion, and feedback regarding clinical usefulness of the user interface for this a handful of realistic but fake patients’ EHRs were used for data entry in TakeCare followed by ETL via the APIs of TakeCare and an openEHR server. Technical scalability/performance (batch import etc.) and deeper semantic work will be addressed in work starting now.
Presentation files with more details and the open-source code used for conversion and visualisation can be found at https://github.com/regionstockholm/poc_tc2openEHR
References
1.
openEHR Foundation. Integration Information Model: RM Release 1.1.0 [Internet]. London: openEHR Foundation; 2020 [cited 2025 October 9]. Available from: https://specifications.openehr.org/releases/RM/Release-1.1.0/integration.html
 
Fig. 1 (Abstract A13)
Solid lines show PoC data flows, dashed lines are available but not used in PoC. Conversion strategies used for different data: https://static-content.springer.com/image/art%3A10.1186%2Fs12919-026-00367-3/MediaObjects/12919_2026_367_Figb_HTML.png
Bild vergrößern

A14 Text2AQL: from clinical questions to executable openEHR queries

Marko Zeman, Matic Bernik (matic.​bernik@better.​care), Robert Tovornik (robert.​tovornik@better.​care), Borut Fabjan (borut.​fabjan@better.​care)

Better Ltd, Ljubljana, Slovenia

Correspondence: Marko Zeman (marko.​zeman@better.​care)
BMC Proceedings 2026, 20(12):A14
Background
OpenEHR [1] has emerged as a widely adopted standard for long-term persistence and interoperability of structured clinical data. Data retrieval from openEHR repositories relies on the Archetype Query Language (AQL) [2], which offers expressive querying capabilities but requires detailed knowledge of its syntax and underlying clinical models. This complexity poses a significant barrier for clinicians and healthcare professionals, who typically formulate information needs as natural language questions. To address this challenge, we introduce Text2AQL, an AI-powered assistant that translates free-text clinical queries into executable AQL statements, thereby democratizing access to openEHR-based data.
Methods
Text2AQL follows a multi-stage generation pipeline (Fig. 1). First, clinical concepts are extracted from the user’s natural language input. These concepts are then mapped to relevant openEHR archetypes and data element paths using Retrieval-Augmented Generation (RAG) [3]. Archetype data is embedded and indexed in a vector database to enable semantic retrieval of the most relevant archetypes, elements and their paths. Finally, a large language model generates the AQL query by incorporating the retrieved context. A dedicated validation step detects and corrects common syntactic errors, such as invalid CONTAINS clauses or hallucinated paths, ensuring compliance with the AQL rules and specification.
Results
We evaluated Text2AQL on a benchmark dataset of 1,274 natural language descriptions paired with their corresponding AQL queries. Generated queries were assessed for syntactic validity and also semantic similarity to the ground truth using an LLMas-a-judge paradigm [4]. Text2AQL produced syntactically valid AQL queries in 81.9% of cases, outperforming general-purpose language models such as GPT-4o (73.6%) and GPT-5 mini (46.2%) as seen in Fig. 2. Although other two approaches achieved higher semantic similarity scores on their valid outputs, they generated substantially fewer executable queries overall. As a result, Text2AQL provides the most reliable performance for practical query execution.
Conclusion
By combining large language models with the retrieval-based method for openEHR archetypes and element paths, Text2AQL improves the generation of executable AQL queries from natural language. The proposed approach lowers the technical barrier to querying openEHR repositories, supports clinical decision-making and streamlines research workflows. Future work will focus on improving robustness, expanding curated validation datasets and evaluating multilingual clinical queries.
References
1.
Beale T, Heard S. The openEHR Foundation. Stud Health Technol Inform. 2007;129:153–157.
 
2.
openEHR Foundation. Archetype Query Language (AQL) Specification. 2020. Available from: https://specifications.openehr.org/releases/QUERY/latest/AQL.html [Accessed 2025-07-30].
 
3.
Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, Kulikov I, Fan A, Chaudhary V, El-Kishky A, et al. Retrieval-augmented generation for knowledgeintensive NLP tasks. Adv Neural Inf Process Syst. 2020;33:9459–9474.
 
4.
Gu J, Jiang X, Shi Z, Tan H, Zhai X, Xu C, Li W, Shen Y, Ma S, Liu H, et al. A survey on LLM-as-a-judge. arXiv. 2024;arXiv:2411.15594.
 
Fig. 1 (Abstract A14)
Text2AQL generation pipeline. Natural language inputs are processed through clinical term extraction, retrieval-augmented archetype binding and large language model’s AQL generation with syntactic validation
Bild vergrößern
Fig. 2 (Abstract A14)
Comparison between approaches regarding syntactic validity of generated AQL queries, tested on 1,274 examples
Bild vergrößern
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Download
Titel
Abstracts from EHRCON25—openEHR International Conference 2025
Publikationsdatum
01.04.2026
Verlag
BioMed Central
Erschienen in
BMC Proceedings / Ausgabe Sonderheft 12/2026
Elektronische ISSN: 1753-6561
DOI
https://doi.org/10.1186/s12919-026-00367-3