Introduction
Method and analyses
The WA Cancer Staging Project Context
Phase 1: Develop business rules. A Project Advisory Group (PAG) (to oversee the project and Working Groups) and Breast and Colorectal Cancer Working Groups (to consult on developing the business rules) were recruited to consult throughout the project. A snowball recruitment strategy to ensure a variety of expertise was used. The stakeholders were notified through team meetings about the project. The Project Manager/Research Fellow (SS), experienced in stakeholder engagement and was in a neutral position, invited potential stakeholders and potential stakeholders were advised who had recommended them. Invites to the PAG and Working Groups included the option to recommend a suitable stakeholder if the invitee was unavailable. The consumer representatives agreed that consumer input was more valuable in the PAG rather than the development of the business rules and data collection challenges that the Working Groups would oversee. Consumer representatives were still involved in overseeing the Working Groups and consulted on issues as required. |
The findings from the rapid scoping review on determining population-based cancer stage at diagnosis in population-based cancer registries [21], which outlines various classification systems, were reviewed. Based on the evidence, the PAG advised using the AJCC TNM classification for cancer staging as it is the most used, established and adaptable system compared to other classification systems. The Victorian Cancer Registry’s business rules (based on a simplified version of the AJCC TNM, version 7 [11]) were compared against the full version [27] and the updated version 8 of the AJCC TNM [28] by the Breast and Colorectal Working Groups in the breast and colorectal business rules development. During this consultation, the PAG endorsed a Cancer Staging Tiered Framework for collecting stage data within the WACR to enable stage collection using current data sources and to ensure future proofing. The Staging Tiered Framework includes three tiers, 1) complete AJCC TNM stage, 2) Registry-derived stage, and 3) pathology stage. The WA Cancer Staging Project operates on the registry-derived tier based on currently available data sources. |
Phase 2: Develop and test NLP/ML models. To (i) classify cancer reports received according to cancer type to prioritise work to achieve timely cancer staging, (ii) enable automated extraction of information relevant to staging from pathology reports for breast and colorectal cancers and provide stage at diagnosis based on the business rules. The models were developed once the business rules were created (through the consultation with the PAG and Working Groups) to automate and support extracting information from relevant data sources to minimise or eliminate manual intervention. The development and testing of the models included classifying pathology reports into cancer types (e.g., colorectal and breast) and report types (e.g., biopsy, colectomy) in addition to extracting Tumour Node Metastasis (TNM) staging and related information. |
Phase 3: Validate the NLP/ML models. Validate the models against (i) manually coded data by the coding team, (ii) manually staged data (iii) clinically staged data. The models developed were trained on manually staged data from 2018 and 2019 and were validated on data from 2020 within the WACR. The validation on clinically staged data did not occur as data were received after the evaluation. |
Phase 4: Embed the NLP/ML models. Enabling future routine collection prospectively and retrospectively. A server was purchased to embed the models with the WACR server. The embedding of the models did not take place within the current study due to global shortages in receiving the purchased server for routine coding and analysis workflows. It was planned that the deployment of the new WACR analytics service was to retrieve reports from the WACR and Hospital Morbidity Data Collection (HMDC) databases and record the model outputs in the WACR database for monitoring and analysis. |
Phase 5: Demonstrate the value of stage at diagnosis in epidemiological analyses of cancer incidence. Perform epidemiological analysis of breast and colorectal cancer incidence in 2019 and 2020 using the staged data produced by the models according to demographic and other tumour-specific information routinely available within the WACR. |
Study design
The conceptual framework for the qualitative process evaluation
1. Intervention characteristics: aspects of an intervention that may impact implementation success, including intervention source, evidence strength and quality, relative advantage, adaptability, trialability, complexity, design quality and packaging, and cost. |
2. Outer setting: external influences on the intervention implementation, including patient needs and resources, cosmopolitanism, peer pressure and external policies and incentives. |
3. Inner setting: characteristics of the implementing organisation such as structural characteristics, networks and communications, culture, implementation climate, (tension for change, compatibility, relative priority, organisational incentives and rewards, goals and feedback, learning climate), readiness for implementation, leadership engagement, available resources and access to knowledge and information. |
4. Characteristics of individuals: individuals’ knowledge and beliefs about the intervention, self-efficacy, individual stage of change, individual identification with organisation and other personal attributes that may affect implementation. |
5. Process: stages of implementation such as planning, engaging, (opinion leaders, formally appointed internal implementation leaders, champions, external change agents), executing, and reflecting and evaluating. |
Participants
Characteristics | Pre-Proforma (29/38 completed—76% response rate) N (%a) | Post-Proforma (18/29 completed—62% response rate—18/38 47% overall response rate) N (%a) |
---|---|---|
Age (years): | (31–76 age range) | (31–76 age range) |
30–39 | 5 (17.2) | 4 (22.2) |
40–49 | 7 (24.1) | 5 (27.8) |
50–59 | 12 (41.4) | 5 (27.8) |
60 + | 4 (13.8) | 3 (16.7) |
Not answered | 1 (3.5) | 1 (5.6) |
Role self-classification: | ||
Clinicians | 12 (41.4) | 6 (33.3) |
Healthcare staff or consumer | 6 (20.7) | 4 (22.2) |
Registry staff | 3 (10.3) | 2 (11.1) |
Other | 8 (27.6) | 6 (33.3) |
Group membership: | ||
PAG | 9 (31.0) | 7 (38.9) |
Working Group (breast or colorectal) | 14 (48.3) | 6 (33.3) |
Both PAG and Working Group | 6 (20.7) | 5 (27.8) |
Data collection
Data analysis
Framework Analysis stages | Description |
---|---|
1. Familiarisation | Two researchers (SS and LP) immersed in the data independently. This included reading and re-reading the responses of the pre- and post-proformas several times, noting key ideas. |
2. Identifying a thematic framework | CFIR domains and selected constructs were used as the thematic framework. This was an iterative process and involved revisions. Initially, all 39 CFIR constructs (see Table 2) were reviewed. Refinements were made, and it was agreed after preliminary coding to code to the included 10 constructs (see Table 2) due to the limited responses in the non-included constructs in the proformas. Cross-over between constructs was also noted. Open coding was considered but not used as the researchers found the responses applied to the definitions of CFIR constructs. This was likely as the questions were developed around the CFIR. |
3. Indexing | Involved applying the framework to the data. Responses were deductively coded in NVivo by each researcher to the included domains and their constructs or moved to another relevant included construct if better suited. This was an iterative comparative process, and the two researchers resolved discrepancies through regular discussions. Coding comparisons were made using coding stripes. Cohen’s Kappa [54] was also conducted to review the levels of agreement between the two researchers. Coder agreement was ‘fair to good’ for the pre-proforma (Kappa = 0.59). Although not required, Kappa was also checked for the post-proformas, and coder agreement had improved to ‘very good’ for the post-proformas (Kappa = 0.86). |
4. Charting | This step was based on an adapted version of Smith et al. [50] guidelines for determining and grading barriers and enablers. This involved arranging the data into positive and negative statements (barriers and enablers) within each construct. Theme labels were used to capture the essence of the statements for ease of comparison between the two researchers. Short paragraphs or sentences were sometimes separated depending on the positive or negative aspects of the responses. Similarities and differences between the researchers continued through regular meetings. Uncertainty or differences were resolved through the regular discussions. Both researchers provided their opinions before coming to a consensus. In the few cases of uncertainty, JS was consulted on determining sentiment. Results were merged into a matrix, allowing the frequency of positive and negative statements |
5. Mapping and interpretation | This stage involved SS reviewing patterns in the data and presenting the interpretations. A simplified summary of the constructs was produced based on Smith et al. [50] (see Table 5). Figure 3 provided a visual representation on the change in the negative statements across the constructs from pre-to post proforma among stakeholders. The colour coding in Table 5 and Fig. 3 are based on the traffic light system created by Smith et al. [50] where red is a barrier, orange is barrier/enabler, and green is an enabler. The visual presentation assisted the interpretation of the barriers and enablers of the constructs. Stakeholders reviewed preliminary findings. The interpretation continued in writing the results and was reviewed by all authors. The theme labels assisted in this process and the structure of the positive and negative statements. |
Findings
Domain 1: intervention characteristics
Adaptability: overall barrier to implementation
Utilisation of modern technologies such as natural language processing and machine learning to extract key pieces of information and build algorithms to determine the stage at diagnosis, ensuring meaningful diagnostic checks are in place to provide context and meaning where stage has been derived [Registry Staff/Pre-Proforma].
[Database Name] application changes being assessed to integrate the modelling data into [Database Name] to assist coders in working through their mapping more quicker [Other/Post-Proforma].
Complexity: overall barrier to implementation
Initially I assumed that all the data would be readily accessible from the pathology reports, and it would simply be a case of developing a machine learning/NLP model to capture the information. However, it has been shown to be a lot more complicated as there are significant data gaps plus the pathology reports are not in a consistent format [Other/Post-Proforma].
It took a while to understand the pathology reports…We need to [meet] multiple times with the subject matter experts in the working groups/PAG…Once we got the understanding of what to expect of different reports…and different tumour groups…the scope of the…work became much more clear [Other/Post-Proforma].
The pathways are also very different for the two tumour groups...So the rules need to be tumour specific [Other/Post-Proforma].
Consideration on how to extract the data for patients that don't have surgery within the 4-month period. Such as rectal cancer patients who have neoadjuvant CRT [chemoradiotherapy], anal scc [squamous cell carcinoma] patients who usually have definitive CRT as well as patients having minimally invasive surgeries [Healthcare Staff or Consumer/Post-Survey].
A number of cancers will be difficult to stage for clients that do not have active treatment due to old age, comorbidities or personal choice. Reluctant patients who delay diagnosis/staging can be time consuming or may need to be revisited [Healthcare Staff or Consumer/Pre-Proforma].
…the approach is very process heavy requiring change request forms to be created and submitted to a third party [IT department] for review [Other/Post-Proforma].
Domain 2: outer setting
Peer pressure: overall barrier to implementation
WA Cancer Registry is one of the few pop[ulation] registries that does not collect cancer stage at diagnosis - as such it is difficult to compare patient outcomes in WA vs other jurisdictions [Healthcare Staff or Consumer/Pre-Proforma].The SEER database (U.S.) and the U.K. cancer registry data seem to routinely include staging [Clinican/Pre-Proforma]European Network of Cancer Registries have been setting standards and systems for years on staging data etc. [Healthcare Staff or Consumer/Pre-Proforma].
WA is [a] fair way behind other jurisdictions when it comes to the collection of staging. However, I'm aware other jurisdictions rely heavily on manual entry [Registry Staff /Post-Proforma].
Domain 3: inner setting
Tension for change: overall enabler to implementation
…the addition of some measure of disease stage at diagnosis would add a great deal to our ability to understand differences in survival and patterns of incidence and mortality [Registry Staff/Pre-Proforma].
Staging data could be paired with date of diagnosis which could provide useful information about how well (or not) we are finding certain cancers early [Other/Pre-Proforma].
Without staging information it is very difficult to accurately evaluate the impact of new treatment or diagnostic pathways as they are likely to have a different impact depending on stage. Also stage is important for incidence and prevalence data in order to estimate future health service use [Other/Post-Proforma].
…screening feasibility [Other/Pre-Proforma]
The WA Cancer Plan 2020-2025 Implementation Plan subsequently identified the key strategic action to ‘Develop a timely data collection for cancer stage at diagnosis.’ The National Cancer Data Strategy for Australia (2008) recognises population-based registries' lack of nationally standardised Stage of Cancer at Diagnosis Data Collections [Other/Pre-Proforma]
…in the context of COVID worldwide there is a concern that diagnoses were down and…this means more people will be diagnosed at later stages which will affect the QALY [Quality-Adjusted Life Year] cost of COVID - we won’t know this for WA because we won't have the data! [Healthcare Staff or Consumer/Pre-Proforma]
Compatibility: overall barrier to implementation
Major issue will be [Database Name] software which is outdated and already cannot cope with the increased amount of histo[pathology] reports coding [Registry Staff/Pre-Proforma].
…research is very agile and software/database changes may need to be done frequently which does not align well with the strict [Database Name]/[IT Department] processes. A solution would be to have a more isolated/standalone schema for the project [Other/Post-Proforma].
Subtle differences in wording between pathology reports of the different providers [Clinician/Post-Survey].
…pathology reporting does not really take into account the use of the reports by the WACR [Other/Post-Survey].
Missing data can be within the scope of the project, such as missing pathological reports, and beyond the scope of the project, no access to imaging and MDT data [Registry Staff/Pre-Survey].
Relative priority: overall enabler to implementation
To better understand the population groups at risk in WA, to assist in risk adjusting cancer related performance and safety and quality indicators, to support planning and prevention programs and a wide range of research studies, stage at dx [diagnosis] is essential [Registry Staff/Post-Survey].
I am relatively committed but relatively busy so will support the process within the limitations of available time [Clinician/Pre-Survey].
…the greatest issue will be making the case to secure funding for the full implementation of staging for all cancer types [Healthcare Staff or Consumer/Pre-Survey].
We will be able to learn which cancers we need to improve early detection…[Cancer Staging] will be evidence to call upon when arguing for Govt [government] spending e.g. Lung cancer screening [Other/Pre-Survey].
Leadership engagement: overall enabler to implementation
Commitment from the [Project] team yes, commitment from [Funder] yes [Clinician/Post-Survey]The registry team are a dedicated, knowledgeable and committed team [Healthcare staff/consumer/Pre-Survey]
There is a great commitment from working group members as well as PAG [Other/Post-Survey]
Long term top-down commitment is required, and I see no sign it is there [Healthcare Staff or Consumer/Pre-Survey].…we need more support from the higher levels [Registry Staff/Pre-Survey].
Available resources: overall barrier to implementation
…as with all aspects of the cancer plan, I am concerned the department is not committing sufficient resources in a consistent and sustained fashion. There is a constant worry about funds, rollover of personnel, bids for funding and knock backs [Healthcare Staff or Consumer/Pre-Survey].
…the registry needs to be funded to ensure they have the skilled staff to collect, enter and clean data in a timely fashion [Other/Pre-Survey]
Domain 4: characteristics of individuals
Self-efficacy: overall enabler to implementation
Coders are very enthusiastic and keen to incorporate staging and look forward to…the necessary support [Registry Staff/Pre-Survey]
Attending the sessions have given me valuable insight into this aspect of reporting and the challenges faced in standardising the data sets…I am committed to helping in anyway [Department Name] can to surface this within [Database Name] [Other/Pre-Survey]
For longevity I think we've taken the right approach by taking a ML/NLP approach. Manual entry whilst achievable always would have been subject to scrutiny and would have introduced manual interpretation and subjective complaints [Registry Staff/Post-Proforma]
At the beginning and middle of the project, I was highly confident. As the project is nearing conclusion, I am less confident due to issues caused by the procurement and setup of the WACR analytics server as well as dependencies on software development work that would need to be performed by [Department Name] and various process/change requests need to add in the [Database Name] enhancements. [Other/Post-Survey].
Domain 5: process
Executing: overall barrier to implementation
While this [consultation] has been useful, it…added delays rather than focusing on the core staging extraction components [Other/Post-Survey]
I would say about 75% [to plan]…due to data issues there will be a number of cases that cannot be staged. While the reasons for unstageable have been flagged by the algorithm this will require additional data to be available to resolve [Other/Post-Survey].
Summary of findings
Barriers | Enablers |
---|---|
1. Complexity (N = 56, n = 46, 82.1%) 2. Compatibility (N = 59, n = 48, 81.4%) 3. Executing (N = 44, n = 26, 59.1%) 4. Available Resources (N = 57, n = 31, 54.4%) 5. Adaptability (N = 62, n = 32, 51.6%) 6. Peer Pressure (N = 55, n = 28, 50.9%) | 1. Tension of Change (N = 49, n = 46, 93.9%) 2. Relative Priority (N = 53, n = 44, 83.0%) 3. Leadership Engagement (N = 50, n = 32, 64.0%) 4. Self-efficacy (N = 51, n = 31, 60.8) |
Discussion
Strengths, limitations and recommendations
Implications
-
Addressing data gaps: Access to imaging, MDT software or standardised reporting on pathology notifications would provide more accurate cancer staging data. The first recommendation will be to discuss the project's findings, highlight the inconsistencies with pathology providers, and explore standardising cancer staging reporting in pathology reports.
-
Standalone database schema for cancer staging: This would prevent the labour-intensive approvals required from IT departments and overloading existing work processes.
-
Rapid-cycle evaluation: The visual grading system method with levels of implementation concern should be conducted at regular intervals, such that the rapid evaluation could be ongoing using rapid-cycles at different time points to evaluate the implementation process in a timely manner and learn about adaption as change occurs to help predict implementation success. This will provide further insights when the data is triangulated against the evidence on the influence of barriers and enablers.
-
Additional cancer streams: Expansion of cancer staging using NLP/ML for additional common tumour groups.
-
Standardising cancer staging: Contribute and lead national standardised collection discussions. The addition of staging data in the WACR output dataset would facilitate use locally, nationally and for research. By promoting the implementation progress of the WA Cancer Staging Project and the Cancer Staging Tiered Framework as an approach for standardisation may assist other registries within Australia and internationally to collect cancer staging data and enable cancer staging comparisons that are currently lacking.