Intraoperative imaging technology to maximise extent of resection for glioma: a network meta‐analysis

Summary of findings 1. iMRI image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risks (95% CI)**		Relative effect (95% CI)	No of participants (studies)	Quality of the evidence (GRADE)	Comments
iMRI image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: iMRI image‐guided surgery (based on postoperative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	iMRI image‐guided surgery
Extent of resection: incomplete resection	32^a per 100	4 per 100 (1 to 31)	RR 0.13 (0.02 to 0.96)	49 participants (1 study)	⊕⊝⊝⊝^b,c Verylow	Small trial of highly selected participants with potential bias in allocation and performance. 1 other trial reported this outcome but did not contribute towards the analysis.
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝^d Verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol.
Overall survival	Not estimable				⊕⊝⊝⊝^d Verylow	Abstract publication only in 2017. 24 (83%) of 29 patients randomly allocated to the iMRI group and 21 (72.4%) of 29 controls were eligible for analysis of overall survival, reported as "iMRI itself did not affect outcome (560 vs. 624 days, p=0.53)". Unable to identify which 8 patients had metastasis (these were excluded from published trial in 2011).
Progression‐free survival	Not estimable				⊕⊝⊝⊝^d Verylow	Progression‐free survival or time to progression was not adequately reported in the trial.
Quality of life	Not estimable				⊕⊝⊝⊝^d Verylow	Quality of life was not reported in the trial.
The basis for the assumed risk* (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval; iMRI: intraoperative magnetic resonance imaging; MRI: magnetic resonance imaging; RR: risk ratio.
GRADE Working Group grades of evidence High quality: further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: we are very uncertain about the estimate.
^aExpressed in terms of risk of incomplete resection (bad outcome). ^bSmall trial so quality of the evidence downgraded one level. ^cHighly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded two levels. ^dOutcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Summary of findings 2. 5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risks (95% CI)**		Relative effect (95% CI)	No of participants (studies)	Quality of the evidence (GRADE)	Comments
5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: 5‐ALA image‐guided surgery (based on postoperative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	5‐ALA image‐guided surgery
Extent of resection: incomplete resection	64^a per 100	35 per 100 (27 to 45)	RR 0.55 (0.42 to 0.71)	270 participants (1 study)	⊕⊕⊝⊝^b Low	Highly selected participants with potential bias in allocation and performance.
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝^c Verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol.
Overall survival	Not estimable due to reporting of HR and since just a single trial reported on this outcome we did not arbitrarily choose a time to use as a basis to calculate the assumed and corresponding risks as this may be misleading.		HR 0.82 (0.62 to 1.07)	270 participants (1 study)	⊕⊕⊝⊝^b Low	The overall quality of this outcome was low in this trial and was downgraded for highly selected participants with potential bias in allocation and performance.
Progression‐free survival	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝^c Verylow	Progression‐free survival or time to progression was not adequately reported in the trial.
Quality of life	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝^c Verylow	Quality of life was not reported in the trial.
The basis for the assumed risk* (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).f 5‐ALA: 5‐aminolevulinic acid; CI: confidence interval; HR: hazard ratio; MRI: magnetic resonance imaging; RR: risk ratio.
GRADE Working Group grades of evidence High quality: further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: we are very uncertain about the estimate.
^aExpressed in terms of risk of incomplete resection (bad outcome). ^bHighly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ^cOutcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Summary of findings 3. Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risks (95% CI)**		Relative effect (95% CI)	No of participants (studies)	Quality of the evidence (GRADE)	Comments
Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: neuronavigation image‐guided surgery (based on postoperative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	Neuronavigation image‐guided surgery
Extent of resection: incomplete resection	Not estimable	Not estimable	Not reported	45 participants (1 study)	⊕⊝⊝⊝^a,b,c Verylow	Small study of highly selected participants at very high risk of allocation bias. Complete resection was achieved in 3 participants in the control group and 5 participants in the neuronavigation group. However, there was significant attrition, with not all participants completing imaging, and the denominators for these figures were not stated, precluding formal analysis.
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝^c Verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol.
Overall survival	Not estimable				⊕⊝⊝⊝^d Verylow	Not reported by trial authors so graded as very low‐quality evidence.
Progression‐free survival	Not estimable				⊕⊝⊝⊝^c Verylow	Progression‐free survival or time to progression was not reported in the trial.
Quality of life	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝^d Verylow	Quality of life was reported in the trial but only 19 participants (8 in the neuronavigation arm and 11 in the standard surgery arm) completed questionnaires postoperatively at 3 months, constituting only 64.5% of all eligible participants, and no statistical analysis was presented.
The basis for the assumed risk* (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval.
GRADE Working Group grades of evidence High quality: further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: we are very uncertain about the estimate.
^aSmall trial so quality of the evidence downgraded by one level. ^b ^cHighly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ^dOutcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Background

Description of the condition

Of all primary tumours of the central nervous system, gliomas are the second most common representing 26.0%. The vast majority of gliomas are diffuse astrocytic and oligodendroglial tumours characterised by their diffusely infiltrative behaviour through the brain parenchyma (Louis 2016). This group includes IDH‐mutant (isocitrate dehydrogenase) and wild‐type genetic classifications across histological subtypes of diffuse astrocytoma, anaplastic astrocytoma, oligodendroglioma, anaplastic oligodendroglioma, and glioblastoma. Glioblastoma has a dismal prognosis; median survival in the UK is 6.1 months with a five‐year survival of 3.4% (Brodbelt 2015). There is consensus for low‐grade diffusely infiltrating gliomas and glioblastoma that maximising extent of resection is associated with a more favourable prognosis. Increasingly, intraoperative imaging technologies are being utilised in an effort to maximise the likelihood of gross total resection.

Description of the intervention

There are multiple modalities offering intraoperative imaging technology to assist the neurosurgeon in achieving maximal safe resection of gliomas. Fluorescence‐guided surgery enables tumour tissue to be better visualised to maximise the probability of gross total resection of the enhancing component of gliomas. The most common fluorescent compound, 5‐aminolevulinic acid (5‐ALA), is a precursor of haemoglobin. Administered three hours before induction, the result is an accumulation of fluorescent porphyrins in mitotically active tissue. The fluorescence is visualised with ultraviolet light to identify neoplastic tissue intraoperatively in the surgical field and improve identification of resection margins (Stummer 1998; Stummer 2000). As a medication administration, there is an associated cost per single use. Capital investment required includes a fluorescence filter in any operating theatre microscope. The relative uptake of 5‐ALA is dependent on the tumour characteristics, which limits its utility in accurately differentiating tumour and non‐tumour tissue.

One imaging technology that is currently variably used in the resection of glioma is intraoperative ultrasound (iUS), which relies on the different reflections of ultrasonic wave pulses caused by different tissue types enabling the delineation of neuroanatomical structures including normal‐appearing cortex and brain tumour tissue. The ease in which images can be acquired using a hand‐held device enables continuous assessment by the surgeon as resection proceeds.

Furthermore, this relatively affordable technique can be combined with a third technology, neuronavigation, in order to assist the neurosurgeon in achieving gross total resection. Neuronavigation leverages optical or electromagnetic technology to allow the registration of preoperative imaging on the patient in theatre. Several points are matched between the preoperative scan and the patient in theatre prior to computational registration of the scan on the patient in theatre. Specialised equipment visible to the neuronavigation software can then be used to plan the incision and craniotomy, and the trajectory of targeted biopsies.

The combination of iUS and navigation can overcome one major limitation of neuronavigation, namely brain shift. This occurs when the cranial cavity is entered resulting in a shift of intracranial contents relative to the preoperative scan due to the change in intracranial pressure and removal of cerebrospinal fluid (CSF) and tumour tissue. Such a shift can be adjusted for using iUS with simultaneous registration of three‐dimensional (3D) iUS imaging on preoperative imaging. The software registers the position of the ultrasound probe and maps the 3D neuroanatomical image to allow adjustment of the preoperative imaging reflective of the brain shift that has occurred during the operation.

Finally, intraoperative magnetic resonance imaging (iMRI) involves the availability of either a nearby or portable magnetic resonance imaging (MRI) scanner for use in the operating theatre. MRI as an imaging technique involves creating a strong magnetic field, applying radiofrequency pulses, and analysing effects of this on the tissue of interest. Equivalent strength magnets are available to traditional MRI scanners offering clinically useful resolution to enable a real‐time intraoperative snapshot of the extent of tumour resection. Such a technique theoretically affords the possibility of immediate further resection during the same operative session (Black 1997). Unexpectedly, due to the need for a dedicated MRI scanner in the operating room, iMRI is associated with substantial capital costs both in terms of purchasing the scanner and its installation.

Guidance from the National Institute for Health and Care Excellence (NICE) in the UK was published in July 2018. With respect to intraoperative techniques, the use of 5‐ALA was recommended at initial surgery where the patient has radiologically suspected high‐grade glioma (HGG) and gross total resection of the enhancing component is possible, while other techniques (iMRI, iUS, diffusion tensor imaging, awake craniotomy) could be considered in low‐grade glioma (LGG) and HGG to maximise surgical resection while preserving neurological function (NICE 2018).

How the intervention might work

The purpose of all the above interventions is to maximise safe resection of the tumour which, in the case of LGG and HGG, have been associated with improved overall survival (OS) based on low‐quality evidence (Hart 2019; Jiang 2017). The extent of resection is one of the only modifiable factors demonstrably correlated with OS and, therefore, is an important subject of research. A further benefit of improved detection of tumour and non‐tumour tissue is the minimisation of damage to healthy brain tissue during the operation. In combination, these interventions can be used to maximise resection and improve prognosis and quality of life (QoL) for the patients.

Why it is important to do this review

The technologies described are not used in all cases in all centres and, prior to their introduction, were not subject to the same degree of scrutiny as new medical treatments including phase III studies. The capital costs associated with iMRI are substantial; an ability to compare a single or combination of technologies with alternatives is important to evaluate efficacy and cost‐effectiveness. Given the close relationship between achieving a greater extent of resection and the risk of surgical injury to healthy brain tissue, the associated risks of each technology and its effect on measures of QoL were evaluated.

One review published in 2018 identified four randomised controlled trials (RCTS) of low‐certainty evidence without network meta‐analysis (NMA) (Jenkinson 2018). This review performed an updated search for new evidence published since the publication of Jenkinson 2018, with presentation of previous studies with additional attempts at quantitative analysis to facilitate direct and indirect comparisons of intraoperative imaging technologies used in isolation or in combination.

Objectives

Methods

Criteria for considering studies for this review

Types of studies

Randomised controlled trials (RCTs).

Types of participants

Participants included in RCTs with presumed new or recurrent glial tumours of any location of histology as identified based on clinical examination and neuroimaging (computed tomography (CT) or MRI, or both). Participant must also have been eligible for randomisation to any intraoperative imaging modality.

Types of interventions

We compared the following interventions with each other or against the standard of care of conventional microsurgery with white light.

iMRI: defined as a portable or fixed scanner with the acquisition of MRI to evaluate extent of resection while the patient remained under anaesthesia.
5‐ALA: fluorescence‐guided surgery defined as the use of a compound to facilitate the intraoperative delineation of tumour and normal brain tissue to assist the surgeon in performing maximal resection of the tumour.
Neuronavigation: image guidance defined as using preoperative imaging to identify intracranial neuroanatomy using optical or electromagnetic technology. Could be integrated with iMRI or iUS (or both) to update imaging to account for brain shift and tumour tissue removed during the treatment.
iUS: either two‐dimensional (2D) or 3D imaging modality defined as the use of an ultrasound probe for the identification of neuroanatomical structures including residual tumour tissue for evaluation of extent of resection.

The interventions above are genuinely competing alternatives and in theory an RCT comparing all the imaging techniques would be possible and participants could be randomly allocated to any of the interventions in isolation or combination. Moreover, combinations of interventions are also possible, particularly the concomitant use of iUS and image guidance to reduce problems associated with brain shift correction of preoperative imaging for brain shift.

Potential effect modifiers include the following.

iMRI: use of portable or fixed scanner, sequences used, use of intravenous contrast agents.
5‐ALA: fluorescence‐guided surgery: exact compound used, time of administration, dose given, microscopic technologies used to detect fluorescence.
Neuronavigation: manufacturer software, optical or electromagnetic technologies.
iUS: 2D or 3D projections.

Furthermore, as this review included both newly diagnosed or recurrent glial tumours in any location, the degree to which studies could have been merged on the network depended on the participants included and the interventions used, with splitting of the network to optimise transitivity when required.

Types of outcome measures

Primary outcomes

Extent of resection: defined as the proportion of tumour tissue removed based on postoperative MRI. Results presented as an absolute volume of resection, percentage resection, and categorical results (gross total resection, subtotal resection, biopsy).
Adverse events (AEs): defined as need for unplanned additional procedures or development of complications including wound haematoma or infection, CSF leak, cerebral oedema, new or worsening focal neurological deficits or seizures, and general medical complications including thromboembolic disease or non‐surgical site infection.

Secondary outcomes

Overall survival (OS): defined as length of time from randomisation to death from any cause.
Progression‐free survival (PFS): defined as length of time from randomisation in RCT to tumour progression based on RANO (Response Assessment in Neuro‐Oncology criteria) consensus of imaging features of the contrast‐enhancing and T2‐weighted‐fluid‐attenuated inversion recovery (T2/FLAIR) non‐enhancing component, new lesions, clinical deterioration not attributable to another cause, death from any cause, or other clear progression of unmeasurable disease (Wen 2010).
Quality of life (QoL): defined based on validated measures for people with glioma including but not limited to the EORTC QLQ‐C30 and BN20 (European Organisation for Research and Treatment of Cancer Quality of Life assessment specific to brain neoplasms) questionnaires, and FACT‐BrS (Functional Assessment of Cancer Therapy – Brain subscale) (Dirven 2014; Fountain 2016).

Our 'Summary of findings' tables summary of findings Table 1; summary of findings Table 2; summary of findings Table 3 reported the following;

Extent of resection
AEs
OS
PFS
QoL

Decisions on the certainty of the evidence for each outcome followed the most recent recommendations and guidelines (Brignardello‐Petersen 2018; Brignardello‐Petersen 2019a; Brignardello‐Petersen 2019b; Puhan 2014).

Search methods for identification of studies

Non‐English language journals were eligible for inclusion.

Electronic searches

We searched the following databases to 19 May 2020:

the Cochrane Central Register of Controlled Trials (CENTRAL; 2020, Issue 5), in the Cochrane Library;
MEDLINE via Ovid (1946 to May week 2 2020);
Embase via Ovid (1980 to 2020 week 20).

We presented our CENTRAL (Appendix 1), MEDLINE search strategy (Appendix 2), Embase search strategy (Appendix 3). For databases other than MEDLINE, we adapted the search strategies accordingly.

We searched the references of all identified studies for additional eligible studies for the review.

Searching other resources

We undertook a handsearch of the Journal of Neuro‐oncology and Neuro‐oncology from 1990 to 2019 to identify trials that may not have been included in the electronic databases. This search included all conference abstracts published in these journals.

Personal communications

We contacted principal investigators of ongoing relevant trials regarding available data for potential inclusion. For completion of the previous review in 2018, neuro‐oncology experts were contacted to obtain information on current or pending RCTs.

Data collection and analysis

Selection of studies

The selection of studies for the 2018 review are described in Jenkinson 2018.

For the updated search, following automated deduplication of results, two review authors (DGB and MW) independently screened titles and abstracts, and assessed them based on inclusion and exclusion criteria. We utilised the Cochrane author support tool Covidence for title and abstract screening. We undertook full‐text screening of all eligible studies at this stage and further examined them against inclusion and exclusion criteria. We identified disagreements and resolved them through discussion.

Where studies had multiple publications, we collated the reports of the same trial so that each trial, rather than each report, was the unit of interest for the review, and such trials had a single identifier with multiple nested references.

Data extraction and management

Data extraction and management of studies included in the 2018 review are described in Jenkinson 2018.

For any additionally identified studies, two review authors (DMF and MW) independently extracted data into a pre prepared database designed based on an initial pilot of three studies. Where sufficient data were not available from the published paper or additional supplementary material, we contacted authors to request relevant data for completion of the database for each study. We identified differences in the extracted data between review authors and resolved them through discussion. The database included the following fields.

Participant characteristics: age, sex, performance status based on Karnofsky performance score (KPS; Table 1) or World Health Organization score (WHO; Table 2), tumour location, contrast enhancement, tumour histology, tumour mutation status, and methylation status.
Trial characteristics: inclusion and exclusion criteria, randomisation methods and stratification, allocation concealment (if applicable), blinding (of who and when), and statistics. Definitions identified included extent of resection, progression, and AEs.
Interventions: iMRI field strength, imaging sequences, use of contrast, and reporting methods. iUS brand and operator experience, neuronavigation imaging sequences and brand, 5‐ALA dose and timing of administration, use with a microscope.
Outcomes: methods to calculate and measured extent of resection, OS, PFS, and QoL.
Risk of bias in each study.
Duration of follow‐up.

Table 1. Karnofsky performance score

Score	Definition
100	Normal, no complaints, no evidence of disease
90	Able to carry on normal activity: minor symptoms of disease
80	Normal activity with effort: some symptoms of disease
70	Cares for self: unable to carry on normal activity or active work
60	Requires occasional assistance but is able to care for needs
50	Requires considerable assistance and frequent medical care
40	Disabled: requires special care and assistance
30	Severely disabled: hospitalisation is indicated, death is not imminent
20	Very sick, hospitalisation is necessary: active treatment is necessary
10	Moribund, fatal processes are progressing rapidly
0	Dead

See: Summary of findings 1 iMRI image‐guided surgery compared to standard surgery for high‐grade glioma; Summary of findings 2 5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma; Summary of findings 3 Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma

Table 2. WHO performance score

Grade	Definition
0	Fully active, able to carry on all predisease performance without restriction
1	Restricted in physically strenuous activity but ambulatory and able to carry out work of a light or sedentary nature, e.g. light house work, office work
2	Ambulatory and capable of all self‐care, but unable to carry out any work activities. Up and about > 50% of waking hours
3	Capable of only limited self‐care, confined to bed or chair > 50% of waking hours
4	Completely disabled. Cannot carry out any self‐care. Totally confined to bed or chair
5	Dead

WHO: Word Health Organization.

We produced Characteristics of included studies and Characteristics of excluded studies tables.

We extracted data as follows.

For dichotomous outcomes (e.g. extent of resection), we extracted the number of participants in each treatment arm who experienced the outcome of interest and the number of participants assessed at endpoint, in order to estimate a risk ratio (RR) and 95% confidence intervals (CI).
For continuous outcomes (e.g. QoL measures), we extracted the final value and standard deviation of the outcome of interest and the number of participants assessed at endpoint in each treatment arm at the end of follow‐up, in order to estimate the mean difference between treatment arms and its standard error.
For time to event data (OS and PFS), we extracted the log of the hazard ratio (log(HR)) and its standard error from trial reports. If these were not reported, we attempted to estimate the log (HR) and its standard error using the methods of Parmar 1998.

Where possible, all data were extracted relevant to an intention‐to‐treat analysis in which participants are analysed in the groups to which they were assigned.

The time points at which outcomes were collected and reported were noted.

Assessment of risk of bias in included studies

Assessment of risk of bias in studies included in the 2018 review are as described in Jenkinson 2018.

For the NMA, two review authors (DMF and MW) provided independent critical appraisal, with any differences identified and resolved through discussion. We assessed risk of bias in all included RCTs in accordance with the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2019). Types of bias considered included selection, performance, detection, attrition and reporting bias. The additional influence of recorded risk of bias on the transitivity assumption in any network meta‐analyses were assessed.

Measures of treatment effect

We measured the level of incoherence where possible (Chapter 11; Higgins 2019), with incoherence factors calculated and statistically tested. We determined treatment effect measurements based on the type of data collected:

continuous data: extent of resection, QoL;
time‐to‐event data: OS, PFS;
dichotomous data: extent of resection.

Unit of analysis issues

We did not anticipate any unit of analysis issues. All network meta‐analyses was planned using Stata (Stata), and this software can deal with issues such as the inclusion of any multi‐arm trials (Chaimani 2017) (i.e. adjust for the correlation between the effect sizes in the NMA).

Dealing with missing data

In the case of missing data required for review outcomes, we contacted study authors as needed. We did not impute missing outcome data.

Assessment of heterogeneity

In the first instance, we decided whether or not included trials were sufficiently clinically and methodologically similar to do a pair‐wise analysis. We then assessed the transitivity assumption based on inclusion criteria and, if deemed sufficiently similar, considered an NMA (Salanti 2014).

If trials appeared similar enough to include, we aimed to further assess transitivity and subsequently heterogeneity between studies by visually inspecting forest plots. Clinical heterogeneity was assessed based on participant characteristics, trial characteristics, and interventions to judge directness. For any pair‐wise analyses in the review, we aimed to report the I² value and interpret it according to guidelines reported in Section 9.5.2 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2019).

We planned to calculate the individual Q statistic for heterogeneity as part of obtaining direct treatment effects in the NMA. We then planned to calculate a Q statistic for inconsistency for global assessment across the NMA based on network estimates for direct and indirect comparisons (Efthimiou 2016). We aimed to report the standard deviation (tau) of the between‐study heterogeneity as outlined in Chapter 11 of Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2019).

Assessment of reporting biases

For the meta‐analysis or NMA, or both, we aimed to construct funnel plots of treatment effect versus precision to investigate the likelihood of publication bias. If these plots suggested that treatment effects may not be sampled from a symmetrical distribution, as assumed by the random‐effects model, we intended to perform additional meta‐analyses using the fixed‐effect model.

Data synthesis

Data synthesis of studies included in the 2018 review are described in Jenkinson 2018.

For the NMA, previously extracted and synthesised data were reviewed and updated if necessary. We planned to generate network plots to demonstrate which direct comparisons the included RCTs had made, with separate network plots for each prespecified outcome as available in the collected data.

For any dichotomous outcomes, we intended to calculate the RR for each study and pooled these.
For continuous outcomes, we intended to pool the mean differences between the treatment arms at the end of follow‐up if all trials measured the outcome on the same scale, otherwise we intended to pool standardised mean differences.
For time‐to‐event data, we intended to pool HRs using the generic inverse variance facility of Review Manager 5 (Review Manager 2014).

We intended to carry out all meta‐analyses in a frequentist framework using Stata using random‐effects models with inverse variance weighting (Stata). We intended to make appropriate decisions about any variability in the interventions and justified all comparisons and, if necessary, split the same interventions in the network if sufficiently different (e.g. measured in a different way).

From each NMA, we intended to report the Surface Under the Cumulative Ranking curve Area (SUCRA) and mean rank statistics to accompany the effect estimates to aid in the interpretation of selecting the most effective imaging technique (RÃƒÂ¼cker 2015; Salanti 2011). We intended to examine trade‐offs between the different outcomes and interpret all findings in light of risk of bias and GRADE profile for each outcome (GRADE Working Group 2004; Meader 2014).

Subgroup analysis and investigation of heterogeneity

Owing to differences in prognosis, we intended to perform subgroup analyses according to tumour type, including:

HGG;
LGG;
primary versus recurrent disease in HGG and primary disease versus disease progression in LGG.

Sensitivity analysis

We planned to perform a sensitivity analysis to investigate how trial quality affected robustness of findings. We also planned a subsequent sensitivity analysis of trials that included objective blinded early postoperative MRI and histology in their assessment of extent of resection.

Summary of findings and assessment of the certainty of the evidence

We used the GRADE approach to assess confidence in estimates of effect (certainty of evidence) associated with specific comparisons, including estimates from direct, indirect, and final NMA (Brignardello‐Petersen 2018; Puhan 2014; Salanti 2014). Our confidence assessments addressed risk of bias (limitations in study design and execution); inconsistency (heterogeneity of estimates of effects across trials); indirectness (differences in population, interventions, or outcomes to the target of the intended meta‐analysis); and imprecision (e.g. 95% CIs are wide and include or are close to null effect). Limitations in these domains could have resulted in a decrease of the certainty of evidence from high to moderate, low, or very low certainty by –1 (serious concern) or –2 (very serious concern). We based indirect evidence on the most dominant loops (i.e. the shortest path between two treatments) and potentially rated it down for intransitivity (differences in study characteristics that may modify treatment effect in the direct comparisons along the path). We intended to obtain the final NMA confidence rating from the higher of the direct and indirect rating excluding imprecision and indented to rate it down for imprecision and incoherence (difference between direct and indirect estimates). We justified all decisions using footnotes and intended to make comments to aid reader's understanding of the review where necessary.

Results

Description of studies

See Characteristics of included studies, Characteristics of excluded studies, and Characteristics of ongoing studies tables.

Results of the search

The updated literature search to May 2020 identified 425 records: CENTRAL, 117 references, MEDLINE, 87 references; Embase, 221 references. Following preliminary deduplication across the databases, the combined total was 383 references.

We utilised the Cochrane author support tool Covidence for title and abstract screening of the 383 references.

Two review authors (DMF and MW) independently examined the remaining 20 potential studies. We excluded those studies that clearly did not meet the inclusion criteria, and obtained full‐text copies of five potentially relevant references. Subsequently, no additional studies were identified for inclusion (Figure 1; Figure 2).

Figure 1

Study flow diagram.

Figure 2

Study flow diagram.

We subsequently re reviewed all ongoing studies identified in the 2018 review (Jenkinson 2018) and updated with additional newly identified ongoing studies (NCT01811121 (RESECT), NCT02150564, NCT03291977 (FLEGME), NCT03531333). One author with an ongoing trial has published several interim analyses (NCT01479686), but the final results remain unpublished. Consequently, this was not included in this version of the review.

Included studies

The four included studies are Kubben 2014; Senft 2011; Stummer 2006 and Willems 2006 and for the purposes of consideration for a NMA are redescribed in detail below and in the Characteristics of included studies table.

In summary, we identified two trials of iMRI (Kubben 2014; Senft 2011), one trial of fluorescence‐guided surgery (Stummer 2006), and one trial of neuronavigation (Willems 2006). We found no eligible studies of ultrasound‐guided surgery.

Intraoperative magnetic resonance imaging

Kubben 2014 recruited 14 participants from multiple centres in Belgium and the Netherlands between 2010 and 2012. Participants had to have a supratentorial brain tumour suspected to be a glioblastoma and an indication for gross total resection. The trial compared surgery with iMRI versus surgery without iMRI (of which either arm could have included neuronavigation). Outcomes were residual tumour volume, complications, QoL (EORTC QLQ‐C30 questionnaire with BN20 brain tumour module, and European Quality of Life‐5 Dimensions (EQ‐5D) questionnaire), and OS. The final results were initially supposed to be an interim analysis, but ultimately the trial was stopped early thereafter. This unplanned interim analysis was not specified a priori, and as a consequence the sample size would not have taken this into account even if the trial had been fully completed. The size of the trial and circumstances around its early completion are reflected in the 'Risk of bias' assessment and GRADE profile (see below).

Senft 2011 recruited 58 participants from a single German neurosurgical centre between 2007 and 2010. Participants had to have a known or suspected glioma that was contrast enhancing and amenable to complete resection. The trial compared surgery with iMRI versus surgery without iMRI (of which either arm could have included neuronavigation). The primary outcome was extent of resection. Secondary outcomes were volume of residual tumour on postoperative MRI, PFS at six months, duration of surgery, and treatment‐related morbidity.

Fluorescence‐guided surgery

Stummer 2006 recruited 322 participants from multiple centres in Germany between 1999 and 2004. Participants had to have a malignant glioma on imaging. The trial compared surgery with 5‐ALA versus surgery without 5‐ALA (of which either arm could have included neuronavigation). Primary outcomes were complete tumour resection on MRI (< 72 hours' postoperation and > 1.5 T) and PFS. Secondary outcomes were residual tumour volume, OS, type and severity of neurological deficits after surgery, and toxic effects.

Neuronavigation

Willems 2006 recruited 45 participants from a single Dutch centre between 1999 and 2002. Participants had to have a solitary intracerebral space‐occupying lesion with (partial) contrast enhancement eligible for surgery with the intention of gross total resection. The trial compared surgery with neuronavigation versus surgery without neuronavigation. Primary outcomes were extent of resection and survival. Secondary outcomes were procedure duration, usefulness of neuronavigation, extent of resection, QoL (EORTC QLQ‐C30 questionnaire with BN20 brain tumour module), and postoperative course (including neurological status and AEs).

Excluded studies

We excluded 32 studies, as follows (see Characteristics of excluded studies).

Five studies were duplicates between searches.
Nine studies were classified as ongoing (see Characteristics of ongoing studies).
Twelve studies were not RCTs (Czyz 2011; Golub 2020; Koc 2008; Stepp 2007; Wu 2003; Wu 2004; Wu 2007; Zhang 2015; Abraham 2018; Abraham 2019; Wadhwa 2019; Waqas 2018).
Three were only presented as abstracts and we were unable to obtain sufficient information even after attempting correspondence with the original trial authors (Chen 2011; Chen 2012; Seddighi 2016).
Three did not directly compare an intraoperative imaging intervention with either another intraoperative imaging intervention or standard surgery (Eljamel 2008; Rohde 2011; Stummer 2017).

All ongoing studies were reviewed, with further written communication in relation to one ongoing RCT performed on 29 June 2020 (NCT01479686). Due to its current status as an active ongoing trial with planned publication of the final report at the time of writing this review, this study is listed as ongoing and will be considered when results are available.

Risk of bias in included studies

Summary data for risk of bias are presented in Figure 3 and Figure 4. A detailed description is provided below and in the Characteristics of included studies table.

Figure 3

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Figure 4

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Allocation

Randomisation methods

Randomisation methods were described and were satisfactory in all four included trials, for a judgement of low risk of bias (Kubben 2014; Senft 2011; Stummer 2006; Willems 2006).

Allocation concealment

We assessed one trial in which allocation concealment was potentially inadequate (i.e. sealed envelopes) and judged at high risk of bias (Senft 2011), one trial at low risk of bias (Stummer 2006), and the remaining two trials at unclear risk of bias (Kubben 2014; Willems 2006).

Blinding

Three trials performed blinded assessment for extent of resection (Kubben 2014; Senft 2011; Stummer 2006), and one trial for histological assessment (Stummer 2006). Regarding OS, blinding would not affect outcome reporting but could affect subsequent treatment. For QoL, PFS, and AEs, blinding would likely affect the outcomes reported. All trials were not blinded to participants or clinicians.

Incomplete outcome data

One trial accounted for all participants (Kubben 2014). Two trials accounted for all participants, but did not perform an intention‐to‐treat analysis, as those participants that had alternative pathological diagnoses were excluded (Senft 2011; Stummer 2006). In the remaining trial there was evidence of attrition bias for extent of resection (analysis of 32/42 participants) (Willems 2006).

Selective reporting

One trial reported all outcomes and was, therefore, at low risk of reporting bias (Senft 2011). Selective outcome reporting was apparent in three trials: one trial did not report QoL outcomes (Kubben 2014); one trial did not report full outcome data in the form of figures and appropriate statistics for survival, PFS, and AEs for 5‐ALA (Stummer 2006); and one trial did not present full data for survival, QoL, or AEs (Willems 2006). AE data in all studies were particularly poorly reported in terms of total number of events, number of participants with multiple events, and timing of events.

Other potential sources of bias

One of the issues with iMRI is attribution bias. Because surgeons know they can check for residual disease, they do not operate as aggressively as they might if they could not check for residual disease during the operation. So when a scan is done, residual disease is more likely to be detected, removed, and the success of the removal attributed to the iMRI. This is likely to affect outcomes that report a difference between the first intraoperative and final postoperative MRI scans.

Early cessation of trial

All four trials were stopped early based on the results of interim analyses. Kubben 2014 was stopped early based on the results of an interim analysis not specified a priori. Given the low number of participants involved, we excluded this trial from quantitative analysis. Senft 2011 was stopped early based on the results of an interim analysis not specified a priori. Significance values were consequently adjusted (a P < 0.04 was subsequently regarded as significant). Stummer 2006 was stopped early based on the results of a scheduled interim analysis with compensated power calculation. Willems 2006 was stopped early, but no reason was given.

Industry sponsorship

Industry sponsorship was apparent in three trials. Kubben 2014 was financially supported by Medtronic Navigation, but the sponsors "were not involved in writing the protocol, had no access to the data, was not involved in writing the manuscript, and had no veto right for submission." Senft 2011 included authors who had received an honorarium from Medtronic (who manufactured the iMRI machine used in the study), although it was emphasised that the study received no funding from Medtronic. Stummer 2006 was sponsored by medac GmbH (who manufacture Gliolan), which was involved in the study design, quality assurance, and quality control but had no role in the interpretation of data, and the corresponding author had final responsibility for the article (although the author was a paid consultant to both medac GmbH and Zeiss, which manufactures the microscopes used for 5‐ALA). One trial did not state if there were conflicts of interest (Willems 2006).

Effects of interventions

Due to considerable heterogeneity across trials and the small number of included trials (four), we did not conduct an NMA or any pair‐wise meta‐analyses. Consequently, we did not conduct any subgroup or sensitivity analyses or investigate publication bias by constructing funnel plots.

Two trials assessed iMRI (Kubben 2014; Senft 2011).
One trial assessed 5‐ALA (Stummer 2006).
One trial assessed neuronavigation (Willems 2006).

Extent of resection

Extent of resection was reported as proportion with incomplete resection in each arm. The RR for the extent of resection in participants with glioma favoured the experimental arms in two of the four trials reporting this outcome, indicating a lower risk of having an incomplete resection with the intervention:

iMRI: Senft 2011 achieved complete tumour resection in 23/24 (96%) participants in the intervention group compared with 17/25 (68%) participants in the control group (RR for incomplete resection 0.13, 95% CI 0.02 to 0.96; very low‐quality evidence). Kubben 2014 reported tumour resection using residual tumour volume and data for complete tumour resections were not available.
5‐ALA: Stummer 2006 performed complete resection in 90/139 (65%) participants in the intervention group versus 47/131 (36%) participants in the control group (RR for incomplete resection 0.55, 95% CI 0.42 to 0.71; low‐quality evidence).
Neuronavigation: Willems 2006 achieved complete resection in three participants in the control group and five participants in the neuronavigation group. However, there was significant attrition, with not all participants having complete imaging, and the denominators for these figures were not stated, precluding meta‐analysis (very low‐quality evidence).

Adverse events

Reporting of AEs was inconsistent between trials and not according to the prespecified manner required in our protocol (Fountain 2020). Specifically, data were not available for participants at risk, participants with multiple events, timing of events, and outcomes of events. Therefore, we adopted a descriptive method using the data available to describe the AEs in each trial.

iMRI: in the trial of Senft 2011, new or aggravated neurological deficits were present in 2/25 (8%) participants in the control group and 3/24 (13%) participants in the iMRI group; intraoperative imaging did not lead to continuation of tumour resection in any of the participants with AEs. Two participants had symptomatic haematomas, which were not attributable to the use of iMRI. In one participant, hemianopia was deliberately accepted due to tumour extension around the temporal horn of the lateral ventricle involving the optic radiation. In the Kubben 2014 trial, one participant in the intervention arm experienced a postoperative haemorrhage.
5‐ALA: AEs were present in 58.7% of the intervention arm versus 57.8% of the control arm in Stummer 2006. Neurological AEs were present in 42.8% of the intervention arm (7.0% grade 3 to 4) and 44.5% of the control arm (5.2% grade 3 to 4). There were significant neurological AEs in 12.4% of the intervention arm versus 11.6% of the control arm. The number of participants with a deterioration in the National Institutes of Health Stroke Scale compared to baseline tended to be higher in the intervention arm at 48 hours (26.2% with 5‐ALA versus 14.5% with control) but not at seven days (20.5% with 5‐ALA versus 10.7% with control), six weeks (17.1% with 5‐ALA versus 11.3% with control), and three months (19.6% with 5‐ALA versus 18.6% with control). No denominators were given for each result, preventing calculation of RRs and CIs.
Neuronavigation: new or worsened neurological deficits were present at three months in 45.5% of participants in the control group and 18.2% of participants in the neuronavigation group in the Willems 2006 trial. During the first three months after surgery, seven participants (31.8%) in the control group and seven (30.4%) in the neuronavigation group experienced a new, non‐neurological AE. In three participants in the neuronavigation group, these events were fatal (pulmonary embolism, cardiac arrest with pulseless electrical activity, and postoperative pulmonary insufficiency). Other AEs included pulmonary or urinary tract infection, surgical removal of an epidural haematoma, surgical cyst drainage, repeated tumour debulking, CSF leakage, postoperative delirium, and insufficiently treated steroid‐induced diabetes.

Overall survival

iMRI: Senft 2011 did not assess OS in the initial publication, although an abstract was published in 2017 included 24/29 (83%) of patients allocated to iMRI group and 21/29 (72.4%) of controls and reported that "iMRI itself did not affect outcome (560 versus 624 days, P = 0.53)". Kubben 2014 did not report OS in the prespecified manner for inclusion.
5‐ALA: in Stummer 2006, there was no difference in OS between the intervention and control arms (HR 0.82, 95% CI 0.62 to 1.07). Median survival was also reported and this was 15.2 months (95% CI 12.9 to 17.5) in the intervention arm versus 13.5 months (95% CI 12.0 to 14.7) in the control arm.
Neuronavigation: Willems 2006 reported the HR to be 1.6, however, no CIs were available or able to be calculated. The median survival time was reported to be nine months in the control arm and 5.6 months in the intervention arm.

Progression‐free survival or time to progression

iMRI: In Senft 2011, HRs or their respective CIs were not available and could not be calculated. The median PFS in the intervention arm was 226 days (95% CI 0.0 to 454) versus 154 days (95% CI 60 to 248) in the control arm. Kubben 2014 did not assess these outcomes.
5‐ALA: HRs and their respective CIs were not available and could not be calculated in Stummer 2006. Median PFS was 5.1 months (95% CI 3.4 to 6.0) in the intervention arm versus 3.6 months (95% CI 3.2 to 4.4 months) in the control arm.
Neuronavigation: Willems 2006 did not assess time to progression or PFS.

Quality of life

iMRI: Senft 2011 and Kubben 2014 did not report data for QoL.
5‐ALA: Stummer 2006 did not assess QoL.
Neuronavigation: In Willems 2006, QoL questionnaires at three months' postoperatively were completed by 19 participants (eight in the neuronavigation arm and 11 in the standard surgery arm), constituting 64.5% of all eligible participants. The questionnaire included the EORTC QLQ‐C30 with 30 general questions, with additionally the 20‐question brain tumour module (BN20). Out of 26 outcome measures that were presented, the direction of change differed in seven (all in the BN20 group): four were in favour of the neuronavigation group and three were in favour of standard surgery. No statistical analysis was presented.

We considered this evidence to be of low to very low quality for all reported outcomes (summary of findings Table 1; summary of findings Table 2; summary of findings Table 3).

For all effects of interventions, meta‐analysis and NMA were not performed. Although all unit of analyses were appropriate, the included studies were excessively heterogeneous clinically for pair‐wise comparison, and there was an insufficient number of studies across interventions with low sample sizes with variations in the image guidance tools used in the control arms (predominantly utilisation of neuronavigation).

Brief economic commentary

To supplement the main systematic review of effects, we sought to identify cost analyses and economic evaluations that compared the interventions with each other or between different variants of the same intervention. A search of MEDLINE and Embase identified seven such studies (Abraham 2019; Eljamel 2016; Esteves 2015a; Hall 2003; Kowalik 2000; Makary 2011; Schulder 2003; Slof 2015) (one study was reported in two papers – Hall 2003; Kowalik 2000). Of the seven studies, four studies compared iMRI to conventional surgery (Abraham 2019; Kowalik 2000; Makary 2011; Schulder 2003); two compared 5‐ALA with white light surgery (Esteves 2015a; Slof 2015); and one compared conventional, 5‐ALA, fluorescein, ultrasound, and iMRI surgery (Eljamel 2016). Three studies were conducted in the USA (Kowalik 2000; Makary 2011; Schulder 2003), one in Portugal (Esteves 2015a), and one in Spain (Slof 2015), and for two it was unclear (but was probably the USA) (Abraham 2019; Eljamel 2016). Four studies were based on non‐randomised retrospective comparative cohorts (Kowalik 2000; Makary 2011; Schulder 2003; Slof 2015); one was based on a review and pair‐wise meta‐analyses (Eljamel 2016), and one used data from a trial and retrospective cohorts (Esteves 2015a). One study parameterised their microsimulation model with data from prospective and retrospective cohorts studies and a randomised trial (Abraham 2019). Costs estimates were derived from existing cost analysis and cost‐effectiveness analysis (including Eljamel 2016). Utility data were derived from a previous health technology assessment and preference elicitation studies, all utility data were derived using the standard gamble method. The studies based on single cohort studies all involved fewer than 100 participants, except for Slof 2015, which included 254 participants who received 5‐ALA and 120 who received white light surgery. All the studies except two (Abraham 2019; Esteves 2015a;), which integrated data using a Markov model and a microsimulation model respectively, were based on comparisons of individual patient level data. In terms of costs, what costs were included and over what time horizon varied markedly. Two studies considered costs over the patient lifetime (Abraham 2019; Esteves 2015a), and one only considered the drug cost (Slof 2015). The other studies considered costs incurred in hospital for the index surgery. Costs were reported in USD in five studies (Abraham 2019; Esteves 2015a; Kowalik 2000; Makary 2011; Schulder 2003), but the price year was stated only in two studies (Abraham 2019; Makary 2011). The other two studies reported costs in EUR, and the price year was 2012 in one study (Esteves 2015a), and not stated in the other (Slof 2015). Two studies were cost analyses only (Kowalik 2000; Schulder 2003). Effects were resection rates (Eljamel 2016), resection‐free years (Makary 2011), quality‐adjusted life years (QALY) (Abraham 2019; Eljamel 2016; Esteves 2015a; Slof 2015), PFS (Esteves 2015a), and life years (Esteves 2015a).

For the comparison of iMRI with conventional surgery, two studies reported a potential cost saving driven by reductions in length of stay (Kowalik 2000; Schulder 2003), and third study reported lower mean costs that were not statistically significant (Makary 2011). The one cost‐effectiveness analysis (Makary 2011) reported a lower complication rate for iMRI versus cMRI in people presenting for their initial tumour resection as well as a longer interval to repeat resection (20.1 months versus 6.7 months; P = 0.02); further results suggested iMRI was more cost‐effective in terms of cost per resection‐free years. Another study reported that iMRI was the most costly of conventional, 5‐ALA, fluorescein, and ultrasound‐assisted surgery (Eljamel 2016). iMRI was the least cost‐effective, but the results could not be replicated from the data presented in the study. Estimates of cost‐effectiveness (and cost over a longer follow‐up) need to considered due to the very limited evidence for iMRI where there is a benefit shown in terms of extent of resection but no evidence in the review of clinical effectiveness on OS. The one cost utility analysis was based on a microsimulation model and simulated cost and outcomes for a hypothetical cohort of 100,000 participants in each arm, the authors did not discuss the rationale for this sample size (Abraham 2019). The authors found that iMRI yielded an incremental benefit of 0.18 QALYs at an incremental cost of USD 13,447, which resulting in an incremental cost‐effectiveness ratio of USD 76,442 per QALY. The authors concluded that there was a 99.5% chance of cost‐effectiveness at a willingness‐to‐pay threshold of USD 100,000 per QALY based on probabilistic sensitivity analysis. The authors did not discuss the reasons for selecting a USD 100,000 per QALY cost‐effectiveness threshold.

For the comparison of 5‐ALA and standard surgery, 5‐ALA was on average more costly in both studies, but results in more quality adjusted life years over the patient lifetime or over the time to progression of disease (Esteves 2015a; Slof 2015). In both cases, the study authors concluded that the extra costs were worth the extra QALYs and that these conclusions were consistent over all sensitivity analyses conducted. These findings of extra effectiveness in the economic studies need to be considered in context of the findings of the review of the best available clinical effectiveness data summarised above.

We did not subject the identified cost analyses and economic evaluations to critical appraisal and we did not attempt to draw any firm or general conclusions regarding the relative costs or efficiency of the iMRI strategies compared. The evidence seems to state there is additional benefit in the additional intraoperative imaging strategies but the value of this benefit seems to vary. The evidence needs to be considered in the context of the decision makers' local context and with different national acceptability thresholds for cost‐effectiveness.

Discussion

Summary of main results

We identified four RCTs in total; two RCTs for iMRI (Kubben 2014; Senft 2011), one for surgery with 5‐ALA (Stummer 2006), and one assessing neuronavigation using preoperative MRI (Willems 2006). Formal NMA standard pair‐wise meta‐analyses were not possible due to the different comparisons and variability in the control arm population between trials. Therefore, we were limited to performing a narrative synthesis of the included trials.

Two trials demonstrated a benefit for intraoperative imaging technology in terms of extent of resection (the primary outcome) (iMRI: Senft 2011; 5‐ALA: Stummer 2006). OS data were available for 5‐ALA and iMRI; there was no evidence that 5‐ALA or iMRI improved OS. Two trials provided data for PFS, and were not available in the format specified (HRs and their variance). Nevertheless, there was a suggestion that 5‐ALA increased PFS compared with standard surgery. One trial reported QoL data, and there was significant attrition and reporting bias. AE reporting varied considerably between trials but in general was poorly performed. With 5‐ALA, it appeared that neurological deterioration was more common after fluorescence‐guided surgery. The studies that reported this effect noted that it occurred mainly among people with fixed deficits and early after surgery, but there was subsequently a trend towards recovery (Stummer 2006). Other AEs appeared to be rare and similar in frequency between study arms.

We did not subject the three identified cost analyses and economic evaluations to critical appraisal and we did not attempt to draw any firm or general conclusions regarding the relative costs or efficiency of the iMRI strategies compared. The evidence seems to state there is additional benefit in the additional intraoperative imaging strategies but the value of this benefit seems to vary. The evidence needs to be considered in the context of the decision makers local context and with different national acceptability thresholds for cost‐effectiveness.

Overall completeness and applicability of evidence

The overall evidence base is incomplete across comparisons and outcomes reported. We only identified four trials and all were small with the exception of one (Stummer 2006), which included 322 participants. The other three included a total of 117 participants, which did not take into account considerable attrition for most outcomes. Furthermore, all the identified trials included highly selected participants in specialised centres, and the applicability of these findings to a more general population needs to be carefully considered. Participants included in the trials tended to be generally young and of good performance status. In addition, most trials also clearly specified the types of tumours that were to be included, and would not have randomised those patients with eloquent tumours or where a complete resection was not feasible. Potentially those enrolled in one of the iMRI trials (Senft 2011) were likely to have more resectable or less eloquent tumours than those in the 5‐ALA trial (Stummer 2006), given the far higher resection rates in both arms of the iMRI study (96% iMRI and 68% control versus 65% 5‐ALA and 36% control).

The majority of included trials only enrolled participants with probable HGG. We identified no RCTs for ultrasound‐guided surgery, which may reflect the less widespread application of this particular technology. There are theoretical advantages to this technology, such as relative affordability, repeatability, and possibly better sensitivity in low‐grade tumours than the other included intraoperative imaging modalities. Nevertheless, it currently does not have the same evidence base as other intraoperative imaging modalities to recommend its use in routine clinical practice.

Quality of the evidence

The certainty of the evidence across all outcomes ranged from low to very low. Therefore, further evidence will certainly have an impact on the results of this review and additional studies in this area are vital. It is clearly feasible to perform RCTs for new surgical interventions, and it appears now to have become standard practice to perform an RCT for assessing new intraoperative imaging technologies. The openness of major centres to enrolling participants in RCTs to provide clear outcome data is a major step forward in neuro‐oncology. Some aspects of the included trials were at low risk of bias, such as randomisation methods and blinded, objective reporting for extent of resection. However, the overall the risk of bias was high, and there were consistent concerns with stopping trials early and the role of industry involvement (summary of findings Table 1; summary of findings Table 2; summary of findings Table 3).

Extent of resection was the primary outcome for all the included trials. This has the advantage of being the outcome most directly influenced by intraoperative imaging. However, there is still no evidence from RCTs that resection (either total or less than total) improves outcomes for HGG over biopsy alone (Hart 2019). Subgroup analyses, particularly for the 5‐ALA trial (Stummer 2006), have shown that those participants that have a complete resection of all contrast‐enhancing tumour survive longer than those with residual tumour (Pichlmeier 2008). Studies of chemotherapy have also found that those without residual tumour survive longer (Stupp 2005). While this is not direct evidence in favour of complete resection, but rather a post hoc non‐randomised subgroup analysis, it is becoming increasingly apparent that a complete tumour resection is desirable, particularly when it can be achieved safely. Precisely how much a complete resection contributes towards the overall outcome is unclear. New methods of imaging (e.g. amino acid positron emission tomography) have found that tumours frequently extend out from the contrast‐enhancing margin on MRI (Miwa 2004). However, validation of this approach has yet to be established, and the need for a cyclotron makes widespread application and testing a challenge in the UK; therefore, MRI for assessing residual tumour remains the current standard of care.

After extent of resection, studies tended to focus on PFS rather than OS. There are certain advantages to this in that possibly fewer participants are required and the results may be available sooner. Additionally, it may provide a more direct assessment of the effect of the primary intervention that is not confounded by subsequent therapy. However, it can be argued that OS should remain the main outcome of interest. First, survival is so short in HGG that the practical benefits of assessing PFS are less relevant. Second, assessment of PFS can be more subjective, and is critically dependent on the timing and interpretation of imaging, which can often be complicated (Wen 2010).

Quality control for surgical neuro‐oncology trials is an important area (Chang 2007). Standardisation of reporting is required to allow clear comparisons between trials in meta‐analyses. Detailed reporting is required for tumour location with regard to eloquent brain; operative technique used; postoperative imaging protocol; assessment of extent of resection; and recording of AEs (including total numbers of events, total number of participants at risk, number of participants with multiple events; severity, timing, and outcome of events, i.e. resolution or persistence of neurological deficits).

Potential biases in the review process

We took multiple steps in the original published and updated review process to minimise bias, including double independent literature sift and data extraction, not pooling results due to heterogeneity, and using strict inclusion criteria. Overall, these steps acted to minimise bias and restrict the review to the best available evidence.

Notably, the majority of trials identified through the search strategy were not RCTs. It could be argued that excluding this volume of data biases our review and that it would be more appropriate to consider a Cochrane Review of non‐randomised studies (NRS). In particular in the completion of a meta‐analysis and NMA, there have recently been peer‐reviewed publications on this topic (see Agreements and disagreements with other studies or reviews).

However, the issue of selection bias is critical, particularly in surgical trials. Participants enrolled in an NRS are likely to have a better prognosis than a control population, and it is impossible to accurately account for this bias without using randomisation. Therefore, it would be unclear what benefit intraoperative imaging had on the overall outcome. Meta‐analysis of RCTs remains the most reliable way of assessing the benefits of specific intraoperative imaging modalities. However, NRS may also have a role, particularly regarding technology development and reporting of AEs.

This review included two specific groups of technologies, those that used imaging obtained intraoperatively and those that used imaging obtained preoperatively for use in an intraoperative manner. We felt that both methods were suitable for comparison, as the goals are similar: namely, to achieve maximal safe resection via the application of surgical technology. A major concern with preoperative imaging is intraoperative brain shift, whereby anatomical localisation is affected by events that occur during surgery (e.g. anaesthesia, brain retraction, tumour resection, dural opening, and CSF drainage). Imaging obtained intraoperatively can theoretically account for brain shift and allow more accurate navigation than imaging obtained preoperatively. In this review, we found that a single trial did not demonstrate an effect for intraoperative imaging utilising preoperatively acquired data (Willems 2006).

Another technique that is commonly used in neuro‐oncology surgery is awake craniotomy. This is often perceived as a technology to make surgery safer by allowing intraoperative mapping of eloquent brain. It is not typically regarded as a technique to maximise extent of resection and was, therefore, not included in this review.

We did not subject these studies to critical appraisal, and we did not attempt to draw any firm or general conclusions regarding the relative costs or efficiency of the interventions being compared. For the comparison of iMRI surgery with conventional surgery, it is clear that the available economic evidence is, at best, equivocal. For the comparison of 5‐ALA with white light surgery, the available economic evidence indicates that, from an economic perspective, use of 5‐ALA could be a promising strategy but effectiveness data used in the economic studies were not consistent with the findings of the review of effectiveness.

Agreements and disagreements with other studies or reviews

Aforementioned interim analyses of an ongoing trial of iMRI is broadly in agreement with the findings of this review (NCT01479686). This reported outcomes on 202 participants with follow‐up data for 177 patients. Complete resection was achieved in 86% of the iMRI arm versus 45% in the control arm (P < 0.0001). Patients in the iMRI arm with eloquent HGGs had significantly longer PFS and OS compared to the control group. There were no AEs of iMRI reported.

Golub and colleagues have recently published a NMA of 5‐ALA and iMRI in HGG surgery (Golub 2020). The NMA included 11 randomised and NRS, including two RCTs from this review (Senft 2011; Stummer 2006), although notably Senft 2011 was classed as a retrospective study. This NMA also included the data published from the interim results of the RCT not included in this review (NCT01479686). The NMA performed revealed that both iMRI and 5‐ALA were superior to neuronavigation in achieving gross total resection, with a smaller number of studies additionally demonstrating superior PFS and OS. There was no evidence of superiority between iMRI and 5‐ALA demonstrated (Golub 2020).

Furthermore, Coburger and Wirtz performed a systematic review of fluorescence‐guided surgery by 5‐ALA and iMRI in HGG (Coburger 2019). Given the broader inclusion criteria of randomised and NRS without a control group, a total of 22 studies were included in the review including two RCTs from this review (Senft 2011; Stummer 2006). The review concluded that both iMRI and 5‐ALA were superior to neuronavigation in achieving gross total resection of non‐eloquent lesions, while additionally not increasing the rate of permanent neurological deficit or reduction in QoL (Coburger 2019).

Figure 1

Study flow diagram.

Figure 2

Study flow diagram.

Figure 3

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Figure 4

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Summary of findings 1. iMRI image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risks (95% CI)**		Relative effect (95% CI)	No of participants (studies)	Quality of the evidence (GRADE)	Comments
iMRI image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: iMRI image‐guided surgery (based on postoperative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	iMRI image‐guided surgery
Extent of resection: incomplete resection	32^a per 100	4 per 100 (1 to 31)	RR 0.13 (0.02 to 0.96)	49 participants (1 study)	⊕⊝⊝⊝^b,c Verylow	Small trial of highly selected participants with potential bias in allocation and performance. 1 other trial reported this outcome but did not contribute towards the analysis.
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝^d Verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol.
Overall survival	Not estimable				⊕⊝⊝⊝^d Verylow	Abstract publication only in 2017. 24 (83%) of 29 patients randomly allocated to the iMRI group and 21 (72.4%) of 29 controls were eligible for analysis of overall survival, reported as "iMRI itself did not affect outcome (560 vs. 624 days, p=0.53)". Unable to identify which 8 patients had metastasis (these were excluded from published trial in 2011).
Progression‐free survival	Not estimable				⊕⊝⊝⊝^d Verylow	Progression‐free survival or time to progression was not adequately reported in the trial.
Quality of life	Not estimable				⊕⊝⊝⊝^d Verylow	Quality of life was not reported in the trial.
The basis for the assumed risk* (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval; iMRI: intraoperative magnetic resonance imaging; MRI: magnetic resonance imaging; RR: risk ratio.
GRADE Working Group grades of evidence High quality: further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: we are very uncertain about the estimate.
^aExpressed in terms of risk of incomplete resection (bad outcome). ^bSmall trial so quality of the evidence downgraded one level. ^cHighly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded two levels. ^dOutcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Summary of findings 1. iMRI image‐guided surgery compared to standard surgery for high‐grade glioma

Summary of findings 2. 5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risks (95% CI)**		Relative effect (95% CI)	No of participants (studies)	Quality of the evidence (GRADE)	Comments
5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: 5‐ALA image‐guided surgery (based on postoperative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	5‐ALA image‐guided surgery
Extent of resection: incomplete resection	64^a per 100	35 per 100 (27 to 45)	RR 0.55 (0.42 to 0.71)	270 participants (1 study)	⊕⊕⊝⊝^b Low	Highly selected participants with potential bias in allocation and performance.
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝^c Verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol.
Overall survival	Not estimable due to reporting of HR and since just a single trial reported on this outcome we did not arbitrarily choose a time to use as a basis to calculate the assumed and corresponding risks as this may be misleading.		HR 0.82 (0.62 to 1.07)	270 participants (1 study)	⊕⊕⊝⊝^b Low	The overall quality of this outcome was low in this trial and was downgraded for highly selected participants with potential bias in allocation and performance.
Progression‐free survival	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝^c Verylow	Progression‐free survival or time to progression was not adequately reported in the trial.
Quality of life	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝^c Verylow	Quality of life was not reported in the trial.
The basis for the assumed risk* (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).f 5‐ALA: 5‐aminolevulinic acid; CI: confidence interval; HR: hazard ratio; MRI: magnetic resonance imaging; RR: risk ratio.
GRADE Working Group grades of evidence High quality: further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: we are very uncertain about the estimate.
^aExpressed in terms of risk of incomplete resection (bad outcome). ^bHighly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ^cOutcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Summary of findings 2. 5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma

Summary of findings 3. Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risks (95% CI)**		Relative effect (95% CI)	No of participants (studies)	Quality of the evidence (GRADE)	Comments
Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: neuronavigation image‐guided surgery (based on postoperative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	Neuronavigation image‐guided surgery
Extent of resection: incomplete resection	Not estimable	Not estimable	Not reported	45 participants (1 study)	⊕⊝⊝⊝^a,b,c Verylow	Small study of highly selected participants at very high risk of allocation bias. Complete resection was achieved in 3 participants in the control group and 5 participants in the neuronavigation group. However, there was significant attrition, with not all participants completing imaging, and the denominators for these figures were not stated, precluding formal analysis.
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝^c Verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol.
Overall survival	Not estimable				⊕⊝⊝⊝^d Verylow	Not reported by trial authors so graded as very low‐quality evidence.
Progression‐free survival	Not estimable				⊕⊝⊝⊝^c Verylow	Progression‐free survival or time to progression was not reported in the trial.
Quality of life	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝^d Verylow	Quality of life was reported in the trial but only 19 participants (8 in the neuronavigation arm and 11 in the standard surgery arm) completed questionnaires postoperatively at 3 months, constituting only 64.5% of all eligible participants, and no statistical analysis was presented.
The basis for the assumed risk* (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval.
GRADE Working Group grades of evidence High quality: further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: we are very uncertain about the estimate.
^aSmall trial so quality of the evidence downgraded by one level. ^b ^cHighly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ^dOutcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Summary of findings 3. Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma

Table 1. Karnofsky performance score

Score	Definition
100	Normal, no complaints, no evidence of disease
90	Able to carry on normal activity: minor symptoms of disease
80	Normal activity with effort: some symptoms of disease
70	Cares for self: unable to carry on normal activity or active work
60	Requires occasional assistance but is able to care for needs
50	Requires considerable assistance and frequent medical care
40	Disabled: requires special care and assistance
30	Severely disabled: hospitalisation is indicated, death is not imminent
20	Very sick, hospitalisation is necessary: active treatment is necessary
10	Moribund, fatal processes are progressing rapidly
0	Dead

Table 1. Karnofsky performance score

Table 2. WHO performance score

Grade	Definition
0	Fully active, able to carry on all predisease performance without restriction
1	Restricted in physically strenuous activity but ambulatory and able to carry out work of a light or sedentary nature, e.g. light house work, office work
2	Ambulatory and capable of all self‐care, but unable to carry out any work activities. Up and about > 50% of waking hours
3	Capable of only limited self‐care, confined to bed or chair > 50% of waking hours
4	Completely disabled. Cannot carry out any self‐care. Totally confined to bed or chair
5	Dead
WHO: Word Health Organization.

Table 2. WHO performance score