Background
In the past decade, intensity modulated radiotherapy (IMRT) and volumetric modulated radiotherapy (VMAT) became standard techniques for external beam radiotherapy treatments (EBRT) of many indications. The inverse optimization approach is an iterative process where optimization objectives are used in order to achieve the pre-defined clinical goals. Additionally, help structures are frequently defined to shape the dose distribution and further individualize and optimize the treatment plan. The complexity of the optimization increases with the number of organ at risks (OAR) and the number of target volumes. Head and neck carcinoma (HNC) is a typical complex case where a large number of OARs, typically 10–20, are surrounding the target volumes irradiated to different dose levels. This makes inverse planning optimization one of the most time consuming steps of the overall treatment planning process.
Additionally, plan quality may vary between planners and between clinical institutions. Plans produced by an experienced center may outperform those produced in a less experienced center [
1] and the OAR sparing also depends on the planning target volume (PTV) dose homogeneity requirements [
2]. Furthermore, evaluation of plan quality is often based on population-based dose volume histogram parameters (DVH), which neglect the nuances of an individual patients’ geometry and therefore do not achieve the optimal solution based on a patient-individual level [
3]. In order to overcome these issues, optimization modules were developed in order to automate part or the entire optimization [
4‐
9] process. They all aim at reducing the inter-planner variability, reducing the planning time allocated for the optimization process and finally improving the overall plan quality [
10,
11]. Nowadays, automated treatment planning and/or optimization systems (ATPS) are in the process of broad clinical implementation. However, since ATPS have to be customized in order to fulfill the specific constraints required by different medical centers, it could be that an ATPS implemented at one institution will not necessary work for patients from another institution. The goal of this study was to compare different ATPS for HNC planning in a multicenter setting.
Additionally, it was evaluated if a model for automated planning developed by one institution could be used for planning cases of another institution using similar but not the same structures and planning goals. This multi-institutional planning comparison of five ATPS solutions is, to the best of our knowledge, the first of its kind.
Discussion
This study presented a multi-institutional planning comparison study of five ATPS used in 3 different institutes, performed on 16 locally advanced head and neck cancer patients coming from two institutes. Although larger differences were observed for an individual patient, when looking at the mean results over all 16 patients, dosimetric differences between ATPS were generally small with Auto-Planning achieving the best ranking. Effective working time differed considerably more between ATPS, from 2 up till 116 minutes.
ATPS can be classified between automated optimization and automated planning, including optimization. The automated optimization can again be distinguished as optimization algorithm driven systems, such as AIO, AP and RS, were the objectives and/or priorities are automatically adjusted during the optimization and knowledge based planning systems based on plan libraries such as RP1 and RP2. The automated optimization algorithm driven systems can be easily modified to take into account possible changes in clinical protocols. This is not necessary the case for the knowledge based planning systems which rely on plans for a database of prior patients. Contrariwise, the use of a database allows a comparison between the predicted and achieved dose volume histogram [
13].
The second classification can be performed based on automated planning where not only the optimization process is automated but also the field setup, gantry and collimator angles, the positioning of the isocenter and help structures such as bolus, rings structures or non-overlapping structures. This fully automated process was used by AIO, AP and RP2. RP1 and RS could also automate these planning process but it was not implemented.
After dose optimization, a single plan was generated by each ATPS except for RS where the user had to select a plan from a database of Pareto-optimal plans. This manual step will reduce the inter-planner standardization but will allow the user to choose the best dose trade-off between the targets and OARs.
Wu et al. [
22] compared AP and RP, for oropharyngeal cancer patients and found that the plan quality from both systems was comparable. Differences between the two systems were in the range of 5%, which is in good agreement to the small differences observed in our study. To the best of our knowledge, no other ATPS comparisons are available.
The two sets of HNC were chosen to evaluate the flexibility of the different ATPS to take into account new structures, objectives and/or different dose levels. The model for AIO, AP and RS could be easily modified because they are based on a set of user pre-defined DVHs parameters, which were automatically adjusted during the optimization. However for the RS, the combination of objectives/constraints and the selection of their formulation had to be adjusted manually for each plan depending on the overlap between the PTV and OAR which affects directly the Pareto surface computation. This is not the case for RP models, which are based on previously generated site-specific plan libraries. In this case, the model had to be manually modified to take into account the structures not defined in the library. RP1 and RP2 were both used without considering whether a particular structure was an outlier. At later inspection, all OAR of the third case from group A were listed as “outside threshold values” for RP2 as the PTV_70Gy size of 585cm3 was above the 90 percentile value of the model. This could have a negative effect on the predictions. RP2 parotid gland doses were 7 Gy higher than for AP for this case. Similarly, when applying RP1 to the patients in the group B, the Glottis was marked as outlier for each single patient, and the swallowing muscles received with RP1 in these patients the highest mean dose. This was also the organ in which the highest differences between RP1 and the other ATPs were observed, clearly showing that the model was not able to predict the correct objectives for this case. This could be overcome by deciding that patients with such warning signs by RapidPlan should not be subject to automated planning. In spite of this, we compared how RP1 and RP2 performed for each organ with the data from the own institution and the external patient data. We did not notice large dosimetric differences, demonstrating that rapidplan also works for patients with slightly different structure sets and prescriptions.
The time required to generate VMAT or IMRT plans has been reduced in the past years by improvement of the available tools in planning system as well as automation of steps in the optimization process time. Nowadays, planning templates, scripts and optimization automation are available in TPS. This allows a gain of time on one side and a standardization of the plan quality at a high level on the other side. The effective working time for ATPS planned with VMAT was reported to be less than 10 min with iCycle [
23] and less than 4 min with AP [
6], but the overall time was not recorded. The effective working time reported are in the same order as those from our study. By adding scripting to the automated optimization processes, effective planning time could be reduced to less than 2 min with RP and AIO. Similar scripting tools are also available in AP but were not implemented. This might have led to a similar reduction of the effective working time.
RS required substantially more time to generate VMAT plans as the other ATPS mainly for two reasons. The first reason is the technique employed in this study where each PTV geometry was approximated by a “more convex” or “less concave” geometry depending on the type of the nearest OAR (serial or parallel architecture). This additional planning step was introduced as earlier publications had shown that RS generated high quality plans in an efficient treatment planning time for convex target geometry [
6,
17,
18]. Therefore, each PTV geometry was approximated by a “more convex” or “less concave” geometry depending on the type of the nearest OAR (serial or parallel architecture). The second reason is that the HNC patients required a high-dimensional Pareto-surface approximation. Thus, the optimization time rises with the number of objective functions used during the optimization process. In our case, 20 objectives on average were used leading to a Pareto-surface approximation generated by 40 plans, as recommended by Craft et al. [
20] for each patient. The optimization time was similar for RP for both institutions. AIO, which is running on the same system as RP, needed a few minutes longer to finish the optimization since the optimization is paused to automatically adjust the objectives. AP performs multitude steps of optimization and dose calculation where the objectives and help structures are automatically adjusted and created. This iterative process is time consuming and lasts typically between 1 h and 1.5 h. The optimization time is increased to three to 4 h with RS due to the reasons mentioned above. However, this approach allows the user to select the plan having the best balance between the targets and OARS dose. The optimization time mentioned above can be influenced by the number of users working in parallel on the server as well as its performances; therefore this parameter should be taken only as a rough estimation of the optimization time.
This study was focused on HNC treatment and whether similar results will be obtained for other sites still needs to be assessed.
Conclusion
The results obtained for the five ATPS evaluated on two different set of HNC patients show that all ATPS were able to fulfill the hard constraints. For the parallel organs, AP achieved the best results followed by RS, AIO, RP2 and RP1. Nevertheless, the differences were small. The effective working time was reduced to less than 20′ for each ATPS, except RS, and could be reduced to less than 2′ when using scripting, which was the case for AIO and RP2.