Introduction

Age-related macular degeneration (AMD) is a common cause of visual impairment in the United States,1 with its neovascular form a leading cause of irreversible blinding in elderly populations.2 Spectral domain optical coherence tomography (OCT) is commonly used to visualize and monitor choroidal neovascularization (CNV) associated with AMD. This noninvasive, nondestructive method of obtaining detailed anatomical data in vivo2, 3 is used to evaluate, diagnose, and monitor diseases such as diabetic retinopathy4, 5 and diabetic macular edema,6, 7 as well as pigment epithelial layer abnormalities and CNV.8 The ability of commercial OCT algorithms to automatically segment retinal boundaries, and generate thickness and volume maps has been very important for its use in clinical practice and in clinical research trials.9, 10

In disorders such as CNV, however, the automatic segmentation boundaries generated by OCT systems are often inaccurate,9, 11, 12 likely owing to the extensive outer retinal disruption caused by the disease process. In such cases, the retinal layer boundaries must be manually corrected to assure accurate measurements.13 The Cirrus and Topcon OCT machines primarily segment two boundaries as a means of defining retinal thickness: the inner limiting membrane (ILM) and the retinal pigment epithelium (RPE). These machines do not differentiate subretinal fluid (SRF) from neurosensory retina, nor do they separately quantify subretinal hyperreflective material (SRHM) or pigment epithelial detachment (PED). The measurements from volume maps generated using OCT can also be affected by artifacts,14, 15, 16, 17 poor signal,18 operator errors, and decentration owing to poor fixation.19

Even newer third-party automated algorithms for CNV lesions require human input and optimization.11, 12 As manual correction of the scans is exhaustive and time consuming, it is unsuitable for regular clinical practice and presents a challenge even in the context of a reading center for clinical trials.20 Furthermore, many clinicians do not obtain dense volume scans, but less dense sets with only 25–50 B-scans per cube, particularly when using acquisition protocols that utilize extensive B-scan averaging. We have shown that features of exudation in CNV lesions can be missed when using these reduced densities.21

We have also previously demonstrated that accurate retinal thickness and volume maps can be generated using only a small subset of B-scans (32 B-scans) in a volume cube;13, 20 however, this study included retinal pathologies of various origins and not specifically CNV, wherein significant disruption of the outer retina leads to more frequent and severe segmentation errors.16, 18, 22 Furthermore, the accuracy of volumes of more localized pathologic features, such as PED or SRF, may be more severely compromised by lower sampling densities. Thus, in the present study, we address these issues by evaluating the impact of reduced B-scan frame sampling, specifically in eyes with neovascular AMD, and incorporating CNV lesion parameters such as SRF, SRHM, and PED.

Materials and methods

Data collection

For this retrospective study, we collected OCT data from 39 eyes of 38 patients clinically diagnosed with wet AMD who presented consecutively to the Doheny Eye Institute Retina Clinics. All data were generated by one of two spectral domain OCT instruments available in the clinic: Cirrus 5000 (Carl Zeiss Meditec, Dublin, CA, USA, 24 patients) or Topcon 3DOCT-2000 (Topcon Medical Systems, Inc., Oakland, NJ, USA, 15 patients). Data collection and analyses were approved by the Medical Institutional Review Board of the University of California Los Angeles and the research adhered to the tenets set forth in the Declaration of Helsinki. Clinical characteristics such as age, gender, best-corrected visual acuity, and diagnosis were also obtained from the patient records.

Imaging from both spectral domain OCT machines was performed using a standardized macular cube protocol consisting of 128 equally spaced, horizontally oriented, 6 mm raster B-scans, each composed of 512A-scans, with scanning performed over a 6 mm square centered on the fovea. This is the most commonly used protocol in the Doheny Imaging Unit and is the most widely accepted acquisition protocol for clinical trials of retinal disease at the Doheny Image Reading Center (DIRC). The raw data from the OCT machines were collected and imported into previously described and validated spectral domain OCT reading center grading software (3D-OCTOR).20, 23 This software allowed the grader to manually segment the relevant boundaries, and generate retinal thickness and volume maps using the common Early Treatment of Diabetic Retinopathy Study macular grid.20

Grading procedure

The OCT scans were analyzed and graded by three experienced, certified DIRC graders (SBV, MGN, and RKK). Boundaries drawn in each of the 128 OCT B-scans included the ILM, outer border of the photoreceptors, borders of SRF and SRHM (if present), inner surface of the RPE, and estimated normal position of the RPE layer (in cases of RPE elevation). All boundaries were drawn in accordance with the standard OCT grading protocol of DIRC, which has been demonstrated to yield highly reproducible grading in previous reports.24 After grading, 3D-OCTOR was used to calculate output parameters for various morphologic spaces such as the neurosensory retina, SRHM, SRF, and PED (Figure 1).

Figure 1
figure 1

(a) Optical coherence tomography B-scan demonstrating subretinal hyperreflective material (SRHM—‘hyperreflective’ space), subretinal fluid (SRF—‘hyporeflective’ space), and pigment epithelial detachment (PED). (b) The clinically relevant boundaries—internal limiting membrane (ILM), outer border of photoreceptors, retinal pigment epithelium (RPE), inner and outer borders of SRHM, and the estimated normal location of the RPE layer are drawn using 3D-OCTOR software. (c) 3D-OCTOR then computes the volumes of the spaces (retina, SRHM, SRF, and PED) defined by these boundaries.

Generating thickness and volume maps

Maps were generated to evaluate the relationship and differences between each B-scan density for foveal central subfield (FCS) thickness, and total volume measurements of the neurosensory retina, SRHM, SRF, and PED. As in previous publications,24 the space extending between the ILM layer and the outer surface of the photoreceptor outer segments was defined as the neurosensory retina; the hyporeflective space (Figure 1) between the outer photoreceptor border and the inner surface of SRHM (if present) or RPE was defined as SRF; the hyperreflective space (Figure 1) between the outer surface of the photoreceptors or SRF (if present) and the inner surface of the RPE was defined as SRHM; the space between the inner surface of the RPE and the estimated original position of the RPE (often recognized by a thin hyperreflective line believed to correspond to the Bruch’s membrane-choriocapillaris interface) was defined as PED (Figure 1). Intergrader reproducibility using the OCTOR software and this grading protocol has been demonstrated previously.24

Retinal thickness maps were generated using all 128 B-scans, and then with sequentially smaller subsets of evenly spaced scans: 64 B-scans (every other scan, 94 μm apart); 32 B-scans (every 4th B-scan, 188 μm apart); 16 B-scans (every 8th B-scan, 376 μm apart); 8 B-scans (every 16th B-scan, 752 μm apart); and 4 B-scans (every 32nd B-scan, 1504 μm apart). Thickness and volume maps were generated not only for the neurosensory retina, but for the CNV lesion features (SRF, SRHM, and PED) using a simple bilinear interpolation for each sampling density, as previously described.13

Statistical methods

The thickness and volume measurements obtained using all 128 B-scans were considered to be the reference standard or ground truth. The difference (error) between the reference standard and analogous values at each reduced frame-sampling density was then calculated for all retinal and CNV lesion parameters (data from only eyes with CNV features were used for analysis). The means of the absolute difference values were compared as opposed to a simple mean, which could potentially mask or minimize apparent differences. Percentage (relative) errors were calculated by dividing the value of the difference between the two measurements by the ground truth/reference (ie, based on all 128 B-scans) measurements and multiplying by 100. Bland–Altman plots were generated to facilitate comparisons between each B-scan sampling density and the ground-truth reference values. Best-corrected visual acuity was converted into logMAR notation for statistical analysis. The relationship between visual acuity and the various calculated parameters was also compared to evaluate for consistency with previously published findings.

All data were analyzed using commercially available SPSS 15.0 statistical software (SPSS, Chicago, IL, USA) and MedCalc (MedCalc Software, ver. 11.3.8, Mariakerke, Belgium). A P-value of ≤0.05 was considered statistically significant. One-way ANOVA and Bonferroni correction were used to determine significant differences between and within B-scan densities.

Results

Clinical characteristics

A total of 39 eyes with CNV from 38 patients with AMD were included in this study. Among the 39 eyes, CNV features such as SRF, SRHM, and PED were present in 26, 29, and 34 eyes, respectively. The mean patient age was 82.7±6.27 and the mean logMAR visual acuity was 0.84 (Snellen≈20/140)±0.72. Twenty-five (66%) of the 38 patients included in our analysis were females and 23 (59%) of the eyes studied were left eyes. The association between logMAR visual acuity and total volumes of each of the CNV parameters was also evaluated. A positive correlation was found between logMAR visual acuity and total volumes of SRHM (r=0.785, P=<0.001), SRF (r=0.701, P=<0.001), and PED (r=0.963, P=<0.001). Similar correlations were found for desired scan densities. A positive correlation was found between logMAR visual acuity and total volumes of SRHM (at 32 B-scans; r=0.789, P=<0.001), SRF (at 16 B-scans; r=0.700, P=<0.001), and PED (at 16 B-scans; r=0.955, P=<0.001).

Neurosensory retina

Table 1 demonstrates the absolute difference and percentage error of neurosensory retinal thickness measurements. Neurosensory retinal FCS thickness and total volume measurements were computed from maps generated after manual grading of retinal boundaries. No statistically significant difference was observed between FCS thickness and volume measurements until the density was reduced to 1/8 B-scans (375 μm apart) (P=0.02) or less. The mean±SD for absolute error (relative to ground-truth value from all 128 B-scans) of FCS thickness was 1.21±1.05 μm with 64 B-scans, increasing to 14.28±14.02 μm with 8 B-scans; whereas for total volume, the mean±SD of absolute error increased from 0.01±0.01 mm3 (64 B-scans) to 0.04±0.03 mm3 (8 B-scans). The mean±SD of percentage errors for FCS NRT thickness and total volume were 0.49±0.43 and 0.13±0.10% with 64 B-scans, increasing to 6.30±6.95 and 0.56±0.5% with 8 B-scans. Comparative graphs with mean and maximum of absolute difference, and percentage error for neurosensory retina FCS thickness and total volume are shown in Figure 2. Supplementary Figure 3 shows Bland–Altman plots for the mean difference in neurosensory retina FCS thicknesses between ground truth and frame-sampling densities of 64, 32, 16, and 8 B-scans.

Table 1 Mean absolute difference and percentage error of neurosensory retinal tissue, foveal central subfield thickness, and total volume in different sampling groups
Figure 2
figure 2

Effect of reduced B-scan densities on measurements of foveal central subfield (FCS) thickness of neurosensory retina (NRT)—(a) mean absolute error (μm), (b) maximum absolute error (μm), (c) mean percentage error, and (d) maximum percentage error; effect of reduced B-scan densities on total volume measurements of neurosensory retina (NRT), subretinal hyperreflective material (SRHM), subretinal fluid (SRF), and pigment epithelium detachment (PED)—(e) mean absolute error (mm3), (f) maximum absolute error (mm3), (g) mean percentage error, and (h) maximum percentage error.

Subretinal fluid

No statistically significant difference (P=1.00) was observed between the total SRF volume with any of the reduced sampling densities of 64, 32, 16, 8, or 4 B-scans and total SRF volume measurements obtained with all 128 B-scans. The mean±SD for absolute error (relative to ground truth) of total SRF volume was 0.002±0.004 mm3 (64 B-scans) and 0.02±0.02 mm3 (8 B-scans). The mean±SD for percentage error was 2.11±6.53% with 64 B-scans, increasing to 25.32±39.57% with 8 B-scans. Table 2 shows the absolute difference and percentage error measurements of SRF volume at the reduced sampling densities. Figure 2 shows the comparative graph of absolute difference and percentage error of SRF volume for various sampling densities.

Table 2 Absolute difference and percentage error of total volumes of subretinal fluid, subretinal hyperreflective material, and pigment epithelium detachment in different sampling groups

Subretinal hyperreflective material

No statistically significant difference (P=0.72) was observed between the total SRHM volumes with sampling densities of 64, 32,16, 8, or 4 B-scans relative to that obtained with all 128 B-scans. The mean±SD for absolute error (relative to ground truth) for total SRHM volume was 0.01±0.01 mm3 (64 B-scans) and 0.04±0.03 mm3 (8 B-scans). The mean±SD for percentage error was 5.44±11.03% with 64 B-scans, increasing to 30.06±32.45% with 8 B-scans. Table 2 shows the absolute difference and percentage error measurements of SRHM volume at the reduced sampling densities. Figure 2 shows the comparative graph of absolute difference and percentage error of SRHM volume for various sampling densities.

Pigment epithelium detachment

No statistically significant difference (P=0.80) was observed between the total PED volume measurements with sampling densities of 64, 32, 16, 8, or 4 B-scans and that of total PED volume obtained with all 128 B-scans. The mean±SD for absolute error (relative to ground truth) of total PED volume was 0.01±0.01 mm3 (64 B-scans) and 0.04±0.06 mm3 (8 B-scans). The mean±SD for percentage error was 2.99±7.42% with 64 B-scans, increasing to 10.38±12.06% with 8 B-scans. Table 2 shows the absolute difference and percentage error measurements of PED volume at the reduced sampling densities. Figure 2 shows the comparative graph of absolute difference and percentage error of PED volume for various sampling densities.

Alternate starting scan

Choosing an alternate starting scan did not yield any difference (P>0.05) in the results, suggesting that the observations were quite stable. The percentage error for the volume of each feature with the various starting scans is shown in Table 3.

Table 3 Absolute difference and percentage error for neurosensory retina, subretinal fluid, subretinal hyperreflective material, and pigment epithelium detachment with various starting scans vs ground-truth values

Discussion

In this retrospective cross-sectional study, we observed that a reduction in frame-sample density of a spectral domain OCT volume scan was associated with an increase in the error of FCS thickness and volume measurements in eyes with neovascular AMD. The error or difference was not statistically significant until the scanning density was reduced to every eighth scan (ie, 16 B-scans, with an equal spacing of 376 μm seemed to yield measurements similar to the ground truth). At a density of every eighth scan, the percentage difference for total neurosensory retinal volume was 0.24%. However, a sudden and statistically significant rise in the error was observed with lower sampling density, with 0.56% error at a density of 8 B-scans (ie, every 16th B-scan) and 1.33% error at a density of 4 B-scans (ie, every 32nd B-scan). The mean percentage error in FCS neurosensory retinal thickness was ~2.5%, with a maximum error of ~13.6%. In total neurosensory retinal volume, the mean and maximum percentage errors were ~0.2% and 0.8%, respectively, at a scanning density of 16 B-scans.

There was no statistically significant difference for the total volume measurements of retinal subcomponents such as SRF, SRHM, and PED at any scan density. This may be owing to the smaller study sample with these parameters and larger SD values at different scan densities. Though the mean values were not statistically significant, the absolute differences were potentially clinically significant. Although the choice of repeatability standard/limit is somewhat arbitrary, if one wants to achieve a mean difference of <10%, this requires B-scan densities of 16 for SRF, 32 for SRHM, and 16 for PED.

We also observed a positive correlation between total volumes of SRHM, SRF, PED, and logMAR visual acuity. In other words, more SRHM, SRF, or PED was associated with worse vision. We first described this relationship between SRHM and visual function in a cohort of neovascular AMD patients using time-domain Stratus OCT.25 This finding was subsequently replicated in the ABC trial26 and CATT studies.27 Although not the main focus of the present study, it was reassuring to see that this apparent relationship between SRHM and vision was replicated.

The findings from the present study have relevance for clinical trials of diseases associated with CNV that incorporate quantitative OCT analyses. Given the enhanced correlation with visual function, lesion subanalysis would seem to be of value in these trials. The potential emergence of new therapeutics specifically to target or reduce SRHM may further increase the importance of delineating these structures. Manual drawing of retinal boundaries and/or correction of the segmentation errors in every B-scan is needed to ensure the accuracy of measurements in many eyes with CNV, but the amount of effort required for these corrections may often be impractical, particularly with spectral domain OCT data sets having 128 (or more) B-scans.13 Thus, reducing the sampling density for thickness map calculation may make reading center manual correction of SDOCT scans feasible and clinically relevant.28 Although use of reduced sampling densities was previously demonstrated by Sadda et al13 to be of potential value, the previous studies only considered neurosensory retinal thickness and did not focus on CNV lesions. Here, we were able to define the acceptable B-scan densities for quantifying specific subcomponents of CNV lesion. However, it is important to note that an ‘acceptable’ B-scan density level is somewhat arbitrary, and depends on the desired level of precision (our threshold was a mean error of <10%) for a particular study or application.

Our study has some limitations that should be considered when assessing our findings. First, it is a retrospective study and from a tertiary-care academic medical center, and may be subject to ascertainment bias in the types of CNV lesions included. For example, our cohort only included eyes with FCS neurosensory retinal thickness ranging from 132.9 to 733.70 μm; thus, our findings may not extrapolate to eyes with more severe disease, or with CNV lesions much smaller or larger than the ones included in this study. In addition, because of the enormous time required to manually segment multiple boundaries on 128 B-scans per case, the number of subjects included in this study was relatively small. Thus, although no statistically significant differences were observed until the lower densities were reached, the study was not powered to identify smaller but potentially still relevant differences at the higher densities. Moreover, our study did not assess whether measurements would differ if an even higher density (>128 B-scans over 6 mm) was used (as a new ground truth). High scanning density may also be critical for generation of OCT projection maps or en face images, which may be useful for certain ancillary analyses or for intervisit registration. Finally, qualitative morphologic assessment may still require higher density scans, even if only subsets of these scans are used for quantification.

In summary, we observed that 16 equally spaced horizontal B-scans over a 6 mm square may be sufficient to adequately represent and generate a reliable macular thickness map of the neurosensory retina, after the manual grading of retinal boundaries and correction of segmentation errors. Similarly for CNV-associated features such as SRF or PED, a minimum of 16 B-scans (every 8th B-scan) are required to generate volume maps that are similar (within 10%) to the ground-truth values. A minimum of 32 B-scans (every 4th B-scan) is required to generate similar ground-truth volume maps for SRHM. These findings may aid in the design of optimal and streamlined spectral domain OCT scanning and grading protocols for future clinical trials using OCT in neovascular AMD.