Study population

This was a multicenter retrospective case series study. The data set was developed from information collected from 1839 patients who underwent clinical examinations in the ophthalmology clinics at Incheon Saint Mary’s Hospital (Incheon, Republic of Korea) and Seoul Saint Mary’s Hospital (Seoul, Republic of Korea), between January 2012 and May 2020. For data mining, patients who had any of the following conditions were excluded: AxL < 24.0 mm (n = 321); other retinal or choroidal disorders, such as diabetic retinopathy, retinal vascular diseases, or age-related macular degeneration (n = 45); poor quality OCT scans (n = 21); a history of vitreoretinal, glaucoma filtering, or tube surgery (n = 54); and missing data (n = 538) (Fig. 1). The study was conducted in accordance with the ethical standards stated in the 1964 Declaration of Helsinki and was approved by The Catholic University of Korea Institutional Review Board (IRB no. OC19RESI0161). Informed consent was obtained for each enrolled subjects.

Figure 1

Inclusion, exclusion flowchart of the study participants.

The enrolled patients were divided into the following two groups: (1) healthy myopia patients with AxL > 24.0 mm and no pathological changes and (2) pathologic myopia patients with AxL > 24.0 mm and pathological myopic changes, determined in accordance with the guidelines specified by the International Myopia Institute16. The International Myopia Institute defines pathologic myopia as an excessive axial elongation associated with myopia that leads to structural changes in the posterior segment of the eye (including posterior staphyloma, myopic maculopathy, and high myopia-associated optic neuropathy) as well as a loss in best-corrected visual acuity16. Eyes with any type of posterior staphyloma and stages 2, 3, or 4 of the Meta-Analysis for Pathologic Myopia classification system, with or without “plus” lesions, were considered to have pathologic myopia in this study9. For reference, the Meta-Analysis for Pathologic Myopia system organizes myopic maculopathy into five stages: 0, no maculopathy; 1, tessellated fundus; 2, diffuse choroidal atrophy; 3, patchy chorioretinal atrophy; and 4, macular atrophy17. Plus lesions included three additional indicators: lacquer cracks, myopic choroidal neovascularization, and Fuchs spot10. Diffuse choroidal atrophy, determined using ophthalmoscopy, is an ill-defined yellowish lesion in the posterior fundus; patchy atrophy constitutes a grayish-white, well-defined atrophy; and lacquer cracks appear as fine, irregular, yellowish lines that often branch and crisscross in the fundus. Posterior staphyloma was defined and classified in accordance with the definition provided by Curtin and the International Myopia Institute: local bulging of the sclera at the posterior pole of the eye, with a radius less than the surrounding curvature of the eye wall8,9,18,19. Diagnosis and classification of posterior staphyloma using stereoscopic fundus photography was decided by agreement between two of the authors (YCK and KDK). The designation of healthy myopia or pathologic myopia was determined by two ophthalmologists (YCK and KDK). If the results from these two ophthalmologists were not consistent, a senior ophthalmologist (CKP) was consulted for the final judgment.

Data collection and definition of variables

All patients underwent comprehensive clinical examinations, including refractive error (RE) in diopters, Landolt C chart best-corrected visual acuity measurements (measured using logarithm of the minimum angle of resolution), and slit-lamp biomicroscopy. AxL was measured using ocular biometry (IOL Master; Carl Zeiss Meditec, Jena, Germany). Digital color fundus photographs were taken with a VX-10i fundus camera (Kowa Co., Nagoya, Japan).

All enrolled patients were imaged by OCT (DRIOCT Triton; Topcon Corporation, Tokyo, Japan). The scanning protocol consisted of 256 B-scans centered on the fovea, which provided an image of the posterior segment 12 mm horizontally and 9 mm vertically. In total, 1000 consecutive coronal scan images were reconstructed, each with a separation of 2.6 μm. A good set of scans with a signal quality index of > 75 in the B-scan mode was selected for further analysis. Each key indicator section of the posterior sclera (i.e., (1) the fovea, (2) the DPE, and (3) the optic disc) was designated and documented as follows. The reviewers examined the reconstructed consecutive coronal scans (en face mode) from front (corneal side) to back (optic nerve side). (1) The coronal scan (Fig. 2A), horizontal scan (Fig. 2B), and vertical scan (Fig. 2C) were reviewed simultaneously; the specific coronal section (given in green numbers in the A section) that simultaneously displayed the foveal double hump in all three displays (Fig. 2A–C) was designated as the foveal position. (2) In a similar manner, the coronal scan (Fig. 2A), horizontal scan (Fig. 3B), and vertical scan (Fig. 3C) were reviewed simultaneously; the specific coronal section (given in green numbers) that simultaneously displayed the coronal view of the hyperreflective Bruch’s membrane (white square) in all three displays (Fig. 3A–C) was designated as the DPE position. (3) The optic disc position was measured by means of the automatic segmentation algorithm of DRIOCT software (Topcon Corp.) using Bruch’s membrane opening (green parallel lines in Fig. 4). The center of the line connecting Bruch’s membrane opening (red square in Fig. 4) at the optic disc center was designated as the optic disc center position. In the extreme cases where the inferior margin of DPE cannot be determined, we assumed that the DPE is within the inferior boundary of the scanning field. Eyes with segmentation errors involving Bruch’s membrane opening were manually measured by two of the authors; the final position was designated by agreement.

Figure 2

Designation of the fovea using the coronal (A), horizontal (B), and vertical (C) scan. The specific coronal section (given in green numbers in the (A)) that simultaneously displayed the foveal double hump in all three displays (A–C) was designated as the foveal position.

Figure 3

Designation of the deepest point of the eye (DPE) using the coronal (A), horizontal (B), and vertical (C) scan. The specific coronal section (given in green numbers in the (A)) that simultaneously displayed the coronal view of the hyperreflective Bruch’s membrane (white square) in all three displays (A–C) was designated as the DPE position.

Figure 4

The concept of using relative anteroposterior elevation of the posterior globe between the fovea (blue square), the DPE (white square), and the optic disc (red square). The tomographic elevation from the fovea to the optic disc center (disc) was designated TEPSfovea→disc (A); from the fovea to the DPE, TEPSfovea→DPE (B); from the disc to the DPE, TEPSdisc→DPE (C); and perpendicular distance from the disc to the DPE, TEPSdistance (red arrow in (C2)). The tomographic elevation was estimated as the number of coronal sections between key indicators, with adjacent sections separated by 2.6 μm. The direction to the posterior (optic disc side) was specified as a positive tomographic elevation (A2,B2,C2). The direction opposite from the anterior (corneal side) was designated as the negative tomographic elevation (A1,B1,C1).

The relative elevation in the posterior globe was estimated by assignment of a specific anteroposterior position for each of the three key indicators, using four indices collectively referred to as TEPS. Specifically, the four indices were as follows: (1) the tomographic elevation from the fovea to the optic disc center (disc) was designated TEPSfovea→disc (Fig. 4A); (2) from the fovea to the DPE, TEPSfovea→DPE (Fig. 4B); (3) from the disc to the DPE, TEPSdisc→DPE (Fig. 4C); and (4) perpendicular distance from the disc to the DPE, TEPSdistance (red arrow in Fig. 4C2). The tomographic elevation was estimated as the number of coronal sections between key indicators, with adjacent sections separated by 2.6 μm14. The direction to the posterior (optic disc side) was specified as a positive tomographic elevation (Fig. 4A2,B2,C2). The direction opposite from the anterior (corneal side) was designated as the negative tomographic elevation (Fig. 4A1,B1,C1). The TEPS perpendicular distance was estimated by measuring the linear distances in micrometers in the 3D reconstructed configuration, which considers the extreme diagonal disc, DPE configuration of the posterior sclera, between key indicators using the intrinsic calipers in DRIOCT software. The SCT was defined as the perpendicular distance from the outer edge of the hyperreflective line of Bruch’s membrane to the choroidoscleral junction at the subfovea. Measurements from vertical and horizontal B-scan images, including the fovea, were averaged. All reviews of the instrumentation were carried out by one masked author (YCK); the 1:1 pixel mode was converted to the 1:1 µm mode for greater accuracy15.

To compensate for potential scanning errors induced by head tilt or ocular rotation, the examiner confirmed the patient’s position at the OCT with their chin in the chin rest and forehead against the forehead rest. The patient’s eyes were aligned with the eye level mark on the forehead rest support by raising or lowering the chin rest. For determination of the foveal center, patients were instructed to hold their heads in a vertical position and look directly at the internal fixation target in the OCT camera. The OCT apparatus also was equipped with real-time eye tracking to eliminate eye motion and minimize artifacts by fixation on the fovea for each scan14.

To evaluate measurement repeatability, two separate scan sets were collected from 14 eyes from each group; the topographic locations of the three key indicators were compared. The intraclass correlation coefficient and coefficient of repeatability were calculated. The coefficient of repeatability is defined as the standard deviation of the difference between two sets of scanned image measurements, divided by the average of the two repeated measurements20. Bland–Altman plots were also used to assess agreement between the two repeated measurements (Supplementary Fig. S1)21.

Machine learning algorithm construction

In this study, an SVM-based classification algorithm was used to build classification models based on the different combinations of variables described above. Decision Tree, Random Forest, k-nearest neighbors, and Naïve Bayes classifiers were used to develop a prediction model based on the combination of the aforementioned variables.

The data were divided into a training set (70%) and a test set (30%) to obtain a reliable evaluation and to avoid overfitting. In the training set, the class imbalance ratio was 84.7 to 15.3. To resolve the imbalance, the Synthetic Minority over-sampling TEchnique (SMOTE) was used to identify an individual in the low-portion group and find its k-nearest neighbor, thereby creating a new data set for the low-portion group; k was set at 5 in this particular model. After SMOTE, the training data consisted of 970 eyes: 510 healthy eyes (52.6%) and 460 pathologic myopic eyes (47.4%). The model was then retrained with these data to construct the final prediction model. Using the final training model, independent validations were performed on the original data.

Based on the steps described above, the specific machine learning algorithms were trained as follows. First, individual pairs consisting of two of the four TEPS indices were selected to create SVM classifier models. Second, an SVM model using all four TEPS indices was constructed. Third, AxL and SCT were used to build an SVM model. Fourth, all six indices were used to develop an SVM model. Finally, SVM, Decision Tree, Random Forest, k-nearest neighbors, and Naïve Bayes classifiers were constructed using all six indices.

The machine learning algorithm was implemented using R version 3.6.2 (R Foundation for Statistical Computing). The predictive performance was compared by calculating (1) accuracy, which constitutes the overall correctness of the model (i.e., number of correct classifications divided by total number of classifications); (2) sensitivity, which evaluates a model’s ability to predict the true positives of each available category; (3) specificity, which assesses a model’s ability to predict the true negatives of each available category; and (4) area under the receiver operating characteristic curve (AUROC).

Support vector machine architecture

An alternative use for SVM is the kernel method, which enables the modeling of higher dimensional, nonlinear models22. In a nonlinear problem, a kernel function can be used to add new dimensions to the raw data, thus converting the nonlinear problem into a linear problem in the resulting higher dimensional space. Briefly, a kernel function facilitates more rapid calculations, which would otherwise require computations in high-dimensional space23. With kernel functions, the scalar product between two data points in a higher dimensional space can be calculated without explicit calculation of the mapping from the input space to the higher dimensional space23. For our model, the most commonly used radial basis function (RBF) kernel is applied, in which the corresponding feature vector is infinite-dimensional:

$$\mathrm{K}\left({x}_{i},{x}_{j}\right)=\mathrm{exp}\left(-\gamma {\Vert {x}_{i}-{x}_{j}\Vert }^{2}\right).$$

Here, \(\gamma\) is associated with the Gaussian function standard deviation, in which \(\gamma\) size is related to overfitting. To find the optimal parameters, the training of kernel SVM was performed on the model 11. The best parameter for gamma was 1 and 10 for cost. The second best parameter for gamma and cost was 1, respectively (Supplementary Table S1). We compared the performance of each of the two best models and the default parameters (gamma = 1/data dimension, Cost = 1) (Supplementary Table S2). The model that showed the best performance was the model with 1/data dimension for gamma and 1 for cost.

Statistical analysis

Continuous variables are presented as the mean ± standard deviation, while categorical variables are presented as frequencies and percentages. Differences between groups were analyzed using Fisher’s exact test for categorical variables and Welch’s t-test (or the Wilcoxon rank-sum test) for continuous variables. Statistical analyses were performed using R version 3.6.2 (R Foundation for Statistical Computing). P-values < 0.05 were considered statistically significant.