|
|
||||||||
From Moorfields Eye Hospital, London, United Kingdom.
| Abstract |
|---|
|
|
|---|
METHODS. Thirty-degree sectors of rim area, as defined by an experimental reference plane, were analyzed for change with respect to different statistical limits of variability (80%, 90%, 95%, 98%, 99%, and 99.9%) in the longitudinal image series of 62 eyes from 30 ocular hypertension converters and 32 normal control subjects. A criterion requiring that change is repeatable in two of three consecutive tests (the 2-of-3 criterion) was compared with a single-test strategy not requiring confirmation, and four other plausible criteria. The influence of these various parameters on sensitivity and the false-positive rate was evaluated. The same series were also assessed for change by the known method of computer-generated probability maps.
RESULTS. More sectors were identified as progressing in converter eyes than in control eyes at every limit of variability. With stricter limits of variability and a requirement of confirmation, fewer sectors were identified as changing, especially in control eyes. The 2-of-3 criterion had the most favorably balanced sensitivity and false-positive rates: these were, for the 90% limit of variability, 90.0% and 6.2%, respectively, and for the 95% limit, 83.3% and 3.1%, respectively. Confirmed rim loss in converter eyes was most frequent in the disc poles and corresponded with the field hemisphere of conversion in 80%. Probability maps detected significant and repeatable change in 26 (86.7%) of 30 converter eyes and 14 (43.8%) of 32 of control eyes.
CONCLUSIONS. This study was conducted to optimize and validate an approach for identifying progression. The method distinguished eyes with glaucomatous change from unchanging control eyes.
Different methods have been proposed for evaluating sequential topographical data. Clusters of pixels in pairs of topography images may be compared for change and the results expressed as probability maps.4 5 6 7 8 Alternatively, a level in topography can be set by a reference plane to define parameters such as the neuroretinal rim and cup which can then be analyzed for change.9 10 11 12 As an extra step, criteria have been introduced requiring that change be repeatable in consecutive tests in groups of pixel clusters5 or parameters11 12 before being attributed to progression. The benefit of such criteria, especially repeat testing, ought to be firmly and empirically justified before being widely implemented as their use can be resource intensive. Not least, they directly affect the appraisal of progression and clinicians need to know what test results mean when managing patients. Reported methods have been validated against computer simulation,4 serial visual field analysis by Humphrey Statpac 2 (Carl Zeiss Meditec, Dublin CA),5 subjectively assessed stereoscopic disc photographs5 12 and in primates with experimental glaucoma.11
We have recently described13 an analytical approach for identifying glaucomatous change in regions of the neuroretinal rim based on testing 30° rim area sectors, as defined by a new experimental reference plane14 15 that facilitates reproducible measurement. We showed how rim area variability can be estimated and accounted for in each sector of each ONH by statistical limits of variability calculated from the single topography images obtained at every test visit within an image series. To judge progression, putative change in each rim area sector from any number of visits over time is simultaneously assessed and weighed against its own variability, and only change repeatedly exceeding variability in two of three consecutive tests is attributed to progression. In the initial construct, variability was arbitrarily defined by 95% confidence limits. We tested this approach in the longitudinal image series of eyes with ocular hypertension that unambiguously progressed to develop reproducible visual field defects ("ocular hypertension converters"), and in the unchanging eyes of normal control subjects. Results indicated that eyes with progression could be distinguished from those that were unchanging.
In the present study we wanted to optimize this approach by testing and validating various parameters of analysis for detecting change to provide a sound basis for the approachs possible clinical use. We have studied how different statistical definitions of variability influence the identifying and verifying of change and show how these can be applied to assessing rim loss. We then evaluated the regional correspondence between progressive rim loss and serial perimetry and compared our analytical approach with an alternative technique using probability maps to detect change.4 5
| Methods |
|---|
|
|
|---|
![]() | (1) |
The outer extent of the rim coincides with the contour line, marking the inner margin of the scleral ring of Elschnig. In this study, the same observer drew the contour line in each subjects baseline mean topography image (JT). Mean topography images are derived on computer from triplets of single topography images (HRT software ver. 2.01; Heidelberg Engineering). Contour lines are exported to other mean and single topography images in each series. Only images with mean pixel SD < 50 µm are used and grainy images with a honeycombed appearance are excluded.
Testing Different Limits of Variability.
The limits of variability define the smallest amount of change we can expect to detect above test variability. Measurement variability in each sector of an image series is estimated and accounted for by way of limits of variability. These limits are modeled from intravisit difference estimates (denoted
), calculated as the area difference in each rim sector between pairs of same-visit, single topography images of the same eye. The number of
per visit varies with the number of topographies acquired in that visit. The number of
in any image series equates the total of all
from all visits. All are used to calculate the limits of variability for that eye. The limits define the extent to which measurements vary from the baseline. For an image series, limits of variability for each sector (VARLIM) can be calculated by
![]() | (2) |
is the sector rim area difference between pairs of intravisit single topography images, i is the ith value of
, X is the mean of observations of
, n is the number of observations of
, and Y is the value of the t-statistic for degrees of freedom for
, corresponding to a chosen two-tailed probability. We had previously arbitrarily defined Y by a probability of 0.05.13 In the present study, we evaluated the effect of a range of probability levels for Y of P = 0.2, 0.1, 0.05, 0.02, 0.01, and 0.001, corresponding to limits of variability ranging from 80% to 99.9%, on the identification of change and distinguishing between progressing and unchanging eyes.
Collation of Data.
Measurements of rim area in sequential mean topography images are plotted as rim area profiles. Profiles are plots of rim area by angular location round the ONH (0° to 360°, with 0° temporal, 90° superior, 180° nasal, and 270° inferior) from the same image series. This represents a rim area at different points in time in a common graph. Lower limits of variability for each sector are plotted relative to the baseline profile, with the region beneath the lower limits termed the zone of change. Rim area in a sector that diminishes, exceeds its limit of variability, and enters the zone of change is taken to represent tentative change. Only confirmed change meeting set criteria is attributed to progression ("progressed sectors"). Eyes having at least one progressed sector were deemed to have changed. In judging progression, rim data from any number of visits over time can be simultaneously assessed and weighed against variability. In the present study, analysis was modified so that the possibility of change could be assessed by several limits of variability simultaneously.
Testing Different Criteria for Verifying Change.
We had established a criterion for verifying change, requiring that any change must be demonstrated in two of three consecutive tests before being attributed to progression.13 This was to ensure that identified change was consistent with true disease-induced change and not variability. In the present study, we compared this 2-of-3 criterion with (1) a strategy that does not require confirmation (the "single strategy") at different limits of variability. Sectors were called progressed sectors for a strategy if they met that strategys criterion for change. The number of progressed sectors arising by different limits of variability in each strategy was plotted in bar graphs. The 2-of-3 criterion was then compared with other plausible criteria: (2) 2-of-2 consecutive tests, (3) 3-of-3 consecutive tests, (4) two adjacent sectors in a single test, and (5) two adjacent sectors in 2-of-3 consecutive tests. Receiver operating characteristic (ROC) curves were plotted to evaluate how well each criterion distinguished eyes with glaucomatous change from the unchanging eyes of normal control subjects for each limit of variability. Any progressed sectors detected in control eyes were considered false positives.
Excluding Changes in Image Size.
We screened image series to ensure they were free of magnification changes over time. The fit of exported contour lines to the ONH margin was subjectively examined in each follow-up mean topography image and compared with fit in the baseline image. Fit was considered poor if the ONHs transverse dimensions (x-y axis) differed from that of the exported contour line. In such cases, distances between landmarks (vessel bifurcations in the peripapillary region) were compared between the follow-up and baseline image using the softwares "interactive measures" function, always after having exported the contour line. Seven repeat measurements were made in each image, and the median value for each image was determined. Percentage of change was the ratio of the median measurements at baseline to those in follow-up images. Measurement change exceeding an arbitrary cutoff of 5% above baseline was tantamount to changed size. Series having these changes were excluded from analysis for progression. The same person performed all checking (JCHT).
Correspondence between Morphologic Change and Serial Perimetry.
The spatial relationship between neuroretinal rim loss and visual field change was then examined by associating confirmed rim loss in an ONH hemisphere (superior or inferior) with the presence of confirmed perimetric change representing conversion in the opposite visual field hemisphere.
Comparison with Detection of Change by Computerized Probability Maps.
The same image series were also analyzed by probability maps (HRT software ver. 2.01b-MS, 1999 with probability map analysis) as described in detail by Chauhan et al.4 5 elsewhere. The nature of analysis in this version of software is identical with that in the present HR2 Explorer for Windows software (Reuter M, Heidelberg Engineering, personal communication, May 2003). Briefly, the software condenses 10° HRT images with 256 x 256 pixels to arrays of 64 x 64 superpixels, each a grid of 4 x 4 pixels representing topographical height. Each successive follow-up visit is compared statistically with a common baseline visit, with data from each visit derived from its three single topography images, and spatial associations between superpixels statistically accounted for as previously described.4 5 Computerized color-coded probability maps show regions in which a significant (P < 0.05) increase (red) or decrease (green) in topographical height is calculated to have occurred. Additional filtering to remove isolated significant superpixels is also possible (Zinser G, personal communication, June 2003). We analyzed our data with and without this filtering. An eye was judged to have significant change if it had at least one cluster of 20 contiguous red superpixels present in the same location in three consecutive images.5 Converter eyes meeting this criterion were considered true positives, whereas normal control eyes with corresponding findings were considered false positives.
Criteria for Selecting Subjects
Sixty-two longitudinal image series from 32 normal control eyes and 30 age-matched ocular hypertension converters were analyzed. Each group had been expanded from their original 20 subjects presented in a previous study,13 all of whom were included in the present study. Each subject contributed images from only one eye to analysis: from a randomly selected eye in control subjects and the eye that had converted in the subjects with converters. Converter subjects and normal control subjects had regularly attended the Ocular Hypertension and Early Glaucoma Research Clinic at Moorfields Eye Hospital and had undergone imaging on at least six separate occasions over a minimum of 3 years. This study adhered to the tenets of the Declaration of Helsinki, having appropriate institutional review board approval and subjects informed consent.
Normal control eyes were taken to be unchanging. Normal subjects were volunteers comprising spouses or friends of hospital patients, hospital staff, or members of external nonmedical social organizations. They repeatedly had intraocular pressure (IOP) less than 22 mm Hg, serially normal and reliable Humphrey 24-2 visual fields (Carl Zeiss Meditec) with Advanced Glaucoma Intervention Study (AGIS)18 visual field scores of 0, 3; no concurrent ocular disease or previous intraocular surgery; no family history of glaucoma; refractive errors less than ±6 D, and age of more than 40 years. ONH appearance was not taken into account for entry into the study. Image data from all available test visits was used in testing normal series to identify change.
Converters were assumed to have glaucoma progression in the eye that had converted. They initially had a diagnosis of ocular hypertension with IOP consistently 22 mm Hg or more in one or both eyes without IOP-lowering treatment, open angles on gonioscopy, initially normal Humphrey 24-2 visual fields with an AGIS scores of 0, determined after a learning period of three consecutive tests, refractive errors less than ±6 D, no concurrent ocular disease or previous intraocular surgery, and age of more than 40 years. In addition, even without field defects, IOP of more than 30 mm Hg was an indication for medical treatment to lower IOP below that level. Converters had, over the course of monitoring, shown development of visual field abnormality, according to AGIS criteria (score > 0), that was reproducible in the same location on three consecutive tests. A glaucoma expert independently confirmed this. Other possible causes of visual field defects were excluded. Patients meeting these criteria were treated medically to lower IOP. ONH appearance was not part of the criteria for inclusion. Converter and normal control groups had equivalent durations of total follow up (control eyes: 5.7 ± 0.85 years, converter eyes: 6.2 ± 0.52 years) and number of imaging sessions. In the converter group, image series were analyzed up to the time when progression was identified (if it was), so that length of follow-up and number of images in each eyes series varied with time between the baseline and point of confirmed change. To calculate limits of variability, single topography images were used from all visits until the test visit in which confirmed change was identified.
| Results |
|---|
|
|
|---|
|
|
|
|
|
|
|
|
| Discussion |
|---|
|
|
|---|
Fine-tuning the various analytical parameters of the approach by testing different limits of variability and criteria for confirming change yielded the following. First, stricter limits of variability within each criterion reduced the number of progressed sectors in both converter and normal eyes. Normal eyes were affected proportionately more, however, indicating a relative reduction in false positives. Second, the 2-of-3 criterion had fewer false positives but reasonably preserved sensitivity compared with the single strategy. The 2-of-3 criterion was more sensitive and specific than the other criteria we tested. Only the criteria requiring change in two adjacent sectors in 2-of-3 tests and 3-of-3 consecutive tests had better specificity, but their respective sensitivities of 60% and 77% were not optimal. Third, as with stricter limits of variability, the 2-of-3 criterion also reduced the number of progressed sectors in normal eyes, resulting in false positives being eliminated for confidence limits of 98% and above. Sensitivity for the 98% limit, however, was only 67%. ROC curve analysis of the 2-of-3 criterion indicated that the optimal balance of sensitivity and specificity was achieved at either the 90% or 95% limits of variability. The 90% limit had a sensitivity of 90% and false-positive rate of 6.2%; the 95% limit had a sensitivity of 83.3% and false-positive rate of 3.1%. The 90% limit was more sensitive to change but the 95% limit had marginally fewer false negatives. Separation between converters and control eyes for the 95% limit of variability of the 2-of-3 criterion was high: converters had 76 times as many progressed sectors as normal control eyes.
Our approach is designed to deal with variability in three separate steps: experimental reference plane for intervisit variability,14 15 limits of variability for intravisit variability,13 and verification criteria for any remaining random variability.13 Images from any number of visits, not just pairs of visits, can be assessed simultaneously. Its limits of variability take into account all within-visit single topography images from all test visits, using all available data in a series. For example, an image series comprising three standard test visits, each with three single topography images, yields nine point-estimates for calculating limits; But if in a fourth visit, say, six images are acquired, 15-point estimates become available from that visit alone, yielding 27 estimates for calculating limits of variability (see the Methods section).13 Hence, larger samples for estimating intravisit variability can quickly be obtained as needed and flexibly, without unduly burdening testing resources. Limits also have an inbuilt mechanism for statistically adjusting for sample size so that they widen with smaller samples to reflect increased uncertainty and vice versa.
We showed how sequential data can be evaluated by graded limits of variability to give extra information on the nature of progression in each eye. The stricter the limit of variability, the greater the magnitude of change needed to exceed it, and the more likely it is that measured change is not due to variability alone. Thus, the exceeding of stricter limits of variability reflects a greater probability of change than when less strict limits are exceeded (illustrated in Figs. 4 5 ). Conversely, that glaucomatous change could be too small to exceed test variability can also be judged subjectively based on the quantitative analysis, as illustrated in Figure 3 . Thus, the approachs method of objective analysis combined with the option of qualitative appraisal provides a composite framework that is potentially useful for guiding clinical decisions.
Having an empiric basis for duplicate testing is important for determining the usefulness of a test, but also because duplicate testing affects clinical resource allocation. For example, a 3-of-3 criterion5 for confirming change requires 50% more testing than a 2-of-2 criterion.11 12 With analysis by our approach, we found that most tentative change could be confirmed by just two consecutive tests. Allowing a third test in the event of a second tests failing to confirm tentative change increased sensitivity but did not compromise specificity, as evident in the relative positions of the ROC curves for the 2-of-2 and 2-of-3 criteria in Figure 2 .
In probability maps, verification of change requires clusters of at least 20 significant superpixels in the same location on three consecutive tests.5 Applying this to our data set, we found reasonable sensitivity but many false positives. Filtering for randomly occurring significant superpixels reduced false positives though also sensitivity. It could be that it is hard to adequately estimate and account for both intravisit and intervisit variability just from three images at baseline and at follow-up. Also, topographical height over the ONH fluctuates relative to regions of the peripapillary surface (used for z-axis zero-referencing) in a way that is complex, unpredictable, and not easily accounted for.17 We studied only the red superpixels representing decreased topographical height; how clusters of green superpixels should be interpreted is not clear to us.
The tradeoff of introducing criteria to improve test specificity is that the severity of minimally detected change is not the same at all points along an ROC curve, nor for different criteria of duplicate testing. This should be noted when interpreting our ROC curve analysis in Figure 2 . Stricter limits of variability within an ROC curve can be expected to detect change that is more severe, possibly when underlying disease is also worse, compared with less strict limits. Likewise, each ROC curve in Figure 2 probably represents a different level of change corresponding to a different stage of disease, especially when compared with the single strategy. Our subjects did not prospectively undergo duplicate testing. Instead, their preexisting longitudinal measurements were examined for repeatable change. The degree to which our results had "bias" depends on the underlying rates of progression and how quickly duplicate testing was completed in each series. Other comparable studies having criteria for repeatable change have also used retrospective assessment, and their findings should be interpreted accordingly.5 11 12 Still, our findings on testing by the 2-of-3 criterion are in line with what is predicted by theoretical modeling22 23 namely, that the 2-of-3 criterion markedly improves specificity but does not appreciably compromise sensitivity.
Our validation data set may not have fully represented the whole spectrum of disease severity. We tested only converters because unequivocally telling that they had progressionconverter visual fields simply had to change from normal to abnormalis much simpler and more unambiguous than judging progression in eyes already having established glaucomatous field defects.22 24 25 Whereas converters might be considered to have early glaucoma, we found that some already had quite extensive cupping. Nevertheless, rim loss could be identified in these eyes, and Figure 5 shows an example. Because variability is accounted for separately in each part of each ONH, and because the experimental reference plane is customized to suit the morphology of each ONH, we do not expect morphologic variations to pose problems to our identifying change.
Most converters (80%) had detectable rim loss that matched the field hemisphere in which conversion occurred. Why the remainder did not have matching rim loss is unclear. The concepts underlying field conversion and our method of detecting rim loss are different: visual sensitivity loss had simply to reach threshold, whereas for significant rim loss to be identified, change had to exceed variability. Their measurements are also scaled differently: The units of rim area are on a linear scale, but the decibel scale of visual sensitivity is logarithmic. Thus, it could be that visual sensitivity thresholds were reached before corresponding neuroretinal rim loss exceeded the statistical limits of variability. Some eyes had statistical rim loss that did not have matching confirmed field loss, and it is possible that morphologic change predated white-on-white field defects in these eyes, as has been reported by several investigators.26 27 28 29 30 31 In eyes with ocular hypertension, Kamal et al.9 10 have reported detecting rim changes by scanning laser tomography before visual field conversion. In glaucomatous eyes, Chauhan et al.5 have reported topographical changes by scanning laser tomography before detecting field progression by Humphrey Statpac-2 analysis. Our finding that change was most frequent in the poles, especially inferiorly, agrees with previous observations of disc photographs in ocular hypertension converters and early glaucoma.27 31 32 33
We have sought to empirically determine and validate optimal parameters for analysis in the analytical approach. We found that the limits of variability of 90% or 95% provided reasonable cutoffs for identifying progression when the 2-of-3 criterion was used for verification, and our results compared favorably with the alternative technique of probability maps. Characteristics of identified rim loss broadly corresponded to serial perimetry and also published descriptions based on serial photography of the disc at a similar stage of disease. In our approach, the assessing of change by a range of graded limits of variability can give extra information on the nature of change and is potentially useful clinically. Having investigated these issues, we have now turned our attention to evaluating this approach in glaucoma suspects ocular hypertension as well as eyes with diverse presentations of glaucoma of varying severity.
| Footnotes |
|---|
Submitted for publication January 12, 2003; revised June 8 and August 20, 2003; accepted September 16, 2003.
Disclosure: J.C.H. Tan, None; R.A. Hitchings, None
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Roger A. Hitchings, Research and Development, Moorfields Eye Hospital, City Road, London EC1V 2PD, UK; roger.hitchings{at}moorfields.nhs.uk.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Poli, N. G. Strouthidis, T. A. Ho, and D. F. Garway-Heath Analysis of HRT Images: Comparison of Reference Planes Invest. Ophthalmol. Vis. Sci., September 1, 2008; 49(9): 3970 - 3975. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Zangwill, R. N. Weinreb, J. A. Beiser, C. C. Berry, G. A. Cioffi, A. L. Coleman, G. Trick, J. M. Liebmann, J. D. Brandt, J. R. Piltz-Seymour, et al. Baseline Topographic Optic Disc Measurements Are Associated With the Development of Primary Open-Angle Glaucoma: The Confocal Scanning Laser Ophthalmoscopy Ancillary Study to the Ocular Hypertension Treatment Study Arch Ophthalmol, September 1, 2005; 123(9): 1188 - 1197. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Patterson, D. F. Garway-Heath, N. G. Strouthidis, and D. P. Crabb A New Statistical Approach for Quantifying Change in Series of Retinal and Optic Nerve Head Topography Images Invest. Ophthalmol. Vis. Sci., May 1, 2005; 46(5): 1659 - 1667. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |