|
|
||||||||
From the Department of Ophthalmology and Visual Sciences, Dalhousie University, Halifax, Nova Scotia, Canada.
| Abstract |
|---|
|
|
|---|
METHODS. Fifteen patients with glaucoma who had early to moderately advanced visual field loss with SAP (mean MD, 4.0 dB; range, +0.2 to 16.1) were enrolled in the study. Patients attended three sessions. During each session, one eye was examined twice with FDT2 (24-2 threshold test) and twice with SAP (Swedish Interactive Threshold Algorithm [SITA] Standard 24-2 test), in random order. We compared threshold values between FDT2 and SAP at test locations with similar visual field coordinates. Testretest variability, established in terms of testretest intervals and standard deviations (SDs), was investigated as a function of visual field sensitivity (estimated by baseline threshold and mean threshold, respectively). The magnitude of visual field defects apparent in total and pattern deviation probability maps were compared between both techniques by ordinal scoring.
RESULTS. The global visual field indices mean deviation (MD) and pattern standard deviation (PSD) of FDT2 and SAP correlated highly (r > 0.8; P < 0.001). At test locations with high sensitivity (>25 dB with SAP), threshold estimates from FDT2 and SAP exhibited a close, linear relationship, with a slope of approximately 2.0. However, at test locations with lower sensitivity, the relationship was much weaker and ceased to be linear. In comparison with FDT2, SAP showed a slightly larger proportion of test locations with absolute defects (3.0% vs. 2.2% with SAP and FDT2, respectively, P < 0.001). Whereas SAP showed a significant increase in testretest variability at test locations with lower sensitivity (P < 0.001), there was no relationship between variability and sensitivity with FDT2 (P = 0.46). In comparison with SAP, FDT2 exhibited narrower testretest intervals at test locations with lower sensitivity (SAP thresholds <25 dB). A comparison of the total and pattern deviation maps between both techniques showed that the total deviation analyses of FDT2 may slightly underestimate the visual field loss apparent with SAP. However, the pattern-deviation maps of both instruments agreed well with each other.
CONCLUSIONS. The testretest variability of FDT2 is uniform over the measurement range of the instrument. These properties may provide advantages for the monitoring of patients with glaucoma that should be investigated in longitudinal studies.
Frequency-doubling technology (FDT) perimetry is one of the newer psychophysical techniques for visual field examination. Empiric investigations have shown the first-generation FDT device to perform well at detecting glaucomatous visual field loss.8 Importantly, the variability characteristics of FDT perimetry appear more uniform across the measurement range than those of SAP. Response variability, measured by the slope of the psychometric function, was shown to be independent of visual field loss,9 and the testretest intervals of the FDT1 were more uniform over the instruments dynamic range compared with SAP.10
A second-generation FDT instrument (FDT2; Humphrey Matrix, Carl-Zeiss Meditec) has recently become available. Its stimuli, smaller than those of the original FDT device, permit a larger number of visual field locations to be examined, providing greater detail of the spatial distribution of visual field loss. In addition, thresholds are estimated by a maximum-likelihood strategy11 which, by presenting a constant number of four stimuli at each location, ensures uniform test duration independent of the level of field loss.
The main objective of this study was to compare threshold estimates and their testretest variability between FDT2 and SAP. To address the methodological issues that arise when tests are compared to an imperfect gold standard, we reduced the effects of variability by examining each patient six times with both techniques. Principal curve analysis12 (which accounts for variability in both the dependent and independent variables) was then performed to determine the relationship between threshold values obtained with both techniques. The testretest variability of both techniques was investigated as a function of visual field sensitivity and established in terms of testretest intervals and standard deviations (SDs). Total and pattern deviation probability maps obtained with both techniques were compared by calculating ordinal defect scores.
| Methods |
|---|
|
|
|---|
|
|
Study Sample and Testing
Fifteen patients with glaucoma (mean age, 66.3 years; range, 56.180.6) with early to moderate visual field loss (mean MD [SITA Standard], 4.0 dB; range, +0.2 to 16.1) were recruited from the glaucoma clinics of the QEII Health Sciences Centre (Halifax, Nova Scotia, Canada). Criteria for inclusion in the study were a clinical diagnosis of open-angle glaucoma, refractive error within 5 D equivalent sphere or 3 D astigmatism, best-corrected visual acuity
6/12 (+0.3 logMAR), and prior experience with FDT1 perimetry and SAP. Patients were examined over three sessions within a period of 4 weeks. Within each session, the randomly selected study eye was examined twice with FDT2 (24-2 threshold test) and twice with SAP (SITA Standard; 24-2 test). The order of the tests was randomized, and a mandatory break of 6 minutes was given between examinations. All participants wore the appropriate refractive correction for each test. The study adhered to the tenets of the Declaration of Helsinki. The protocol was approved by the Queen Elizabeth II Health Science Centre Research Ethics Committee, and all participants gave written informed consent.
Analyses
Comparison of Threshold Estimates.
To establish the relationship between the threshold estimates of FDT2 and SAP, we compared the mean result of the six tests at those locations at which the stimulus centers were within 2° of each other (Fig. 2 , closed circles and unshaded squares). In addition to the test locations on either side of the vertical meridian, we excluded the two locations near the blind spot.
A principal curve12 implementation19 available in the open-source statistical environment R20 was used to derive the relationship between the two perimetric techniques. This mathematical method finds a best fit between two variables by minimizing the residuals perpendicular to the fitted curve in both the x and y variables. In contrast to the more familiar least-squares type of regression (which minimizes the residuals in the dependent variable only), the principal curve algorithm is not based on the assumption that the independent variable is measured without error, and it is therefore immaterial which one of the two perimetric techniques is represented on the x-axis. To avoid floor effects that occur when the lower limit of the dynamic range of either technique has been reached, we excluded the data from test locations at which one or more of the threshold estimates were 0 dB with either FDT2 or SAP.
TestRetest Variability.
Testretest intervals describe the range within which the central 90% of follow-up thresholds are likely to fall, for each level of baseline threshold, if no real change has taken place. We derived such intervals for combinations of two baseline and two follow-up tests with FDT2 and SAP. Since there were no significant learning or fatigue effects within or between the three sessions with either of the two techniques (repeated-measures ANOVA of MD and PSD, P > 0.1), we treated the order of the six examinations as interchangeable, thus using all 90 possible combinations of independent baseline and follow-up pairs of tests. For example, a baseline threshold calculated from tests 1 and 2 was compared to all six possible follow-up thresholds obtained by pair-wise combination of tests 3, 4, 5, and 6. The testretest intervals were then established by calculating the empiric 5th and 95th percentiles of the distribution of follow-up thresholds, stratified for the baseline value.
For a quantitative comparison of the relationships between threshold and testretest variability with FDT2 and SAP, we investigated the SD of the six repeated threshold estimates as a function of their mean value at each test location, using linear regression analyses of log SD versus mean. Because the dB scales of FDT2 and SAP are based on two distinct definitions (provided in Description of Techniques), we first transformed all thresholds into instrument-independent units of log Weber contrast sensitivity to render the data numerically comparable. As the dynamic range of both perimeters is limited, the sensitivity of severely damaged visual field locations may not always be truly measurable. Some thresholds may be estimated at 0 dB, since the maximum stimulus contrast has been reached. Because this floor effect can lead to an artifactual decrease in the testretest variability, the regression analyses were confined to locations at which all six threshold estimates were >0 dB. Similarly, if all six threshold estimates had the same value, the resultant SDs (0) may unduly bias the estimation of the underlying variability, and data from such locations were also excluded from the regression analyses of both techniques.
Comparison of Total and Pattern Deviation Probability Maps.
For the comparison between the probability maps of FDT2 and SAP, we derived an ordinal defect score for each visual field test. Each test location was assigned a value ranging from 0 to 4 according to its probability (P > 5%, P < 5%, P < 2%, P < 1%, P < 0.5%, respectively) in the printout. These scores were then summed across the 52 test locations of the entire visual field (excluding the foveal test point and the two locations at and above the blind spot), and the global defect sums were compared between total and pattern deviation probability maps of FDT2 and SAP.
| Results |
|---|
|
|
|---|
Comparison of Threshold Estimates
The measurement scales of FDT2 and SAP appeared numerically similar; 90% of threshold estimates in our study were between 3 and 32 dB with FDT and between 5 and 32 dB with SAP. At test locations with high sensitivity (mean SAP threshold >25 dB), there was a close and approximately linear relationship between the mean thresholds of FDT2 and SAP (Fig. 3) . In this range of sensitivities, the principal curve had a slope of approximately 2.0. At locations with lower sensitivity, however, the spread of the data points increased considerably and the curve became progressively shallower. In comparison to FDT2, SAP estimated a slightly larger proportion of absolute defects (proportion of threshold estimates at 0 dB, 3.0% vs. 2.2%, respectively with SAP and FDT2; P < 0.001).
|
8 dB) across virtually the entire measurement range. In contrast, the intervals of SAP were narrow (
3 dB) with thresholds near 30 dB, but broadened considerably with lower thresholds, ranging over nearly 15 dB at locations with baseline values near 10 dB (Fig. 4) .
|
|
|
|
|
| Discussion |
|---|
|
|
|---|
For visual field locations with high sensitivity (>25 dB with SAP), our data showed a close association between the point-wise mean thresholds of FDT2 and SAP. Within this range, the relationship appeared nearly linear, with a gradient of 2, conforming to what would be expected from consideration of the techniques distinct definitions of the dB scale. With FDT2, a 1-dB change in threshold refers to a 0.05-log unit change in stimulus contrast, whereas with SAP, a 1-dB change refers to a contrast change of 0.1 log units. This means that, at test locations with early damage, visual field changes over time should be numerically twice as large with FDT2 compared with SAP. At visual field locations with lower sensitivity (<25 dB with SAP), however, the dispersion of the data increased considerably, and the principal curve fit showed a markedly nonlinear relationship between the mean thresholds obtained with FDT2 and SAP. In several patients, we observed visual field locations with near-absolute losses with one technique but not the other. Of interest, the proportion of absolute defects (threshold estimates of 0 dB) was slightly lower with FDT2 compared to SAP (2.2% vs. 3%, P < 0.001), indicating that FDT2, despite its smaller contrast range, does not suffer from more substantial floor effects than SAP. Because each observation is the average of six tests, the large dispersion of the data from damaged visual field locations is more likely to reflect genuine differences caused by the psychophysical properties of the techniques stimuli (see Fig. 1 ) rather than random measurement variability. Owing to the large dispersion of the data, a simple conversion factor to translate from one scale to the other would offer limited practical utility outside the narrow range of near-normal threshold values. An important focus of future work will be to investigate systematically the sources of discrepancyfor example, the size and location of the visual field loss. Specifically, further work is needed to establish how FDT2 characterizes the small yet deep losses that may occur near the fixation point (Fig. 7) .
The nearly uniform testretest intervals of FDT2 contrasted with those of SAP, which expanded markedly at lower threshold values. These results, in conjunction with the regression analyses, show that the testretest variability of FDT2 is remarkably constant over the entire measurement range of the instrument. These findings are in agreement with previous investigations on frequency-of-seeing curves obtained with stimuli similar to those of FDT221 and SAP.22 At test locations with low sensitivity (<25 dB with SAP), FDT2 showed narrower testretest intervals than SAP, but the opposite was true at test locations with high sensitivity (>25 dB with SAP). Because the testretest intervals are dependent on the measurement scales, the relationship between the threshold estimates of both techniques must be taken into account to interpret these findings. The relatively larger testretest intervals of FDT2 in areas of high visual field sensitivity, for example, may be offset if changes over time are also larger with this technique, as is suggested by the relationship between the threshold estimates (Fig. 3) . Longitudinal studies are needed to establish whether, and how, the variability characteristics of FDT2 translate into tangible benefits to the follow-up of patients with glaucoma. However, the nearly uniform testretest variability characteristics of FDT2 make this test a promising candidate for such trials. Since our data were established from a small sample of patients experienced with perimetry, it is possible that the testretest variability of less experienced patients is somewhat higher. This is unlikely, however, to influence the fundamental differences in the relationship between visual field sensitivity and variability with FDT2 and SAP observed in our study.
The comparison between the defect scores indicated that, with the total deviation analyses, FDT2 may classify fewer locations as outside normal limits compared with SAP. One potential explanation for this finding is that the relatively larger influence of media transmittance, pupil size, and other parameters on FDT perimetry23 causes relatively larger between-subject variability in the general height of the visual field measured with FDT2, increasing the width of the total deviation normal limits and thereby making these analyses less sensitive to early field loss. Alternatively, there may have been systematic differences between the databases from which the normative values for the FDT2 and SITA-Standard had been established. We ruled out the latter explanation by recomputing the probability maps based on normative data from 232 healthy control subjects who had been examined with both FDT2 and SITA Standard, which were provided to us by Chris A. Johnson (Discoveries in Sight, Portland, OR). The systematic differences in the total deviation defect scores between FDT2 and SAP persisted in that analysis, indicating that biases in the normative databases are unlikely to be the sole explanation for these findings. However, the overall differences between the total deviation maps of both techniques were small and clinically relevant only in patients with very early visual field loss. In contrast to the total deviation analyses, the defect scores established from the pattern deviation maps agreed closely between FDT2 and SAP.
In conclusion, our study showed that the testretest variability of FDT2 perimetry is uniform over the entire measurement range of the instrument. In combination with the relatively lower variability of the threshold estimates in areas of visual field damage, these properties may allow an earlier and more accurate detection of visual field progression and therefore prove advantageous for the monitoring of patients with glaucoma. These benefits, however, can only be conclusively demonstrated in prospective longitudinal studies.
| Acknowledgements |
|---|
| Footnotes |
|---|
Supported by Grant 41340 from the E. A. Baker Foundation of the Canadian National Institute for the Blind (PHA), Grant MOP-11357 from the Canadian Institute of Health Research (BCC), and an unrestricted grant from Welch-Allyn (BCC).
Submitted for publication February 1, 2005; revised March 2, 2005; accepted March 10, 2005.
Disclosure: P.H. Artes, None; D.M. Hutchison, None; M.T. Nicolela, None; R.P. LeBlanc, None; B.C. Chauhan, Welch-Allyn (F)
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Paul H. Artes, Department of Ophthalmology and Visual Sciences, 1278 Tower Road, Halifax, Nova Scotia B3H 2Y9, Canada; paul.h.artes{at}gmail.com.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Hot, M. W. Dul, and W. H. Swanson Development and Evaluation of a Contrast Sensitivity Perimetry Test for Patients with Glaucoma Invest. Ophthalmol. Vis. Sci., July 1, 2008; 49(7): 3049 - 3057. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Racette, F. A. Medeiros, L. M. Zangwill, D. Ng, R. N. Weinreb, and P. A. Sample Diagnostic Accuracy of the Matrix 24-2 and Original N-30 Frequency-Doubling Technology Tests Compared with Standard Automated Perimetry Invest. Ophthalmol. Vis. Sci., March 1, 2008; 49(3): 954 - 960. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Sakata, J. DeLeon-Ortega, S. N. Arthur, B. E. Monheit, and C. A. Girkin Detecting Visual Function Abnormalities Using the Swedish Interactive Threshold Algorithm and Matrix Perimetry in Eyes With Glaucomatous Appearance of the Optic Disc Arch Ophthalmol, March 1, 2007; 125(3): 340 - 345. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. A. Medeiros, P. A. Sample, L. M. Zangwill, J. M. Liebmann, C. A. Girkin, and R. N. Weinreb A statistical approach to the evaluation of covariate effects on the receiver operating characteristic curves of diagnostic tests in glaucoma. Invest. Ophthalmol. Vis. Sci., June 1, 2006; 47(6): 2520 - 2527. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |