|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1From the Departments of Ophthalmology and 2Neurology, University of Iowa, College of Medicine, Veterans Administration Hospital, Iowa City, Iowa; and the 3Department of Ophthalmology and Visual Sciences, Dalhousie University, Nova Scotia, Canada.
| Abstract |
|---|
|
|
|---|
METHODS. One eye each of 120 patients with glaucoma was examined on the same day with these four perimetric tests and retested 1 to 8 weeks later. The decibel scales were adjusted to make the tests scales numerically similar. Retest variability was examined by establishing the distributions of retest threshold estimates, for each threshold level observed at the first test. The 5th and 95th percentiles of the retest distribution were used as point-wise limits of retest variability. Regression analyses were performed to quantify the relationship between visual field sensitivity and variability.
RESULTS. With SAP III, the retest variability increased substantially with reducing sensitivity. Corresponding increases with SAP V, Matrix, and Motion perimetry were considerably smaller or absent. With SAP III, sensitivity explained 22% of the retest variability (r2), whereas corresponding data for SAP V, Matrix, and Motion perimetry were 12%, 2%, and 2%, respectively.
CONCLUSIONS. Variability of Matrix and Motion perimetry does not increase as substantially as that of SAP III in damaged areas of the visual field. Increased sampling with the larger stimuli of these techniques is the likely explanation for this finding. These properties may make these stimuli excellent candidates for early detection of visual field progression.
With regard to the subject, there are several reasons that results of testing the sensory visual system and, in particular, of the visual field vary. Instructions can have a substantial effect. We have shown substantial variations in the results of conventional automated perimetry by instructing subjects in three generally accepted ways.4 Also, visual attention can have a marked affect on thresholds.5 Other cognitive factors are subject motivation and the effects of visual testing fatigue.6 Although in individual subjects these factors can contribute greatly to the retest variability, for most subjects, the effects of these factors can be controlled.7 8 9
Perimetric variability is divided into short-term (intratest) and long-term (intertest) variabilities.10 11 12 The short-term variability or "fluctuation" (the variability of repeated measurements of the visual field within one test session) depends primarily on the slope of the frequency (probability) of seeing curve for that test location, and on the frequency of false-positive and -negative responses.
In a study from our laboratory, the frequency of seeing curves of patients with glaucoma was generated by using a custom test program. The patients were tested with size I, III, and V stimuli in areas of normal sensitivity and areas of 10- to 20-dB loss. The same test locations were used in the same session for both sizes. As shown by a steepening of the slope of the frequency of seeing curve, variability substantially decreased in the patients with glaucoma with 10- to 20-dB loss as the stimulus size was increased.13
This high variability of standard automated perimetry (SAP) is very apparent in analysis of glaucoma clinical trials. For example, repeatability was a major issue in the Normal-Tension Glaucoma Study.14 In another clinical trial, the Ocular Hypertension Treatment Study, almost 90% of the 703 retests failed to confirm defects found on earlier examinations.15 The many studies of visual field progression in patients with glaucoma are a testament to the difficulties encountered due to high retest variability.
Since a major problem of perimetry is this marked increase in variability with decreasing sensitivity, we investigated the retest variability of patients with glaucoma and normal subjects with four perimetry tests: standard automated perimetry size III (SAP III), SAP size V (SAP V), Matrix (FDT II), and Motion perimetry, to test the hypothesis that larger stimuli do not show the marked increase in variability seen in damaged areas of the visual field with SAP III.
| Methods |
|---|
|
|
|---|
Normal participants were included if they had no history of eye disease, refractive error within ±5 D sphere and ±2 D astigmatism, no history of diabetes mellitus or systemic arterial hypertension, and a normal ophthalmic examination including 20/25 or better Snellen acuity. The subjects either had undergone a complete eye examination within 12 months before this study or were examined by an ophthalmologist on the day, of testing to ensure normal ocular health. One eye of each participant was randomly chosen as the study eye.
Visual Testing
All subjects underwent automated perimetry using program 24-2 of the Humphrey Field Analyzer (HFA, Carl Zeiss Meditec, Dublin, CA). For SAP, the stimuli of Goldmann size III (0.43° diameter, 4 mm2) were used with the SITA standard 24-2 algorithm. Goldmann size V stimuli (1.72° diameter, 64 mm2) were used along with full-threshold testing; there is no SITA program currently available for size V stimuli. Our pilot data and the work of Artes et al.16 show that the differences between estimates of the SITA and full-threshold strategies are minor compared to their test-retest variability. We chose size III SITA so that we would be comparing the test most commonly used in clinical practice.
We followed the manufacturers recommendations for using corrective lenses. Care was taken to prevent lens rim artifacts. The subjects had testing in one eye, chosen at random, but the same eye was used for all tests. All visual field examinations met the following reliability criteria: fixation losses < 20% or normal gaze tracking, false-positive rate < 10%, and false-negative rate < 33%. The four tests were administered in a random order with at least a 5-minute rest break between the testing sessions.
Motion Perimetry
Motion perimetry uses random dot cinematograms as visual stimuli. The dots are randomly displayed on a gray background with a luminance of 31.5 apostilbs, in a standard VGA video display: 640 x 480 pixels (Fig. 1) . The motion targets were circular random dot cinematograms within which 50% of the dots moved centrifugally and 50% moved in random directions. This stimulus is commonly used stimulus for motion perception experiments because the random placement of stimulus dots reduces the effect of positional cues.17 Subjects were asked to respond by touching the monitor with a light pen in the area in which they had perceived the movement.
|
Testing was done in a darkened room using an IBM compatible 486 computer with software we have developed.12 The patients appropriate near correction was again used. Care was taken to prevent lens rim artifact by asking if the subject could see each corner of the video display while looking at the fixation target.
To facilitate the comparison between the four methods, the threshold values of Motion perimetry were transformed such that their numerical ranges were similar to those of the other tests. This does not imply that we were able to standardize the tests dynamic ranges. Motion perimetry measures a size threshold (18 steps) with the smallest size being 1 dB. We transformed the data by the following equation: 18 – observed threshold x 2.
Matrix
Humphrey Matrix frequency doubling perimetry was performed either before or after conventional perimetry testing with at least a 5 to 10 minute rest period between examinations to reduce the effect of fatigue. Testing was performed in a dim room using the Humphrey Matrix device (Carl Zeiss Meditec, Inc.). The patients were asked to press a response button whenever they saw a small patch of alternating light and dark gray bars at any location within the field of view. Each test lasted approximately 5 to 6 minutes per eye. For this test, patients wore their own prescription glasses and did not use an eye patch to cover the fellow eye.9 Rest breaks were allowed when requested. Details of the testing can be found in publications by Anderson et al.20 21
Statistical Analysis
To examine retest variability with the four methods, we established the distributions of threshold estimates at retest, conditional on their value at the first test. Test locations with similar thresholds at the first test were collected together, and the threshold estimates obtained at the second test (retest) were examined. To summarize the retest distributions, we derived the empiric 5th and 95th percentiles (retest intervals) for all threshold levels observed at the first test. These intervals therefore show, for each threshold level, the range within which subsequent estimates from the same location are likely to fall with 90% probability, given that no real change has taken place. Test-retest intervals were established separately for patients with glaucoma and healthy control subjects. To test the hypothesis that retest variability is independent of sensitivity (no increase of variability with decreasing visual field sensitivity), we performed a Friedman ANOVA. For this, the differences between test and retest sessions at each test location were established and related to the mean of both measurements.
| Results |
|---|
|
|
|---|
|
|
The 5th and 95th percentiles, for each level of sensitivity obtained at the first test, are shown in Figure 3 , for patients with glaucoma as well as healthy control subjects. The 90% range of thresholds seen at test and retest, indicated by heavy lines on the x- and y-axes, respectively, indicate that the stimulus scales were similar though not identical with the four tests. Because Matrix and Motion perimetry both measure fundamentally different psychophysical thresholds, it is problematic to compare the scales with each other. In the patients with glaucoma, the findings from these plots mirrored the large increase in retest variability, with SAP III and V already remarked on earlier. In comparing the retest intervals of SAP III and V, the latter showed a lesser increase in variability with decreasing sensitivity compared with SAP III. While the lower retest limit of SAP III fell to 0 dB for test locations with initial thresholds below approximately 15 dB; with SAP V this did not occur until the thresholds of the first test were reduced beyond 10 dB. It is interesting to note that the retest limits of all four tests were not symmetric around the 1:1 line. When the values of the first test were high, the retest intervals were shifted toward lower values and vice versa. This can be explained by a regression-to-the-mean effect that differs between patients with glaucoma and healthy control subjects. Low values at the first test are statistically more likely to be outliers that are followed by a higher estimate at retest. Similarly, high values are more likely to be followed by a lower value. Apart from this shift, the retest intervals were of similar width in patients with glaucoma and healthy control subjects.
|
| Discussion |
|---|
|
|
|---|
Our study confirms many others that with SAP III, variability increases dramatically with a reduction in sensitivity. For example, Heijl et al.22 23 investigated the variability in 51 eyes of 51 perimetrically experienced subjects with glaucoma representing all stages of optic nerve damage. The patients, all clinically stable, were tested four times in a 4-week period. Test locations, initially measured with a 6-dB loss had a 90% prediction interval from –1 to –16 dB. With an 8- to 18-dB loss initially, the 95% prediction interval nearly covered the full measurement range of the instrument (0–40 dB). An important finding of Heijl et al., also observed by others,10 24 25 26 27 28 29 is that point-wise intertest variability increases dramatically with decreasing sensitivity of the test location. This finding has a major ramification: Areas with the most visual loss have the highest variability. Therefore, the most clinically important regions are ones in which determination of change is most difficult.
Like Heijl et al.,22 23 we found this high retest variability for size III stimuli to begin at approximately 25 dB, and by 20 dB the limits extended over almost the entire measurement range of the instrument (Fig. 3) . The data suggest that the large retest variability renders a large part of the measurement range with SAP III ineffective. Inspection of Figures 2 and 3 for retest results below 20 dB with size III suggests that testing in the range of 0 to 20 dB may have little practical benefit. We do not mean to imply that 20 dB is an absolute cutoff for the increase in variability. Whether one chooses 22, 20, or 15 dB, the value of single-threshold estimates of SAP III for predicting the values during the next examination is limited, because the retest variability extends over most of the measurement range of the instrument.
SAP Size V testing gave results similar to SAP III testing except that the 5th percentile peak sensitivities were approximately 2 dB higher, and the retest distributions did not extend over most of the measurement range until thresholds of the first test were below approximately 15 dB.
Motion perimetry and Matrix (FDT II) testing had considerably different results. With these two tests employing larger stimuli (Goldmann size III: 0.43°, Goldmann size V: 1.72°, Matrix: 4°, and Motion perimetry (range 0.1–8°), there was no clinically meaningful increase in variability with decreasing sensitivity. Artes et al.30 have shown a similar independence of sensitivity and variability with Matrix perimetry.
Regression analyses showed that with SAP III, sensitivity explained 22% of the test-retest variability (r2), whereas corresponding figures for SAP V, Matrix, and Motion perimetry were 12%, 2%, and 2%, respectively. As suggested, if the analyses are performed excluding threshold values below 20 dB with SAP III and below 15 dB with SAP V, the explained variances are similar.
These findings have important implications for perimetry. As outlined earlier, there are many sources of variability, and only a limited number of approaches are available to reduce it. First, since our data suggest that, with SAP III, threshold estimates below 20 dB have little value for predicting the value at retest, one can make the case that for the purposes of detecting change, examination of such test locations could be eliminated. Whether a Bayesian rule is applied or whether two or three consecutive responses below 20 dB are used, an algorithm could be devised that could conclude with an acceptable level of certainty that the sensitivity is below 20 dB. Testing could cease at this point. Second, if a larger dynamic range is desired, a larger stimulus such as SAP V could be used. Last, since sensitivity accounts for about half the variability for SAP V compared with SAP III and use of FDT II or Motion perimetry is associated with minimal variability associated with sensitivity, it suggests use of these larger stimuli may have important advantages as perimetric stimuli.30
Why do stimuli of larger size have lower variability? Three studies (Swanson WH, et al. IOVS 1996;37:ARVO Abstract 4937)31 32 modeled the stimulus-response function of SAP based on the electrophysiological data of Croner et al.33 on macaque receptive fields and the anatomic data of Dacey.34 Garway-Heath et al.35 have estimated Goldmann stimulus size coverage of receptive fields based on a series of results of human ganglion cell histology. Assuming that the response is mediated primarily by the midget ganglion cells, Garway-Heath et al.35 estimated that a Goldmann size I stimulus covers two midget cell receptive fields, a size III stimulus covers approximately 20, and a size V stimulus approximately 400 receptive fields; these coverages vary slightly with eccentricity.
Given these assumptions, a spatially irregular loss of ganglion cell receptive fields produces a poorer signal-to-noise ratio for small stimuli but has a much lesser effect on larger stimuli. In other words, for a stimulus to be seen it must activate a critical number of receptive fields (i.e., a population response is required). On retesting and the variability introduced by small eye movements,36 if less than this critical number of receptive fields is activated, it is less likely that the stimulus will be seen. This model predicts that in areas of visual field damage, if a larger stimulus is used, this larger stimulus will cover more receptive fields and will increase the margin of error due to small changes in stimulus position at each retest. Small stimuli may in effect fall into "holes" in the remaining "Swiss cheese-like" array of receptive fields on retest. Thus, larger stimuli should provide a more constant and improved signal-to-noise ratio and related decrease in retest variability. Whether this effect is best explained by coverage area or the length of the stimulus edge (since ganglion cell responses are highest around the edge of the stimulus where less of the receptive fields inhibitory surround is being stimulated) can be argued.
This model is a qualitative one. As yet, there is no comprehensive and universally accepted quantitative model for light sensitivity perimetry. There are many reasons for this. With light sensitivity stimuli, the size of the stimulus is known but determining the effective stimulus size is problematic because of light scatter. The sensitivity to light stimuli is indirectly related to the number of functioning receptive field units (retinal ganglion cell receptive fields and the related cortical receptive fields). Another difficulty with these models is the irregular pattern of visual field damage in optic nerve disease and the resultant irregular receptive field array. Last, cortical mechanisms are important in any model of perception as shown in other studies.31 32
A potential confound is the perimetric learning effect,6 37 38 39 However, most of this effect takes place between the first and second test. Since most of our patients were trained perimetry subjects recruited from our glaucoma clinic, we suspect this effect was modest. In a related study, we have evaluated the learning effect of subjects taking the same four tests as in this study, but subjects took the four tests once a week for 5 weeks. We found the majority of learning between test one and test two and modest learning effects of less than one dB for the four tests (Brito CF, et al. IOVS 2007;48:ARVO E-Abstract 1625). Another problem is the possibility of fatigue effects, since the subjects took four perimetry tests in 1 day.39 To reduce this effect, the subjects were given frequent rest breaks and generous time between tests. No time limits were imposed. Since the test order was randomized, the fatigue effects present should be distributed equally among the tests. Also, this randomization should negate possible effects on the results caused by different preadaptation levels before the commencement of each test.
Our entry criteria required that persons with glaucoma have a mean deviation on SAP III of between 0 and –20 dB. Subjects with endstage glaucoma were not included. This may have affected the SITA size III results, since it would affect the posterior probability distribution of true thresholds when the threshold estimate is near 0 dB. We believe there were enough end-stage hemifields in our analysis to make this issue minor.
In summary, our results show that larger sized stimuli show more uniform variability in areas of visual field damage. A moderate reduction of variability and improvement of dynamic range can be accomplished by using size V stimuli. Also, the larger stimuli of Matrix and Motion perimetry provide more uniform variability which may result in earlier detection in visual field change. Studies are ongoing to test this hypothesis.
| Footnotes |
|---|
Submitted for publication January 24, 2008; revised May 30, August 1, and September 12, 2008; accepted December 4, 2008.
Disclosure: M. Wall, None; K.R. Woodward, None; C.K. Doyle, None; P.H. Artes, None
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Michael Wall, University of Iowa, College of Medicine, Department of Neurology, 200 Hawkins Drive 2007 RCP, Iowa City, IA 52242-1053; michael-wall{at}uiowa.edu.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
P. H. Artes and B. C. Chauhan Signal/Noise Analysis to Compare Tests for Measuring Visual Field Loss and Its Progression Invest. Ophthalmol. Vis. Sci., October 1, 2009; 50(10): 4700 - 4708. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |