|
|
||||||||
From Discoveries in Sight, Legacy Health System, Portland, Oregon.
| Abstract |
|---|
|
|
|---|
METHODS. Five individuals with glaucoma (ages 52, 63, 69, 77, and 78 years) and five individuals with normal, healthy eyes (ages 25, 34, 43, 45, and 52 years), participated in the study. Each subject was experienced in automated perimetry and performed multiple, monocular baseline SITA-standard (SITA-S) 24-2 visual field tests. In addition, normal subjects completed SITA-S 24-2 field examinations in which known frequencies of FP error were introduced (0%, 5%, 10%, 20%, or 33% frequency). Likewise, the subjects with glaucoma completed visual field examinations with 0%, 20%, and 33% error introduced during the test.
RESULTS. Reported FP errors were significantly lower than the introduced frequency of error. The SITA algorithm more accurately identified FP errors when the MD and PSD diverged from normal. Test duration increased as introduced error frequencies increased. The Statpac single-field analyses indicated that two thirds of the tests with introduced errors produced a "low-patient-reliability" determination.
CONCLUSIONS. HFA II SITA-S underestimates patients FP errors, particularly among normal patients. High FP error frequencies can have adverse effects on MD and PSD, leading clinicians and researchers to an inaccurate determination of the amount and severity of visual field loss.
Reliability affects the validity of an automated perimetric test, and it is therefore important to monitor patient responses during the test procedure. Three reliability indices are available within SITA: false positives, false negatives, and fixation losses. False positive errors occur if patients respond when no stimulus is presented. For the purposes of this study, we define a false positive response as randomly occurring, independent of stimulus presentation, and hence independent of any monitored response window. False-negative errors occur when the patient does not respond to a suprathreshold stimulus in an area where the threshold has already been measured. The interpretation of false-negative errors is not as clear as that of false-positive ones, because they can be produced by a variety of sources. Fixation losses occur when the patients eye wanders from the fixation target.
The method of measuring these error rates has been to present "catch trials" during the visual field test. Catch trials allow the perimeter to estimate the patients overall false-positive errors, false-negative errors, and fixation losses during the test.4 The Heijl-Krakau5 method for monitoring the patients fixation includes a suprathreshold stimulus presented to the patients blind spot. If the patient is properly fixated on the central target, he or she should not be able to see the stimulus. Improper fixation may result in an inappropriate response to a blind-spot check, but false positives may also cause inappropriate responses to blind-spot checks. According to the manufacturer of the Humphrey Field Analyzer (HFA; Carl Zeiss Meditec, Dublin, CA), visual field test results of patients whose fixation losses exceed 20% or whose false-positive or false-negative errors exceed 33% are not considered reliable.6
Katz and Sommer7 initially reported that results in 30% of normal control subjects and 45% of patients with glaucoma were unreliable by these criteria. They indicated that 41% of unreliable visual fields in patients with glaucoma and 67% of unreliable fields in normal patients were due to high fixation losses. However, if the proper precautions are taken, unreliable test results due to excessive fixation losses can be reduced to 14% by replotting the patients blind spots.8 In addition, if the cutoff reliability criteria for fixation losses is modified to
33%, only 3% of visual field examinations fall into the "unreliable" category.9
Catch trials add to the duration of the visual field examination and may be affected by damaged visual field locations.10 Swedish Interactive Threshold Algorithms (SITA), designed for the HFA II, were originally developed to improve patient testing reliability by reducing test duration. A portion of the reduction in test duration was due to eliminating false positive catch trials and estimating false positive response errors through the use of "listening windows."1 4 These windows are intervals between stimulus presentations when no patient response is anticipated. It has been reported that the minimum response time for a perimetric stimulus is approximately 180 ms.11 A response window is thus defined as the period beginning at the minimum response time, adjusted according to the patients individual mean response time. Olsson et al.4 implemented the minimum response interval when designing their new system for evaluating response errors. The test monitors a window of time beginning immediately after the onset of a stimulus, continuing for 180 ms, and directly after a response window, continuing until the onset of the next stimulus.4 An accurate, alert test subject should not respond during these times. Thus, any response during these epochs can be considered a false-positive error. Instead of reporting these errors in ratio form, a percentage of overall error throughout the entire test is reported.
This new method does not require any additional questions during the testing process, reducing the duration of the examination. With SITA, fixation losses are still determined by using the catch trial method by presenting stimuli in the blind spot. The SITA procedures assume that patients respond at a consistent rate of error during the response windows and listening windows.4 Although the listening window method substantially increases sampling time available for estimating response errors (an average of 15 times greater), the accuracy of the algorithm has not been reported. The purpose of this study was to evaluate the SITA-S testing strategys accuracy in reporting patient false positive error rates and to examine the effects of known frequencies of excessive false-positive responses on mean deviation (MD), pattern SD (PSD), glaucoma hemifield test (GHT), test duration, and reliability indices.
| Methods |
|---|
|
|
|---|
Eligibility for normal subjects was based on three criteria (applied to both eyes): (1) normal visual fields, as determined by a within normal limits glaucoma hemifield test (GHT) result and P > 0.05 for PSD and MD; (2) intraocular pressure <21 mm Hg; and (3) normal optic disc appearance (determined by a previous full clinical eye examination). In addition to fulfilling these standard criteria, all the normal subjects had to have no history of systemic diseases or of taking medication known to affect vision.
Inclusion requirements for patients with glaucoma were based on a previous clinical diagnosis of glaucoma, an outside normal limits result on the GHT, and the presence of glaucomatous optic neuropathy in one or both eyes. Participants with glaucoma were selected on the basis of the severity of the disease. Glaucomatous field loss of patients with primary open-angle glaucoma was used to classify each individuals disease as mild (n = 2), moderate (n = 1), or severe (n = 2), according to the criteria of Hodapp et al.12
The Humphrey Field Analyzer II (HFA II) M750 (Carl Zeiss Meditec) was used to conduct 24-2 SITA-S visual field tests in all subjects. SITA strategies are adaptive among perimetry techniques, in that the algorithm constantly uses newly received data to recalculate thresholds throughout the test.
Five nominal false-positive error frequencies were selected for evaluation in this study: 0%, 5%, 10%, 20%, and 33%. The perimetry test operator generated aperiodic, randomly spaced responses at a predetermined mean frequency by pressing the response button while the patients completed the tests as usual. Calculations for these erratic responses were based on the knowledge that the HFA II Full Threshold testing strategy presents a stimulus to the patient at a mean period of approximately once every 2 seconds. We calculated how often an erroneous response must be made to produce a false-positive reading (i.e., for 33% error frequency, an erroneous response should be introduced, on average, every 6 seconds). We then introduced random fluctuation to the length of this period to determine when each of these responses should be introduced. False-response events were calculated for each nominal error frequency and then randomly introduced during the testing. To reduce bias, patients were not informed of the error frequency chosen by the test operator for each test run and were instructed to respond normally, despite any responses generated by the perimetrist. Subjects and the perimetrist shared the same response button throughout the testing, and the response alarm on the HFA II apparatus remained intact. A total of 25 monocular SITA-S visual field examinations were conducted in each normal patient, five at each of the four predetermined error frequencies and five baselines with no introduced false responses. The latter served to ensure the testing reliability of each patient. Nine SITA-S visual field tests were conducted in the glaucoma subjects (three each at 0%, 20%, and 33% false-positive error frequency). All visual fields were obtained using the appropriate near refractive error correction for each patient. Testing sessions were held within a 1-month period and did not exceed 1 hour. A minimum 10-minute break was given between each test during a session. In addition, 40 SITA-S tests (10 at each of the 5%, 10%, 20%, and 33% error rates) were conducted without a subject present. These tests were conducted to serve as a reference comparison for the internal reporting accuracy of the software and can be used to simulate an eye with complete vision loss.
Commercial software (Statistical Package for Social Sciences [SPSS] ver. 13.0, SPSS, Chicago, IL; SigmaPlot version 8.0, Systat Software Inc., Point Richmond, CA; and Prism 4, Graphpad Software Inc., San Diego, CA) was used to conduct statistical analyses and construct graphic representations.
| Results |
|---|
|
|
|---|
|
Figure 1 presents the reported false-positive error rate as a function of the nominal introduced false positives for the no-patient test conditions. The slope of the linear regression was 0.893 (i.e., the SITA algorithm tended to report 89.3% of the introduced false positives). This relationship was significantly shallower than a slope of 1 (F = 88.41, df = 39, P < 0.001, r2 = 0.893).
|
|
|
Fixation Losses
All gaze tracks were monitored to ensure that the subjects did not have fixation losses outside the acceptable range. On the baseline examinations, with no introduced false positives, the group mean fixation losses in normal subjects was 2.1% ± 5.19% (SD), whereas in patients with glaucoma the mean was 3.8% ± 4.51% (SD). With a 33% introduced error rate, the mean reported fixation losses in normal subjects was 27% ± 19.51% (SD) in comparison to a mean of 44.3% ± 24.0% (SD) for those with glaucoma. Figure 4 depicts the reported percentage of fixation losses as the introduced error rate increased in normal subjects and those with glaucoma.
|
|
Pattern Standard Deviation
The PSD increased with introduced false-positive error rate in the normal group, indicating a more irregular field, as expected. In the glaucoma group, introduction of false-positive errors decreased the PSD, because genuine glaucomatous defects were blurred out by the false-positive responses. As with MD, the PSD increased with the false-positive rate when no eye was tested.
Glaucoma Hemifield Test
With the 33% introduced error rate, only one of the normal patients GHT results was outside normal limits on any of the tests. The GHT reading in one of the normal patients indicated abnormally high sensitivity during a baseline test. Given that our patients with glaucoma were selected in part on the basis of their GHT results, all baseline results from these patients were outside normal limits. However, once the response errors were introduced for these subjects, the GHT changed to "borderline" in one instance.
Test Duration
Mean test duration for normal subjects increased 19.7% (or 54 seconds) from mean baseline tests to 33% introduced error tests. For 5%, 10%, and 20% introduced error rates, the mean test durations increased an average of 1, 13, and 28 seconds, respectively. Similar increases were noted in the glaucoma group as well, with an increase of 31 seconds at 20% introduced error and an increase of 69 seconds (18%) at 33% introduced error.
The relationship between test duration and introduced false positives appeared to be nonlinear. However, linear regression serves as a reasonable approximation for our data. Linear regression analysis revealed highly variable slopes among all patients for test duration, but they demonstrated reasonably linear relationships with these variables, indicating an increased testing time for patients with more severe visual field defects in the presence of elevated false-positive errors.
| Discussion |
|---|
|
|
|---|
|
A similar study on catch trials by Vingrys and Demirel10 also suggested a need to reevaluate the acceptable level of reported error rates. They concluded that a more appropriate cutoff range for reliable visual field testing would be less than 20% false-positive errors. Our findings support their proposal.
Results of a comparable investigation by Cascairo et al.14 based on patient-generated catch trial errors are also consistent with our findings. Their study showed that all global indices and probability maps in normal, reliable patients were significantly altered from their baseline readings when reported false-positive errors reached 33%.
One limitation of having the operatorintroduced errors, as opposed to softwareintroduced errors, is that the error calculation is not as accurate. As previously stated, aperiodic responses introduced during a fixation catch trial may register as a fixation loss instead of a false-positive error. It has been reported that high fixation losses can result in poor detection of visual field loss as well as exacerbate threshold variability.15 Attribution of some false positives to fixation losses in this investigation may account for some of the unreported errors. In addition, our response error frequencies were calculated assuming a stimulus presentation every 2 seconds on average. If a subjects response time is longer than the average, the interval between stimuli will increase. The worst-case scenario is when the subject has complete vision loss or there is no patient present during the testing. In these situations, the interval between our introduced false positives may be too short to attain the desired false-positive percentage, because the interstimulus interval is automatically increased. Therefore the percentage of introduced false positives would be higher than assumed; the result of this is that the proportion of false-positive errors being reported would actually be worse than in our study.
Of the three reliability indices, false positives are the least variable in testretest studies and occur less frequently than either fixation losses or false negatives.16 False positives may be the only good indictor of patient alertness and test reliability because high false-negative responses are associated with glaucomatous visual field damage17 and fixation losses can frequently be reduced by replotting the patients blind spot early during the test procedure. Without an accurate false-positive report, it is difficult to establish which patients visual fields are truly indicative of their visual function and which should be retested. Clinicians and vision researchers should be wary of the reliability of a SITA-S 24-2 visual field reading if false-positive or false-negative error reports exceed 20%. In such instances, patients should be asked to retest, and the initial data should not be considered for determining a diagnosis or concluding that progression of field loss has occurred.
Although the SITA software consistently underreports the actual false-positive error rate, high false-positive estimates in a normal subject are of greater concern than those in a patient with glaucoma. As indicated by the data from glaucoma subjects 3 and 5, our participants with the lowest baseline MD, the algorithm most accurately reports false-positive responses in patients with large field defects. Therefore, a false-positive response rate that borders on the permissible value for error in an otherwise normal eye is, in fact, much higher than the acceptable rate. Such a test should be considered unreliable. Further research, with a larger sample size, is necessary to firmly establish the accuracy of the SITA-S software in deriving and reporting response error estimates.
| Footnotes |
|---|
Disclosure: M.R. Newkirk, None; S.K. Gardiner, None; S. Demirel, None; C.A. Johnson, None
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Chris A. Johnson, Discoveries in Sight, Devers Eye Institute, Legacy Clinical Research and Technology Center, 1225 N.E. Second Avenue, Portland, OR 97232; cajohnson{at}deverseye.org.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Hot, M. W. Dul, and W. H. Swanson Development and Evaluation of a Contrast Sensitivity Perimetry Test for Patients with Glaucoma Invest. Ophthalmol. Vis. Sci., July 1, 2008; 49(7): 3049 - 3057. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Burgansky-Eliash, G. Wollstein, A. Patel, R. A Bilonick, H. Ishikawa, L. Kagemann, W. D Dilworth, and J. S Schuman Glaucoma detection with matrix and standard achromatic perimetry Br. J. Ophthalmol., July 1, 2007; 91(7): 933 - 938. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |