|
|
||||||||
From the Academic Department of Ophthalmology, Manchester Royal Eye Hospital, University of Manchester, Manchester, United Kingdom.
| Abstract |
|---|
|
|
|---|
METHODS. A sample of 435 largely inexperienced patients underwent suprathreshold visual field examination on a perimeter that was modified to record RTs. Data were analyzed from 60,500 responses to suprathreshold stimuli and from 523 false-positive responses to catch trials.
RESULTS. False-positive responses had much more variable latencies than responses to suprathreshold stimuli. An algorithm defining RT windows on the basis of z-transformed individual latency samples correctly identified more than 70% of false-positive responses to catch trials, whereas fewer than 3% of responses to suprathreshold stimuli were classified as false-positive responses.
CONCLUSIONS. Latency analysis can be used to detect a substantial proportion of false-positive responses in suprathreshold perimetry. Rejection of such responses may increase the reliability of visual field screening by reducing variability and bias in a small but clinically important proportion of patients.
| Introduction |
|---|
|
|
|---|
Automated suprathreshold perimetry is mainly used for screening.1 2 3 4 Patients undergoing such tests are generally inexperienced and may find it difficult to establish and maintain optimal response criteria and to sustain attention. Erroneous responses to stimulus presentations increase the variability of the test result, degrading the ability to correctly classify the status of the patients visual field. The patients response behavior has traditionally been assessed by randomly interleaving a small proportion of catch trials (usually 3%5% of presentations) with the test stimuli. The number of responses to false-positive catch trials (during which no stimuli are presented) reveals how likely a patient is to respond without having perceived a stimulus. Owing to the small number of catch trials, estimates of response error rates are notoriously imprecise. Using 14 catch trials, for example, a true error rate of 33% is estimated as between 14% and 57% (95% confidence interval [CI]).5 Estimates of patients reliability based on catch trials are poor predictors of test result variability, and their clinical usefulness has been brought into question by other research groups.6 7
Olsson et al.8 have proposed a more precise measure of false-positive rate based on analysis of response times (RTs)the time between the onset of the stimulus and the patients response. (They used the term "reaction time," although it is not clear whether patients were urged to respond as quickly as possible to the stimuli. In keeping with established practice, patients were not instructed to respond rapidly in our study. We therefore prefer to use "response time" rather than "reaction time," because the latter has a different meaning in the literature.9 ) By eliminating the need for false-positive catch trials, the technique of Olsson et al. also contributes to the reduction in test time achieved with the Swedish Interactive Threshold Algorithm (SITA) of the Humphrey Visual Field Analyzer (Humphrey Instruments, San Leandro, CA).10 Although the assumptions underlying the algorithm of Olsson et al. appear plausible, they have not yet been validated, and the paper of Olsson et al.8 did not report the proportion of false-positive responses detected by their analysis.
This article reports on the RT distributions of true- and false-positive responses in a large sample of perimetrically inexperienced patients examined with a suprathreshold strategy. It describes an algorithm that estimates the typical time frame for a patients responses (RT window) and reports on the proportion of false-positive catch trial responses outside this interval. It proposes that the quality of visual field data can be improved by rejecting responses with latencies outside the RT window and by re-examining the respective locations.
| Methods |
|---|
|
|
|---|
Patients and Data Collection
Data were collected from 435 patients (mean age, 45 years;
range, 1281), attending a Manchester city-center optometric practice
for routine eye care. The only selection criterion was clinical need
for visual field screening, based on risk factors for glaucoma or
neurologic disease. Most patients had had no experience with automated
perimetry. The tests were administered by seven optometric assistants,
and patients were instructed using the conventional directions for
automated perimetry.11
No instructions were given
regarding the speed of the response, and neither patients nor
optometric assistants were aware that RTs were being recorded. Our
study followed the tenets of the Declaration of Helsinki, in that the
research was free of any risks and no additional burden was placed on
the patients. The complete sample contained data from 976 visual field
tests, yielding RTs from 60,500 responses to suprathreshold stimuli and
from 523 false-positive responses to catch trials. A total of 403
patients completed the examination of both eyes. The mean duration of
the test was 3.81 minutes per eye (range, 3.237.93).
Suprathreshold Visual Field Test
The stimulus matrix consisted of 68 locations distributed over
the central 25° of the patients visual field. The interstimulus
interval was 1490 msec, and approximately 25% false-positive and
false-negative catch trials were randomly interleaved with the
suprathreshold stimuli. The spatial sequence of presentations was
randomized. No acoustic warning or feedback signals were provided.
After a brief demonstration phase at the onset of the test, the
instrument would estimate the general height (GH) of the patients
visual field according to a previously described
algorithm.12
In brief, six stimuli were presented as a
1-dB up/1-dB down staircase at each of four "seed" locations
(12.7° from fixation in each visual field quadrant), and the GH was
estimated by averaging the staircase levels of those seed locations at
which sensitivity was within normal limits. Subsequently, the 50%
detection threshold of each test location was predicted from normative
values, adjusted according to the GH estimate. During the
suprathreshold phase of the test, each location was examined with a
stimulus that was presented for 200 msec at an intensity 5 dB brighter
than the predicted local threshold. If that stimulus was not detected,
the presentation was repeated at a later stage of the test. Visual
field locations at which both the initial and the repeat 5-dB
suprathreshold stimuli had been missed were classified as defective and
re-examined with suprathreshold increments of 8 and 12 dB.
Analysis
RT data were analyzed from responses to 5-dB suprathreshold
stimuli, excluding those locations at which the stimulus was missed at
two presentations and that were consequently flagged as defective. To
derive the RT distributions for responses to suprathreshold stimuli and
to false-positive catch trials, all available data were used. For the
summary statistics and comparisons of right and left eye data, only
results of tests of patients who completed the examination of both eyes
were analyzed. If these patients had repeated a visual field test, only
data of the last examination were included.
| Results |
|---|
|
|
|---|
RT distributions varied greatly among patients. Across the sample of patients, individual median RTs ranged from 316 to 908 msec with a group mean of 451 msec (Fig. 1) . Individual interquartile ranges of RTs extended from 41 to 422 msec, with a group mean of 108 msec. Interquartile ranges were related to median RTs (Pearson r = 0.74, P < 0.001). Median RTs decreased slightly with age (9 msec per decade, 95% CI 413 msec per decade; P < 0.001), but the correlation was poor (Pearson r = 0.18). Median RTs from the right and left eyes of individual patients were highly related (Pearson r = 0.72, P < 0.001). Tests of the left eye, which were always performed last, yielded responses that were, on average, 12 msec faster (P < 0.001, paired t-test) and somewhat less variable (mean difference of individual interquartile range 9 msec, P < 0.001, paired t-test) than those of the right eyes. The mean false-positive rate of the left eye results (2.8%) was slightly lower than that of the right eye results (3.4%, P = 0.103, paired t-test).
|
|
|
![]() |
Figure 4 shows the performance of this algorithm at different values of Vcrit. With a criterion value of 0.5, more than 70% of false-positive responses to catch trials, but less than 3% of stimulus responses, occur outside the RT window. In 35% of tests, no stimulus responses were detected outside the RT window defined by this criterion. In fewer than 5% of tests was the proportion of stimulus responses outside the RT window greater than 10%.
|
| Discussion |
|---|
|
|
|---|
Latencies for false-alarm responses in detection tasks have been reported to exhibit characteristics different from latencies of true responses.9 These studies used reaction time paradigms with highly trained observers, low rates of stimulus presentation, and randomized, exponentially distributed interstimulus intervals. Patients examined with suprathreshold perimetry tend to have little experience with demanding psychophysical tests and are not usually urged to respond rapidly. The interstimulus intervals are brief and regular (10001600 msec),16 and patients with little or no visual field loss respond to most presentations. This leads to a high level of stimulus expectation that may, in turn, increase the likelihood of anticipatory false-positive errors.17
Olsson et al.8 reported a new method to estimate the false-positive rate in threshold perimetry, based on the frequency of answers during intervals in which no true responses were expected ("listen time"). Estimates were derived by maximum-likelihood estimation using RT, change in RT and stimulus intensity. Olsson et al. demonstrated that their estimates exhibited much lower between-test variability than the conventional catch-trial estimates. They did not, however, demonstrate the validity of their technique or present data to justify the assumptions it is based on. The algorithm to determine the listen time was not described, and there have been no reports on what proportion of false-positive response errors are detected by this technique.
Better estimates of error rates do not, per se, improve the test result. The sole use of patient reliability indices is to classify as unreliable those test results that exceed arbitrarily defined cutoff criteria. Faced with such results, the only options open to the clinician are either to base decisions on unreliable evidence or to repeat the test. If suspect responses can be detected by their latencies, the clinical data can be improved at source by re-examining the respective visual field locations during the test. Our data highlight striking differences in the RT distribution between responses to suprathreshold stimuli and false-positive responses to catch trials. When corrected for between-subject variability in average latency and dispersion, the distribution of stimulus RTs is compact and highly peaked, whereas the RT distribution of false-positive responses is much broader. Owing to the high variability between the RT distributions of different patients, classification of suspect responses on the basis of population-based RT windows would be inefficient, disadvantaging patients with long or short mean RTs. Conventional statistical methods for the detection of outliers (e.g., the Grubb test18 ) could be used to detect suspect responses when the sample of responses is large and the false-positive rate is low. These methods fail when the sample size is small and there is a moderate or large number of false-positives with highly variable RTs.
The algorithm described herein is an ad hoc solution to this problem. It classifies responses as suspect false-positives if their removal reduces the sample variance by an amount greater than a predetermined criterion. The optimal criterion value for any particular application can be estimated from empiric data. Small criterion values lead to exclusion of a relatively larger number of responses (Fig. 4) . With a criterion value of 0.5, more than 70% of false-positive catch trial responses in our sample had RTs outside the RT window, compared with fewer than 3% of stimulus responses. As an example, Figure 2 shows the RTs and the calculated RT window of a patient with a high false-positive rate.
The minimum visual reaction time has been estimated to be approximately 180 msec.19 Responses occurring less than 180 msec after stimulus onset are therefore false-positives. The lower limit of the RT window as set at the criterion value of 0.5 was between 267 and 471 msec in 95% of tests (median: 344 msec). Wall et al.20 investigated perimetric reaction times in normal and glaucomatous observers. Reaction time decreased exponentially with increasing suprathreshold increments, similar to the relationship first reported by Pieron in 1914.9 In suprathreshold perimetry, stimuli are presented at a fixed suprathreshold increment (5 dB in this study) and are presumed to be of similar visibility across the visual field (consistent with the lack of evidence for a relationship between RT and stimulus eccentricity in our data). Responses with untypically long latencies are not necessarily false-positives. They may be manifestations of the occasional lapse of attention or may be due to threshold elevation if the visual field location is defective. Because latency analysis is used to reject suspect responses and to selectively re-examine the respective locations, such misclassifications occur at the expense of a minor increase in the number of stimulus presentations.
The derivation of the RT window could be performed toward the end of a suprathreshold visual field examination, when a sufficient number of responses (probably in excess of 50) have been obtained to allow robust estimation of the patients RT distribution. Locations with suspect responses can subsequently be re-examined without introducing a break in the examination. We do not suggest that RT analysis should replace false-positive catch trials. In suprathreshold perimetry, relatively high rates of false-positive catch trials may be required to keep stimulus expectation at moderate levels.
| Conclusion |
|---|
|
|
|---|
| Footnotes |
|---|
Submitted for publication February 12, 2001; revised July 13, 2001; accepted August 6, 2001.
Commercial relationships policy: C (DBH); N (PHA, DM).
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Paul H. Artes, Department of Ophthalmology, QEII Health Sciences Centre, 1278 Tower Road, Halifax, Nova Scotia, B3H 2Y9 Canada; paul_h_artes{at}yahoo.co.uk.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. T. Becker, R. Vonthein, N. J. Volpe, and U. Schiefer Factors Influencing Reaction Time during Automated Kinetic Perimetry on the Tubingen Computer Campimeter Invest. Ophthalmol. Vis. Sci., July 1, 2005; 46(7): 2633 - 2638. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |