|
|
||||||||
1From the Departments of Ophthalmology and Visual Sciences, 2Epidemiology, and 3Biostatistics, University of Michigan, Ann Arbor, Michigan; and the 4Department of Ophthalmology, University of Washington, Seattle, Washington.
| Abstract |
|---|
|
|
|---|
METHODS. VF data were obtained from 243 patients in the Collaborative Initial Glaucoma Treatment Study (CIGTS) who had follow-up visits in 2004. FT and SS VF tests were performed in random order on the same day.
RESULTS. The average duration of the SS test (6.3 minutes) was shorter (P < 0.0001, paired t-test) than the FT test (11.8 minutes). The mean deviation did not differ between SS and FT testing. A small difference was found in the pattern SD (PSD) (P = 0.02). The mean CIGTS score from the FT test (4.5) was significantly lower (P < 0.0001) than the mean CIGTS score from the SS test (6.0). Although the two tests yielded identical Glaucoma Hemifield Test (GHT) results in 179 patients (76%), 16 patients had a normal GHT result on FT testing and an SS test result that was outside normal limits. Six patients had the reverse finding. The most significant factor associated with an increased (positive) difference between the CIGTS VF score generated from SS and FT testing was conducting the FT test first (P < 0.0001).
CONCLUSIONS. Although SS and FT testing yielded very similar mean deviation results, the CIGTS VF score and GHT differed between SS and FT tests. Changing the approach used to measuring a studys primary VF outcome should be accompanied by a critical evaluation of the changes impact.
| Methods |
|---|
|
|
|---|
4.0 dB). Because the SITA-Standard test does not produce an SF value, its reliability was based on the first three parameters. In two patients, pupil dilation was induced between the two VF tests, as indicated by the intertest pupil diameter being 4 mm disparate. These two patients data are not included in this report, yielding n = 243 with comparable VF test results.
Both VF test results were scored according to the CIGTS VF scoring algorithm,1 11 which assigns weights to points on the VF tests total deviation probability plot according to the extent of departure from normal values, as expressed by point-specific probabilities, which are empirically derived percentiles from the distributions of values at each of the 52 points from age-specific sets of normal subjects collected by the manufacturer.2 The proprietary distributions are built into the VF test software and are not available for inspection. The probability at each of the 52 points is reported as no defect or P
0.05,
0.02,
0.01, or
0.005, meaning that the measured value at that point was at or below the respective percentile of the age-specific empiric distribution at that position in the field for normal subjects. A point is called defective if its probability is 0.05 or less and it has at least two neighboring points with probabilities of 0.05 or less in the same vertical hemifield (superior or inferior). A weight is assigned depending on the minimum depth of the defect at the given point and the two most defective neighboring points. A minimum defect of 0.05, 0.02, 0.01, or 0.005 is given a weight of 1, 2, 3, or 4, respectively. A point without two neighboring points all depressed to at least P
0.05 is given a weight of 0. For example, a point at P
0.01 with only two neighboring points of defect, both at P
0.05, would receive a weight of 1. The weights for all 52 points in the field are summed, resulting in a value between 0 and 208 (52 x 4). The sum is then scaled to a range of 0 to 20 (by dividing by 10.4), resulting in a score that is a nearly continuous measure of VF loss. Other Humphrey VF test parameters that are common to both testing procedurestest duration, pupil diameter, mean deviation (MD), pattern SD (PSD), and Glaucoma Hemifield Test (GHT) result12 were recorded.
Comparisons between test results were made with paired Students t-tests and scatterplots for continuous variables. For categorical variables, we used the McNemar test for dichotomous variables and the Bowker test for symmetry13 for more than two categories. Factors predictive of differences in test results (SITA minus FT) were evaluated by linear regression. Data analyses were performed on computer (SAS, ver. 9.1; SAS, Cary, NC).14
This research adhered to the tenets of the Declaration of Helsinki. All CIGTS patients gave written informed consent to participate, and the institutional review boards at the CIGTS clinical centers approved the study.
| Results |
|---|
|
|
|---|
|
= 0.61), reflecting the 22 (9%) patients for whom a normal versus ONL result was found, and the additional 36 patients (15%) whose intertest GHT results were off by one category.
|
|
|
|
|
For the PSD outcome, modeling the differences between FT and SITA testing showed significant associations with two factors: the mean PSD (P = 0.03) and the mean pupil diameter (P = 0.05). Higher PSDs and smaller pupils yielded larger positive PSD differences between SITA and FT VF tests.
For the CIGTS VF score, the most predictive factor associated with the SITA tests showing more loss than the FT test was the order in which the tests were conducted (P < 0.0001). Regardless of which VF test was first, the score was higher (worse) from the SITA than the FT test. When the SITA VF test was first, the resultant CIGTS VF score was 0.87 units higher than the FT test result; when the FT test was first, the resultant CIGTS VF score from the SITA test was 2.22 units higher than the FT test result. The difference between these two results, 1.35 score units, is equal to the regression coefficient in Table 3 . Other significant factors that were associated with a SITA tests showing more loss than the FT test (in CIGTS VF scores) included more VF loss (P = 0.003) and a smaller mean pupil diameter (P = 0.02).
The impact of observed differences in CIGTS VF scores from the two tests can be displayed in the frequency of triggering the intervention failure protocol (Table 4) , which is based on detecting a 3-unit or greater increase (worsening) in the CIGTS VF score from the patients current reference CIGTS VF score (based on previous FT testing). If the SITA-derived CIGTS VF score was used, it would have triggered initiation of the intervention failure protocol (i.e., additional VF testing for verification) in 30 (12%) patients for whom the FT test would not have, whereas using the FT-derived score would trigger this protocol in only three patients (1%) for whom the SITA test would not have done so (P < 0.0001, McNemar test).
|
| Discussion |
|---|
|
|
|---|
The fact that these two testing strategies yielded quite similar MDs and PSDs is noteworthy. Our average intertest differences in MDs and PSDs of 0.12 and 0.18 dB, respectively, indicate close agreement from a clinical perspective, as illustrated in Figures 1 and 2 . In most prior comparative studies, MDs from SITA testing were slightly better than those from FT testing. Bengtsson and Heijl18 conducted a study of 330 normal subjects that indicated a 1.6-dB higher (better) MD from SITA-Standard testing than from FT testing. Their evaluation of 44 patients with glaucoma and 21 normal subjects19 found almost identical average MD results from SITA standard and FT testing of the patients, but noted that the number of significantly depressed points was higher in SITA testing than in FT testing. Heijl et al.4 tested 31 patients with glaucoma and found that MDs with SITA were on average approximately 1 dB less severe than the 30-2 FT values. They concluded that the SITA test yields results similar to those of the FT test. Sharma et al.6 reported that the MDs and PSDs from SITA and FT 24-2 testing correlated highly (r = 0.92 and 0.93, respectively). In 82 patients with glaucoma, Budenz et al.7 found better MDs derived from SITA-Standard testing than from FT testing, by 0.7 dB. Our lack of difference in MDs from the two testing approaches, although not substantially disparate from that found by others, may have been caused by differences in our patients distribution of VF loss, their relatively extensive experience with VF testing, or other factors.
There was a substantial difference between CIGTS VF scores produced from the FT test and from the SITA-Standard test. On average, SITA scores were higher (indicating more VF loss) than FT scores by 1.5 CIGTS VF score units. Test order came into play in this finding. Conducting the SITA test first and the FT test second resulted in less disagreement between CIGTS VF scores. Even so, when this test order was used, CIGTS VF scores were still higher when based on the SITA tests, and the amount of difference was still statistically significant. Characteristics that are unique to the computation of the CIGTS VF score may have contributed to the intertest difference. The CIGTS VF score is derived from weighting the probabilities in the total deviation probability plot. A specific point contributes to the score only if at least two neighboring points show some defect. We speculate that higher CIGTS VF scores from SITA testing result from a lower threshold for points in SITA testing to be declared depressed relative to the threshold needed to be passed for FT significance. Support for this possibility comes from the report of Bengtsson and Heijl18 of the establishment of normal threshold limits for the SITA strategies. They found smaller intersubject variability in SITA-Standard testing (31% less) than in FT testing among 330 normal subjects who were tested, which resulted in normal limits for SITA that were "tightened" between 9% and 29%. Thereby, the statistical significance of a depressed point on SITA testing would be achieved more readily than on FT testing. Of course, this speculation can be critically evaluated only with knowledge of the distributions of normal values used by the SITA-Standard and FT testing software, which are proprietary.
The role of test order was important for both CIGTS VF scores and MDs. Conducting the longer FT test first may result in a fatigued patient taking the SITA test. A patient who is fatigued after taking the FT test may well fail to respond to less-intense test stimuli, thereby producing a SITA VF result that indicates more loss and a higher (worse) CIGTS VF score or a more negative MD result. The finding that more visual field loss was associated with greater differences between SITA and FT tests may relate to reported differences between SITA and FT test results at sensitivity losses of approximately 15 dB, wherein SITA results yield higher sensitivity estimates than do FT results.20 As pupil diameter increased, two trends were observed in the VF results. First, regardless of the order of testing, the average CIGTS VF score was smaller (indicating less VF loss) with increasing pupil diameter. Second, differences between SITA and FT outcomes for CIGTS VF scores lessened with increasing pupil diameter, although SITA outcomes were consistently greater than FT outcomes across the range of pupil diameters. These results probably relate to a more reliable VF assessment when the pupil diameter is sufficiently wide to allow for optimal light exposure.
GHT results from the two tests yielded 22 instances wherein one tests GHT was within normal limits (WNL) and the other tests GHT results was ONL. An inspection of these VFs by two glaucoma specialists (PRL, RPM) found factors such as better defect characterization on SITA testing, probable artifact, questionable reliability, possible patient fatigue, and early defects at the edge of detection underlying these GHT differences. Such variation in GHT results has been reported previously by Sharma et al.,6 who found variation to be more likely in normal control subjects and patients with suspected or mild glaucoma.
The decision to change VF testing strategies in our ongoing follow-up of patients with glaucoma compelled us to study its impact on the CIGTS VF score. This score is used in the CIGTS to monitor a patients VF status, and VF change is the primary outcome of the study. A 3-unit increase in the CIGTS VF score, if shown to be consistent on two repeated VF tests, requires advancement to the next treatment step (e.g., from topical medicine to argon laser trabeculoplasty). As Table 4 shows, reliance on the SITA result would have caused VF follow-up testing of 30 patients for whom FT test results would not have indicated such a follow-up. This far exceeds the three patients whom FT flagged for further VF testing and SITA did not. These SITA-versus-FT VF findings do not permit us to view the CIGTS VF score resulting from FT testing to be directly comparable to that obtained from SITA testing. Rather, we are establishing a new baseline for each patient on converting to SITA-Standard testing.
Our results show that comparability of SITA and FT testing for one VF outcome (MD) does not imply comparability on another VF outcome (CIGTS VF score). To be safe, then, a critical evaluation of the comparability of old and new test strategies should accompany change in the approach used to measuring a studys primary VF outcome.
| Footnotes |
|---|
Supported by National Eye Institute Grants EY09148 and EY015860 and a grant from Allergan, Inc.
Submitted for publication January 4, 2005; revised March 24, 2005; accepted March 31, 2005.
Disclosure: D.C. Musch, Allergan, Inc. (F); B.W. Gillespie, Allergan, Inc. (F); B.M. Motyka, Allergan, Inc. (F); L.M. Niziol, Allergan, Inc. (F); R.P. Mills, Allergan, Inc. (F); P.R. Lichter, Allergan, Inc. (F)
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: David C. Musch, Department of Ophthalmology, University of Michigan, 1000 Wall St., Ann Arbor, MI 48105; dmusch{at}med.umich.edu.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
P. R. Lichter, D. C. Musch, and N. K. Janz The Investigators' Perspective on the Collaborative Initial Glaucoma Treatment Study (CIGTS) Arch Ophthalmol, January 1, 2008; 126(1): 122 - 124. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |