|
|
||||||||
From the Ophthalmology Clinic, Department of Clinical Neurosciences, Geneva University Hospitals, Geneva, Switzerland.
| Abstract |
|---|
|
|
|---|
METHODS. Five to six volunteers with normal vision were asked to read full pages of text with a 10° x 7° viewing window stabilized in central vision. In a first experiment, reading performance with off-line and real-time square pixelizations was compared at different resolutions. In a second experiment, off-line square pixelization was compared with off-line Gaussian pixelization with various degrees of overlap. In a third experiment, real-time square pixelization was compared with real-time Gaussian pixelization.
RESULTS. Results from the first experiment showed that real-time square pixelization required approximately 30% less information (pixels) than its off-line counterpart. Results from the second experiment, using off-line processing, revealed a restricted range of Gaussian widths for which performances were equivalent or significantly better than that obtained with square pixelization. The third experiment demonstrated, however, that reading performances were similar in both real-time pixelization conditions.
CONCLUSIONS. This study reveals that real-time stimulus pixelization favors reading performance. Performance gains were moderate, however, and did not allow for a significant (e.g., twofold) reduction of the minimum resolution (400500 pixels) needed to achieve useful reading abilities.
Our research group is part of a larger multidisciplinary research effort aiming to develop a subretinal implant. Our CMOS-Retina8 9 10 is built to transform incident light on the retina into electric stimulation currents "in situ." In this context, we have developed special experimental conditions (simulations) to explore the minimum requirements to restore useful artificial vision.
Our simulations use low-resolution (pixelized) images that are projected in a "small" viewing area, stabilized at a fixed location in the visual field. We attempt to mimic the type of visual information provided by a retinal implant, using photodiode technology to transform incident light into an electric signal. With this methodological approach we explored, in a first study,11 the reading of isolated four-letter words. In central vision, accurate recognition was possible with pixelizations down to 286 pixels, distributed over a 10° x 3.5° viewing window. After a period of systematic training, comparable results were achieved with the same viewing window stabilized at 15° eccentricity in the lower visual field. In a second study,12 we explored full-page text reading under similar conditions. Tests were performed with a larger viewing window of 10° x 7° containing 572 pixels, that moved across the page of text under control of the subjects eye movements. Performance was close to perfect with central vision. With eccentric vision, subjects achieved reading scores between 86% and 98% after a period of methodical training.
In earlier studies, we used a simplified technique to simulate the limited number of stimulation contacts available in a visual prosthesis. Stimulus images were decomposed into a finite number of pixels with a simple block-averaging algorithm. This resulted in a mosaic of square pixels of various gray levels, the gray level within each pixel being constant (square pixelization). However, electrophysiological research13 14 15 revealed that the patterns of neural activity elicited by electric stimulation of the retina depend on the strength of the stimulation current and that neural activation diminishes progressively with increasing electrode-to-neural target distance. These findings imply that phosphenes elicited by electrical stimulation of the retina should not be of constant luminosity and not of square shape. Furthermore, depending on the strength of the stimulation current, the percepts may develop from a collection of isolated phosphenes toward more continuous patterns with different degrees of overlap across neighboring phosphenes.
One could argue that square pixelization is adequate to simulate the reduced information content of the stimuli transmitted by a retinal implant. In a given condition, the detailed shape of each pixel does not alter the overall information content of the image. However, studies on face recognition have demonstrated that detection is considerably hampered when images are decomposed into uniform square pixels. Harmon and Julesz16 suggested that the oriented high-frequency noise introduced at block borders masks certain image features essential for recognition. Gestalt psychologists17 18 further proposed that square pixelization distorts the image to the point of modifying its intrinsic gestalt properties.19 Bachmann and Kahusk20 also suggest that the "block" constituents or pixels of the processed image compete for attention with the particular features of the image, thus affecting recognition. If one wants to avoid these drawbacks, square pixelization should be replaced by other types of image quantization featuring softer borders and allowing for variable amounts of overlap.
Another shortcoming of our previous studies is that the pixelization algorithm was applied off-line over the entire original image (e.g., seven lines of full-page text). Subjects were allowed to scan this preprocessed image through a viewing window containing a subset of 572 pixels, the gray level of these "frozen" pixels being independent of the point of gaze on the image. This would not be the case in artificial vision systems, since stimulation intensity at each electrode contact would depend on the exact point of gaze relative to the image observed. For retinal implants transforming light falling on the retina into stimulation currents "in situ,"4 7 10 this would happen due to eye movements. Head movements would act similarly in systems using an external head-mounted camera for stimulus generation.1 2 3 5 6 In the case of reading, when focusing on a string of a few characters, its appearance would change on small eye (or camera) movements. Temporal cues seem to play a significant role in visual perception: the human visual system is optimized for detecting structural changes in dynamic images. A dynamic sequence of slightly different pixelized images may contain more information than one frozen pixelized image; therefore, dynamic (real-time) pixelization is likely to enhance information transmission to the visual system. Major object identification features (such as shape or location) are extracted from different spatial patterns (such as local contrast changes or relative position changes) resulting from image motion. Improved sensitivity for moving contrast changes, compared to their static equivalents, has previously been demonstrated.21 Moreover, it has already been established that dynamic presentations lead to better performance in tasks like facial recognition.22 23 24 Hence, if one wants more accurate simulations of artificial vision, pixelization should be performed in real-time and the intensity of each pixel should vary dynamically, according to gaze position.
To our knowledge, psychophysical research using simulations of prosthetic vision has not been extensive so far. Reading and mobility were first studied by a group at the University of Utah.25 26 Their head-mounted experimental setup consisted of a video camera sending images to a monochrome monitor that projected to the subjects right eye (maximum viewing angle of 1.7°). Pixelization was achieved by overlaying the monitor with opaque masks containing a variable number of square perforations (pixels). Recently, another group at The Johns Hopkins University presented a series of experiments that used simulations specifically designed to mimic percepts evoked by retinal implants.27 28 29 Different pixelization algorithms were used: a square pixelizing filter similar to the one presented in this article, a constant luminosity circular pixelizing filter, and a nonoverlapping Gaussian filter. Unfortunately, no direct comparison of the different pixelizing algorithms has been reported. Moreover, all these experiments neglected a fundamental aspect of artificial vision with a retinal implant: Viewing areas were not stabilized at fixed (eccentric) retinal positions. In more recent studies, the latter authors acknowledged that the stabilization of the viewing area on the retina can significantly affect performance (Dagnelie G, et al. IOVS 2004;45:ARVO E-Abstract 4223; Kelley AJ, et al. IOVS 2004;45:ARVO E-Abstract 5436), especially in visually demanding tasks such as reading.
To validate our previous studies as well as to improve our simulation methods for future studies, we decided to investigate specifically the influence of the spatial and temporal characteristics of stimulus pixelization on reading performance. In the present study, we report a series of three paired comparisons of the effects of different pixelization methods on full-page reading. We compared reading performance: (1) between off-line square pixelization and real-time square pixelization of the image, (2) between off-line square pixelization and off-line Gaussian pixelization of the image, and (3) between real-time square pixelization and real-time Gaussian pixelization of the image.
| Methods |
|---|
|
|
|---|
Experimental Setup
The stabilized projection of a 10° x 7° viewing window on the retina was achieved with a high-speed video-based eye and head-tracking system (EyeLink; SensoMotor Instruments GmbH, Berlin, Germany) and a high-refresh-rate monitor (Fig. 1) . Please refer to our preceding publications11 12 for a more detailed description of the experimental setup.
|
Square pixelization was performed with a simple block-averaging algorithm, in which matrices of n x n pixels of the original image are fused into single uniform pixels with luminance values corresponding to the mean gray scale levels of the original n x n matrices (Fig. 2a) .
|
![]() |
![]() |
denotes the SD of the particular Gaussian function around its horizontal (µx) and vertical (µy) means. In our case,
determines the amount of overlap of each pixel onto its neighbors (Gaussian width), whereas µx and µy correspond to the center coordinates for each pixel (Fig. 3) .
|
Real-Time Pixelization.
In this condition, only the small portion of the entire text segment image displayed in the 10° x 7° viewing window (determined by the subjects gaze position on the screen) was pixelized in real-time. Gaze position data were used to reposition the viewing window and to display its newly pixelized content on the screen. To achieve adequate image stabilization on the retina, the maximum image-processing time (stimulus pixelization and display) was kept below 10 ms. To fulfill this condition, enormous processing power is needed when large Gaussian widths are used, due to significant amounts of overlap across neighboring pixels. For real-time pixelization, the processing power of our equipment limited us to Gaussian widths up to 0.14 pixels.
Testing Procedure
The remaining aspects of the experimental procedure were exactly the same as described in our preceding study on full-page text reading.12 Briefly, tests were performed monocularly (using the dominant eye) and in central vision. For each run, subjects had to read aloud several text segments of an article, randomly chosen out of a pool of 50 (none of the subjects read an article twice). Test sessions frequently included several runs, but they never lasted longer than 30 minutes, to avoid fatiguing the subjects.
The programs and algorithms used for image processing and experiment control were developed in commercial software (Visual C++ 6.0 SP5; Microsoft, Redmond, WA) and the latest Platform SDK libraries available at the time of the experiment. Some functions of the EyeLink Windows API library (v. 1.0; SensoMotor Instruments, GmbH) were also used.
Data Analysis and Statistics
Two variables were measured to assess reading performance: reading scores, expressed in percentage of correctly read words (gender and conjugation mistakes were considered as errors), and reading rates, expressed in the number of correctly read words per minute. Since percentage scales are not adequate for statistical analysis,30 reading scores were transformed to rationalized arcsine units (rau). Nevertheless, for better clarity, an approximate percentage scale is shown on the right axes of the figures and is also used in the text.
Results were calculated as the mean of the cumulative performance of each subject ± SEM. Statistically significant differences in reading performance were determined by standard (paired) t-tests with a significance level of 0.05.
| Results |
|---|
|
|
|---|
Figure 4 compares mean reading performances versus number of pixels in the viewing window for off-line and real-time pixelizations. Individual performances in each experimental condition were established on the basis of 12 text segments and data were fitted with psychometric functions. Down to a target resolution of 572 pixels, average reading scores were close to perfect (above 95% correct) and statistically equivalent for both conditions. At 280 pixels, subjects achieved reading scores of 94.3% with real-time pixelization, but of only 76.4% with off-line pixelization. This difference was statistically significant (P = 0.0017), and persisted at the lowest resolution (166 pixels; 56.1% versus 29.3%; P = 0.013). It is interesting to estimate the critical target resolution for subjects to reach useful reading performances. In our previous study on full-page reading,12 we found that adequate (good to excellent) text comprehension correlated closely with high reading scores. This criterion was fulfilled at median scores of 96.8%. In the present case, the fits to the data indicate that this score is reached at 498 pixels in the case of off-line pixelization and at 322 pixels for real-time pixelization (Fig. 4a) .
|
Taken together, these results indicate that equivalent reading performances could be reached at a significantly lower resolution with real-time pixelization.
Off-Line Gaussian Pixelization Versus Off-Line Square Pixelization
Six normal subjects (26, 29, 29, 33, 34, and 41 years of age) participated in the second experiment. Pixelizations with six different Gaussian widths (
of 0.036, 0.071, 0.143, 0.286, 0.571, and 1.143 pixels) were tested and compared with square pixelization. The effect of varying the Gaussian width
for image pixelization is illustrated in Figure 5 . In all conditions, the 10° x 7° viewing window contained 572 pixels (resolution shown to provide enough information for useful full-page text reading12 ). Each subject had to read an article of approximately 250 words (i.e., 10 consecutive text segments, per condition). Three subjects started the experiment with Gaussian pixelization at the smallest
value, progressed toward the larger Gaussian widths, to finish with square pixelization. The remaining three subjects conducted the experiment inversely.
|
) are shown in Figure 6 and compared to results obtained with square pixelization. Four Gaussian widths (
= 0.071, 0.143, 0.286, and 0.571 pixels) resulted in reading scores above 94% correctly read words. These scores were very close to those obtained with square pixelization (Fig. 6a) . Mean reading scores with
= 0.143 and 0.286 pixels were found to be significantly better than those obtained with square pixelization (P = 0.04 and 0.009, respectively). Reading scores declined markedly below 80% for the two extreme Gaussian widths tested (
= 0.036 and 1.143 pixels).
|
= 0.286 pixels. This value is significantly higher (P < 0.001) than the reading rate of 57 words/min achieved with square pixelization. Reading rates with
= 0.143 and 0.571 pixels were not significantly different from those obtained with square pixelization. For
= 0.036, 0.071, and 1.143 pixels, reading rates declined markedly (below 40 words/min). Taken together, these data reveal that Gaussian pixelization can lead to slightly, but significantly better reading performance than can its square counterpart. This suggests that some degree of image smoothing resulting from overlapping between neighboring pixels can be beneficial for reading. This benefit is, however, only observed for a restricted range of overlapping.
Real-Time Gaussian Pixelization Versus Real-Time Square Pixelization
Results of the second experiment demonstrated that off-line Gaussian pixelization could lead to significantly better reading performance than off-line square pixelization. A third experiment was thus dedicated to extend this comparison to real-time mode.
For this evaluation we would have rather used the "optimal" Gaussian width (
= 0.286 pixels) determined in the second experiment. However, the total processing time needed to simulate this condition turned out to be too important to ensure adequate image stabilization on the retina. Using the second best condition (
= 0.143 pixels) allowed us to keep processing time below 10 ms. The same six normal volunteers who had participated in the second experiment were requested to read 10 text segments in each of two conditions: (1) real-time Gaussian pixelization at
= 0.143 pixels and (2) real-time square pixelization. In both conditions, the 10° x 7° viewing window contained 572 pixels. Three subjects started with real-time square pixelization and then switched to real-time Gaussian pixelization. The remaining three subjects performed the experiment inversely.
The results of this experiment are summarized in Table 1 . No significant difference in performance was recorded between both types of pixelization. However, reading scores and reading rates tended to be slightly higher with square pixelization. Comparing those real-time scores with their off-line counterparts gathered in the second experiment reveals that both real-time conditions yielded better performance. This performance gain was significant for square pixelization (reading scores: P = 0.003; reading rates: P = 0.008), but not for Gaussian pixelization (reading scores: P = 0.12; reading rates: P = 0.25).
|
| Discussion |
|---|
|
|
|---|
|
= 0.286 pixels). Additional research, especially at lower resolutions, would be necessary to investigate other factors. It should also be stressed that extreme Gaussian widths noticeably impaired performance. When very small Gaussian widths were used, pixels appeared as isolated small points of light, making it almost impossible to extract a cohesive picture. With large Gaussian widths, overlap was too pronounced, leading to very-low-contrast stimuli.
Results of experiment 3 might appear surprising in light of the findings of experiment 2: When using real-time processing, the benefits of Gaussian pixelization vanished. In fact, this outcome is not astonishing. Real-time processing had already eliminated the major handicap of square pixelization. The distracting high-frequency noise introduced at pixel borders is low-pass filtered by pixel movement. We believe that the use of the optimal Gaussian width
= 0.286 pixels (instead of 0.143) would not change this result fundamentally.
Implications of the Results for Simulations of Artificial Vision
The exact characteristics of the electrophysiological response of the retina to patterned electrical stimulation remain undetermined to this date. However, the use of 2-D Gaussian functions for stimulus pixelization is certainly a more physiologically pertinent approach than the use of square pixels (pixel borders are smoother and it allows for overlapping between neighboring pixels). As soon as the results of electrophysiological experiments on retinal tissue become available, the parameters of such 2-D Gaussian (or more adequate) functions should be adapted. Our experiments also revealed that Gaussian width is an important factor for readability, suggesting that stimulating current strength and electrode spacing might have to be further "tuned" (within safe and comfortable limits) to achieve the most efficient image transmission possible.
Real-time processing also allows for more realistic simulations of the visual information provided by retinal prostheses. Our results demonstrated that it yields significantly better performance than its off-line counterpart. However, this benefit was relatively moderate, not allowing for a significant reduction (e.g., a factor of two) of the number of stimulation points. Most probably, this advantage will be even less important in visual prostheses with external head-mounted cameras, since head movements are larger and less frequent than eye movements. Recurring head movements could also result in an abnormal vestibulo-ocular reflex.
The first visual prosthesis prototypes have been recently implanted in humans with encouraging results.5 6 7 Yet, several important challenges still need to be overcome before these devices can provide benefits similar to those of cochlear implants in cases of deafness. The basic notion of patterned vision resulting from the continuous stimulation of several electrodes has not been fully confirmed. An appropriate method of selective stimulation eliciting the adequate psychophysical response has not been developed yet. Another major problem is to achieve efficient electrical stimulation within safe charge density limits.32 To reduce the total electrical charge injected on the retina, the use of relatively large stimulation electrodes (fundamentally limiting interelectrode spacing) as well as alternate solutions (such as inverted polarity, interleaved stimulation, and/or increasing the total area of the retinal array within feasible limits) may be mandatory. A substantial research effort is therefore still needed to solve these and other open issues before realizing the level of electrode integration suggested by our studies.
In conclusion, these results demonstrate that the spatial and temporal characteristics of image pixelization play a role in artificial vision simulations. Equivalent performance could be reached with a resolution reduction of approximately 30%, if stimulation parameters were adequate. This effect is not strong enough, however, to change fundamentally the minimum requirements determined in our previous studies on the basis of simplified processing:11 12 Four to five hundred contacts covering a 2 x 3-mm2 retinal area are necessary to transmit sufficient visual information for full-page text reading. Reading is particularly important because it is strongly associated with vision-related estimates of quality of life and represents one of the main goals of low vision patients seeking rehabilitation.33 34 35 It is thus important to be aware of such minimal conditions when developing visual prostheses, even if less sophisticated devices might already bring some clinical benefits to patients.
| Acknowledgements |
|---|
| Footnotes |
|---|
Submitted for publication October 4, 2004; revised February 14, May 24, and June 2, 2005; accepted August 1, 2005.
Disclosure: A. Pérez Fornos, None; J. Sommerhalder, None; B. Rappaz, None; A.B. Safran, None; M. Pelizzone, None
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Jörg Sommerhalder, Ophthalmology Clinic, Geneva University Hospitals, 24 rue Micheli-du-Crest, 1211 Geneva 14, Switzerland; jorg.r.sommerhalder{at}hcuge.ch.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
G. Dagnelie, D. Barnett, M. S. Humayun, and R. W. Thompson Jr Paragraph text reading using a pixelized prosthetic vision simulator: parameter dependence and task learning in free-viewing conditions. Invest. Ophthalmol. Vis. Sci., March 1, 2006; 47(3): 1241 - 1250. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |