|
|
||||||||
1From Bio-Medical Physics and Bio-Engineering, Aberdeen University and Grampian University Hospitals, Aberdeen, Scotland, United Kingdom; and the 2Grampian Diabetes Retinal Screening Programme, Woolmanhill Hospital, Aberdeen, Scotland, United Kingdom.
| Abstract |
|---|
|
|
|---|
METHODS. Algorithmic methods have been developed for assessing the quality of 45° single field retinal images for use in diabetic retinopathy screening. For this purpose, image quality was defined by two aspects: image clarity and field definition. An image with adequate clarity was defined as one that shows sufficient detail for automated retinopathy grading. The visibility of the macular vessels was used as an indicator of image clarity, since these vessels are known to be narrow and become less visible with any image degradation. An image with adequate field definition was defined as one that shows the desired field of view for retinopathy grading, including the full 45° field of view, the optic disc, and at least two optic disc diameters of visible retina around the fovea. From 489 patients attending a diabetic retinopathy screening program, 1039 retinal images were obtained. The images were graded by a clinician for image clarity and field definition, with a comprehensive image-quality grading scheme.
RESULTS. The sensitivity and specificity were, respectively, 100% and 90.9% for inadequate clarity detection, 95.3% and 96.4% for inadequate field definition detection, and 99.1% and 89.4% for inadequate overall quality detection.
CONCLUSIONS. The automated system performs with sufficient accuracy to form part of an automated diabetic retinopathy grading system.
However, detection of retinopathy lesions alone is not sufficient to allow automation of image grading within a diabetic retinopathy screening program. It is also necessary to determine that the image is of sufficient clarity and that it is of the correct portion of the retina. Blurred images can disguise lesions, so that a diseased eye could be mistakenly graded as normal. Blurring can be due to poor optics, cataract, macular edema, poor camera focus, or saccadic eye motion during acquisition. This aspect of image quality will be referred to as "image clarity." The image must also show the correct portion of the retina. This aspect of image quality will be referred to as "field definition." Inadequate field definition may be caused by poor patient cooperation, a latent squint, or insufficient pupil size.
Images must be passed to manual graders, by an automated system, either because they show signs of retinopathy or because they are not of adequate quality. Manual graders must decide whether patients should be referred for slit lamp examination or whether retinopathy grading is possible.
The prevalence of images with insufficient quality in a diabetic retinopathy screening program was shown by a study of 5575 patients.4 Of these, 11.9% had images that were unsuitable for retinopathy grading in at least one eye, according to the staged mydriasis protocol recommended by the Health Technology Board for Scotland. In another study, investigators found that 20.8% of 1542 patients who underwent single-field nonmydriatic photography and 5.6% of 1549 patients who underwent dual-field mydriatic photography had an ungradable image in at least one eye.5
The most comprehensive reported study on automated retinal image clarity analysis used detected vessel area as a means of assessing the gradability of an image for diabetic retinopathy.6 Sensitivity and specificity for detection of ungradable images, using 1746 images taken from a diabetic retinopathy screening program, were 84.3% and 95.0%, respectively. The verdict of a clinical grader was used as the gold standard. However, the researchers did not look at field definition. In our experience, images can fail because of inadequate field definition and/or inadequate clarity. Other published work on automated retinal image clarity assessment is limited to studies with a low number of images or describe only semiautomated methods.7 8 9 No reported studies have included automated detection of retinal image field definition.
For full automation of image grading to take place in a diabetic retinal screening program, we have developed automated methods for both aspects of retinal image quality: clarity and field definition. The methods were developed by using a training set of 395 images and were tested on a separate set of 1039 images. The results of the automated quality assessment were compared with a clinicians quality grading.
| Methods |
|---|
|
|
|---|
All anonymized images for the study were from patients attending the Grampian Diabetes Retinal Screening Programme. All people with diabetes from the age of 10 were invited to take part in the program. Model CR5-45NM and CR6-45NM fundus cameras (Canon, Tokyo, Japan) with internal flash were used, each attached to a color digital camera (model D30; Canon). The image-capture protocol adhered to the recommendations of the Health Technology Board for Scotland.10 Initially, all patients had photographs taken without the aid of mydriasis. If image quality appeared inadequate at time of photography, pupils were dilated with tropicamide 1% and the patient was rephotographed. A 45° single field discmacula photograph was taken of each eye, with the fundus occupying a circle inside a region of 1600 by 1440 pixels. Although the red plane of these retinal images is the brightest, the green plane shows the features of interest with the highest contrast and this so-called "red free" image is used clinically for microaneurysm detection. The blue plane is of low intensity, containing mainly noise and so was not used in this study.
The test set for the study consisted of 1039 images from 977 eyes of 489 patients. The images were a consecutive sequence from which no images were rejected. The camera operator occasionally took multiple photographs in an attempt to achieve an acceptable image. In 60 eyes, two images were present and in 1 eye, three images were present. For one patient, only one image was present. The mean age of patients in the test set was 63 years, with a minimum age of 11 years and a maximum age of 89 years. Mydriasis was required in 254 (23.8%) test-set images.
Manual Quality Grading System
We have devised a comprehensive retinal image quality grading scheme for assessing the gradability of images for diabetic retinopathy. Table 1 lists the four image clarity classifications with typical images shown in Figure 1 . An automated image grading system must be able to detect microaneurysms, as these are the earliest sign of diabetic retinopathy. The visibility of the macular vessels is therefore a good indicator of image clarity, because these have a width and color similar to those of microaneurysms.
|
|
|
|
A clinician (SP) applied the grading scheme to all images in the study and recorded the results along with the reasons for images having inadequate field definition. These included (1) small-pupil artifact present, (2) optic disc incomplete, (3) temporal arcades incomplete, and (4) a macula less than two disc diameters from the edge of the image.
Automated Quality Grading System
This section describes the automatic image-processing methods for retinal image quality assessment that were developed with a training set of 395 images. We have previously described the image processing methods that we use for detection of the vessels, optic disc, and fovea and for determination of the length of the visible portions of the temporal arcades.11 These methods are summarized here and illustrated in Figure 3 .
|
In the following description, disc diameter (DD) is used as a dimensional unit. A value of DD = 246 pixels was chosen, based on manual estimation of the mean optic disc diameter in good-quality images. The search space for the optic disc was restricted to a region with height 2.4 DD and width 2.0 DD surrounding the rightmost or leftmost point of the semiellipse fitted to the temporal arcades, as in Figure 3 . Within this region, the circular outline of the optic disc was detected by using a Hough transform, this time with circular templates having diameters from 0.7 to 1.25 DD pixels.
The fovea was detected by finding the maximum correlation coefficient between the image and a foveal model. The search was restricted to a circular region with diameter 1.6 DD centered on a point that is 2.4 DD from the optic disc and on a line between the detected optic disc and the center of the semiellipse fitted to the temporal arcades.
Image Clarity Assessment
Image clarity must be sufficient for microaneurysms, the earliest sign of diabetic retinopathy, to be visible. In the macular region, vessels will be present with a diameter on the same order as that of microaneurysms and with a similar color. Therefore, we assessed image clarity by detecting macular vessels and calculating their total length. Macular vessels were detected by processing the image to detect dark linear structures. Those that satisfy certain length and straightness criteria were assumed to be macular vessels.
Vessel detection was applied to a region centered on the automatically detected fovea. If the fovea had a high contrast (determined by the correlation coefficient between the detected fovea and a fovea model),11 then it was likely that the fovea had been accurately located. In this case, the method illustrated in Figure 4 A was used to assess image clarity. Vessel detection was applied within a square region having width and height of 3.5 DD. Within this region, the length of detected vessels was measured over a circle of diameter 3.5 DD (Fig. 4A , circle). The diameter, 3.5 DD, was determined to be optimal for correlation with the clinicians image clarity grading of the training-set images.
|
This problem was avoided by applying vessel detection to a larger region with width and height 4.5 DD (Fig. 4B . square boundary) centered on the detected fovea. Within this region, the location of the circle of diameter 2 DD (Fig. 4B , dashed circle) was searched for, such that it contains the minimum length of detected vessels. This circle was likely to be centered on the foveal avascular zone. Clarity was then assessed by calculating vessel length over a circle of diameter 3.5 DD with the same center point (the solid circle in Fig. 4B ).
An image was deemed to have adequate image clarity if the vessel length was greater than a threshold derived from the training set (see the Results section).
Field Definition
An image was defined as having adequate field definition if all the constraints listed in Table 3 based on the metrics illustrated in Figure 5 were satisfied and if no small-pupil artifact was present. Constraints on the distance between the optic disc and edge of the image, DOD,EDGE, and between the fovea and the edge of the image, DFOVEA,EDGE, were derived from clinical requirementsnamely, that a complete optic disc should be present and that two disc diameters of retina should be visible around the fovea. The upper and lower bounds on the angle of the line between the fovea and the optic disc,
OD,FOVEA, were the minimum and maximum values of this angle in the training-set images with adequate field definition. Analysis of the receiver operator characteristic (ROC) curve for detection of inadequate arcade length was used to determine a suitable threshold for the superior and inferior arcade lengths DARCADE(S) and DARCADE(I), respectively. The chosen threshold had high specificity, while maintaining good sensitivity, for detection of inadequate arcade length in the training-set images. If the image contained no optic disc or no fovea, the system would make spurious detections for their locations; however, these are unlikely to satisfy the constraints on
OD,FOVEA, DARCADE(S), and DARCADE(I).
|
|
| Results |
|---|
|
|
|---|
|
|
|
|
| Discussion |
|---|
|
|
|---|
In the one study to which the results can be usefully compared,6 the authors measured the area of detected vessels over the entire image, as an indicator of image clarity. If the measured area was above a given threshold, then the image was classified gradable. Image blurring causes some vessels to become undetectable, and so it is expected that vessel area would be reduced in ungradable images. However, larger vessels can remain visible, even when an image has lost significant clarity. Therefore, an image loses only a fraction of its detectable vessels when subject to low-level but significant blurring. There is also anatomic variation in vessel area between patients. The ability to discriminate gradable and ungradable images, therefore, depends on how variations between patients compare with variations due to clarity. Hence, there would be an increase in the discrimination ability of the technique if only narrow vessels were used in the assessment. Our method does this by first detecting the fovea and then by assessing image clarity using the length of vessels within the macular region. This region rarely contains any large vessels, and the absence of such vessels may explain why the results presented by Usher et al.,6 84.3% sensitivity and 95.0% specificity, are below our ROC curve (Fig. 6) which shows 96.7% sensitivity at 95.0% specificity.
There are no published studies in which retinal image field definition is automatically assessed. Our experience (Table 4) shows that approximately 50% of images with inadequate overall quality fail due to inadequate field definition, although their clarity is adequate. It is therefore important that field definition be considered when assessing the gradability of retinal images.
A more general approach to image clarity assessment may be considered using the contrast of vessels relative to their background. A threshold would have to be chosen below which the contrast of a vessel relative to its background would be considered inadequate. However, the choice of this threshold is problematic because the nature of even adequate-quality retinal images is extremely variable. For example, in some images, where the retina between the lesions and vessels is very uniform in its intensity, low-contrast vessels and lesions are clearly visible. We therefore chose to use vessel visibility as an indicator of image clarity. Our automated macular vessel detection method follows this principle by detecting vessels based on shape requirementsnamely length and straightness, rather than their contrast with the background.
Of the images incorrectly classified, there were false-positive and -negative cases of inadequate quality assessment. If an image is incorrectly graded as having adequate quality, then there is a risk of cases of retinopathy being missed. However, if an image contains the macula and is of adequate clarity, then it is likely that automated retinopathy grading would be satisfactory. It is, therefore, useful to consider individual cases in which errors have been made.
No images were incorrectly classified as having adequate clarity. There were, however, four images in the test set and three images of the training set that were incorrectly assessed as having adequate field definition. In all these cases, the images had been graded by the clinician as having a complete macula: the most important area for assessment of diabetic retinopathy. In six cases, the arcades were incomplete, but the software did not detect this because the process of ellipse matching to the main vessels (Fig. 3) , erroneously included branches off the main arcades in the arcade length calculation. This increased the estimated arcade length to a value above the threshold for acceptable adequate field definition. In two cases the optic disc was incomplete, because it was present at the edge of the image. In both cases only a very small portion was missing. This is a difficult condition to detect automatically but has very little diagnostic consequence. Of the seven images incorrectly assessed as having adequate field definition, five were correctly assessed as having inadequate clarity resulting in the correct assessment of overall quality. The remaining two images (one each from the training set and test set) were incorrectly assessed as having adequate overall quality. Nonetheless, they both had adequate clarity and a complete macula.
If an image is incorrectly classified as having inadequate quality, then the situation is safe for patient diagnosis, but it reduces the economic effectiveness of automation of the screening program by increasing the workload for manual grading. We have achieved a high sensitivity while allowing less than 11% of images of adequate quality to be passed for manual grading by the quality assessment system. The false detections of inadequate clarity were heavily biased toward images of lower clarity and fell, as follows, among the three image clarity gradings that have been grouped as adequate (see Table 1 ): excellent, 0% (0/125); good, 3.1% (23/732); and fair, 52.8% (66/125). Because many images with fair image clarity are difficult to classify, there being only a few macular vessels visible, it is reasonable that a large number of these images would be passed to manual grading by the automated system.
The methods could be generalized for other photographic scales and angular fields of view and for certain other retinal image capture protocols. To apply identical methods to images obtained with a different photographic scale, images could be resized so that the number of pixels per degree is converted to the same value as for the images in our study. The ability of the image-capture system to resolve fine detail may affect the length of macular vessels that are visible, and so the threshold applied to macular vessel length may be dependent on camera resolution. Images with a narrower or wider angular field of view may have different field requirements. If this were the case, then straightforward modifications to the constraints listed in Table 3 would probably be sufficient. Problems could occur, however, if images originate from sources that contain a variety of fields of view. Linking of images to their source would be necessary so that resizing or other minor adaptations could be applied according to the photographic setup used for each image.
Apart from automation of screening systems, the software could also be a useful aid at the time of photography. As more technicians use digital instruments to image the eye, it may be possible to incorporate an immediate quality check into the instrument software so that the technicians can be made aware of any quality problem. This would present an opportunity to solve the problem by repeating the photograph.
Image quality assessment is an important part of diabetic retinal image grading. It has been shown that it is possible to use automated microaneurysm detection to reduce the workload for manual image graders.3 However, incorrect results would be obtained if an attempt were made to apply that method to poor-quality images. It is therefore important that the images presented for automated microaneurysm detection be ones with quality that is adequate for microaneurysms to be seen and that show the correct portion of retina. The present study has shown that a fully automated method can make this decision. Hence, the techniques we have described are able to form an important and effective component of an automated image grading system in a diabetic retinopathy screening program.
| Footnotes |
|---|
Submitted for publication August 30, 2005; revised October 28, 2005; accepted January 18, 2006.
Disclosure: A.D. Fleming, None; S. Philip, None; K.A. Goatman, None; J.A. Olson, None; P.F. Sharp, None
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Alan Fleming, Bio-Medical Physics and Bio-Engineering, Aberdeen University and Grampian University Hospitals, Foresterhill, Aberdeen AB25 2ZD, Scotland, UK; a.fleming{at}biomed.abdn.ac.uk.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S Philip, A D Fleming, K A Goatman, S Fonseca, P Mcnamee, G S Scotland, G J Prescott, P F Sharp, and J A Olson The efficacy of automated "disease/no disease" grading for diabetic retinopathy in a systematic screening programme Br. J. Ophthalmol., November 1, 2007; 91(11): 1512 - 1517. [Abstract] [Full Text] [PDF] |
||||
![]() |
G S Scotland, P McNamee, S Philip, A D Fleming, K A Goatman, G J Prescott, S Fonseca, P F Sharp, and J A Olson Cost-effectiveness of implementing automated grading within the national screening programme for diabetic retinopathy in Scotland Br. J. Ophthalmol., November 1, 2007; 91(11): 1518 - 1523. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |