|
|
||||||||
1From the Department of Ophthalmology, Herlev Hospital, University of Copenhagen, Copenhagen, Denmark; 2Retinalyze A/S, Hørsholm, Denmark; the 3Department of Ophthalmology, Odense University Hospital, Odense, Denmark; the 4Department of Ophthalmology, Malmö University Hospital, Malmö, Sweden; the 5Department of Ophthalmology, Sahlgrenska University Hospital, Göteborg, Sweden; and the 6Department of Diabetes Research, University of Wales College of Medicine, Cardiff, Wales, United Kingdom.
| Abstract |
|---|
|
|
|---|
METHODS. Four hundred fundus photographs (35-mm color transparencies) were obtained in 200 eyes of 100 patients with diabetes who were randomly selected from the Welsh Community Diabetic Retinopathy Study. A gold standard reference was defined by classifying each patient as having or not having diabetic retinopathy based on overall visual grading of the digitized transparencies. A single-lesion visual grading was made independently, comprising meticulous outlining of all single lesions in all photographs and used to develop the automated red lesion detection system. A comparison of visual and automated single-lesion detection in replicating the overall visual grading was then performed.
RESULTS. Automated red lesion detection demonstrated a specificity of 71.4% and a resulting sensitivity of 96.7% in detecting diabetic retinopathy when applied at a tentative threshold setting for use in diabetic retinopathy screening. The accuracy of 79% could be raised to 85% by adjustment of a single user-supplied parameter determining the balance between the screening priorities, for which a considerable range of options was demonstrated by the receiver-operating characteristic (area under the curve 90.3%). The agreement of automated lesion detection with overall visual grading (0.659) was comparable to the mean agreement of six ophthalmologists (0.648).
CONCLUSIONS. Detection of diabetic retinopathy by automated detection of single fundus lesions can be achieved with a performance comparable to that of experienced ophthalmologists. The results warrant further investigation of automated fundus image analysis as a tool for diabetic retinopathy screening.
Digital analysis of fundus photographic images may enable detailed lesion mapping and counting, and dynamic analysis of time series of fundus photographs. The simplest automatable task of practical utility is probably that of distinguishing between subjects without retinopathy and those with any level of retinopathy, of which only the latter may need secondary-level evaluationin this case, a visual evaluation by an ophthalmologist or a grader. Absence of retinopathy is diagnosed in a large proportion of patients with diabetes in populations undergoing photographic screening.2 Diagnosing the absence of retinopathy or presence of a single or very few lesions is often more time consuming than diagnosing mild levels of retinopathy, because it may involve detailed consideration of borderline elements of doubtful identity.
The objective of the present study was to develop a method for automated detection of hemorrhages and microaneurysms in digitized fundus photographic images of patients with diabetic retinopathy and to compare the automated detection with visual identification of diabetic retinopathy.
| Materials and Methods |
|---|
|
|
|---|
The study includes a baseline examination and a follow-up examination after 18 months. Only photographs from the follow-up examination were included in a substudy of single-lesion detection, whereas both sets of photographs, a total of 800, were included in a substudy of the reproducibility of diabetic retinopathy grading between color transparency viewing and computer monitor display of digitized color versions of the same images. Both the color film and the digitized photographs were classified in random order. No upper limit was set for the time used to grade either set of photographs.
Image quality was graded for each eye according to the following classifications: excellent, good, fair, or poor (EURODIAB criteria3 ).
The present study is an entirely retrospective analysis of a prospectively planned investigation. The protocol underwent appropriate institutional review and complied with the tenets of the Declaration of Helsinki.2 The present study did not involve patients or biological samples, and as such did not require renewed institutional review under the national laws of the countries of residence of the study participants.
Photography
Fundus photographs were obtained after dilatation of the pupil with one or more drops of phenylephrine hydrochloride 2.5% and/or tropicamide 1%. A fundus camera (CR4 45 NM; Canon Europa, NV, Amstelveen, The Netherlands) set at a 45° angular field of view and 35-mm color transparency film (Ektachrome Elite 100; Eastman Kodak Corp., Rochester, NY) were used.
Four photographs were recorded at each visit in each eye of each of the 100 patients: two with the nasal border of the photographic field transecting the optic disc (macula-temporal field) and two with the temporal border of the photographic field set at 1 disc diameter from the optic disc on its temporal side (disc-nasal field). The two fields in combination span a viewing angle of 45° (vertically) by 80° (horizontally).3 The best photograph of each field was used for grading and the other was retained in the project files.
Digitization
The slide-mounted 35-mm color film transparencies were digitized with a slide film digitizer (Coolscan LS-2000; Nikon Corp., Tokyo, Japan) at 1350 dpi and 12 bits per pixel per color channel (red, green, and blue [RGB]). The images were stretched to 8 bits (values from 0 to 255) per color channel and stored in the uncompressed RGB tagged information file format (TIFF). The full image size was 1448 x 1296 pixels, and the diameter of the circular fundus region defined by the internal mask of the camera was approximately 1170 pixels. The images were displayed on a 21-in. color monitor (Professional Series PT813; ViewSonic Corp., Walnut, CA), unless otherwise specified.
Single-Lesion Classification
For visual inspection, the color and contrast of the digitized fundus images were enhanced according to an automated procedure involving full color stretching separately for each color channel. The lesions were outlined with a manually controlled screen marker. The software annotation tool included circles, ellipses, and freehand drawing to enclose the lesion as narrowly as possible. The lesions class and outline were displayed in color code as an overlaid image that could be toggled on and off.
Red fundus lesions were defined as any fundus lesion visibly recognizable as being a hemorrhage or a microaneurysm or any of the two. No other types of lesions were accepted. Hemorrhages were defined as comprising round or irregularly shaped, sharply or diffusely outlined, deep red areas of the color of venous blood or darker. Microaneurysms were defined as small round lesions with a well-defined edge and occasionally a brighter rim. The color and saturation of microaneurysms vary according to the oxygenation of the blood, from bright red (nearly pink) to deep red. The definitions of microaneurysms and hemorrhages are overlapping, and therefore the classification did not distinguish between the two.
Overall Visual Grading
Using standard grading conditions, two experienced graders of the Bro Taf Screening Service (Bro Taf Area Health Authority Diabetic Retinopathy Screening Service, Cardiff, Wales, UK), independently and in a randomized order of presentation, graded the color fundus photographs and image quality on transparency film of each eye according to the Bro Taf protocol (a modification of the EURODIAB standard).2 3 In a subsequent session, the graders reached consensus on the grades of all eyes.
One month later, the graders independently and in a randomized order of presentation graded the digitized images of each eye according to the Bro Taf protocol. The images were presented on two 19-in. color computer monitors (Multiscan G400; Sony, Tokyo, Japan). In a subsequent session, the graders reached consensus on the grades and image quality of all eyes, and the consensus grades were used to classify patients into the gold-standard binary reference classification of no diabetic retinopathy and any diabetic retinopathy, by using the Bro Taf levels of no diabetic retinopathy and questionable diabetic retinopathy for the first classification and levels of manifest diabetic retinopathy for the second classification. The overall visual gradings were reserved for the validation of automated lesion detection and were not used for development of the lesion-detection algorithm.
Visual Single-Lesion Annotation
Each of six ophthalmologists, all of whom subspecialize in retinal disease, identified and outlined as narrowly as possible all red lesions in each of the 400 digitized fundus photographs, independently and in random order of presentation.
First Round of Automated Lesion Detection
The digitized fundus images were analyzed with commercial fundus image-analysis software (RetinaLyze System; Retinalyze A/S, Hørsholm, Denmark). The lesion-detection algorithm of the system applies advanced modeling of the gray-level image function of digital images, primarily the green color channel. Seed points of feature candidates are identified by using local image properties. The vessel tree and optic nerve head are automatically identified and extracted before growing lesion candidates. A visibility parameter describing each candidate lesion is then calculated. This parameter describes the densitometric steepness of the edge of the candidate lesion and its contrast relative to the surrounding fundus, with allowance for other dark areas being permissible in its vicinity, provided that such dark areas are noncontiguous with the vascular branchings of the trunk vessels of the retina. The level of visibility required for acceptance of a candidate lesion is determined by a single user-supplied parameter, the visibility threshold, which controls the sensitivity of the automated lesion detection. The analysis was performed on a personal computer with a standard operating system (Windows NT; Microsoft Corp., Seattle, WA).
Adjudicated Single-Lesion Annotation
Three months after the specialist lesion outline, an adjudicating single-lesion classification was made of each of the 400 digitized fundus images. Three of the six ophthalmologists discussed and reached agreement or decided by majority vote whether to accept or reject a given hemorrhage or microaneurysm they deemed to be at least 50% likely to be a diabetic lesion. To ascertain that the adjudicating panel considered all potential red lesions, the panel members were shown all red lesion outlines set by the six specialists as well as by the automatic lesion-detection system. All outlines were shown as a simultaneous overlay in a masked appearance, using the same marker for automatically detected and specialist-identified lesions. At the discretion of any panel member the manually controlled zoom was linearly increased to approximately 30 x 25 pixels, which allowed the panel to investigate detailed areas of the images and to accept, reject, or redraw existing outlines or to outline unmarked lesions that they detected.
At a training session before the adjudicating session, a protocol amendment was accepted requiring all parts of the fundus images to be investigated simultaneously by the panel members at a specified image zoom factor to approximately 400 x 300 pixels. In addition, the adjudicators were allowed to examine the original slide-mounted transparencies using a 15-D magnifying spectacle lens and a slide projector.
The adjudicated set of single red lesion annotations was used to make the final adjustment of the automated lesion-detection algorithm.
Statistical Analysis
The agreement between overall visual gradings based on inspection of color transparency film and inspection of computer screened versions of the same photographs was calculated using the Cohen weighted
.4
Patients were classified as having automatically detected diabetic retinopathy if a single red lesion of any type was identified by the automated lesion-detection algorithm in any of the four images of the patients two eyes. The clinical relevance of the detected lesions was characterized by the specificity, sensitivity, and accuracy (the fraction of the entire study population that was correctly classified) against the gold-standard reference classifying each patient as having or not having diabetic retinopathy. The receiver operating characteristic (ROC) curve of the automated lesion detection was used to characterize the relationship between sensitivity and specificity, with area under the curve (AUC) serving as a general measure of performance.5
The Cohen
of agreement between automated classification and the gold standard reference was compared with the mean of agreements between the six ophthalmologists and the gold standard reference. The coefficient of variation was used to characterize the variability between individual ophthalmologists.
Statistical analysis was performed on computer (SAS System, ver. 8.01; SAS Institute Inc., Cary, NC; and StatXact 4; Cytel Software Corp., Cambridge, MA).
| Results |
|---|
|
|
|---|
|
|
The ROC curve (Fig. 1) of the automatic red lesion detection demonstrates the range of options in setting the balance between sensitivity and specificity. The AUC of the ROC curve was 90.3%. With the key parameter of the automated-detection algorithm, the visibility threshold (see the Methods section), set to its default value of 1.2, the specificity of detecting patients without retinopathy was 71.4% and the sensitivity of detecting patients with retinopathy was 96.7%, resulting in an overall accuracy (the fraction of the entire study population that was correctly classified) of 79.0% (Table 3 ; Fig. 2 ).
|
|
|
Twenty patients without diabetic retinopathy or with questionable diabetic retinopathy were incorrectly classified by automated red lesion detection as having retinopathy. The false-positive lesions arose in eyes with well-defined bright areas of visible retinal nerve fibers or with bright posterior hyaloid reflexes. On the background of these features, small, interspersed, well-circumscribed areas of normal yellow-red fundus pigmentation appeared as isolated high-contrast elements (intensity minima) that mimicked small hemorrhages and microaneurysms.
When we increased the visibility threshold to 1.5, the specificity was changed to 85.7%, and the sensitivity to 83.3%, thus resulting in an overall accuracy 85.0%. This demonstrates the possibility of adjusting the algorithm for specific objectives by variation of the visibility threshold.
The mean specificity of the six individual specialists was 82.9% (range, 74.3%88.6%). This was higher than the specificity of 71.4% achieved by the automated lesion detection (Table 4) . The mean sensitivity of the individual specialists was 91.1% (range, 83.3%96.7%). This was lower than the sensitivity of 96.7% achieved by the automated lesion detection at the default threshold value. The specificity of 75.5% of the adjudicating panel was higher than that of the automated lesion detection and the sensitivity of 90.0% was lower than that of the automated lesion detection. At a visibility threshold of 1.2 the sensitivity of the automated lesion detection matched the sensitivity of the best-performing specialist, at a cost of 2.9 percentage points of specificity.
|
Discussion
We have shown that a red-lesion-detection algorithm operating on digitized fundus images can achieve considerable effectiveness in sorting fundus photographs that show no disease from photographs with nonproliferative diabetic retinopathy. Thus, 50 of 70 patients without retinopathy were correctly identified.
The present study was intended to determine whether a sound ROC curve function can be achieved in automated detection of diabetic retinopathy. It was not intended to measure the sensitivity and specificity that is ultimately obtainable by automated fundus image analysis, because it involved a partial circular argument, in that the algorithm was tested on the same fundus photographs that were used to develop it. Consequently, a study of an independent set of photographs has been undertaken.6
We specifically chose not to use fluorescein angiography to classify red lesions into microaneurysms and hemorrhages because angiography does not permit distinction between small spot hemorrhages and clotted microaneurysms. Indeed, there is currently no practicable method whereby an ulterior ruling can be made on candidate lesions, and color fundus photographic grading is the established method used in clinical trials for morphologic assessment of diabetic retinopathy.
Ultimately, there is no means of objectively determining an ulterior truth, not even by fluorescein angiography. Although it may reveal microaneurysms that are not visible on fundus images, a candidate lesion that does not show up on the angiogram cannot be classified, because it may be a clotted microaneurysm, a hemorrhage, or another type of dark area that does not warrant the diagnosis of any of these lesions. Consequently, it would be difficult or impossible to determine whether the adjudicating panel or the overall graders were most correct in determining the absence or presence of diabetic retinopathy.
The frequency of diabetic retinopathy in the general population is low, and consequently the identification of one definite lesion in a fundus photograph may induce the observer to decrease his or her threshold for acceptance of candidate elements in the same photograph as being true diabetic lesions. Such a tendency to perceive an increase in the likelihood that morphologically similar elements are indeed specific pathologic lesions and not just random fluctuations in normal fundus morphology could not be confirmed by statistical analysis in the present study, and such reasoning was not implemented in the algorithm.
The present study demonstrates the possibility of achieving algorithmic identification of fundi with diabetic retinopathy with a near-perfect 96.7% sensitivity while maintaining a specificity of 71.4%. The suggested practical usefulness of the automated system is to reduce the workload of visual graders by approximately 50%, depending on the distribution of retinopathy in the screening population. In addition, the high workload and monotonous procedures involved in photographic screening for diabetic retinopathy appear to warrant practical assessment of computer-assisted diabetic retinopathy screening, at a time when digital fundus photography is becoming gradually more common.
In the present study automated lesion detection was applied in a conservative strategy optimized to achieve a significant potential for workload reduction in a screening setting without overlooking even very mild grades of retinopathy. Further studies should investigate whether the option of tuning the lesion detection to values other than its default setting can be safely used.
The effect of potentially confounding pathologic fundus changes has not been thoroughly investigated in the present study because of the low prevalence of such changes in the study population. Preliminary investigations indicate that the algorithm for red lesion detection is virtually insensitive to bright lesions such as drusen, epiretinal fibrosis, and hypopigmentation of the retinal pigment epithelium, whereas it is sensitive to microaneurysms and hemorrhages, regardless of whether these are caused by diabetic retinopathy or other types of retinal disease that produce such elements of microangiopathy.
In both the UK Prospective Diabetes Study (UKPDS) and the Early-Treatment Diabetic Retinopathy Study (ETDRS) the presence of microaneurysms only, in the absence of other lesions, is associated with an increased risk of progression of diabetic retinopathy.7 8 Patients in the UKPDS with microaneurysms only had a 3.4% risk of progression to moderate nonproliferative diabetic retinopathy (level 43) or higher levels of retinopathy or photocoagulation treatment within 6 years, whereas the risk in patients with no retinopathy at baseline was 2.7%.9 These relatively small numbers indicate that the risk of overlooking a single or a few microaneurysms is inconsequential in screening settings in which the follow-up interval never exceeds 2 years. The risk obviously increases if disease in the same patient is missed more than once in successive examinations. Future studies should address whether the missing of lesions is a random function or a systematically repeating event in a subgroup of eyes or patients, so that markers of poor detectability can be identified and taken into account, if possible.
The present study demonstrates the feasibility of using automated digital detection of red lesions in diabetic retinopathy. Further studies are in progress to determine the reproducibility of the procedures involved. The results of the present study firmly document the potential for incorporating automated image analysis in the diabetic retinopathy screening services. The exact role of such instruments remains to be determined in practice.
| Footnotes |
|---|
Commercial relationships policy: E (JG, MG); C (ML, NL, HL-A, DRO); P (MG); N (EA, AKS, HK).
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Michael Larsen, Department of Ophthalmology, Herlev Hospital, DK-2730 Herlev, Denmark; mla{at}dadlnet.dk.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
M. D. Abramoff, M. Niemeijer, M. S.A. Suttorp-Schulten, M. A. Viergever, S. R. Russell, and B. van Ginneken Evaluation of a System for Automatic Detection of Diabetic Retinopathy From Color Fundus Photographs in a Large Population of Patients With Diabetes Diabetes Care, February 1, 2008; 31(2): 193 - 198. [Abstract] [Full Text] [PDF] |
||||
![]() |
S Philip, A D Fleming, K A Goatman, S Fonseca, P Mcnamee, G S Scotland, G J Prescott, P F Sharp, and J A Olson The efficacy of automated "disease/no disease" grading for diabetic retinopathy in a systematic screening programme Br. J. Ophthalmol., November 1, 2007; 91(11): 1512 - 1517. [Abstract] [Full Text] [PDF] |
||||
![]() |
G S Scotland, P McNamee, S Philip, A D Fleming, K A Goatman, G J Prescott, S Fonseca, P F Sharp, and J A Olson Cost-effectiveness of implementing automated grading within the national screening programme for diabetic retinopathy in Scotland Br. J. Ophthalmol., November 1, 2007; 91(11): 1518 - 1523. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Niemeijer, B. van Ginneken, S. R. Russell, M. S. A. Suttorp-Schulten, and M. D. Abramoff Automated Detection and Differentiation of Drusen, Exudates, and Cotton-Wool Spots in Digital Color Fundus Photographs for Diabetic Retinopathy Diagnosis Invest. Ophthalmol. Vis. Sci., May 1, 2007; 48(5): 2260 - 2267. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Larsen, J. Godt, M. Grunkin, H. Lund-Andersen, and M. Larsen Automated Detection of Diabetic Retinopathy in a Fundus Photographic Screening Population Invest. Ophthalmol. Vis. Sci., February 1, 2003; 44(2): 767 - 771. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |