Loading...
Thumbnail Image
Item

Application of Data Mining to Predict and Diagnose Diabetic Retinopathy

Haniyeh, Maryam
Date
2024-06
Type
Thesis
Degree
Citations
Altmetric:
Description
A Master of Science thesis in Biomedical Engineering by Maryam Haniyeh entitled, “Application of Data Mining to Predict and Diagnose Diabetic Retinopathy”, submitted in June 2024. Thesis advisor is Dr. Michel Pasquier and thesis co-advisor is Dr. Assim Sagahyroon. Soft copy is available (Thesis, Completion Certificate, Approval Signatures, and AUS Archives Consent Form).
Abstract
Diabetes Mellitus (DM), a chronic metabolic disorder, is characterized by high blood sugar levels that can lead to complications such as Diabetic Retinopathy (DR)—a condition that damages the retina and can cause vision loss. The early detection and management of DR are critical and can be facilitated by a comprehensive understanding of the disease and its risk factors, achievable through advanced data mining techniques. This study sets out to construct data mining models that can identify and associate these risk factors with the likelihood of developing DR. The dataset for this research was sourced from Saqr Hospital in Ras Al Khaimah and includes 400 patient records, with 194 patients diagnosed with DR. In assessing the impact of various factors on DR, the study will analyze 29 different attributes including diabetes duration, Body Mass Index, blood glucose levels, cardiovascular disease, hypertension, and others. The initial analysis employed supervised classification algorithms such as k-Nearest Neighbor, Support Vector Machine, Naïve Bayes, Random Forest, XG-Boost, and J48 Decision Tree to predict the incidence of DR. To enhance the model’s accuracy, 10-fold cross-validation was used, allowing the model to learn from different subsets of the data. Feature selection was utilized to determine the specific attributes that correlate with the presence of DR. Moreover, unsupervised learning techniques were employed to discover association rules and evaluate the probability of relationships within the dataset. The results indicate that feature selection significantly improved the performance of the classifiers, with the Random Forest algorithm achieving the highest accuracy of 91% and specificity of 90.4%. Moreover, the unsupervised learning methods highlighted strong associations between hypertension, diabetic macular edema, and DR. These findings can help in understanding the interconnected nature of these complications and emphasize the importance of comprehensive management approaches for patients with diabetic retinopathy.
External URI
Collections