Predicting Stroke Risk Using Machine Learning: A Data-Driven Approach to Early Detection and Prevention

dc.contributor.author Sutcu, Muhammed
dc.contributor.author Jouda, Dana
dc.contributor.author Yildiz, Baris
dc.contributor.author Katrib, Juliano
dc.contributor.author Almustafa, Khaled Mohamad
dc.contributor.other Industrial Engineering
dc.contributor.other 06. School Of Engineering
dc.contributor.other 01. Atılım University
dc.date.accessioned 2025-12-05T16:39:23Z
dc.date.available 2025-12-05T16:39:23Z
dc.date.issued 2025
dc.description.abstract Stroke is a major global health concern and a leading cause of disability and mortality, emphasizing the need for early risk prediction and intervention. This study leverages statistical analysis, machine learning (ML) classification, clustering, and survival modeling to identify key stroke predictors using a dataset of 5110 records. Descriptive statistics reveal that age, glucose levels, BMI, hypertension, and heart disease are the most influential risk factors. Stroke prevalence is notably higher among hypertensive (13.25%) and heart disease patients (17.03%), as well as among former (7.91%) and current smokers (5.32%). Clustering analysis using PCA and t-SNE highlights high-risk groups with elevated glucose levels and advanced age. Among ML models, XGBoost offers the best trade-off between precision and recall, while na & iuml;ve Bayes achieves the highest recall (0.404), detecting more stroke cases despite higher false positives. Feature importance analysis ranks glucose, BMI, and age as dominant predictors, with XGBoost emphasizing cardiovascular conditions. Survival analysis confirms increasing stroke risk beyond age 60, with the Kaplan-Meier and Cox models showing a 31.9% risk increase linked to hypertension. These findings underscore the importance of early screening, lifestyle intervention, and targeted care. Future research should explore data-balancing methods like SMOTE and develop real-time tools to support clinical decision-making. en_US
dc.description.sponsorship Gulf University for Science and Technology [ISG- Case 137] en_US
dc.description.sponsorship This project has been partially supported by the Gulf University for Science and Technology (GUST) and the GUST Engineering and Applied Innovation Research Center (GEAR) under project code ISG- Case 137. en_US
dc.identifier.doi 10.1155/srat/2892726
dc.identifier.issn 2090-8105
dc.identifier.issn 2042-0056
dc.identifier.scopus 2-s2.0-105021996751
dc.identifier.uri https://doi.org/10.1155/srat/2892726
dc.identifier.uri https://hdl.handle.net/20.500.14411/10961
dc.language.iso en en_US
dc.publisher Wiley en_US
dc.relation.ispartof Stroke Research and Treatment en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Clustering en_US
dc.subject Early Detection en_US
dc.subject Feature Importance en_US
dc.subject Naï en_US
dc.subject Ve Bayes en_US
dc.subject Predicting Stroke Risk Using Machine Learning en_US
dc.subject Stroke Prevention en_US
dc.subject Survival Analysis en_US
dc.subject XGBoost en_US
dc.title Predicting Stroke Risk Using Machine Learning: A Data-Driven Approach to Early Detection and Prevention
dc.type Article en_US
dspace.entity.type Publication
gdc.author.institutional Yıldız, Barış
gdc.author.scopusid 57203285174
gdc.author.scopusid 60196315700
gdc.author.scopusid 58715216700
gdc.author.scopusid 59548294800
gdc.author.scopusid 23567493800
gdc.author.wosid Almustafa, Khaled/Lru-8856-2024
gdc.description.department Atılım University en_US
gdc.description.departmenttemp [Sutcu, Muhammed; Jouda, Dana; Katrib, Juliano; Almustafa, Khaled Mohamad] Gulf Univ Sci & Technol GUST, GUST Engn & Appl Innovat Res Ctr GEAR, Dept Elect & Comp Engn, Hawally, Kuwait; [Yildiz, Baris] Atilim Univ, Ind Engn Dept, Ankara, Turkiye en_US
gdc.description.issue 1 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q3
gdc.description.volume 2025 en_US
gdc.description.woscitationindex Emerging Sources Citation Index
gdc.description.wosquality N/A
gdc.identifier.pmid 41287761
gdc.identifier.wos WOS:001614984300001
relation.isAuthorOfPublication affa0402-9f12-4483-8860-309080a4dbd8
relation.isAuthorOfPublication.latestForDiscovery affa0402-9f12-4483-8860-309080a4dbd8
relation.isOrgUnitOfPublication 12c9377e-b7fe-4600-8326-f3613a05653d
relation.isOrgUnitOfPublication 4abda634-67fd-417f-bee6-59c29fc99997
relation.isOrgUnitOfPublication 50be38c5-40c4-4d5f-b8e6-463e9514c6dd
relation.isOrgUnitOfPublication.latestForDiscovery 12c9377e-b7fe-4600-8326-f3613a05653d

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Predicting Stroke Risk Using Machine Learning.pdf
Size:
1.79 MB
Format:
Adobe Portable Document Format

Collections