Kan test verilerini kullanarak makine öğrenme algoritmaları ile COVİD-19 tahmini
Loading...
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Tehlikeli COVID-19 hastalığı, çok sayıda insanın hayatına doğrudan zarar verdi. RT-PCR testi, X-Ray ve bilgisayarlı tomografi (BT) ile COVID-19 tespiti, uzun geri dönüş süreleri, yanlış negatif oranları (%15-20), pahalı ekipman ve kalifiye personel gibi bazı dezavantajlara sahiptir. Hastalık bulaşma oranı çok yüksek olduğundan, özellikle yoksun bölgelerde enfeksiyon riskini azaltan COVID-19 hastalarını belirlemek için kan testi gibi daha hızlı, kesin ve uygun maliyetli bir yol gereklidir. Çalışmanın amacı, COVID-19 pozitifini ekonomik ve hızlı bir şekilde belirlemek için makine öğrenimi ve kan testleri kullanarak klinisyenlere yardımcı olabililecek doğru ve kesin bir yaklaşım sunmaktır. Ayrıca bu çalışma ileride başka hastalıkların teşhisinde de yardımcı olabilir. Kan testi veri seti, 5644 örnek içeren Sao Paulo Brezilya'daki İsrailli Albert Einstein Hastanesi'nden alınmıştır. COVID-19 tahmin yöntemi, sekiz farklı makine öğrenimi modelinin dayanmaktadır. Bu çalışmada üç strateji oluşturulmuştur. İlk olarak, geliştirilen modeller herhangi bir öznitelik seçme algoritması kullanılmadan eğitilmiştir. İkinci olarak, aynı modeller Gri Kurt Optimizasyonu (GWO) özellik seçimi kullanılarak test edilmiştir. Üçüncüsünde ise özellik seçimi için GWO yerine Pearson Korelasyonu kullanılmıştır. Model performanslarını değerlendirmek için doğruluk, duyarlılık, özgüllük ve AUC kullanıldı. Sırasıyla %98,82, %97,83 ve %100 doğruluk, duyarlık ve özgüllük ile SVM en iyi sonuçları gösterdi. Bu çalışmanın katkısı olan GWO öznitelik seçimini kullanan algoritmaların performans metrikleri önemli bir iyileşme göstermiştir.
The dangerous COVID-19 illness has directly harmed numerous people's lives. COVID-19 detection with RT-PCR test, X-Ray, and computed tomography (CT) has some drawbacks such as long turnaround times, false-negative rates (15-20%), pricey equipment, and qualified staff. Since the disease transmission rate is very high, a quicker, precise, and affordable way like a blood test is required to identify COVID-19 patients which reduces the risk of infection, especially in deprived areas. This study aims to present an accurate and precise approach, which can assist clinicians, by using machine learning and blood tests to affordably and quickly identify COVID-19 positives. Also, this study may help in the diagnosis of other diseases in the future. The blood test dataset is from the Hospital Israelita Albert Einstein in Sao Paulo Brazil, which is including 5644 samples. The COVID-19 prediction method is based on eight different ML models. Three strategies are established in this study. First, the developed models are trained without using a feature selection algorithm. Second, the same models are tested using the Grey Wolf Optimization (GWO) feature selection. Third, instead of GWO, Pearson Correlation is used for feature selection. Accuracy, sensitivity, specificity, and AUC are used to evaluate the model performances. With accuracy, sensitivity, and specificity of 98.82%, 97.83%, and 100%, respectively, the SVM shows the best results. The performance metrics of the developed algorithms using the GWO feature selection show a significant improvement, which is the main contribution of this study.
The dangerous COVID-19 illness has directly harmed numerous people's lives. COVID-19 detection with RT-PCR test, X-Ray, and computed tomography (CT) has some drawbacks such as long turnaround times, false-negative rates (15-20%), pricey equipment, and qualified staff. Since the disease transmission rate is very high, a quicker, precise, and affordable way like a blood test is required to identify COVID-19 patients which reduces the risk of infection, especially in deprived areas. This study aims to present an accurate and precise approach, which can assist clinicians, by using machine learning and blood tests to affordably and quickly identify COVID-19 positives. Also, this study may help in the diagnosis of other diseases in the future. The blood test dataset is from the Hospital Israelita Albert Einstein in Sao Paulo Brazil, which is including 5644 samples. The COVID-19 prediction method is based on eight different ML models. Three strategies are established in this study. First, the developed models are trained without using a feature selection algorithm. Second, the same models are tested using the Grey Wolf Optimization (GWO) feature selection. Third, instead of GWO, Pearson Correlation is used for feature selection. Accuracy, sensitivity, specificity, and AUC are used to evaluate the model performances. With accuracy, sensitivity, and specificity of 98.82%, 97.83%, and 100%, respectively, the SVM shows the best results. The performance metrics of the developed algorithms using the GWO feature selection show a significant improvement, which is the main contribution of this study.
Description
Keywords
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
Turkish CoHE Thesis Center URL
Citation
WoS Q
Scopus Q
Source
Volume
Issue
Start Page
0
End Page
127