Recurrent Neural Networks for Spam E-Mail Classification on an Agglutinative Language

dc.authorscopusid 56247318100
dc.authorscopusid 55806648900
dc.authorscopusid 55293387500
dc.authorscopusid 15081108900
dc.contributor.author Işik,S.
dc.contributor.author Kurt,Z.
dc.contributor.author Anagun,Y.
dc.contributor.author Ozkan,K.
dc.contributor.other Computer Engineering
dc.date.accessioned 2024-07-05T15:45:56Z
dc.date.available 2024-07-05T15:45:56Z
dc.date.issued 2020
dc.department Atılım University en_US
dc.department-temp Işik S., Computer Eng, Eskisehir Osmangazi University, Eskisehir, Turkey; Kurt Z., Computer Eng, Atılım University, Ankara, Turkey; Anagun Y., Computer Eng, Eskisehir Osmangazi University, Eskisehir, Turkey; Ozkan K., Computer Eng, Eskisehir Osmangazi University, Eskisehir, Turkey en_US
dc.description.abstract In this study, we have provided an alternative solution to spam and legitimate email classification problem. The different deep learning architectures are applied on two feature selection methods, including the Mutual Information (MI) and Weighted Mutual Information (WMI). Firstly, feature selection methods including WMI and MI are applied to reduce number of selected terms. Secondly, the feature vectors are constructed with concept of the bag-of-words (BoW) model. Finally, the performance of system is analyzed with using Artificial Neural Network (ANN), Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (BILSTM) models. After experimental simulations, we have observed that there is a competition between detection results of using WMI and MI when commented with accuracy rates for the agglutinative language, namely Turkish. The experimental scores show that the LSTM and BILSTM give 100% accuracy scores when combined with MI or WMI, for spam and legitimate emails. However, for particular cross-validation, the performance WMI is higher than MI features in terms e-mail grouping. It turns out that WMI and MI with deep learning architectures seem more robust to spam email detection when considering the high detection scores. © 2020, Ismail Saritas. All rights reserved. en_US
dc.identifier.citationcount 7
dc.identifier.doi 10.18201/ijisae.2020466316
dc.identifier.endpage 227 en_US
dc.identifier.issn 2147-6799
dc.identifier.issue 4 en_US
dc.identifier.scopus 2-s2.0-85100155779
dc.identifier.startpage 221 en_US
dc.identifier.uri https://doi.org/10.18201/ijisae.2020466316
dc.identifier.uri https://hdl.handle.net/20.500.14411/3986
dc.identifier.volume 8 en_US
dc.institutionauthor Kurt, Zühal
dc.language.iso en en_US
dc.publisher Ismail Saritas en_US
dc.relation.ispartof International Journal of Intelligent Systems and Applications in Engineering en_US
dc.relation.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.scopus.citedbyCount 10
dc.subject LSTM en_US
dc.subject Mutual information en_US
dc.subject Odds ratio en_US
dc.subject RNN en_US
dc.subject Spam E-mail en_US
dc.title Recurrent Neural Networks for Spam E-Mail Classification on an Agglutinative Language en_US
dc.type Article en_US
dspace.entity.type Publication
relation.isAuthorOfPublication c1644357-fb5e-46b5-be18-1dd9b8e84e2e
relation.isAuthorOfPublication.latestForDiscovery c1644357-fb5e-46b5-be18-1dd9b8e84e2e
relation.isOrgUnitOfPublication e0809e2c-77a7-4f04-9cb0-4bccec9395fa
relation.isOrgUnitOfPublication.latestForDiscovery e0809e2c-77a7-4f04-9cb0-4bccec9395fa

Files

Collections