Improving Word Embedding Quality With Innovative Automated Approaches To Hyperparameters

dc.contributor.author Yildiz, Beytullah
dc.contributor.author Yıldız, Beytullah
dc.contributor.author Tezgider, Murat
dc.contributor.author Yıldız, Beytullah
dc.contributor.other Software Engineering
dc.date.accessioned 2024-07-05T15:18:36Z
dc.date.available 2024-07-05T15:18:36Z
dc.date.issued 2021
dc.description YILDIZ, Beytullah/0000-0001-7664-5145 en_US
dc.description.abstract Deep learning practices have a great impact in many areas. Big data and significant hardware developments are the main reasons behind deep learning success. Recent advances in deep learning have led to significant improvements in text analysis and classification. Progress in the quality of word representation is an important factor among these improvements. In this study, we aimed to develop word2vec word representation, also called embedding, by automatically optimizing hyperparameters. Minimum word count, vector size, window size, negative sample, and iteration number were used to improve word embedding. We introduce two approaches for setting hyperparameters that are faster than grid search and random search. Word embeddings were created using documents of approximately 300 million words. We measured the quality of word embedding using a deep learning classification model on documents of 10 different classes. It was observed that the optimization of the values of hyperparameters alone increased classification success by 9%. In addition, we demonstrate the benefits of our approaches by comparing the semantic and syntactic relations between word embedding using default and optimized hyperparameters. en_US
dc.identifier.doi 10.1002/cpe.6091
dc.identifier.issn 1532-0626
dc.identifier.issn 1532-0634
dc.identifier.scopus 2-s2.0-85100035562
dc.identifier.uri https://doi.org/10.1002/cpe.6091
dc.identifier.uri https://hdl.handle.net/20.500.14411/1873
dc.language.iso en en_US
dc.publisher Wiley en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject deep learning en_US
dc.subject machine learning en_US
dc.subject text analysis en_US
dc.subject text classification en_US
dc.subject word embedding en_US
dc.subject word2vec en_US
dc.title Improving Word Embedding Quality With Innovative Automated Approaches To Hyperparameters en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id YILDIZ, Beytullah/0000-0001-7664-5145
gdc.author.institutional Yıldız, Beytullah
gdc.author.scopusid 14632851900
gdc.author.scopusid 57207471698
gdc.coar.access metadata only access
gdc.coar.type text::journal::journal article
gdc.description.department Atılım University en_US
gdc.description.departmenttemp [Yildiz, Beytullah] Atilim Univ, Dept Software Engn, Sch Engn, Ankara, Turkey; [Tezgider, Murat] Firat Univ, Dept Comp Engn, Fac Engn, Elazig, Turkey en_US
gdc.description.issue 18 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.volume 33 en_US
gdc.description.wosquality Q3
gdc.identifier.wos WOS:000609293400001
gdc.scopus.citedcount 13
gdc.wos.citedcount 8
relation.isAuthorOfPublication 8eb144cb-95ff-4557-a99c-cd0ffa90749d
relation.isAuthorOfPublication 8eb144cb-95ff-4557-a99c-cd0ffa90749d
relation.isAuthorOfPublication 8eb144cb-95ff-4557-a99c-cd0ffa90749d
relation.isAuthorOfPublication.latestForDiscovery 8eb144cb-95ff-4557-a99c-cd0ffa90749d
relation.isOrgUnitOfPublication d86bbe4b-0f69-4303-a6de-c7ec0c515da5
relation.isOrgUnitOfPublication.latestForDiscovery d86bbe4b-0f69-4303-a6de-c7ec0c515da5

Files

Collections