Improving word embedding quality with innovative automated approaches to hyperparameters

dc.authoridYILDIZ, Beytullah/0000-0001-7664-5145
dc.authorscopusid14632851900
dc.authorscopusid57207471698
dc.contributor.authorYildiz, Beytullah
dc.contributor.authorYıldız, Beytullah
dc.contributor.authorTezgider, Murat
dc.contributor.authorYıldız, Beytullah
dc.contributor.otherSoftware Engineering
dc.date.accessioned2024-07-05T15:18:36Z
dc.date.available2024-07-05T15:18:36Z
dc.date.issued2021
dc.departmentAtılım Universityen_US
dc.department-temp[Yildiz, Beytullah] Atilim Univ, Dept Software Engn, Sch Engn, Ankara, Turkey; [Tezgider, Murat] Firat Univ, Dept Comp Engn, Fac Engn, Elazig, Turkeyen_US
dc.descriptionYILDIZ, Beytullah/0000-0001-7664-5145en_US
dc.description.abstractDeep learning practices have a great impact in many areas. Big data and significant hardware developments are the main reasons behind deep learning success. Recent advances in deep learning have led to significant improvements in text analysis and classification. Progress in the quality of word representation is an important factor among these improvements. In this study, we aimed to develop word2vec word representation, also called embedding, by automatically optimizing hyperparameters. Minimum word count, vector size, window size, negative sample, and iteration number were used to improve word embedding. We introduce two approaches for setting hyperparameters that are faster than grid search and random search. Word embeddings were created using documents of approximately 300 million words. We measured the quality of word embedding using a deep learning classification model on documents of 10 different classes. It was observed that the optimization of the values of hyperparameters alone increased classification success by 9%. In addition, we demonstrate the benefits of our approaches by comparing the semantic and syntactic relations between word embedding using default and optimized hyperparameters.en_US
dc.identifier.citation6
dc.identifier.doi10.1002/cpe.6091
dc.identifier.issn1532-0626
dc.identifier.issn1532-0634
dc.identifier.issue18en_US
dc.identifier.scopus2-s2.0-85100035562
dc.identifier.scopusqualityQ2
dc.identifier.urihttps://doi.org/10.1002/cpe.6091
dc.identifier.urihttps://hdl.handle.net/20.500.14411/1873
dc.identifier.volume33en_US
dc.identifier.wosWOS:000609293400001
dc.identifier.wosqualityQ3
dc.institutionauthorYıldız, Beytullah
dc.language.isoenen_US
dc.publisherWileyen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectdeep learningen_US
dc.subjectmachine learningen_US
dc.subjecttext analysisen_US
dc.subjecttext classificationen_US
dc.subjectword embeddingen_US
dc.subjectword2vecen_US
dc.titleImproving word embedding quality with innovative automated approaches to hyperparametersen_US
dc.typeArticleen_US
dspace.entity.typePublication
relation.isAuthorOfPublication8eb144cb-95ff-4557-a99c-cd0ffa90749d
relation.isAuthorOfPublication8eb144cb-95ff-4557-a99c-cd0ffa90749d
relation.isAuthorOfPublication8eb144cb-95ff-4557-a99c-cd0ffa90749d
relation.isAuthorOfPublication.latestForDiscovery8eb144cb-95ff-4557-a99c-cd0ffa90749d
relation.isOrgUnitOfPublicationd86bbe4b-0f69-4303-a6de-c7ec0c515da5
relation.isOrgUnitOfPublication.latestForDiscoveryd86bbe4b-0f69-4303-a6de-c7ec0c515da5

Files

Collections