Improving Word Embedding Quality With Innovative Automated Approaches To Hyperparameters
| dc.contributor.author | Yildiz, Beytullah | |
| dc.contributor.author | Yıldız, Beytullah | |
| dc.contributor.author | Tezgider, Murat | |
| dc.contributor.author | Yıldız, Beytullah | |
| dc.contributor.other | Software Engineering | |
| dc.contributor.other | 06. School Of Engineering | |
| dc.contributor.other | 01. Atılım University | |
| dc.date.accessioned | 2024-07-05T15:18:36Z | |
| dc.date.available | 2024-07-05T15:18:36Z | |
| dc.date.issued | 2021 | |
| dc.description | YILDIZ, Beytullah/0000-0001-7664-5145 | en_US |
| dc.description.abstract | Deep learning practices have a great impact in many areas. Big data and significant hardware developments are the main reasons behind deep learning success. Recent advances in deep learning have led to significant improvements in text analysis and classification. Progress in the quality of word representation is an important factor among these improvements. In this study, we aimed to develop word2vec word representation, also called embedding, by automatically optimizing hyperparameters. Minimum word count, vector size, window size, negative sample, and iteration number were used to improve word embedding. We introduce two approaches for setting hyperparameters that are faster than grid search and random search. Word embeddings were created using documents of approximately 300 million words. We measured the quality of word embedding using a deep learning classification model on documents of 10 different classes. It was observed that the optimization of the values of hyperparameters alone increased classification success by 9%. In addition, we demonstrate the benefits of our approaches by comparing the semantic and syntactic relations between word embedding using default and optimized hyperparameters. | en_US |
| dc.identifier.doi | 10.1002/cpe.6091 | |
| dc.identifier.issn | 1532-0626 | |
| dc.identifier.issn | 1532-0634 | |
| dc.identifier.scopus | 2-s2.0-85100035562 | |
| dc.identifier.uri | https://doi.org/10.1002/cpe.6091 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14411/1873 | |
| dc.language.iso | en | en_US |
| dc.publisher | Wiley | en_US |
| dc.relation.ispartof | Concurrency and Computation: Practice and Experience | |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | deep learning | en_US |
| dc.subject | machine learning | en_US |
| dc.subject | text analysis | en_US |
| dc.subject | text classification | en_US |
| dc.subject | word embedding | en_US |
| dc.subject | word2vec | en_US |
| dc.title | Improving Word Embedding Quality With Innovative Automated Approaches To Hyperparameters | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication | |
| gdc.author.id | YILDIZ, Beytullah/0000-0001-7664-5145 | |
| gdc.author.institutional | Yıldız, Beytullah | |
| gdc.author.scopusid | 14632851900 | |
| gdc.author.scopusid | 57207471698 | |
| gdc.bip.impulseclass | C4 | |
| gdc.bip.influenceclass | C4 | |
| gdc.bip.popularityclass | C4 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::journal::journal article | |
| gdc.description.department | Atılım University | en_US |
| gdc.description.departmenttemp | [Yildiz, Beytullah] Atilim Univ, Dept Software Engn, Sch Engn, Ankara, Turkey; [Tezgider, Murat] Firat Univ, Dept Comp Engn, Fac Engn, Elazig, Turkey | en_US |
| gdc.description.issue | 18 | en_US |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | Q2 | |
| gdc.description.volume | 33 | en_US |
| gdc.description.wosquality | Q3 | |
| gdc.identifier.openalex | W3124469651 | |
| gdc.identifier.wos | WOS:000609293400001 | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 18.0 | |
| gdc.oaire.influence | 4.083888E-9 | |
| gdc.oaire.isgreen | false | |
| gdc.oaire.popularity | 1.7017332E-8 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 0202 electrical engineering, electronic engineering, information engineering | |
| gdc.oaire.sciencefields | 02 engineering and technology | |
| gdc.oaire.sciencefields | 01 natural sciences | |
| gdc.oaire.sciencefields | 0105 earth and related environmental sciences | |
| gdc.openalex.fwci | 1.998 | |
| gdc.openalex.normalizedpercentile | 0.84 | |
| gdc.opencitations.count | 16 | |
| gdc.plumx.crossrefcites | 7 | |
| gdc.plumx.facebookshareslikecount | 2 | |
| gdc.plumx.mendeley | 17 | |
| gdc.plumx.scopuscites | 13 | |
| gdc.scopus.citedcount | 13 | |
| gdc.wos.citedcount | 9 | |
| relation.isAuthorOfPublication | 8eb144cb-95ff-4557-a99c-cd0ffa90749d | |
| relation.isAuthorOfPublication | 8eb144cb-95ff-4557-a99c-cd0ffa90749d | |
| relation.isAuthorOfPublication | 8eb144cb-95ff-4557-a99c-cd0ffa90749d | |
| relation.isAuthorOfPublication.latestForDiscovery | 8eb144cb-95ff-4557-a99c-cd0ffa90749d | |
| relation.isOrgUnitOfPublication | d86bbe4b-0f69-4303-a6de-c7ec0c515da5 | |
| relation.isOrgUnitOfPublication | 4abda634-67fd-417f-bee6-59c29fc99997 | |
| relation.isOrgUnitOfPublication | 50be38c5-40c4-4d5f-b8e6-463e9514c6dd | |
| relation.isOrgUnitOfPublication.latestForDiscovery | d86bbe4b-0f69-4303-a6de-c7ec0c515da5 |