Extractive Text Summarization for Turkish: Implementation of Tf-Idf and Pagerank Algorithms

dc.authorscopusid 57895357200
dc.authorscopusid 24315330000
dc.contributor.author Akülker,E.
dc.contributor.author Turhan,Ç.
dc.contributor.other Software Engineering
dc.date.accessioned 2024-07-05T15:50:21Z
dc.date.available 2024-07-05T15:50:21Z
dc.date.issued 2023
dc.department Atılım University en_US
dc.department-temp Akülker E., Havelsan, Ankara, Turkey; Turhan Ç., Department of Software Engineering, Atılım University, Ankara, Turkey en_US
dc.description.abstract Due to the massive amount of information available on the web, reaching the desired content has become more and more difficult. Automatic text summarization helps to solve the problem by minimizing the document size while keeping its core information. In this study, two extractive single document automatic text summarization systems for Turkish are presented which implement the statistical-based TF-IDF algorithm as well as the combination of TF-IDF with the graph-based PageRank algorithm. The study aims to reveal the usability and effectiveness of these algorithms for Turkish documents. Moreover, the results of the TF-IDF implementation and the hybrid approach are compared using the co-selection measures, precision, recall, and F-score. In the evaluation phase, the system-generated summaries are categorized and tested based on their word sizes and the predetermined thresholds and compared against the human-generated summaries. The results indicate that the hybrid system performs better than the TF-IDF system even in lower thresholds, and also both systems are inclined to improve average F-scores in higher threshold generated summarization. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG. en_US
dc.identifier.citationcount 0
dc.identifier.doi 10.1007/978-3-031-16075-2_51
dc.identifier.endpage 704 en_US
dc.identifier.isbn 978-303116074-5
dc.identifier.issn 2367-3370
dc.identifier.scopus 2-s2.0-85138243155
dc.identifier.scopusquality Q4
dc.identifier.startpage 688 en_US
dc.identifier.uri https://doi.org/10.1007/978-3-031-16075-2_51
dc.identifier.uri https://hdl.handle.net/20.500.14411/4136
dc.identifier.volume 544 LNNS en_US
dc.institutionauthor Turhan, Çiğdem
dc.language.iso en en_US
dc.publisher Springer Science and Business Media Deutschland GmbH en_US
dc.relation.ispartof Lecture Notes in Networks and Systems -- Intelligent Systems Conference, IntelliSys 2022 -- 1 September 2022 through 2 September 2022 -- Virtual, Online -- 282539 en_US
dc.relation.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.scopus.citedbyCount 1
dc.subject PageRank en_US
dc.subject Text summarization en_US
dc.subject TF-IDF en_US
dc.subject Turkish en_US
dc.title Extractive Text Summarization for Turkish: Implementation of Tf-Idf and Pagerank Algorithms en_US
dc.type Conference Object en_US
dspace.entity.type Publication
relation.isAuthorOfPublication df768b22-7cc0-4650-882f-5af552c7a5f2
relation.isAuthorOfPublication.latestForDiscovery df768b22-7cc0-4650-882f-5af552c7a5f2
relation.isOrgUnitOfPublication d86bbe4b-0f69-4303-a6de-c7ec0c515da5
relation.isOrgUnitOfPublication.latestForDiscovery d86bbe4b-0f69-4303-a6de-c7ec0c515da5

Files

Collections