Syllable-Based Text Compression: a Language Case Study

No Thumbnail Available

Date

2016

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Heidelberg

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Organizational Units

Organizational Unit
Computer Engineering
(1998)
The Atılım University Department of Computer Engineering was founded in 1998. The department curriculum is prepared in a way that meets the demands for knowledge and skills after graduation, and is subject to periodical reviews and updates in line with international standards. Our Department offers education in many fields of expertise, such as software development, hardware systems, data structures, computer networks, artificial intelligence, machine learning, image processing, natural language processing, object based design, information security, and cloud computing. The education offered by our department is based on practical approaches, with modern laboratories, projects and internship programs. The undergraduate program at our department was accredited in 2014 by the Association of Evaluation and Accreditation of Engineering Programs (MÜDEK) and was granted the label EUR-ACE, valid through Europe. In addition to the undergraduate program, our department offers thesis or non-thesis graduate degree programs (MS).

Journal Issue

Events

Abstract

Compression of texts has been widely studied by various researchers and in the process, several algorithms have been proposed. However, compression of texts using the syllabic structure of words in syllable-based languages has emerged as another dimension to the compression of texts. An algorithm for syllable extraction from words should be designed based on the structure of a language due to the ineffectiveness of the presently existing "universal" algorithms. Several syllable-based methods of compression proposed by different authors are reviewed in this work, including the methodologies used in achieving text compression. Finally, an algorithm for syllable extraction from words in the Yoruba language is presented and compared with four universal algorithms, recording the best result (100 % accuracy) among the five; the significance of this is that a dictionary of common syllables does not need to be created to achieve syllable-based text compression on the Yoruba Language.

Description

Misra, Sanjay/0000-0002-3556-9331; Adubi, Stephen/0000-0003-1950-3727

Keywords

Syllables, Syllable-based compression, Text compression, Syllabification

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Q2

Scopus Q

Q1

Source

Volume

41

Issue

8

Start Page

3089

End Page

3097

Collections