Plagiarism Detection in Software Using Efficient String Matching
No Thumbnail Available
Date
2012
Journal Title
Journal ISSN
Volume Title
Publisher
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
String matching refers to the problem of finding occurrence(s) of a pattern string within another string or body of a text. It plays a vital role in plagiarism detection in software codes, where it is required to identify similar program in a large populations. String matching has been used as a tool in a software metrics, which is used to measure the quality of software development process. In the recent years, many algorithms exist for solving the string matching problem. Among them, Berry-Ravindran algorithm was found to be fairly efficient. Further refinement of this algorithm is made in TVSBS and SSABS algorithms. However, these algorithms do not give the best possible shift in the search phase. In this paper, we propose an algorithm which gives the best possible shift in the search phase and is faster than the previously known algorithms. This algorithm behaves like Berry-Ravindran in the worst case. Further extension of this algorithm has been made for parameterized string matching which is able to detect plagiarism in a software code. © 2012 Springer-Verlag.
Description
Universidade Federal da Bahia (UFBA); Universidade Federal do Reconcavo da Bahia (UFRB); Universidade Estadual de Feira de Santana (UEFS); University of Perugia; University of Basilicata (UB)
Keywords
bad character shift, parameterized matching and RGF, plagiarism detection, String matching
Turkish CoHE Thesis Center URL
Fields of Science
Citation
1
WoS Q
Scopus Q
Source
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) -- 12th International Conference on Computational Science and Its Applications, ICCSA 2012 -- 18 June 2012 through 21 June 2012 -- Salvador de Bahia -- 90945
Volume
7336 LNCS
Issue
PART 4
Start Page
147
End Page
156