An Empirical Analysis of the Effectiveness of Software Metrics and Fault Prediction Model for Identifying Faulty Classes

Kumar, Lov; Misra, Sanjay; Rath, Santanu Ku.

An Empirical Analysis of the Effectiveness of Software Metrics and Fault Prediction Model for Identifying Faulty Classes

Date

2017

Authors

Kumar, Lov

Misra, Sanjay

Rath, Santanu Ku.

Publisher

Elsevier

Green Open Access

No

Publicly Funded

No

Impulse

Top 10%

Influence

Top 10%

Popularity

Top 10%

Abstract

Software fault prediction models are used to predict faulty modules at the very early stage of software development life cycle. Predicting fault proneness using source code metrics is an area that has attracted several researchers' attention. The performance of a model to assess fault proneness depends on the source code metrics which are considered as the input for the model. In this work, we have proposed a framework to validate the source code metrics and identify a suitable set of source code metrics with the aim to reduce irrelevant features and improve the performance of the fault prediction model. Initially, we applied a t-test analysis and univariate logistic regression analysis to each source code metric to evaluate their potential for predicting fault proneness. Next, we performed a correlation analysis and multivariate linear regression stepwise forward selection to find the right set of source code metrics for fault prediction. The obtained set of source code metrics are considered as the input to develop a fault prediction model using a neural network with five different training algorithms and three different ensemble methods. The effectiveness of the developed fault prediction models are evaluated using a proposed cost evaluation framework. We performed experiments on fifty six Open Source Java projects. The experimental results reveal that the model developed by considering the selected set of source code metrics using the suggested source code metrics validation framework as the input achieves better results compared to all other metrics. The experimental results also demonstrate that the fault prediction model is best suitable for projects with faulty classes less than the threshold value depending on fault identification efficiency (low - 48.89%, median- 39.26%, and high - 27.86%).

Description

kumar, lov/0000-0002-0123-7822; Misra, Sanjay/0000-0002-3556-9331; Rath, Santanu/0000-0001-5641-8199

ORCID

kumar, lov

Misra, Sanjay

Rath, Santanu

Keywords

Feature selection techniques, Artificial neural network, Ensemble method, Source code metrics, Cost analysis framework

Fields of Science

0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

WoS Q

Q2

OpenCitations Citation Count

54

Source

Computer Standards & Interfaces

Volume

53

Start Page

1

End Page

32

URI

https://doi.org/10.1016/j.csi.2017.02.003
https://hdl.handle.net/20.500.14411/2855

Collections

WoS
Scopus

PlumX Metrics

Citations

CrossRef : 54

Scopus : 62

Captures

Mendeley Readers : 67

Full item page

SCOPUS™ Citations

62

checked on Feb 15, 2026

Web of Science™ Citations

47

checked on Feb 15, 2026

Page Views

2

checked on Feb 15, 2026

Google Scholar™

Check

An Empirical Analysis of the Effectiveness of Software Metrics and Fault Prediction Model for Identifying Faulty Classes

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

Green Open Access

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

BIP! Indicators

Research Projects

Journal Issue

Abstract

Description

ORCID

Keywords

Fields of Science

Citation

WoS Q

Scopus Q

OpenCitations Citation Count

Source

Volume

Issue

Start Page

End Page

URI

Collections

PlumX Metrics

Citations

Captures

SCOPUS™ Citations

62

Web of Science™ Citations

47

Page Views

2

Google Scholar™

OpenAlex FWCI

12.87579188

Sustainable Development Goals

SDG data is not available