Predicting Software Functional Size Using Natural Language Processing: an Exploratory Case Study

Unlu, Huseyin; Tenekeci, Samet; Ciftci, Can; Oral, Ibrahim Baran; Atalay, Tunahan; Hacaloglu, Tuna; Demirors, Onur

Predicting Software Functional Size Using Natural Language Processing: an Exploratory Case Study

dc.contributor.author	Unlu, Huseyin
dc.contributor.author	Tenekeci, Samet
dc.contributor.author	Ciftci, Can
dc.contributor.author	Oral, Ibrahim Baran
dc.contributor.author	Atalay, Tunahan
dc.contributor.author	Hacaloglu, Tuna
dc.contributor.author	Demirors, Onur
dc.date.accessioned	2025-04-07T18:52:47Z
dc.date.available	2025-04-07T18:52:47Z
dc.date.issued	2024
dc.description.abstract	Software Size Measurement (SSM) plays an essential role in software project management as it enables the acquisition of software size, which is the primary input for development effort and schedule estimation. However, many small and medium-sized companies cannot perform objective SSM and Software Effort Estimation (SEE) due to the lack of resources and an expert workforce. This results in inadequate estimates and projects exceeding the planned time and budget. Therefore, organizations need to perform objective SSM and SEE using minimal resources without an expert workforce. In this research, we conducted an exploratory case study to predict the functional size of software project requirements using state-of-the-art large language models (LLMs). For this aim, we fine-tuned BERT and BERT_SE with a set of user stories and their respective functional size in COSMIC Function Points (CFP). We gathered the user stories included in different project requirement documents. In total size prediction, we achieved 72.8% accuracy with BERT and 74.4% accuracy with BERT_SE. In data movement-based size prediction, we achieved 87.5% average accuracy with BERT and 88.1% average accuracy with BERT_SE. Although we use relatively small datasets in model training, these results are promising and hold significant value as they demonstrate the practical utility of language models in SSM.	en_US
dc.identifier.doi	10.1109/SEAA64295.2024.00036
dc.identifier.isbn	9798350380279
dc.identifier.isbn	9798350380262
dc.identifier.issn	2640-592X
dc.identifier.scopus	2-s2.0-85212703074
dc.identifier.uri	https://doi.org/10.1109/SEAA64295.2024.00036
dc.identifier.uri	https://hdl.handle.net/20.500.14411/10507
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.relation.ispartof	50th Euromicro Conference on Software Engineering and Advanced Applications -- AUG 28-30, 2024 -- Paris, FRANCE	en_US
dc.relation.ispartofseries	Euromicro Conference on Software Engineering and Advanced Applications
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Software Size Measurement	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	Cosmic	en_US
dc.subject	Bert	en_US
dc.subject	Functional Size	en_US
dc.subject	Software Engineering	en_US
dc.subject	Nlp	en_US
dc.title	Predicting Software Functional Size Using Natural Language Processing: an Exploratory Case Study	en_US
dc.type	Conference Object	en_US
dspace.entity.type	Publication
gdc.author.scopusid	57521977500
gdc.author.scopusid	57340107000
gdc.author.scopusid	59643575200
gdc.author.scopusid	59643575300
gdc.author.scopusid	59644134000
gdc.author.scopusid	56422190200
gdc.author.scopusid	59640759700
gdc.bip.impulseclass	C5
gdc.bip.influenceclass	C5
gdc.bip.popularityclass	C5
gdc.coar.access	metadata only access
gdc.coar.type	text::conference output
gdc.collaboration.industrial	true
gdc.description.department	Atılım University	en_US
gdc.description.departmenttemp	[Unlu, Huseyin; Tenekeci, Samet; Ciftci, Can; Oral, Ibrahim Baran; Atalay, Tunahan; Demirors, Onur] Izmir Inst Technol, Dept Comp Engn, Izmir, Turkiye; [Tenekeci, Samet; Demirors, Onur] Bilgi Grubu, Izmir, Turkiye; [Hacaloglu, Tuna] Atilim Univ, Dept Informat Syst Engn, Ankara, Turkiye; [Hacaloglu, Tuna] Ecole Technol Super, Dept Software & IT Engn, Montreal, PQ, Canada; [Musaoglu, Burcu] Siskon Software & Automat, Izmir, Turkiye	en_US
gdc.description.endpage	193	en_US
gdc.description.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı	en_US
gdc.description.scopusquality	Q4
gdc.description.startpage	188	en_US
gdc.description.woscitationindex	Conference Proceedings Citation Index - Science
gdc.description.wosquality	N/A
gdc.identifier.openalex	W4405846425
gdc.identifier.wos	WOS:001413352200026
gdc.index.type	WoS
gdc.index.type	Scopus
gdc.oaire.diamondjournal	false
gdc.oaire.impulse	0.0
gdc.oaire.influence	2.5349236E-9
gdc.oaire.isgreen	false
gdc.oaire.popularity	2.4744335E-9
gdc.oaire.publicfunded	false
gdc.openalex.collaboration	International
gdc.openalex.fwci	3.05530053
gdc.openalex.normalizedpercentile	0.9
gdc.openalex.toppercent	TOP 10%
gdc.opencitations.count	0
gdc.plumx.mendeley	19
gdc.plumx.scopuscites	5
gdc.scopus.citedcount	5
gdc.virtual.author	Hacaloğlu, Tuna
gdc.wos.citedcount	3
relation.isAuthorOfPublication	d3ed58a9-ec7a-4537-bd73-68342f5537fe
relation.isAuthorOfPublication.latestForDiscovery	d3ed58a9-ec7a-4537-bd73-68342f5537fe
relation.isOrgUnitOfPublication	cf0fb36c-0500-438e-b4cc-ad1d4ef25579
relation.isOrgUnitOfPublication	4abda634-67fd-417f-bee6-59c29fc99997
relation.isOrgUnitOfPublication	50be38c5-40c4-4d5f-b8e6-463e9514c6dd
relation.isOrgUnitOfPublication.latestForDiscovery	cf0fb36c-0500-438e-b4cc-ad1d4ef25579

Collections

WoS
Scopus

Predicting Software Functional Size Using Natural Language Processing: an Exploratory Case Study

Files

Collections