Predicting Software Functional Size Using Natural Language Processing: an Exploratory Case Study

dc.contributor.author Unlu, Huseyin
dc.contributor.author Tenekeci, Samet
dc.contributor.author Ciftci, Can
dc.contributor.author Oral, Ibrahim Baran
dc.contributor.author Atalay, Tunahan
dc.contributor.author Hacaloglu, Tuna
dc.contributor.author Demirors, Onur
dc.date.accessioned 2025-04-07T18:52:47Z
dc.date.available 2025-04-07T18:52:47Z
dc.date.issued 2024
dc.description.abstract Software Size Measurement (SSM) plays an essential role in software project management as it enables the acquisition of software size, which is the primary input for development effort and schedule estimation. However, many small and medium-sized companies cannot perform objective SSM and Software Effort Estimation (SEE) due to the lack of resources and an expert workforce. This results in inadequate estimates and projects exceeding the planned time and budget. Therefore, organizations need to perform objective SSM and SEE using minimal resources without an expert workforce. In this research, we conducted an exploratory case study to predict the functional size of software project requirements using state-of-the-art large language models (LLMs). For this aim, we fine-tuned BERT and BERT_SE with a set of user stories and their respective functional size in COSMIC Function Points (CFP). We gathered the user stories included in different project requirement documents. In total size prediction, we achieved 72.8% accuracy with BERT and 74.4% accuracy with BERT_SE. In data movement-based size prediction, we achieved 87.5% average accuracy with BERT and 88.1% average accuracy with BERT_SE. Although we use relatively small datasets in model training, these results are promising and hold significant value as they demonstrate the practical utility of language models in SSM. en_US
dc.identifier.doi 10.1109/SEAA64295.2024.00036
dc.identifier.isbn 9798350380279
dc.identifier.isbn 9798350380262
dc.identifier.issn 2640-592X
dc.identifier.scopus 2-s2.0-85212703074
dc.identifier.uri https://doi.org/10.1109/SEAA64295.2024.00036
dc.identifier.uri https://hdl.handle.net/20.500.14411/10507
dc.language.iso en en_US
dc.publisher IEEE en_US
dc.relation.ispartof 50th Euromicro Conference on Software Engineering and Advanced Applications -- AUG 28-30, 2024 -- Paris, FRANCE en_US
dc.relation.ispartofseries Euromicro Conference on Software Engineering and Advanced Applications
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Software Size Measurement en_US
dc.subject Natural Language Processing en_US
dc.subject Cosmic en_US
dc.subject Bert en_US
dc.subject Functional Size en_US
dc.subject Software Engineering en_US
dc.subject Nlp en_US
dc.title Predicting Software Functional Size Using Natural Language Processing: an Exploratory Case Study en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.author.scopusid 57521977500
gdc.author.scopusid 57340107000
gdc.author.scopusid 59643575200
gdc.author.scopusid 59643575300
gdc.author.scopusid 59644134000
gdc.author.scopusid 56422190200
gdc.author.scopusid 59640759700
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access metadata only access
gdc.coar.type text::conference output
gdc.collaboration.industrial true
gdc.description.department Atılım University en_US
gdc.description.departmenttemp [Unlu, Huseyin; Tenekeci, Samet; Ciftci, Can; Oral, Ibrahim Baran; Atalay, Tunahan; Demirors, Onur] Izmir Inst Technol, Dept Comp Engn, Izmir, Turkiye; [Tenekeci, Samet; Demirors, Onur] Bilgi Grubu, Izmir, Turkiye; [Hacaloglu, Tuna] Atilim Univ, Dept Informat Syst Engn, Ankara, Turkiye; [Hacaloglu, Tuna] Ecole Technol Super, Dept Software & IT Engn, Montreal, PQ, Canada; [Musaoglu, Burcu] Siskon Software & Automat, Izmir, Turkiye en_US
gdc.description.endpage 193 en_US
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q4
gdc.description.startpage 188 en_US
gdc.description.woscitationindex Conference Proceedings Citation Index - Science
gdc.description.wosquality N/A
gdc.identifier.openalex W4405846425
gdc.identifier.wos WOS:001413352200026
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 0.0
gdc.oaire.influence 2.5349236E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 2.4744335E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration International
gdc.openalex.fwci 3.05530053
gdc.openalex.normalizedpercentile 0.9
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 0
gdc.plumx.mendeley 19
gdc.plumx.scopuscites 5
gdc.scopus.citedcount 5
gdc.virtual.author Hacaloğlu, Tuna
gdc.wos.citedcount 3
relation.isAuthorOfPublication d3ed58a9-ec7a-4537-bd73-68342f5537fe
relation.isAuthorOfPublication.latestForDiscovery d3ed58a9-ec7a-4537-bd73-68342f5537fe
relation.isOrgUnitOfPublication cf0fb36c-0500-438e-b4cc-ad1d4ef25579
relation.isOrgUnitOfPublication 4abda634-67fd-417f-bee6-59c29fc99997
relation.isOrgUnitOfPublication 50be38c5-40c4-4d5f-b8e6-463e9514c6dd
relation.isOrgUnitOfPublication.latestForDiscovery cf0fb36c-0500-438e-b4cc-ad1d4ef25579

Files

Collections