Big Data Software Engineering: Analysis of Knowledge Domains and Skill Sets Using Lda-Based Topic Modeling

dc.contributor.author Gurcan, Fatih
dc.contributor.author Cagiltay, Nergiz Ercil
dc.date.accessioned 2024-07-05T15:28:34Z
dc.date.available 2024-07-05T15:28:34Z
dc.date.issued 2019
dc.description GURCAN, Fatih/0000-0001-9915-6686; Cagiltay, Nergiz/0000-0003-0875-9276 en_US
dc.description.abstract Software engineering is a data-driven discipline and an integral part of data science. The introduction of big data systems has led to a great transformation in the architecture, methodologies, knowledge domains, and skills related to software engineering. Accordingly, education programs are now required to adapt themselves to up-to-date developments by first identifying the competencies concerning big data software engineering to meet the industrial needs and follow the latest trends. This paper aims to reveal the knowledge domains and skill sets required for big data software engineering and develop a taxonomy by mapping these competencies. A semi-automatic methodology is proposed for the semantic analysis of the textual contents of online job advertisements related to big data software engineering. This methodology uses the latent Dirichlet allocation (LDA), a probabilistic topic-modeling technique to discover the hidden semantic structures from a given textual corpus. The output of this paper is a systematic competency map comprising the essential knowledge domains, skills, and tools for big data software engineering. The findings of this paper are expected to help evaluate and improve IT professionals' vocational knowledge and skills, identify professional roles and competencies in personnel recruitment processes of companies, and meet the skill requirements of the industry through software engineering education programs. Additionally, the proposed model can be extended to blogs, social networks, forums, and other online communities to allow automatic identification of emerging trends and generate contextual tags. en_US
dc.identifier.doi 10.1109/ACCESS.2019.2924075
dc.identifier.issn 2169-3536
dc.identifier.scopus 2-s2.0-85068646074
dc.identifier.uri https://doi.org/10.1109/ACCESS.2019.2924075
dc.identifier.uri https://hdl.handle.net/20.500.14411/2819
dc.language.iso en en_US
dc.publisher Ieee-inst Electrical Electronics Engineers inc en_US
dc.relation.ispartof IEEE Access
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Big data software engineering en_US
dc.subject competency map en_US
dc.subject knowledge domains and skill sets en_US
dc.subject topic modeling en_US
dc.subject latent Dirichlet allocation en_US
dc.title Big Data Software Engineering: Analysis of Knowledge Domains and Skill Sets Using Lda-Based Topic Modeling en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id GURCAN, Fatih/0000-0001-9915-6686
gdc.author.id Cagiltay, Nergiz/0000-0003-0875-9276
gdc.author.scopusid 57194776706
gdc.author.scopusid 16237826800
gdc.author.wosid GURCAN, Fatih/AAJ-7503-2021
gdc.author.wosid Cagiltay, Nergiz/O-3082-2019
gdc.bip.impulseclass C3
gdc.bip.influenceclass C4
gdc.bip.popularityclass C3
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Atılım University en_US
gdc.description.departmenttemp [Gurcan, Fatih] Karadeniz Tech Univ, Fac Engn, Dept Comp Engn, TR-61080 Trabzon, Turkey; [Cagiltay, Nergiz Ercil] Atilim Univ, Fac Engn, Dept Software Engn, TR-06830 Ankara, Turkey en_US
gdc.description.endpage 82552 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.startpage 82541 en_US
gdc.description.volume 7 en_US
gdc.description.wosquality Q2
gdc.identifier.openalex W2949361621
gdc.identifier.wos WOS:000475354200001
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.impulse 43.0
gdc.oaire.influence 7.1972215E-9
gdc.oaire.isgreen false
gdc.oaire.keywords competency map
gdc.oaire.keywords knowledge domains and skill sets
gdc.oaire.keywords topic modeling
gdc.oaire.keywords latent Dirichlet allocation
gdc.oaire.keywords Electrical engineering. Electronics. Nuclear engineering
gdc.oaire.keywords Big data software engineering
gdc.oaire.keywords TK1-9971
gdc.oaire.popularity 5.3378326E-8
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.openalex.collaboration National
gdc.openalex.fwci 17.2202
gdc.openalex.normalizedpercentile 0.99
gdc.openalex.toppercent TOP 1%
gdc.opencitations.count 103
gdc.plumx.crossrefcites 22
gdc.plumx.mendeley 243
gdc.plumx.scopuscites 132
gdc.scopus.citedcount 132
gdc.virtual.author Çağıltay, Nergiz
gdc.wos.citedcount 90
relation.isAuthorOfPublication c99221fa-e0c9-4b73-9f64-54919fcd3c58
relation.isAuthorOfPublication.latestForDiscovery c99221fa-e0c9-4b73-9f64-54919fcd3c58
relation.isOrgUnitOfPublication d86bbe4b-0f69-4303-a6de-c7ec0c515da5
relation.isOrgUnitOfPublication 4abda634-67fd-417f-bee6-59c29fc99997
relation.isOrgUnitOfPublication 50be38c5-40c4-4d5f-b8e6-463e9514c6dd
relation.isOrgUnitOfPublication.latestForDiscovery d86bbe4b-0f69-4303-a6de-c7ec0c515da5

Files

Collections