Big Data Software Engineering: Analysis of Knowledge Domains and Skill Sets Using Lda-Based Topic Modeling

Loading...
Publication Logo

Date

2019

Journal Title

Journal ISSN

Volume Title

Publisher

Ieee-inst Electrical Electronics Engineers inc

Open Access Color

GOLD

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Top 1%
Influence
Top 10%
Popularity
Top 1%

Research Projects

Journal Issue

Abstract

Software engineering is a data-driven discipline and an integral part of data science. The introduction of big data systems has led to a great transformation in the architecture, methodologies, knowledge domains, and skills related to software engineering. Accordingly, education programs are now required to adapt themselves to up-to-date developments by first identifying the competencies concerning big data software engineering to meet the industrial needs and follow the latest trends. This paper aims to reveal the knowledge domains and skill sets required for big data software engineering and develop a taxonomy by mapping these competencies. A semi-automatic methodology is proposed for the semantic analysis of the textual contents of online job advertisements related to big data software engineering. This methodology uses the latent Dirichlet allocation (LDA), a probabilistic topic-modeling technique to discover the hidden semantic structures from a given textual corpus. The output of this paper is a systematic competency map comprising the essential knowledge domains, skills, and tools for big data software engineering. The findings of this paper are expected to help evaluate and improve IT professionals' vocational knowledge and skills, identify professional roles and competencies in personnel recruitment processes of companies, and meet the skill requirements of the industry through software engineering education programs. Additionally, the proposed model can be extended to blogs, social networks, forums, and other online communities to allow automatic identification of emerging trends and generate contextual tags.

Description

GURCAN, Fatih/0000-0001-9915-6686; Cagiltay, Nergiz/0000-0003-0875-9276

Keywords

Big data software engineering, competency map, knowledge domains and skill sets, topic modeling, latent Dirichlet allocation, competency map, knowledge domains and skill sets, topic modeling, latent Dirichlet allocation, Electrical engineering. Electronics. Nuclear engineering, Big data software engineering, TK1-9971

Fields of Science

0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

Citation

WoS Q

Q2

Scopus Q

Q1
OpenCitations Logo
OpenCitations Citation Count
103

Source

IEEE Access

Volume

7

Issue

Start Page

82541

End Page

82552

Collections

PlumX Metrics
Citations

CrossRef : 22

Scopus : 127

Captures

Mendeley Readers : 243

SCOPUS™ Citations

132

checked on Feb 14, 2026

Web of Science™ Citations

90

checked on Feb 14, 2026

Page Views

1

checked on Feb 14, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
18.56004918

Sustainable Development Goals

SDG data is not available