Prompting for Security: A Cross-Model Evaluation of Code Generation in LLMs
Loading...

Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Open Access Color
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
The security of AI-generated code has become a growing concern as Large Language Models (LLMs) like GPT-4, Gemini, DeepSeek, and LLaMA are increasingly integrated into software development pipelines. While prior research has primarily focused on GPT-family models, the security performance of newer open models under structured prompting remains underexplored. This study evaluates the ability of modern LLMs to generate secure code using six established prompting strategies across 150 Python tasks (LLMSecEval). Generated code was assessed using two static analysis tools (Bandit and CodeQL) to detect Common Weakness Enumeration (CWE) vulnerabilities. Findings showed that Recursive Criticism and Improvement (RCI) prompting significantly improves security outcomes across all models. Notably, LLaMA produced over 15,800 lines of vulnerability-free code under RCI. Gemini and DeepSeek also showed notable improvements under guided prompting. From a tool-specific perspective, Bandit and Cod-eQL produced divergent results, with CodeQL exposing deeper or more complex vulnerabilities. These results highlight the necessity of prompt-aware security evaluations and multi-tool static analysis to ensure reliable, secure code generation from LLMs. This study offers practical insights into secure code generation for developers and researchers. © 2025 IEEE.
Description
Keywords
Large Language Models, Prompt Engineering, Secure Code Generation, Software Security, Static Code Analysis
Fields of Science
Citation
WoS Q
N/A
Scopus Q
N/A
Source
International Conference on Computer Science and Engineering, UBMK -- 10th International Conference on Computer Science and Engineering, UBMK 2025 -- 2025-09-17 Through 2025-09-21 -- Istanbul -- 214243
Volume
Issue
2025
Start Page
271
End Page
276

