RUS  ENG
Full version
JOURNALS // Proceedings of the Institute for System Programming of the RAS // Archive

Proceedings of ISP RAS, 2024 Volume 36, Issue 6, Pages 103–114 (Mi tisp941)

Could an LLM like chatGPT perform a functional size measurement using the COSMIC method?

F. Valdés-Soutoa, D. Torres-Robledob

a National Autonomous University of Mexico, Science Faculty, CDMX, Mexico
b National Autonomous University of Mexico, Research Institute in Applied Mathematics and Systems, CDMX, Mexico

Abstract: The process of developing software is intricate and time-consuming. Resource estimation is one of the most important responsibilities in software development. Since it is currently the only acceptable metric, the functional size of the program is used to generate estimating models in a widely accepted manner. On the other hand, functional size measurement takes time. The use of artificial intelligence (AI) to automate certain software development jobs has gained popularity in recent years. Software functional sizing and estimation is one area where artificial intelligence may be used. In this study, we investigate how to apply the concepts and guidelines of the COSMIC method to measurements using ChatGPT 4o, a large language model (LLM). To determine whether ChatGPT can perform COSMIC measurements, we discovered that ChatGPT could not reliably produce accurate findings. The primary shortcomings found in ChatGPT include its incapacity to accurately extract data movements, data groups, and functional users from the text. Because of this, ChatGPT's measurements fall short of two essential requirements for measurement: accuracy and reproducibility.

Keywords: COSMIC, CFP, functional size measurement, LLM, chatGPT, software engineering, AI, automatization

Language: English

DOI: 10.15514/ISPRAS-2024-36(6)-6



© Steklov Math. Inst. of RAS, 2026