RUS  ENG
Full version
JOURNALS // Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia // Archive

Dokl. RAN. Math. Inf. Proc. Upr., 2025 Volume 527, Pages 471–484 (Mi danma702)

SPECIAL ISSUE: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TECHNOLOGIES

Adapting the AI scientist for enterprise: solving real business problems with autonomous TEXT-to-SQL research

V. Fedorova, D. Lavitskayaa, D. M. Ibragimovb, D. A. Safronovc, A. Ballesc, A. Yu. Gribanovac, M. S. Radionovc

a Novosibirsk State University
b Moscow Aviation Institute (National Research University)
c Sberbank, Moscow

Abstract: One of the grand challenges in artificial intelligence is automating complex, multi-step reasoning tasks that require deep understanding of both natural language and structured data. While large language models have shown promise in code generation and natural language understanding, their ability to autonomously conduct end-to-end research in specialized domains remains limited. We ask a key question: can AI-driven research reliably solve real enterprise problems? This paper presents a fully automated framework for AI-driven research applied to the text-to-SQL problem, enabling frontier language models to independently generate novel ideas, design experiments, implement solutions, and communicate findings. We adapt The AI Scientist [10] – a comprehensive AI agent for autonomous scientific discovery – to the domain of semantic parsing, where it formulates hypotheses about improving text-to-SQL accuracy, writes executable code, runs experiments on benchmark datasets, visualizes performance gains, and produces complete scientific papers summarizing its results. Each full research cycle costs less than \$5, making it a scalable and cost-effective approach for rapid innovation. This work moves toward self-improving AI systems in natural language processing, demonstrating robust performance across both proprietary and public benchmarks. Where AI agents not only solve tasks but also advance the state of the art by conducting independent research. Our code and generated papers are open-sourced at https://gitverse.ru/tr1ggers/AIScientist-Text2SQL.git.

Keywords: semantic parsing, large language models, automated scientific discovery, AI agent, TEXT-to-SQL, NL2SQL, database question answering, autonomous research, machine learning automation, code generation, self-improving AI, program synthesis.

UDC: 004.9

Received: 21.08.2025
Accepted: 29.09.2025

DOI: 10.7868/S2686954325070409



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2026