Abstract:
This paper presents a meta-analysis of four experimental studies from the Norm! project, aimed at systematically studying the effectiveness of large language models in the legal field. The study includes a comparative analysis of junior and senior models, optimization of system prompts, and testing of multi-agent architectures on tasks in Russian family and civil law. A key discovery was the identification of a nonlinear relationship between architectural complexity and the quality of results: the transition from simple to complex systems provides a slight increase in quality (15–40%) with an exponential increase in resource costs (by a factor of 10–15). The flagship models GPT-4.1 and Gemini 2.5 Pro demonstrate superior quality (9.04 and 8.52 points), but junior LLMs with efficiency coefficients up to 130.3 remain cost-effective. A universal problem area for all architectures is tasks requiring an integrative analysis of multiple legal norms. The results form scientifically sound recommendations for various implementation scenarios: from mass consulting services to specialized legal applications, defining the prospects for the development of hybrid architectures in legal practice.
Keywords:large language models, legal artificial intelligence, meta-analysis, multi-agent systems, system prompts, cost-effectiveness, legal consulting, RAG systems, family law, artificial intelligence system architecture.