RUS  ENG
Полная версия
ЖУРНАЛЫ // Theoretical and Applied Mechanics // Архив

Theor. Appl. Mech., 2025, том 52, выпуск 1, страницы 67–73 (Mi tam152)

Эта публикация цитируется в 1 статье

On non-approximability of zero loss global $\mathcal{L}^2$ minimizers by gradient descent in deep learning

Thomas Chen, Patricia  Muñoz Ewald

Department of Mathematics, University of Texas at Austin, Austin TX, USA

Аннотация: We analyze geometric aspects of the gradient descent algorithm in Deep Learning (DL), and give a detailed discussion of the circumstance that, in underparametrized DL networks, zero loss minimization cannot generically be attained. As a consequence, we conclude that the distribution of training inputs must necessarily be non-generic in order to produce zero loss minimizers, both for the method constructed in [2, 3], or for gradient descent [1] (which assume clustering of training data).

Ключевые слова: deep learning, underparametrization, generic training data, zero loss.

MSC: 57R70, 62M45

Поступила в редакцию: 21.01.2025
Принята в печать: 05.05.2025

Язык публикации: английский

DOI: 10.2298/TAM250121008C



© МИАН, 2026