D. N. Davydova, “Binarization of language models”, Intelligent systems. Theory and applications, 2025, Volume 29, Issue 3,Pages <nobr>119

Part 2. Special Issues in Intellectual Systems Theory

Binarization of language models

D. N. Davydova

Lomonosov Moscow State University, Faculty of Mechanics and Mathematics

Abstract: Large language models are widely used in the field of natural language processing. However, despite their high eficiency, the application of large language models becomes dificult due to their high computational and memory costs. One of the ways to solve this problem is neural network quantization, that is, converting the weights and activations of the network to a representation with lower bit-width. A special case of quantization is binarization, which is the compression of network parameters to a bit-width of $1$ bit. In this paper, the structure of binary neural networks is considered, an overview of current methods of language model binarization is provided, and the results obtained are described.

Keywords: natural language processing, binary neural networks, binarization, quantization, large language models