RUS  ENG
Full version
JOURNALS // Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki // Archive

Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, 2013 Volume 155, Book 4, Pages 109–117 (Mi uzku1246)

This article is cited in 2 papers

Mid-level features for audio chord recognition using a deep neural network

N. Glazyrin

Department of Algebra and Discrete Mathematics, Ural Federal University named after B. N. Yeltsin, Ekaterinburg, Russia

Abstract: Deep neural networks composed of several pre-trained layers have been successfully applied to various tasks related to audio processing. Some configurations of deep neural networks (including deep recurrent networks) which can be pretrained with the help of stacked denoising autoencoders are proposed and examined in this paper in application to feature extraction for audio chord recognition task. The features obtained from an audio spectrogram using such network can be used instead of conventional chroma features to recognize the actual chords in the audio recording. Chord recognition quality that was achieved using the proposed features is compared to the one that was achieved using conventional chroma features which do not rely on any machine learning technique.

Keywords: audio chord recognition, autoencoder, recurrent network, deep learning.

UDC: 004.93

Received: 02.08.2013

Language: English



© Steklov Math. Inst. of RAS, 2026