Gradient-based Active Learning Query Strategy for End-to-end Speech Recognition

Author	Yang Yuan, Soo-Whan Chung, Hong-Goo Kang
Publication	International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Month	April
Year	2019
Link	[Paper]

ABSTRACT

In this paper, we propose an effective active learning query strategy for an automatic speech recognition system with the aim of reducing the training cost. Generally, training a deep neural network with supervised learning requires a massive amount of labeled data to obtain excellent performance. However, labeling data is tedious and costly manual work. Active learning can solve this problem by choosing and only annotating informative instances, which presents better results even with less transcribed data. In this approach it is vitally important to accurately select informative samples. Based on the preliminary experiment results that true gradient length has the best performance in terms of measuring sample informativeness in ideal conditions, we propose utilizing both uncertainty and the expected gradient length criterion to approximate the true gradient length using a neural network. The experiment results show that our proposed method is superior to the conventional individual criterion when applied to a phoneme-based speech recognition system, and it has both a faster convergence speed and the greatest loss reduction in both clean and noisy conditions.

Share on

Twitter Facebook LinkedIn

Soo-Whan Chung

Gradient-based Active Learning Query Strategy for End-to-end Speech Recognition

Share on

You may also enjoy

HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders

MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion

Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech

MoLE : Mixture of Language Experts for Multi-Lingual Automatic Speech Recognition