An Empirical Study on Speech Restoration Guided by Self-supervised Speech Representation

Author	Jaeuk Byun, Youna Ji, Soo-Whan Chung, Soyeon Choe, Min-Seok Choe
Publication	International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Year	2023
Link	[Paper] [arXiv]

ABSTRACT

Enhancing speech quality is an indispensable yet difficult task as it is often complicated by a range of degradation factors. In addition to additive noise, reverberation, clipping, and speech attenuation can all adversely affect speech quality. Speech restoration aims to recover speech components from these distortions. This paper focuses on exploring the impact of self-supervised learning (SSL) on the speech restoration task. Specifically, we employ speech representation in various speech restoration networks and evaluate their performance under different distortion scenarios. Our experiments demonstrate that the contextual information provided by the SSL model can enhance speech restoration performance in various distortion scenarios, while also increasing robustness against various lengths of speech attenuation and mismatched test conditions.

Share on

Twitter Facebook LinkedIn

Soo-Whan Chung

An Empirical Study on Speech Restoration Guided by Self-supervised Speech Representation

Share on

You may also enjoy

Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation

HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders

MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion

Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech