Date: Aug., 30th, 2021

In INTERSPEECH 2021, I present a paper for the active speaker detection (ASD) in audio-visual framework. The paper title is “Look Who’s Talking: Active Speaker Detection in the Wild.”

This paper proposes a active speaker detection method by measuring similarity between audio embeddings and visual embeddings. Also, we present a new ASD dataset, Active Speakers in the Wild (ASW), which contains videos and co-occurring speech segments with dense speech activity labels.

If you are interested in this paper, please visit the link below.

[Paper]