Negative Emotion Recognition in Spoken Dialogs

Abstract

Increasing attention has been directed to the study of the automatic emotion recognition in human speech recently. This paper presents an approach for recognizing negative emotions in spoken dialogs at the utterance level. Our approach mainly includes two parts. First, in addition to the traditional acoustic features, linguistic features based on distributed representation are extracted from the text transcribed by an automatic speech recognition (ASR) system. Second, we propose a novel deep learning model, multi-feature stacked denoising autoencoders (MSDA), which can fuse the high-level representations of the acoustic and linguistic features along with contexts to classify emotions. Experimental results demonstrate that our proposed method yields an absolute improvement over the traditional method by 5.2 %.

Publication
In the Thirteenth China National Conference on Computational Linguistics (CCL 2015)
Date