Bispectral features and mean shift clustering for stress and emotion recognition from natural speech
Yükleniyor...
Dosyalar
Tarih
2017
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Pergamon-Elsevier Science Ltd
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
A new set of features and feature enhancement techniques are proposed to recognize emotion and stress from speech signal. The speech waveforms and the glottal waveforms (derived from the recorded emotional/stress speech waveforms) were processed by using third order statistics called bispectrum and 28 (14 from speech waveforms and 14 from glottal waveforms) bispectral based features. In this work, mean shift clustering was used to enhance the discrimination ability of the extracted Bispectral Features (BSFs). Four classifiers were used to distinguish different emotional and stressed states. The performance of the proposed method is tested with three databases. Different experiments were conducted and recognition rates were achieved in the range between 93.44% and 100% for Berlin emotional speech database (BES), between 73.81% and 97.23% for Surrey audio-visual expressed emotion database (SAVEE), between 93.8% and 100% for speech under simulated and actual stress simulated domain (SUSAS) (recognition of multi-style speech under stress-neutral, loud, lombard and anger) and 100% for SUSAS actual domain (recognition of three different levels of stress. high, medium and low). The obtained results indicate that the proposed bispectral based features and mean shift clustering provide promising results to recognize emotion and stress from speech signal. (C) 2017 Elsevier Ltd. All rights reserved.
Açıklama
Anahtar Kelimeler
Speech Signals, Glottal Signals, Emotions, Feature Extraction and Emotion Recognition
Kaynak
Computers & Electrical Engineering
WoS Q Değeri
Q2
Scopus Q Değeri
Q1
Cilt
62