Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech

The aim of the present study is to select a set of higher order spectral features for emotion/stress recognition system. 50 Bispectral (28 features) and Bicoherence (22 features) based higher order spectral features were extracted from speech signal and its glottal waveform. These features were combined with Inter-Speech 2010 features to further improve the recognition rates. Feature subset selection (FSS) was carried out in this proposed work with the objective of maximizing emotion recognition rate for subject independent with minimum features. The FSS contains two stages: Multi-cluster feature selection was adopted in Stage 1 to reduce feature space and identify relevant feature subset from Interspeech 2010 features. In Stage 2, Biogeography based optimization (BBO), Particle swarm optimization (PSO) and proposed BBO_PSO Hybrid optimization were performed to further reduce the dimension of feature space and identify the most relevant feature subset, which has higher discrimination ability to distinguish different emotional states. The proposed method was tested in three different databases: Berlin emotional speech database (BES), Surrey audio-visual expressed emotion database (SAVEE) and Speech under simulated and actual stress (SUSAS) simulated domain. The proposed feature set was evaluated with subject independent (SI), subject dependent (SD), gender dependent male (GD-male), gender dependent female (GD-female), text independent pairwise speech (TIDPS), and text independent multi-style speech (TIDMSS) experiments by using SVM and ELM classifiers. From the results obtained, it is evident that the proposed method attained accuracies of 93.25% (SI), 100% (SD), 93.75% (GD-male), and 97.58% (GD-female) for BES; 62.38% (SI) and 76.19% (SD) for SAVEE; and 90.09% (TIDMSS), 97.04% (TIDPS - Angryvs. Neutral), 98.89% (TIDPS - Lombard vs. Neutral), 99.07% (TIDPS - Loud vs. Neutral) for SUSAS. (c) 2017 Elsevier B.V. All rights reserved.

Anahtar Kelimeler

Speech Signals, Feature Extraction, Feature Selection and Emotion Recognition

Kaynak

Applied Soft Computing

WoS Q Değeri

Q1

Scopus Q Değeri

Q1

Cilt

56

Bağlantı

https://doi.org/10.1016/j.asoc.2017.03.013
https://hdl.handle.net/20.500.12491/9212

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Elektrik Elektronik Mühendisliği Bölümü Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech

Dosyalar

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Özet

Açıklama