Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech

dc.authorid0000-0002-8929-3473en_US
dc.authorid0000-0003-1840-9958en_US
dc.authorid0000-0001-7466-0368en_US
dc.contributor.authorYogesh, C. K.
dc.contributor.authorHariharan, Muthusamy
dc.contributor.authorNgadiran, Ruzelita
dc.contributor.authorAdom, A. H.
dc.contributor.authorYaacob, Sazali
dc.contributor.authorPolat, Kemal
dc.date.accessioned2021-06-23T19:45:48Z
dc.date.available2021-06-23T19:45:48Z
dc.date.issued2017
dc.departmentBAİBÜ, Mühendislik Fakültesi, Elektrik Elektronik Mühendisliği Bölümüen_US
dc.description.abstractThe aim of the present study is to select a set of higher order spectral features for emotion/stress recognition system. 50 Bispectral (28 features) and Bicoherence (22 features) based higher order spectral features were extracted from speech signal and its glottal waveform. These features were combined with Inter-Speech 2010 features to further improve the recognition rates. Feature subset selection (FSS) was carried out in this proposed work with the objective of maximizing emotion recognition rate for subject independent with minimum features. The FSS contains two stages: Multi-cluster feature selection was adopted in Stage 1 to reduce feature space and identify relevant feature subset from Interspeech 2010 features. In Stage 2, Biogeography based optimization (BBO), Particle swarm optimization (PSO) and proposed BBO_PSO Hybrid optimization were performed to further reduce the dimension of feature space and identify the most relevant feature subset, which has higher discrimination ability to distinguish different emotional states. The proposed method was tested in three different databases: Berlin emotional speech database (BES), Surrey audio-visual expressed emotion database (SAVEE) and Speech under simulated and actual stress (SUSAS) simulated domain. The proposed feature set was evaluated with subject independent (SI), subject dependent (SD), gender dependent male (GD-male), gender dependent female (GD-female), text independent pairwise speech (TIDPS), and text independent multi-style speech (TIDMSS) experiments by using SVM and ELM classifiers. From the results obtained, it is evident that the proposed method attained accuracies of 93.25% (SI), 100% (SD), 93.75% (GD-male), and 97.58% (GD-female) for BES; 62.38% (SI) and 76.19% (SD) for SAVEE; and 90.09% (TIDMSS), 97.04% (TIDPS - Angryvs. Neutral), 98.89% (TIDPS - Lombard vs. Neutral), 99.07% (TIDPS - Loud vs. Neutral) for SUSAS. (c) 2017 Elsevier B.V. All rights reserved.en_US
dc.identifier.doi10.1016/j.asoc.2017.03.013
dc.identifier.endpage232en_US
dc.identifier.issn1568-4946
dc.identifier.issn1872-9681
dc.identifier.scopus2-s2.0-85016239937en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.startpage217en_US
dc.identifier.urihttps://doi.org/10.1016/j.asoc.2017.03.013
dc.identifier.urihttps://hdl.handle.net/20.500.12491/9212
dc.identifier.volume56en_US
dc.identifier.wosWOS:000402364000017en_US
dc.identifier.wosqualityQ1en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.institutionauthorPolat, Kemal
dc.language.isoenen_US
dc.publisherElsevieren_US
dc.relation.ispartofApplied Soft Computingen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectSpeech Signalsen_US
dc.subjectFeature Extractionen_US
dc.subjectFeature Selection and Emotion Recognitionen_US
dc.titleHybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speechen_US
dc.typeArticleen_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
c-k-yogesh.pdf
Boyut:
5.31 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin/Full Text