Bispectral features and mean shift clustering for stress and emotion recognition from natural speech

Yükleniyor...
Küçük Resim

Tarih

2017

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Pergamon-Elsevier Science Ltd

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

A new set of features and feature enhancement techniques are proposed to recognize emotion and stress from speech signal. The speech waveforms and the glottal waveforms (derived from the recorded emotional/stress speech waveforms) were processed by using third order statistics called bispectrum and 28 (14 from speech waveforms and 14 from glottal waveforms) bispectral based features. In this work, mean shift clustering was used to enhance the discrimination ability of the extracted Bispectral Features (BSFs). Four classifiers were used to distinguish different emotional and stressed states. The performance of the proposed method is tested with three databases. Different experiments were conducted and recognition rates were achieved in the range between 93.44% and 100% for Berlin emotional speech database (BES), between 73.81% and 97.23% for Surrey audio-visual expressed emotion database (SAVEE), between 93.8% and 100% for speech under simulated and actual stress simulated domain (SUSAS) (recognition of multi-style speech under stress-neutral, loud, lombard and anger) and 100% for SUSAS actual domain (recognition of three different levels of stress. high, medium and low). The obtained results indicate that the proposed bispectral based features and mean shift clustering provide promising results to recognize emotion and stress from speech signal. (C) 2017 Elsevier Ltd. All rights reserved.

Açıklama

Anahtar Kelimeler

Speech Signals, Glottal Signals, Emotions, Feature Extraction and Emotion Recognition

Kaynak

Computers & Electrical Engineering

WoS Q Değeri

Q2

Scopus Q Değeri

Q1

Cilt

62

Sayı

Künye