Emotion recognition from phoneme-duration information - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2020

Emotion recognition from phoneme-duration information

Résumé

The duration of each phoneme is extracted for several emotions. Information on phonemes and their duration are used to train a Variational AutoEncoder (VAE) to create a latent space z which represents emotion information. The loss functions that were used for that purpose are reconstruction loss, Kullback-Leibler (KL) divergence and multiclass N pair loss. Test samples are classified using the nearest neighbor criterion between their representation and the clusters associated to each emotion, as estimated from training data. To evaluate the models two metrics were used: emotion recognition accuracy and the consistency of the clusters of the latent space.
Fichier principal
Vignette du fichier
ISSP_2020_submitted.pdf (191.67 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02983229 , version 1 (29-10-2020)

Identifiants

  • HAL Id : hal-02983229 , version 1

Citer

Ajinkya Kulkarni, Ioannis K Douros, Vincent Colotte, Denis Jouvet. Emotion recognition from phoneme-duration information. 2020. ⟨hal-02983229⟩
102 Consultations
167 Téléchargements

Partager

Gmail Facebook X LinkedIn More