Audiovisual data fusion for successive speakers tracking - AGPIG Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Audiovisual data fusion for successive speakers tracking

Résumé

In this paper, a human speaker tracking method on audio and video data is presented. It is applied to con- versation tracking with a robot. Audiovisual data fusion is performed in a two-steps process. Detection is performed independently on each modality: face detection based on skin color on video data and sound source localization based on the time delay of arrival on audio data. The results of those detection processes are then fused thanks to an adaptation of bayesian filter to detect the speaker. The robot is able to detect the face of the talking person and to detect a new speaker in a conversation.
Fichier principal
Vignette du fichier
LABOUREY_VISAPP_2014.pdf (888.89 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00935636 , version 1 (31-01-2014)

Identifiants

  • HAL Id : hal-00935636 , version 1

Citer

Quentin Labourey, Olivier Aycard, Denis Pellerin, Michèle Rombaut. Audiovisual data fusion for successive speakers tracking. VISIGRAPP 2014 - 9th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Jan 2014, Lisbonne, Portugal. ⟨hal-00935636⟩
514 Consultations
261 Téléchargements

Partager

Gmail Facebook X LinkedIn More