Topic segmentation in ASR transcripts using bidirectional rnns for change detection - Grid'5000 Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Topic segmentation in ASR transcripts using bidirectional rnns for change detection

Imran Sheikh
  • Fonction : Auteur
  • PersonId : 1018394

Résumé

Topic segmentation methods are mostly based on the idea of lexical cohesion, in which lexical distributions are analysed across the document and segment boundaries are marked in areas of low cohesion. We propose a novel approach for topic segmentation in speech recognition transcripts by measuring lexical cohesion using bidirectional Recurrent Neural Networks (RNN). The bidirectional RNNs capture context in the past and the following set of words. The past and following contexts are compared to perform topic change detection. In contrast to existing works based on sequence and discrim-inative models for topic segmentation, our approach does not use a segmented corpus nor (pseudo) topic labels for training. Our model is trained using news articles obtained from the internet. Evaluation on ASR transcripts of French TV broadcast news programs demonstrates the effectiveness of our proposed approach.
Fichier principal
Vignette du fichier
draft_20Sep2017.pdf (800.99 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01599682 , version 1 (02-10-2017)

Identifiants

  • HAL Id : hal-01599682 , version 1

Citer

Imran Sheikh, Dominique Fohr, Irina Illina. Topic segmentation in ASR transcripts using bidirectional rnns for change detection. ASRU 2017 - IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2017, Okinawa, Japan. ⟨hal-01599682⟩
546 Consultations
1711 Téléchargements

Partager

Gmail Facebook X LinkedIn More