Skip to Main content Skip to Navigation
Journal articles

Un mot pour un autre ? Analyse et comparaison de huit plateformes de transcription automatique

Abstract : This article compares the functionalities and results of eight automatic transcription platforms (Go Transcribe, Happy Scribe, Headliner, Sonix, Video Indexer, Vocalmatic, Vocapia and YouTube), for audio samples in French. We propose an original methodology, designed through an interdisciplinary work, to compare the transcriptions. It combines three complementary approaches: (1) a quantitative approach which compares the textual outcomes using a common metric, the Word Error Rate (WER), (2) a fine-grained approach to classify and understand the errors generated by the platforms, and finally (3) an approach estimating the amount of transcription time which can be saved for each file on each platform. We show that no platform surpassed the others for all the samples, but two nevertheless stood out: Vocapia and Sonix, each with their own areas of expertise. Regardless of the type of file or platform, listening and correcting the text remains a necessary step. However the use of such tools can save up to 75% of time compared with manual transcription. Yet, the use of these online tools can create major problems relating to data confidentiality and security. Finally, we reflect on the interdisciplinary setting that made this project possible. Résumé Cet article compare les fonctionnalités et résultats de huit outils de transcription automatique (Go Transcribe, Happy Scribe, Headliner, Sonix, Video Indexer, Vocalmatic, Vocapia et YouTube), pour des extraits audio de langue française. Une méthodologie innovante, fruit d'un travail interdisciplinaire, est proposée pour comparer les transcriptions. Elle repose sur un assemblage de trois approches complémentaires : (1) une approche quantitative de comparaison de textes à partir d'une métrique couramment employée, le Word Error Rate (WER), (2) une approche fine de classification et compréhension des erreurs générées par les plateformes, et enfin (3) une estimation du potentiel de gain de temps de transcription pour chacun des fichiers et des plateformes. In fine, aucune plateforme ne serait plus efficace que les
Document type :
Journal articles
Complete list of metadata

https://hal-univ-pau.archives-ouvertes.fr/hal-03730474
Contributor : Gaëlle DELETRAZ Connect in order to contact the contributor
Submitted on : Wednesday, July 20, 2022 - 5:43:15 PM
Last modification on : Thursday, September 22, 2022 - 5:07:05 AM

File

 Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : 2023-07-20

Please log in to resquest access to the document

Identifiers

Collections

Citation

Elise Tancoigne, Jean Philippe Corbellini, Gaëlle Deletraz, Laure Gayraud, Sandrine Ollinger, et al.. Un mot pour un autre ? Analyse et comparaison de huit plateformes de transcription automatique. Bulletin de Méthodologie Sociologique / Bulletin of Sociological Methodology, SAGE Publications, 2022, 155 (1), pp.45 - 81. ⟨10.1177/07591063221088322⟩. ⟨hal-03730474v1⟩

Share

Metrics

Record views

79