We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: LEAP Submission for the Third DIHARD Diarization Challenge

Abstract: The LEAP submission for DIHARD-III challenge is described in this paper. The proposed system is composed of a speech bandwidth classifier, and diarization systems fine-tuned for narrowband and wideband speech separately. We use an end-to-end speaker diarization system for the narrowband conversational telephone speech recordings. For the wideband multi-speaker recordings, we use a neural embedding based clustering approach, similar to the baseline system. The embeddings are extracted from a time-delay neural network (called x-vectors) followed by the graph based path integral clustering (PIC) approach. The LEAP system showed 24% and 18% relative improvements for Track-1 and Track-2 respectively over the baseline system provided by the organizers. This paper describes the challenge submission, the post-evaluation analysis and improvements observed on the DIHARD-III dataset.
Comments: Accepted in INTERSPEECH 2021
Subjects: Audio and Speech Processing (eess.AS)
Cite as: arXiv:2104.02359 [eess.AS]
  (or arXiv:2104.02359v2 [eess.AS] for this version)

Submission history

From: Prachi Singh [view email]
[v1] Tue, 6 Apr 2021 08:36:08 GMT (182kb,D)
[v2] Mon, 14 Jun 2021 07:47:18 GMT (432kb,D)

Link back to: arXiv, form interface, contact.