We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: The IBM Speaker Recognition System: Recent Advances and Error Analysis

Abstract: We present the recent advances along with an error analysis of the IBM speaker recognition system for conversational speech. Some of the key advancements that contribute to our system include: a nearest-neighbor discriminant analysis (NDA) approach (as opposed to LDA) for intersession variability compensation in the i-vector space, the application of speaker and channel-adapted features derived from an automatic speech recognition (ASR) system for speaker recognition, and the use of a DNN acoustic model with a very large number of output units (~10k senones) to compute the frame-level soft alignments required in the i-vector estimation process. We evaluate these techniques on the NIST 2010 SRE extended core conditions (C1-C9), as well as the 10sec-10sec condition. To our knowledge, results achieved by our system represent the best performances published to date on these conditions. For example, on the extended tel-tel condition (C5) the system achieves an EER of 0.59%. To garner further understanding of the remaining errors (on C5), we examine the recordings associated with the low scoring target trials, where various issues are identified for the problematic recordings/trials. Interestingly, it is observed that correcting the pathological recordings not only improves the scores for the target trials but also for the nontarget trials.
Comments: submitted to INTERSPEECH 2016. arXiv admin note: substantial text overlap with arXiv:1602.07291
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Machine Learning (stat.ML)
Cite as: arXiv:1605.01635 [cs.CL]
  (or arXiv:1605.01635v1 [cs.CL] for this version)

Submission history

From: Omid Sadjadi [view email]
[v1] Thu, 5 May 2016 15:57:21 GMT (169kb,D)

Link back to: arXiv, form interface, contact.