We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for eess.AS in Nov 2019

[ total of 155 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 151-155 ]
[ showing 25 entries per page: fewer | more | all ]
[1]  arXiv:1911.00137 [pdf, other]
Title: Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences
Comments: Resubmitted to IEEE Access
Subjects: Audio and Speech Processing (eess.AS)
[2]  arXiv:1911.00432 [pdf, other]
Title: Deep neural networks for emotion recognition combining audio and transcripts
Subjects: Audio and Speech Processing (eess.AS)
[3]  arXiv:1911.00527 [pdf, other]
Title: Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Performance (cs.PF); Sound (cs.SD)
[4]  arXiv:1911.00566 [pdf, other]
Title: Predicting word error rate for reverberant speech
Comments: Presented at IEEE 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[5]  arXiv:1911.00940 [pdf, other]
Title: Robust speaker recognition using unsupervised adversarial invariance
Comments: Submitted to ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[6]  arXiv:1911.00982 [pdf, other]
Title: Onssen: an open-source speech separation and enhancement library
Comments: Submitted to ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[7]  arXiv:1911.01182 [pdf, other]
Title: Voice Biometrics Security: Extrapolating False Alarm Rate via Hierarchical Bayesian Modeling of Speaker Verification Scores
Comments: Accepted to be published in Computer Speech and Language
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[8]  arXiv:1911.01255 [pdf, other]
Title: pyannote.audio: neural building blocks for speaker diarization
Comments: Submitted to ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9]  arXiv:1911.01266 [pdf, other]
Title: Supervised online diarization with sample mean loss for multi-domain data
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[10]  arXiv:1911.01533 [pdf, other]
Title: Speaker-invariant Affective Representation Learning via Adversarial Training
Comments: Accepted by ICASSP 2020; 5 pages
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[11]  arXiv:1911.01601 [pdf, other]
Title: ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech
Comments: Accepted, Computer Speech and Language. This manuscript version is made available under the CC-BY-NC-ND 4.0. For the published version on Elsevier website, please visit this https URL
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Sound (cs.SD); Signal Processing (eess.SP)
[12]  arXiv:1911.01635 [pdf, other]
Title: Emotional speech synthesis with rich and granularized control
Comments: Submitted to ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[13]  arXiv:1911.01799 [pdf, ps, other]
Title: CN-CELEB: a challenging Chinese speaker recognition dataset
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[14]  arXiv:1911.01802 [pdf, other]
Title: Fast acoustic scattering using convolutional neural networks
Comments: Accepted by ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Image and Video Processing (eess.IV); Signal Processing (eess.SP)
[15]  arXiv:1911.01803 [pdf, other]
Title: Temporal Feedback Convolutional Recurrent Neural Networks for Speech Command Recognition
Authors: Taejun Kim, Juhan Nam
Comments: This paper is accepted to APSIPA ASC 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[16]  arXiv:1911.01806 [pdf, other]
Title: Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)
[17]  arXiv:1911.01840 [pdf, other]
Title: Who is Real Bob? Adversarial Attacks on Speaker Recognition Systems
Comments: IEEE Symposium on Security and Privacy 2021
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)
[18]  arXiv:1911.01902 [pdf, ps, other]
Title: Speech Enhancement via Deep Spectrum Image Translation Network
Comments: Accepted at ICBME 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[19]  arXiv:1911.02086 [pdf, other]
Title: Small-Footprint Keyword Spotting on Raw Audio Data with Sinc-Convolutions
Comments: Accepted at ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[20]  arXiv:1911.02091 [pdf, other]
Title: Closing the Training/Inference Gap for Deep Attractor Networks
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[21]  arXiv:1911.02115 [pdf, ps, other]
Title: Spatial Attention for Far-field Speech Recognition with Deep Beamforming Neural Networks
Comments: To be presented at ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22]  arXiv:1911.02216 [pdf, ps, other]
Title: Addressing Ambiguity of Emotion Labels Through Meta-Learning
Comments: Submitted to ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[23]  arXiv:1911.02242 [pdf, other]
Title: A comparison of end-to-end models for long-form speech recognition
Comments: ASRU camera-ready version
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[24]  arXiv:1911.02388 [pdf, other]
Title: The Speed Submission to DIHARD II: Contributions & Lessons Learned
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[25]  arXiv:1911.02746 [pdf, other]
Title: Mask-dependent Phase Estimation for Monaural Speaker Separation
Comments: Accepted by ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS)
[ total of 155 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 151-155 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2404, contact, help  (Access key information)