We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for eess.AS in Apr 2020, skipping first 50

[ total of 132 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-132 ]
[ showing 25 entries per page: fewer | more | all ]
[51]  arXiv:2004.06579 [pdf, other]
Title: The Hearpiece database of individual transfer functions of an openly available in-the-ear earpiece for hearing device research
Comments: 14 pages, 13 figures
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[52]  arXiv:2004.06756 [pdf, other]
Title: Speaker Diarization with Lexical Information
Journal-ref: Interspeech 2019, 391-395
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[53]  arXiv:2004.06833 [pdf, ps, other]
Title: Alzheimer's Dementia Recognition through Spontaneous Speech: The ADReSS Challenge
Comments: To appear in the Proceedings of INTERSPEECH 2020, Oct 2020, Shanghai, China
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Machine Learning (stat.ML)
[54]  arXiv:2004.07370 [pdf, other]
Title: F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[55]  arXiv:2004.07832 [pdf, other]
Title: Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[56]  arXiv:2004.07948 [pdf, other]
Title: Sound of Guns: Digital Forensics of Gun Audio Samples meets Artificial Intelligence
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[57]  arXiv:2004.07992 [pdf, other]
Title: Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Quantitative Methods (q-bio.QM)
[58]  arXiv:2004.08248 [pdf, ps, other]
Title: Acoustical classification of different speech acts using nonlinear methods
Comments: 6 pages, 2 figures; Proceedings of WESPAC 2018, New Delhi, India, November 11-15, 2018
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Chaotic Dynamics (nlin.CD); Neurons and Cognition (q-bio.NC)
[59]  arXiv:2004.08250 [pdf, other]
Title: How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition
Comments: in IEEE/ACM Transactions on Audio, Speech, and Language Processing (to appear)
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[60]  arXiv:2004.08287 [pdf, other]
Title: Deep Neural Network for Respiratory Sound Classification in Wearable Devices Enabled by Patient Specific Model Tuning
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[61]  arXiv:2004.08326 [pdf, other]
Title: SpEx: Multi-Scale Time Domain Speaker Extraction Network
Comments: ACCEPTED in IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
Journal-ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[62]  arXiv:2004.08531 [pdf, other]
Title: MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition
Subjects: Audio and Speech Processing (eess.AS)
[63]  arXiv:2004.08849 [pdf, other]
Title: The Attacker's Perspective on Automatic Speaker Verification: An Overview
Comments: 5 pages, 1 figure, Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR)
[64]  arXiv:2004.09347 [pdf, other]
Title: End-to-End Whisper to Natural Speech Conversion using Modified Transformer Network
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[65]  arXiv:2004.09571 [pdf, other]
Title: Language-agnostic Multilingual Modeling
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Machine Learning (stat.ML)
[66]  arXiv:2004.09584 [pdf, other]
Title: ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric
Comments: 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[67]  arXiv:2004.09607 [pdf, other]
Title: Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System
Comments: 8 pages, 2 figures, submit to Oriental Cocosda
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[68]  arXiv:2004.10120 [pdf, other]
Title: Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
Comments: 15 pages, 13 figures
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[69]  arXiv:2004.10246 [pdf, ps, other]
Title: Music Generation with Temporal Structure Augmentation
Authors: Shakeel Raja
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[70]  arXiv:2004.10391 [pdf, other]
Title: Towards Linking the Lakh and IMSLP Datasets
Authors: TJ Tsai
Comments: 5 pages, 4 figures, 1 table. Accepted paper at the International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020
Subjects: Audio and Speech Processing (eess.AS); Multimedia (cs.MM); Sound (cs.SD); Image and Video Processing (eess.IV)
[71]  arXiv:2004.10799 [pdf, other]
Title: Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Comments: Accepted by Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[72]  arXiv:2004.10823 [pdf, other]
Title: Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit
Comments: 5 pages. Accepted by ICASSP2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[73]  arXiv:2004.11012 [pdf, other]
Title: ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
Comments: Accepted by ISCSLP2021
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[74]  arXiv:2004.11162 [pdf, other]
Title: Flexible framework for audio reconstruction
Journal-ref: 23rd International Conference on Digital Audio Effects (eDAFx2020)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[75]  arXiv:2004.11284 [pdf, other]
Title: Unsupervised Speech Decomposition via Triple Information Bottleneck
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[ total of 132 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-132 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2404, contact, help  (Access key information)