We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions

[ total of 42 entries: 1-25 | 26-42 ]
[ showing 25 entries per page: fewer | more | all ]

Fri, 29 May 2020

[1]  arXiv:2005.14181 (cross-list from eess.AS) [pdf, other]
Title: Bayesian Restoration of Audio Degraded by Low-Frequency Pulses Modeled via Gaussian Process
Comments: 10 pages, 4 figures, 2 tables. Submitted to IEEE Journal of Selected Topics in Signal Processing - Special Issue "Reconstruction of audio from incomplete or highly degraded observations"
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP); Applications (stat.AP); Machine Learning (stat.ML)
[2]  arXiv:2005.13981 (cross-list from eess.AS) [pdf]
Title: The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Testing Framework, and Challenge Results
Comments: Interspeech 2020. arXiv admin note: substantial text overlap with arXiv:2001.08662
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[3]  arXiv:2005.13895 (cross-list from eess.AS) [pdf, other]
Title: When Can Self-Attention Be Replaced by Feed Forward Layers?
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[4]  arXiv:2005.13835 (cross-list from eess.AS) [pdf, other]
Title: Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[5]  arXiv:2005.13827 (cross-list from cs.CL) [pdf, other]
Title: Subword RNNLM Approximations for Out-Of-Vocabulary Keyword Search
Comments: INTERSPEECH 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6]  arXiv:2005.13770 (cross-list from eess.AS) [pdf, other]
Title: DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Multimedia (cs.MM); Sound (cs.SD)
[7]  arXiv:2005.13769 (cross-list from eess.AS) [pdf, other]
Title: Unsupervised Audio Source Separation using Generative Priors
Comments: 5 pages, 2 figures
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Machine Learning (stat.ML)
[8]  arXiv:2005.13681 (cross-list from cs.CL) [pdf, other]
Title: Phone Features Improve Speech Translation
Comments: Accepted to ACL2020
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[9]  arXiv:2005.13616 (cross-list from eess.AS) [pdf, other]
Title: Modality Dropout for Improved Performance-driven Talking Faces
Comments: Pre-print
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)

Thu, 28 May 2020

[10]  arXiv:2005.13426 (cross-list from eess.SP) [pdf, other]
Title: Weighted Data Spaces for Correlation Based Array Imaging in Experimental Aeroacoustics
Comments: Preprint subitted to "Journal of Sound and Vibration"
Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[11]  arXiv:2005.13402 (cross-list from cs.CV) [pdf, other]
Title: AVGZSLNet: Audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features from Multi-Modal Embeddings
Comments: Submitted to INTERSPEECH 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[12]  arXiv:2005.13326 (cross-list from eess.AS) [pdf, other]
Title: CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
Comments: 5 pages. arXiv admin note: text overlap with arXiv:1911.08747
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[13]  arXiv:2005.13300 (cross-list from cs.LG) [pdf, ps, other]
Title: Fast and Effective Robustness Certification for Recurrent Neural Networks
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[14]  arXiv:2005.13291 (cross-list from eess.AS) [pdf, other]
Title: Earballs: Neural Transmodal Translation
Comments: 9 pages, 3 figures
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[15]  arXiv:2005.13211 (cross-list from eess.AS) [pdf, other]
Title: Insertion-Based Modeling for End-to-End Automatic Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16]  arXiv:2005.13163 (cross-list from eess.AS) [pdf, other]
Title: Semi-supervised source localization with deep generative modeling
Comments: Submitted to IEEE International Workshop on Machine Learning for Signal Processing
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[17]  arXiv:2005.13146 (cross-list from eess.AS) [pdf, other]
Title: ACGAN-based Data Augmentation Integrated with Long-term Scalogram for Acoustic Scene Classification
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[18]  arXiv:2005.12977 (cross-list from cs.IR) [pdf, other]
Title: Learning to rank music tracks using triplet loss
Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[19]  arXiv:2005.12963 (cross-list from eess.AS) [pdf, ps, other]
Title: Adversarial Contrastive Predictive Coding for Unsupervised Learning of Disentangled Representations
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[20]  arXiv:2005.12962 (cross-list from eess.AS) [pdf, other]
Title: A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems
Comments: 9 pages, submitted to KSE 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Wed, 27 May 2020

[21]  arXiv:2005.12779 [pdf, ps, other]
Title: Sound Context Classification Basing on Join Learning Model and Multi-Spectrogram Features
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[22]  arXiv:2005.12412 [pdf, other]
Title: InfantNet: A Deep Neural Network for Analyzing Infant Vocalizations
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23]  arXiv:2005.12683 (cross-list from eess.AS) [pdf, other]
Title: Exploring Optimal DNN Architecture for End-to-End Beamformers Based on Time-frequency References
Comments: Submitted to Interspeech2020. arXiv admin note: substantial text overlap with arXiv:1910.14262
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[24]  arXiv:2005.12531 (cross-list from eess.AS) [pdf, other]
Title: Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[25]  arXiv:2005.12368 (cross-list from cs.CL) [pdf, other]
Title: FT Speech: Danish Parliament Speech Corpus
Comments: Submitted to Interspeech 2020
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 42 entries: 1-25 | 26-42 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2005, contact, help  (Access key information)