We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for recent submissions

[ total of 18 entries: 1-18 ]
[ showing up to 50 entries per page: fewer | more ]

Fri, 24 Jan 2020

[1]  arXiv:2001.08444 [pdf, other]
Title: On the human evaluation of audio adversarial examples
Comments: Preprint. 17 pages, 7 figures, 4 tables
Subjects: Audio and Speech Processing (eess.AS); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[2]  arXiv:2001.08378 [pdf, ps, other]
Title: Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Comments: 5 pages, 3 figures. Submitted to ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[3]  arXiv:2001.08290 [pdf, other]
Title: Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Machine Learning (stat.ML)
[4]  arXiv:2001.08702 (cross-list from cs.CV) [pdf, other]
Title: Lipreading using Temporal Convolutional Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[5]  arXiv:2001.08662 (cross-list from cs.SD) [pdf]
Title: The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework
Comments: Details about Deep Noise Suppression Challenge
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

Thu, 23 Jan 2020

[6]  arXiv:2001.07849 [pdf, other]
Title: Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion
Comments: Accepted to IEEE Transactions on Emerging Topics in Computational Intelligence
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[7]  arXiv:2001.08163 (cross-list from cs.NI) [pdf, other]
Title: A Power-Efficient Audio Acquisition System for Smart City Applications
Subjects: Networking and Internet Architecture (cs.NI); Sound (cs.SD); Social and Information Networks (cs.SI); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[8]  arXiv:2001.07874 (cross-list from cs.SD) [pdf]
Title: Non-Negative Matrix Factorization-Convolutional Neural Network (NMF-CNN) For Sound Event Detection
Comments: 5 pages, 1 figure, 2 tables
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Wed, 22 Jan 2020

[9]  arXiv:2001.07263 [pdf, other]
Title: Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300
Comments: 5 pages, 2 figures
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[10]  arXiv:2001.07067 [pdf, other]
Title: Interpretable Filter Learning Using Soft Self-attention For Raw Waveform Speech Recognition
Subjects: Audio and Speech Processing (eess.AS)
[11]  arXiv:2001.07034 [pdf, other]
Title: Pairwise Discriminative Neural PLDA for Speaker Verification
Comments: Submitted to IEEE International Conference on Acoustics, Speech, and Signal Processing 2020 (ICASSP 2020). Link to GitHub Repository: this https URL
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[12]  arXiv:2001.07044 (cross-list from cs.SD) [pdf, ps, other]
Title: JVS-MuSiC: Japanese multispeaker singing-voice corpus
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[13]  arXiv:2001.06785 (cross-list from cs.CL) [pdf, other]
Title: From Speech-to-Speech Translation to Automatic Dubbing
Comments: 5 pages, 4 figures
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Mon, 20 Jan 2020

[14]  arXiv:2001.06397 (cross-list from cs.SD) [pdf, other]
Title: Supervised Speaker Embedding De-Mixing in Two-Speaker Environment
Comments: Submitted to Odyssey 2020
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[15]  arXiv:2001.06086 (cross-list from cs.AI) [pdf, other]
Title: A Critical Look at the Applicability of Markov Logic Networks for Music Signal Analysis
Comments: Accepted for presentation at the Ninth International Workshop on Statistical Relational AI (StarAI 2020) at the 34th AAAI Conference on Artificial Intelligence (AAAI) in New York, on February 7th 2020
Subjects: Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Fri, 17 Jan 2020

[16]  arXiv:2001.05908 (cross-list from cs.CL) [pdf]
Title: Speech Emotion Recognition Based on Multi-feature and Multi-lingual Fusion
Authors: Chunyi Wang
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[17]  arXiv:2001.05685 (cross-list from cs.SD) [pdf, other]
Title: SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[18]  arXiv:2001.05532 (cross-list from cs.LG) [pdf, other]
Title: Improving GANs for Speech Enhancement
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[ total of 18 entries: 1-18 ]
[ showing up to 50 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, new, 2001, contact, help  (Access key information)