We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions

[ total of 61 entries: 1-25 | 26-50 | 51-61 ]
[ showing 25 entries per page: fewer | more | all ]

Wed, 16 Jun 2021

[1]  arXiv:2106.08207 [pdf]
Title: Graph-based Label Propagation for Semi-Supervised Speaker Identification
Comments: To appear in Interspeech 2021
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG)
[2]  arXiv:2106.08004 [pdf, other]
Title: Adaptive Margin Circle Loss for Speaker Verification
Authors: Runqiu Xiao
Comments: Accepted by Interspeech 2021
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[3]  arXiv:2106.07886 [pdf, other]
Title: MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis
Subjects: Sound (cs.SD)
[4]  arXiv:2106.07874 [pdf]
Title: Towards the Objective Speech Assessment of Smoking Status based on Voice Features: A Review of the Literature
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[5]  arXiv:2106.07843 [pdf, other]
Title: Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation
Comments: Accepted to Interspeech 2021
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[6]  arXiv:2106.07787 [pdf, other]
Title: Tracing Back Music Emotion Predictions to Sound Sources and Intuitive Perceptual Qualities
Comments: Sound and Music Computing Conference 2021
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[7]  arXiv:2106.07732 [pdf, other]
Title: Learning Audio-Visual Dereverberation
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[8]  arXiv:2106.08126 (cross-list from eess.AS) [pdf, other]
Title: Dialectal Speech Recognition and Translation of Swiss German Speech to Standard German Text: Microsoft's Submission to SwissText 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[9]  arXiv:2106.07994 (cross-list from eess.AS) [pdf, other]
Title: Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Comments: Accepted at Interspeech 2021
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10]  arXiv:2106.07972 (cross-list from eess.AS) [pdf]
Title: SRIB Submission to Interspeech 2021 DiCOVA Challenge
Comments: 5 pages, 5 figures
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11]  arXiv:2106.07939 (cross-list from eess.SP) [pdf, other]
Title: Attention-based distributed speech enhancement for unconstrained microphone arrays with varying number of nodes
Authors: Nicolas Furnon (MULTISPEECH), Romain Serizel (MULTISPEECH), Slim Essid (ADASP), Irina Illina (MULTISPEECH)
Journal-ref: European Signal Processing Conference (EUSIPCO), IEEE, Aug 2021, Dublin, Ireland
Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[12]  arXiv:2106.07889 (cross-list from eess.AS) [pdf, other]
Title: UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Comments: Accepted to INTERSPEECH 2021
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[13]  arXiv:2106.07868 (cross-list from cs.LG) [pdf, other]
Title: Voting for the right answer: Adversarial defense for speaker verification
Comments: Accepted by Interspeech 2021. Code is available at this https URL
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[14]  arXiv:2106.07803 (cross-list from cs.LG) [pdf, other]
Title: SynthASR: Unlocking Synthetic Data for Speech Recognition
Comments: Accepted to Interspeech 2021
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[15]  arXiv:2106.07716 (cross-list from cs.CL) [pdf, ps, other]
Title: Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts
Comments: 5 pages
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[16]  arXiv:2106.07699 (cross-list from cs.CL) [pdf, ps, other]
Title: Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition
Comments: 5 pages
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Tue, 15 Jun 2021 (showing first 9 of 15 entries)

[17]  arXiv:2106.07577 [pdf, other]
Title: F-T-LSTM based Complex Network for Joint Acoustic Echo Cancellation and Speech Enhancement
Comments: Accepted by Interspeech 2021
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[18]  arXiv:2106.07448 [pdf]
Title: A Novel mapping for visual to auditory sensory substitution
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[19]  arXiv:2106.07431 [pdf, other]
Title: CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis
Comments: 12 pages, 11 figures
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[20]  arXiv:2106.07428 [pdf, ps, other]
Title: Audio Attacks and Defenses against AED Systems -- A Practical Study
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
[21]  arXiv:2106.07268 [pdf, other]
Title: FastICARL: Fast Incremental Classifier and Representation Learning with Efficient Budget Allocation in Audio Sensing Applications
Comments: Accepted for publication at INTERSPEECH 2021
Subjects: Sound (cs.SD); Machine Learning (cs.LG)
[22]  arXiv:2106.07157 [pdf, other]
Title: Multiple scattering ambisonics: three-dimensional sound foeld estimation using interacting spheres
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23]  arXiv:2106.06969 [pdf, other]
Title: SoundDet: Polyphonic Sound Event Detection and Localization from Raw Waveform
Subjects: Sound (cs.SD); Machine Learning (cs.LG)
[24]  arXiv:2106.06909 [pdf, other]
Title: GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[25]  arXiv:2106.06863 [pdf]
Title: Continuous Wavelet Vocoder-based Decomposition of Parametric Speech Waveform Synthesis
Comments: 5 pages, 4 figures, accepted to the conference of Interspeech 2021
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 61 entries: 1-25 | 26-50 | 51-61 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2106, contact, help  (Access key information)