We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for cs.SD in Oct 2021, skipping first 50

[ total of 324 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | ... | 301-324 ]
[ showing 25 entries per page: fewer | more | all ]
[51]  arXiv:2110.05087 [pdf, ps, other]
Title: A Multi-Resolution Front-End for End-to-End Speech Anti-Spoofing
Comments: submitted to ICASSP 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[52]  arXiv:2110.05580 [pdf, other]
Title: vocadito: A dataset of solo vocals with $f_0$, note, and lyric annotations
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[53]  arXiv:2110.05587 [pdf, other]
Title: Evaluation of Latent Space Disentanglement in the Presence of Interdependent Attributes
Comments: Submitted to the Late-Breaking Demo Session of the 22nd International Society for Music Information Retrieval Conference
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Information Theory (cs.IT); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[54]  arXiv:2110.05713 [pdf, other]
Title: Foster Strengths and Circumvent Weaknesses: a Speech Enhancement Framework with Two-branch Collaborative Learning
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[55]  arXiv:2110.05765 [pdf, other]
Title: Music Sentiment Transfer
Comments: NSF REU: Computational Methods for Understanding Music, Media, and Minds, University of Rochester
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[56]  arXiv:2110.05777 [pdf, other]
Title: Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification
Comments: Accepted by ICASSP 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[57]  arXiv:2110.05798 [pdf, other]
Title: Adapting TTS models For New Speakers using Transfer Learning
Comments: Submitted to Interspeech 2022
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[58]  arXiv:2110.05866 [pdf, ps, other]
Title: MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[59]  arXiv:2110.05966 [pdf, other]
Title: Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training
Comments: accepted by ICASSP 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[60]  arXiv:2110.05975 [pdf, other]
Title: Multi-Channel Far-Field Speaker Verification with Large-Scale Ad-hoc Microphone Arrays
Comments: 5 pages, 3 figures
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[61]  arXiv:2110.06100 [pdf, other]
Title: Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Comments: 5 pages, 1 figure, accepted by DCASE 2021 workshop
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[62]  arXiv:2110.06123 [pdf, other]
Title: COVID-19 Diagnosis from Cough Acoustics using ConvNets and Data Augmentation
Comments: DiCOVA, top 1st, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[63]  arXiv:2110.06280 [pdf, other]
Title: S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations
Comments: Submitted to ICASSP 2022. Code available at: this https URL
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[64]  arXiv:2110.06323 [pdf, other]
Title: An Annihilating Filter-Based DOA Estimation for Uniform Linear Array
Authors: Son Phan, Lam Pham
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[65]  arXiv:2110.06371 [pdf, other]
Title: Algorithmic Composition by Autonomous Systems with Multiple Time-Scales
Authors: Risto Holopainen
Comments: 28 pages, 3 figures. Submitted to Divergence Press
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Adaptation and Self-Organizing Systems (nlin.AO)
[66]  arXiv:2110.06467 [pdf, other]
Title: Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement
Comments: Accepted by ICASSP 2022
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[67]  arXiv:2110.06494 [pdf, other]
Title: Music Source Separation with Deep Equilibrium Models
Comments: 5 pages, 4 figures, accepted for publication in IEEE ICASSP 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[68]  arXiv:2110.06501 [pdf, other]
Title: Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection
Comments: 5 pages, 2 figures, accepted for publication in IEEE ICASSP 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[69]  arXiv:2110.06525 [pdf, other]
Title: Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks
Comments: To be published at ICASSP 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[70]  arXiv:2110.06534 [pdf, other]
Title: Simple Attention Module based Speaker Verification with Iterative noisy label detection
Comments: submitted to ICASSP2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[71]  arXiv:2110.06543 [pdf, ps, other]
Title: EIHW-MTG DiCOVA 2021 Challenge System Report
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[72]  arXiv:2110.06565 [pdf, other]
Title: Duality Temporal-channel-frequency Attention Enhanced Speaker Representation Learning
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[73]  arXiv:2110.06634 [pdf, other]
Title: End-to-end translation of human neural activity to speech with a dual-dual generative adversarial network
Comments: 12 pages, 13 figures
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS); Neurons and Cognition (q-bio.NC)
[74]  arXiv:2110.06707 [pdf, other]
Title: Singer separation for karaoke content generation
Comments: Submitted to ICASSP 2022
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[75]  arXiv:2110.06999 [pdf, other]
Title: Study of positional encoding approaches for Audio Spectrogram Transformers
Comments: Submitted to ICASSP 2022. 5 pages, 3 figures
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[ total of 324 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | ... | 301-324 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help  (Access key information)