Sound

Authors and titles for cs.SD in Oct 2021, skipping first 50

[ total of 324 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | ... | 301-324 ]
[ showing 25 entries per page: fewer | more | all ]

[51] arXiv:2110.05087 [pdf, ps, other]: Title: A Multi-Resolution Front-End for End-to-End Speech Anti-Spoofing

Authors: Wei Liu, Meng Sun, Xiongwei Zhang, Hugo Van hamme, Thomas Fang Zheng

Comments: submitted to ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[52] arXiv:2110.05580 [pdf, other]: Title: vocadito: A dataset of solo vocals with $f_0$, note, and lyric annotations

Authors: Rachel M. Bittner, Katherine Pasalo, Juan José Bosch, Gabriel Meseguer-Brocal, David Rubinstein

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[53] arXiv:2110.05587 [pdf, other]: Title: Evaluation of Latent Space Disentanglement in the Presence of Interdependent Attributes

Authors: Karn N. Watcharasupat, Alexander Lerch

Comments: Submitted to the Late-Breaking Demo Session of the 22nd International Society for Music Information Retrieval Conference

Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Information Theory (cs.IT); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[54] arXiv:2110.05713 [pdf, other]: Title: Foster Strengths and Circumvent Weaknesses: a Speech Enhancement Framework with Two-branch Collaborative Learning

Authors: Wenxin Tai, Jiajia Li, Yixiang Wang, Tian Lan, Qiao Liu

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[55] arXiv:2110.05765 [pdf, other]: Title: Music Sentiment Transfer

Authors: Miles Sigel, Michael Zhou, Jiebo Luo

Comments: NSF REU: Computational Methods for Understanding Music, Media, and Minds, University of Rochester

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[56] arXiv:2110.05777 [pdf, other]: Title: Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification

Authors: Zhengyang Chen, Sanyuan Chen, Yu Wu, Yao Qian, Chengyi Wang, Shujie Liu, Yanmin Qian, Michael Zeng

Comments: Accepted by ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[57] arXiv:2110.05798 [pdf, other]: Title: Adapting TTS models For New Speakers using Transfer Learning

Authors: Paarth Neekhara, Jason Li, Boris Ginsburg

Comments: Submitted to Interspeech 2022

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[58] arXiv:2110.05866 [pdf, ps, other]: Title: MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech

Authors: Szu-Wei Fu, Cheng Yu, Kuo-Hsuan Hung, Mirco Ravanelli, Yu Tsao

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[59] arXiv:2110.05966 [pdf, other]: Title: Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training

Authors: Changsheng Quan, Xiaofei Li

Comments: accepted by ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[60] arXiv:2110.05975 [pdf, other]: Title: Multi-Channel Far-Field Speaker Verification with Large-Scale Ad-hoc Microphone Arrays

Authors: Chengdong Liang, Yijiang Chen, Jiadi Yao, Xiao-Lei Zhang

Comments: 5 pages, 3 figures

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[61] arXiv:2110.06100 [pdf, other]: Title: Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information

Authors: Zhongjie Ye, Helin Wang, Dongchao Yang, Yuexian Zou

Comments: 5 pages, 1 figure, accepted by DCASE 2021 workshop

Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[62] arXiv:2110.06123 [pdf, other]: Title: COVID-19 Diagnosis from Cough Acoustics using ConvNets and Data Augmentation

Authors: Saranga Kingkor Mahanta, Darsh Kaushik, Shubham Jain, Hoang Van Truong, Koushik Guha

Comments: DiCOVA, top 1st, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[63] arXiv:2110.06280 [pdf, other]: Title: S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations

Authors: Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe, Tomoki Toda

Comments: Submitted to ICASSP 2022. Code available at: this https URL

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[64] arXiv:2110.06323 [pdf, other]: Title: An Annihilating Filter-Based DOA Estimation for Uniform Linear Array

Authors: Son Phan, Lam Pham

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[65] arXiv:2110.06371 [pdf, other]: Title: Algorithmic Composition by Autonomous Systems with Multiple Time-Scales

Authors: Risto Holopainen

Comments: 28 pages, 3 figures. Submitted to Divergence Press

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Adaptation and Self-Organizing Systems (nlin.AO)
[66] arXiv:2110.06467 [pdf, other]: Title: Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement

Authors: Guochen Yu, Andong Li, Chengshi Zheng, Yinuo Guo, Yutian Wang, Hui Wang

Comments: Accepted by ICASSP 2022

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[67] arXiv:2110.06494 [pdf, other]: Title: Music Source Separation with Deep Equilibrium Models

Authors: Yuichiro Koyama, Naoki Murata, Stefan Uhlich, Giorgio Fabbro, Shusuke Takahashi, Yuki Mitsufuji

Comments: 5 pages, 4 figures, accepted for publication in IEEE ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[68] arXiv:2110.06501 [pdf, other]: Title: Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection

Authors: Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji

Comments: 5 pages, 2 figures, accepted for publication in IEEE ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[69] arXiv:2110.06525 [pdf, other]: Title: Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks

Authors: Bo-Yu Chen, Wei-Han Hsu, Wei-Hsiang Liao, Marco A. Martínez Ramírez, Yuki Mitsufuji, Yi-Hsuan Yang

Comments: To be published at ICASSP 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[70] arXiv:2110.06534 [pdf, other]: Title: Simple Attention Module based Speaker Verification with Iterative noisy label detection

Authors: Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li

Comments: submitted to ICASSP2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[71] arXiv:2110.06543 [pdf, ps, other]: Title: EIHW-MTG DiCOVA 2021 Challenge System Report

Authors: Adria Mallol-Ragolta, Helena Cuesta, Emilia Gómez, Björn W. Schuller

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[72] arXiv:2110.06565 [pdf, other]: Title: Duality Temporal-channel-frequency Attention Enhanced Speaker Representation Learning

Authors: Li Zhang, Qing Wang, Lei Xie

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[73] arXiv:2110.06634 [pdf, other]: Title: End-to-end translation of human neural activity to speech with a dual-dual generative adversarial network

Authors: Yina Guo, Xiaofei Zhang, Zhenying Gong, Anhong Wang, Wenwu Wang

Comments: 12 pages, 13 figures

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS); Neurons and Cognition (q-bio.NC)
[74] arXiv:2110.06707 [pdf, other]: Title: Singer separation for karaoke content generation

Authors: Hsuan-Yu Chen, Xuanjun Chen, Jyh-Shing Roger Jang

Comments: Submitted to ICASSP 2022

Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[75] arXiv:2110.06999 [pdf, other]: Title: Study of positional encoding approaches for Audio Spectrogram Transformers

Authors: Leonardo Pepino, Pablo Riera, Luciana Ferrer

Comments: Submitted to ICASSP 2022. 5 pages, 3 figures

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

[ total of 324 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | ... | 301-324 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for cs.SD in Oct 2021, skipping first 50