Sound

Authors and titles for cs.SD in Jun 2022

[ total of 221 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 201-221 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:2206.00208 [pdf, other]: Title: AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation

Authors: Kun Song, Heyang Xue, Xinsheng Wang, Jian Cong, Yongmao Zhang, Lei Xie, Bing Yang, Xiong Zhang, Dan Su

Comments: Accepted by ISCSLP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[2] arXiv:2206.00393 [pdf, other]: Title: Towards Generalisable Audio Representations for Audio-Visual Navigation

Authors: Shunqi Mao, Chaoyi Zhang, Heng Wang, Weidong Cai

Comments: CVPR 2022 Embodied AI Workshop

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO); Audio and Speech Processing (eess.AS)
[3] arXiv:2206.00454 [pdf, other]: Title: Towards Context-Aware Neural Performance-Score Synchronisation

Authors: Ruchit Agrawal

Comments: PhD Thesis, Queen Mary University of London (190 pages)

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[4] arXiv:2206.00635 [pdf, other]: Title: Speech Artifact Removal from EEG Recordings of Spoken Word Production with Tensor Decomposition

Authors: Holy Lovenia, Hiroki Tanaka, Sakriani Sakti, Ayu Purwarianti, Satoshi Nakamura

Journal-ref: 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[5] arXiv:2206.00901 [pdf, ps, other]: Title: Musical Instrument Recognition by XGBoost Combining Feature Fusion

Authors: Yijie Liu, Yanfang Yin, Qigang Zhu, Wenzhuo Cui

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6] arXiv:2206.01071 [pdf, other]: Title: Partitura: A Python Package for Symbolic Music Processing

Authors: Carlos Cancino-Chacón, Silvan David Peter, Emmanouil Karystinaios, Francesco Foscarin, Maarten Grachten, Gerhard Widmer

Journal-ref: Proceedings of the Music Encoding Conference (MEC), 2022, Halifax, Canada

Subjects: Sound (cs.SD); Digital Libraries (cs.DL); Audio and Speech Processing (eess.AS)
[7] arXiv:2206.01104 [pdf, other]: Title: The match file format: Encoding Alignments between Scores and Performances

Authors: Francesco Foscarin, Emmanouil Karystinaios, Silvan David Peter, Carlos Cancino-Chacón, Maarten Grachten, Gerhard Widmer

Journal-ref: Proceedings of the Music Encoding Conference (MEC), 2022, Halifax, Canada

Subjects: Sound (cs.SD); Digital Libraries (cs.DL); Audio and Speech Processing (eess.AS)
[8] arXiv:2206.01305 [pdf, other]: Title: The Musical Arrow of Time -- The Role of Temporal Asymmetry in Music and Its Organicist Implications

Authors: Qi Xu

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[9] arXiv:2206.01542 [src]: Title: Detecting the Severity of Major Depressive Disorder from Speech: A Novel HARD-Training Methodology

Authors: Edward L. Campbell, Judith Dineley, Pauline Conde, Faith Matcham, Femke Lamers, Sara Siddi, Laura Docio-Fernandez, Carmen Garcia-Mateo, Nicholas Cummins, the RADAR-CNS Consortium

Comments: Error in Training Code

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Quantitative Methods (q-bio.QM)
[10] arXiv:2206.02211 [pdf, other]: Title: Variable-rate hierarchical CPC leads to acoustic unit discovery in speech

Authors: Santiago Cuervo, Adrian Łańcucki, Ricard Marxer, Paweł Rychlikowski, Jan Chorowski

Comments: Accepted to 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

Journal-ref: Advances in Neural Information Processing Systems, 2022

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[11] arXiv:2206.02246 [pdf, other]: Title: Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models

Authors: Alon Levkovitch, Eliya Nachmani, Lior Wolf

Comments: Accepted to Interspeech 2022

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[12] arXiv:2206.02284 [pdf, other]: Title: Tagged-MRI Sequence to Audio Synthesis via Self Residual Attention Guided Heterogeneous Translator

Authors: Xiaofeng Liu, Fangxu Xing, Jerry L. Prince, Jiachen Zhuo, Maureen Stone, Georges El Fakhri, Jonghye Woo

Comments: MICCAI 2022 (early accept, Oral Presentation ~3%)

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[13] arXiv:2206.02671 [pdf, ps, other]: Title: Canonical Cortical Graph Neural Networks and its Application for Speech Enhancement in Audio-Visual Hearing Aids

Authors: Leandro A. Passos, João Paulo Papa, Amir Hussain, Ahsan Adeel

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[14] arXiv:2206.03065 [pdf, other]: Title: Universal Speech Enhancement with Score-based Diffusion

Authors: Joan Serrà, Santiago Pascual, Jordi Pons, R. Oguz Araz, Davide Scaini

Comments: 24 pages, 6 figures; includes appendix; examples in this https URL

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[15] arXiv:2206.03351 [pdf, other]: Title: AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker Recognition Systems

Authors: Guangke Chen, Zhe Zhao, Fu Song, Sen Chen, Lingling Fan, Yang Liu

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[16] arXiv:2206.03393 [pdf, other]: Title: Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition

Authors: Guangke Chen, Zhe Zhao, Fu Song, Sen Chen, Lingling Fan, Feng Wang, Jiashui Wang

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[17] arXiv:2206.04006 [pdf, other]: Title: Few-Shot Audio-Visual Learning of Environment Acoustics

Authors: Sagnik Majumder, Changan Chen, Ziad Al-Halah, Kristen Grauman

Comments: Accepted to NeurIPS 2022

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[18] arXiv:2206.04658 [pdf, other]: Title: BigVGAN: A Universal Neural Vocoder with Large-Scale Training

Authors: Sang-gil Lee, Wei Ping, Boris Ginsburg, Bryan Catanzaro, Sungroh Yoon

Comments: To appear at ICLR 2023. Listen to audio samples from BigVGAN at: this https URL

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[19] arXiv:2206.04769 [pdf, other]: Title: CLAP: Learning Audio Concepts From Natural Language Supervision

Authors: Benjamin Elizalde, Soham Deshmukh, Mahmoud Al Ismail, Huaming Wang

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[20] arXiv:2206.04780 [pdf, other]: Title: Speak Like a Dog: Human to Non-human creature Voice Conversion

Authors: Kohei Suzuki, Shoki Sakamoto, Tadahiro Taniguchi, Hirokazu Kameoka

Comments: 5 pages, 4 figures

Journal-ref: 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) (pp. 1388-1393)

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[21] arXiv:2206.04805 [pdf, other]: Title: Motif Mining and Unsupervised Representation Learning for BirdCLEF 2022

Authors: Anthony Miyaguchi, Jiangyue Yu, Bryan Cheungvivatpant, Dakota Dudley, Aniketh Swain

Comments: Submitted to CEUR-WS under LifeCLEF for the BirdCLEF 2022 challenge as a working note

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[22] arXiv:2206.04962 [pdf, other]: Title: Feature Learning and Ensemble Pre-Tasks Based Self-Supervised Speech Denoising and Dereverberation

Authors: Yi Li, ShuangLin Li, Yang Sun, Syed Mohsen Naqvi

Comments: arXiv admin note: text overlap with arXiv:2112.11142

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23] arXiv:2206.04984 [pdf, other]: Title: Zero-Shot Audio Classification using Image Embeddings

Authors: Duygu Dogan, Huang Xie, Toni Heittola, Tuomas Virtanen

Comments: Accepted to the European Signal Processing Conference (EUSIPCO) 2022

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[24] arXiv:2206.05018 [pdf, ps, other]: Title: Going Beyond the Cookie Theft Picture Test: Detecting Cognitive Impairments using Acoustic Features

Authors: Franziska Braun, Andreas Erzigkeit, Hartmut Lehfeld, Thomas Hillemacher, Korbinian Riedhammer, Sebastian P. Bayerl

Comments: Accepted at the 25th International Conference on Text, Speech and Dialogue (TSD 2022)

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[25] arXiv:2206.05286 [src]: Title: AHD ConvNet for Speech Emotion Classification

Authors: Asfand Ali, Danial Nasir, Mohammad Hassan Jawad

Comments: Wrong authors quoted

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)

[ total of 221 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 201-221 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for cs.SD in Jun 2022