Sound

Authors and titles for cs.SD in Oct 2021, skipping first 25

[ total of 324 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | ... | 301-324 ]
[ showing 25 entries per page: fewer | more | all ]

[26] arXiv:2110.03414 [pdf, other]: Title: SERAB: A multi-lingual benchmark for speech emotion recognition

Authors: Neil Scheidwasser-Clow, Mikolaj Kegler, Pierre Beckmann, Milos Cernak

Comments: Submitted to ICASSP 2022

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[27] arXiv:2110.03536 [pdf, other]: Title: Prototype Learning for Interpretable Respiratory Sound Analysis

Authors: Zhao Ren, Thanh Tam Nguyen, Wolfgang Nejdl

Comments: Technical report of the paper accepted by IEEE ICASSP 2022

Subjects: Sound (cs.SD)
[28] arXiv:2110.03744 [pdf, other]: Title: Voice Reenactment with F0 and timing constraints and adversarial learning of conversions

Authors: Frederik Bous, Laurent Benaroya, Nicolas Obin, Axel Roebel

Comments: arXiv admin note: text overlap with arXiv:2107.12346

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[29] arXiv:2110.03771 [pdf, other]: Title: Wake-Cough: cough spotting and cougher identification for personalised long-term cough monitoring

Authors: Madhurananda Pahar, Marisa Klopper, Byron Reeve, Rob Warren, Grant Theron, Andreas Diacon, Thomas Niesler

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[30] arXiv:2110.04057 [pdf, other]: Title: FAST-RIR: Fast neural diffuse room impulse response generator

Authors: Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu

Comments: Accepted to ICASSP 2022. More results and source code is available at this https URL

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[31] arXiv:2110.04091 [pdf, other]: Title: Affective Burst Detection from Speech using Kernel-fusion Dilated Convolutional Neural Networks

Authors: Berkay Kopru, Engin Erzin

Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[32] arXiv:2110.04284 [pdf, other]: Title: Auto-DSP: Learning to Optimize Acoustic Echo Cancellers

Authors: Jonah Casebeer, Nicholas J. Bryan, Paris Smaragdis

Comments: Accepted to the 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). Source code and audio examples: this https URL

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[33] arXiv:2110.04438 [pdf, other]: Title: Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for Speaker Verification

Authors: Qingjian Lin, Lin Yang, Xuyang Wang, Xiaoyi Qin, Junjie Wang, Ming Li

Comments: Accepted by ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[34] arXiv:2110.04451 [pdf, other]: Title: Using multiple reference audios and style embedding constraints for speech synthesis

Authors: Cheng Gong, Longbiao Wang, Zhenhua Ling, Ju Zhang, Jianwu Dang

Comments: 5 pages,3 figures submitted to ICASSP2022

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[35] arXiv:2110.04474 [pdf, other]: Title: A Mutual learning framework for Few-shot Sound Event Detection

Authors: Dongchao Yang, Helin Wang, Yuexian Zou, Zhongjie Ye, Wenwu Wang

Comments: Accepted by ICASSP2022. arXiv admin note: text overlap with arXiv:2106.12252 by other authors

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[36] arXiv:2110.04486 [pdf, other]: Title: PAMA-TTS: Progression-Aware Monotonic Attention for Stable Seq2Seq TTS With Accurate Phoneme Duration Control

Authors: Yunchao He, Jian Luan, Yujun Wang

Comments: Accepted by ICASSP 2022. 5 pages, 4 figures, 3 tables. Audio samples are available at: this https URL

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[37] arXiv:2110.04621 [pdf, other]: Title: Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

Authors: Joel Shor, Aren Jansen, Wei Han, Daniel Park, Yu Zhang

Journal-ref: ICASSP 2022-2022 IEEE

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[38] arXiv:2110.04656 [pdf, other]: Title: Streaming on-device detection of device directed speech from voice and touch-based invocation

Authors: Ognjen Rudovic, Akanksha Bindal, Vineet Garg, Pramod Simha, Pranay Dighe, Sachin Kajarekar

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[39] arXiv:2110.04678 [pdf, other]: Title: An Overview of Techniques for Biomarker Discovery in Voice Signal

Authors: Rita Singh, Ankit Shah, Hira Dhamyal

Comments: Last two authors contributed equally to the paper

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[40] arXiv:2110.04684 [pdf, other]: Title: Can Audio Captions Be Evaluated with Image Caption Metrics?

Authors: Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

Comments: ICASSP 2022

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[41] arXiv:2110.04754 [pdf, other]: Title: Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding

Authors: Chao Wang, Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Yibiao Yu, Zejun Ma

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[42] arXiv:2110.04765 [pdf, other]: Title: Multi-task Learning with Metadata for Music Mood Classification

Authors: Rajnish Kumar, Manjeet Dahiya

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[43] arXiv:2110.04946 [pdf, other]: Title: LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example

Authors: Hieu-Thi Luong, Junichi Yamagishi

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[44] arXiv:2110.04972 [pdf, ps, other]: Title: Kernel Learning For Sound Field Estimation With L1 and L2 Regularizations

Authors: Ryosuke Horiuchi, Shoichi Koyama, Juliano G. C. Ribeiro, Natsuki Ueno, Hiroshi Saruwatari

Comments: Accepted to IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2021

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[45] arXiv:2110.05020 [pdf, other]: Title: MELONS: generating melody with long-term structure using transformers and structure graph

Authors: Yi Zou, Pei Zou, Yi Zhao, Kaixiang Zhang, Ran Zhang, Xiaorui Wang

Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[46] arXiv:2110.05033 [pdf, other]: Title: Pitch Preservation In Singing Voice Synthesis

Authors: Shujun Liu, Hai Zhu, Kun Wang, Huajun Wang

Comments: 5 pages, 3 figures

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[47] arXiv:2110.05042 [pdf, other]: Title: Multi-query multi-head attention pooling and Inter-topK penalty for speaker verification

Authors: Miao Zhao, Yufeng Ma, Yiwei Ding, Yu Zheng, Min Liu, Minqiang Xu

Comments: submitted to ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[48] arXiv:2110.05054 [pdf, other]: Title: Source Mixing and Separation Robust Audio Steganography

Authors: Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji

Comments: Accepted to ICASSP 2022

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[49] arXiv:2110.05059 [pdf, other]: Title: Amicable examples for informed source separation

Authors: Naoya Takahashi, Yuki Mitsufuji

Comments: Accepted to ICASSP 2022

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[50] arXiv:2110.05069 [pdf, other]: Title: Efficient Training of Audio Transformers with Patchout

Authors: Khaled Koutini, Jan Schlüter, Hamid Eghbal-zadeh, Gerhard Widmer

Comments: Submitted to Interspeech 2022. Source code: this https URL

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

[ total of 324 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | ... | 301-324 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for cs.SD in Oct 2021, skipping first 25