Sound

Authors and titles for recent submissions, skipping first 32

[ total of 41 entries: 1-10 | 3-12 | 13-22 | 23-32 | 33-41 ]
[ showing 10 entries per page: fewer | more | all ]

Tue, 30 Apr 2024 (continued, showing last 9 of 12 entries)

[33] arXiv:2404.18094 [pdf, other]: Title: USAT: A Universal Speaker-Adaptive Text-to-Speech Approach

Authors: Wenbin Wang, Yang Song, Sanjay Jha

Comments: 15 pages, 13 figures. Copyright has been transferred to IEEE

Journal-ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2024

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[34] arXiv:2404.18081 [pdf, other]: Title: ComposerX: Multi-Agent Symbolic Music Composition with LLMs

Authors: Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[35] arXiv:2404.18002 [pdf, other]: Title: Towards Privacy-Preserving Audio Classification Systems

Authors: Bhawana Chhaglani, Jeremy Gummeson, Prashant Shenoy

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[36] arXiv:2404.17983 [pdf, other]: Title: TI-ASU: Toward Robust Automatic Speech Understanding through Text-to-speech Imputation Against Missing Speech Modality

Authors: Tiantian Feng, Xuan Shi, Rahul Gupta, Shrikanth S. Narayanan

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[37] arXiv:2404.17821 [pdf, ps, other]: Title: An automatic mixing speech enhancement system for multi-track audio

Authors: Xiaojing Liu, Angeliki Mourgela, Hongwei Ai, Joshua D. Reiss

Comments: 5 pages

Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[38] arXiv:2404.17806 [pdf, other]: Title: T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining

Authors: Yi Yuan, Zhuo Chen, Xubo Liu, Haohe Liu, Xuenan Xu, Dongya Jia, Yuanzhe Chen, Mark D. Plumbley, Wenwu Wang

Comments: Preprint submitted to IEEE MLSP 2024

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[39] arXiv:2404.17721 [pdf, ps, other]: Title: An RFP dataset for Real, Fake, and Partially fake audio detection

Authors: Abdulazeez AlAli, George Theodorakopoulos

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[40] arXiv:2404.17608 [pdf, ps, other]: Title: Synthesizing Audio from Silent Video using Sequence to Sequence Modeling

Authors: Hugo Garrido-Lestache Belinchon, Helina Mulugeta, Adam Haile

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[41] arXiv:2404.17968 (cross-list from cs.CL) [pdf, other]: Title: Usefulness of Emotional Prosody in Neural Machine Translation

Authors: Charles Brazier, Jean-Luc Rouas

Comments: 5 pages, In Proceedings of the 11th International Conference on Speech Prosody (SP), Leiden, The Netherlands, 2024

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)

[ total of 41 entries: 1-10 | 3-12 | 13-22 | 23-32 | 33-41 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for recent submissions, skipping first 32

Tue, 30 Apr 2024 (continued, showing last 9 of 12 entries)