We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions, skipping first 32

[ total of 41 entries: 1-10 | 3-12 | 13-22 | 23-32 | 33-41 ]
[ showing 10 entries per page: fewer | more | all ]

Tue, 30 Apr 2024 (continued, showing last 9 of 12 entries)

[33]  arXiv:2404.18094 [pdf, other]
Title: USAT: A Universal Speaker-Adaptive Text-to-Speech Approach
Comments: 15 pages, 13 figures. Copyright has been transferred to IEEE
Journal-ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2024
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[34]  arXiv:2404.18081 [pdf, other]
Title: ComposerX: Multi-Agent Symbolic Music Composition with LLMs
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[35]  arXiv:2404.18002 [pdf, other]
Title: Towards Privacy-Preserving Audio Classification Systems
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[36]  arXiv:2404.17983 [pdf, other]
Title: TI-ASU: Toward Robust Automatic Speech Understanding through Text-to-speech Imputation Against Missing Speech Modality
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[37]  arXiv:2404.17821 [pdf, ps, other]
Title: An automatic mixing speech enhancement system for multi-track audio
Comments: 5 pages
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[38]  arXiv:2404.17806 [pdf, other]
Title: T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining
Comments: Preprint submitted to IEEE MLSP 2024
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[39]  arXiv:2404.17721 [pdf, ps, other]
Title: An RFP dataset for Real, Fake, and Partially fake audio detection
Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[40]  arXiv:2404.17608 [pdf, ps, other]
Title: Synthesizing Audio from Silent Video using Sequence to Sequence Modeling
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[41]  arXiv:2404.17968 (cross-list from cs.CL) [pdf, other]
Title: Usefulness of Emotional Prosody in Neural Machine Translation
Comments: 5 pages, In Proceedings of the 11th International Conference on Speech Prosody (SP), Leiden, The Netherlands, 2024
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 41 entries: 1-10 | 3-12 | 13-22 | 23-32 | 33-41 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help  (Access key information)