We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions, skipping first 13

[ total of 46 entries: 1-10 | 4-13 | 14-23 | 24-33 | 34-43 | 44-46 ]
[ showing 10 entries per page: fewer | more | all ]

Wed, 24 Apr 2024 (continued, showing last 2 of 15 entries)

[14]  arXiv:2404.14700 (cross-list from eess.AS) [pdf, other]
Title: FlashSpeech: Efficient Zero-Shot Speech Synthesis
Comments: Efficient zero-shot speech synthesis
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[15]  arXiv:2404.14564 (cross-list from eess.AS) [pdf, other]
Title: Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Tue, 23 Apr 2024 (showing first 8 of 14 entries)

[16]  arXiv:2404.14063 [pdf, other]
Title: LVNS-RAVE: Diversified audio generation with RAVE and Latent Vector Novelty Search
Comments: Accepted to GECCO 24 Companion
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[17]  arXiv:2404.13914 [pdf, other]
Title: Audio Anti-Spoofing Detection: A Survey
Comments: submitted to ACM Computing Surveys
Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[18]  arXiv:2404.13892 [pdf, other]
Title: Retrieval-Augmented Audio Deepfake Detection
Comments: Accepted by the 2024 International Conference on Multimedia Retrieval (ICMR 2024)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[19]  arXiv:2404.13789 [pdf, other]
Title: Anchor-aware Deep Metric Learning for Audio-visual Retrieval
Comments: 9 pages, 5 figures. Accepted by ACM ICMR 2024
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[20]  arXiv:2404.13569 [pdf, other]
Title: Musical Word Embedding for Music Tagging and Retrieval
Comments: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[21]  arXiv:2404.13568 [pdf, ps, other]
Title: Sparse Direction of Arrival Estimation Method Based on Vector Signal Reconstruction with a Single Vector Sensor
Authors: Jiabin Guo
Comments: 20 pages
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[22]  arXiv:2404.13551 [pdf, other]
Title: AudioRepInceptionNeXt: A lightweight single-stream architecture for efficient audio recognition
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23]  arXiv:2404.13509 [pdf, ps, other]
Title: MFHCA: Enhancing Speech Emotion Recognition Via Multi-Spatial Fusion and Hierarchical Cooperative Attention
Comments: Main paper (5 pages). Accepted for publication by ICME 2024
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[ total of 46 entries: 1-10 | 4-13 | 14-23 | 24-33 | 34-43 | 44-46 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2404, contact, help  (Access key information)