Audio and Speech Processing

Authors and titles for recent submissions, skipping first 10

[ total of 48 entries: 1-10 | 11-20 | 21-30 | 31-40 | 41-48 ]
[ showing 10 entries per page: fewer | more | all ]

Wed, 24 Apr 2024 (continued, showing last 3 of 13 entries)

[11] arXiv:2404.14946 (cross-list from cs.SD) [pdf, other]: Title: StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

Authors: Sen Liu, Yiwei Guo, Xie Chen, Kai Yu

Comments: Accepted by ICASSP 2024

Journal-ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 11521-11525

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[12] arXiv:2404.14736 (cross-list from cs.HC) [pdf, ps, other]: Title: Qualitative Approaches to Voice UX

Authors: Katie Seaborn, Jacqueline Urakami, Peter Pennefather, Norihisa P. Miyake

Journal-ref: ACM Computing Surveys (2024)

Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[13] arXiv:2404.14716 (cross-list from cs.CL) [pdf, other]: Title: Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities

Authors: Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang

Comments: 16 pages, 6 figures

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Tue, 23 Apr 2024 (showing first 7 of 17 entries)

[14] arXiv:2404.14063 (cross-list from cs.SD) [pdf, other]: Title: LVNS-RAVE: Diversified audio generation with RAVE and Latent Vector Novelty Search

Authors: Jinyue Guo, Anna-Maria Christodoulou, Balint Laczko, Kyrre Glette

Comments: Accepted to GECCO 24 Companion

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[15] arXiv:2404.13914 (cross-list from cs.SD) [pdf, other]: Title: Audio Anti-Spoofing Detection: A Survey

Authors: Menglu Li, Yasaman Ahmadiadli, Xiao-Ping Zhang

Comments: submitted to ACM Computing Surveys

Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[16] arXiv:2404.13892 (cross-list from cs.SD) [pdf, other]: Title: Retrieval-Augmented Audio Deepfake Detection

Authors: Zuheng Kang, Yayun He, Botao Zhao, Xiaoyang Qu, Junqing Peng, Jing Xiao, Jianzong Wang

Comments: Accepted by the 2024 International Conference on Multimedia Retrieval (ICMR 2024)

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[17] arXiv:2404.13821 (cross-list from cs.HC) [pdf, other]: Title: Robotic Blended Sonification: Consequential Robot Sound as Creative Material for Human-Robot Interaction

Authors: Stine S. Johansen, Yanto Browning, Anthony Brumpton, Jared Donovan, Markus Rittenbruch

Comments: Paper accepted at ISEA 24, The 29th International Symposium on Electronic Art, Brisbane, Australia, 21-29 June 2024

Subjects: Human-Computer Interaction (cs.HC); Robotics (cs.RO); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[18] arXiv:2404.13789 (cross-list from cs.SD) [pdf, other]: Title: Anchor-aware Deep Metric Learning for Audio-visual Retrieval

Authors: Donghuo Zeng, Yanan Wang, Kazushi Ikeda, Yi Yu

Comments: 9 pages, 5 figures. Accepted by ACM ICMR 2024

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[19] arXiv:2404.13569 (cross-list from cs.SD) [pdf, other]: Title: Musical Word Embedding for Music Tagging and Retrieval

Authors: SeungHeon Doh, Jongpil Lee, Dasaem Jeong, Juhan Nam

Comments: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[20] arXiv:2404.13568 (cross-list from cs.SD) [pdf, ps, other]: Title: Sparse Direction of Arrival Estimation Method Based on Vector Signal Reconstruction with a Single Vector Sensor

Authors: Jiabin Guo

Comments: 20 pages

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

[ total of 48 entries: 1-10 | 11-20 | 21-30 | 31-40 | 41-48 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, new, 2404, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for recent submissions, skipping first 10

Wed, 24 Apr 2024 (continued, showing last 3 of 13 entries)

Tue, 23 Apr 2024 (showing first 7 of 17 entries)