Sound

Authors and titles for recent submissions, skipping first 63

[ total of 46 entries: 1-10 | 7-16 | 17-26 | 27-36 | 37-46 ]
[ showing 10 entries per page: fewer | more | all ]

Fri, 19 Apr 2024 (continued, showing last 4 of 8 entries)

[37] arXiv:2404.12299 (cross-list from cs.CL) [pdf, other]: Title: Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair

Authors: Yusuke Sakai, Mana Makinae, Hidetaka Kamigaito, Taro Watanabe

Comments: 23 pages, 9 figures

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[38] arXiv:2404.12251 (cross-list from cs.LG) [pdf, other]: Title: Dynamic Modality and View Selection for Multimodal Emotion Recognition with Missing Modalities

Authors: Luciana Trinkaus Menon, Luiz Carlos Ribeiro Neduziak, Jean Paul Barddal, Alessandro Lameiras Koerich, Alceu de Souza Britto Jr

Comments: 15 pages

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[39] arXiv:2404.11938 (cross-list from cs.MM) [pdf, other]: Title: HyDiscGAN: A Hybrid Distributed cGAN for Audio-Visual Privacy Preservation in Multimodal Sentiment Analysis

Authors: Zhuojia Wu, Qi Zhang, Duoqian Miao, Kun Yi, Wei Fan, Liang Hu

Comments: 13 pages, IJCAI-2024

Subjects: Multimedia (cs.MM); Distributed, Parallel, and Cluster Computing (cs.DC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[40] arXiv:2404.11619 (cross-list from eess.AS) [pdf, ps, other]: Title: Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech

Authors: Shannon Wotherspoon, William Hartmann, Matthew Snover

Comments: 2 pages

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Thu, 18 Apr 2024

[41] arXiv:2404.11275 [pdf, other]: Title: Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation

Authors: Ye Bai, Chenxing Li, Hao Li, Yuanyuan Zhao, Xiaorui Wang

Comments: Accepted by ICME 2024

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[42] arXiv:2404.11116 [pdf, other]: Title: Music Enhancement with Deep Filters: A Technical Report for The ICASSP 2024 Cadenza Challenge

Authors: Keren Shao, Ke Chen, Shlomo Dubnov

Comments: 2 pages, 2 figures, 1 tables, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[43] arXiv:2404.10842 [pdf, ps, other]: Title: Unsupervised Speaker Diarization in Distributed IoT Networks Using Federated Learning

Authors: Amit Kumar Bhuyan, Hrishikesh Dutta, Subir Biswas

Comments: 11 pages, 7 figures, 1 table

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[44] arXiv:2404.11399 (cross-list from eess.AS) [pdf, other]: Title: In situ sound absorption estimation with the discrete complex image source method

Authors: Eric Brandao, William Fonseca, Paulo Mareze, Carlos Resende, Gabriel Azzuz, Joao Pontalti, Efren Fernandez-Grande

Comments: 37 pages, 12 figures, original manuscript to be submitted to the Journal of Sound and Vibration

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Classical Physics (physics.class-ph)
[45] arXiv:2404.10989 (cross-list from cs.CV) [pdf, other]: Title: FairSSD: Understanding Bias in Synthetic Speech Detectors

Authors: Amit Kumar Singh Yadav, Kratika Bhagtani, Davide Salvi, Paolo Bestagini, Edward J.Delp

Comments: Accepted at CVPR 2024 (WMF)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[46] arXiv:2404.10922 (cross-list from cs.CL) [pdf, other]: Title: Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training

Authors: Pavel Denisov, Ngoc Thang Vu

Comments: NAACL Findings 2024

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

[ total of 46 entries: 1-10 | 7-16 | 17-26 | 27-36 | 37-46 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2404, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for recent submissions, skipping first 63

Fri, 19 Apr 2024 (continued, showing last 4 of 8 entries)

Thu, 18 Apr 2024