We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for recent submissions, skipping first 18

[ total of 35 entries: 1-10 | 9-18 | 19-28 | 29-35 ]
[ showing 10 entries per page: fewer | more | all ]

Tue, 16 Apr 2024 (continued, showing last 7 of 10 entries)

[19]  arXiv:2404.09956 (cross-list from cs.SD) [pdf, other]
Title: Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Comments: this https URL
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[20]  arXiv:2404.09466 (cross-list from cs.SD) [pdf, other]
Title: Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
Comments: Fixed Typos
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[21]  arXiv:2404.09342 (cross-list from cs.CV) [pdf, other]
Title: Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan
Comments: ACM Multimedia Conference - Grand Challenge
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[22]  arXiv:2404.09192 (cross-list from cs.SD) [pdf, other]
Title: Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[23]  arXiv:2404.09177 (cross-list from cs.SD) [pdf, other]
Title: An Experimental Comparison Of Multi-view Self-supervised Methods For Music Tagging
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[24]  arXiv:2404.08857 (cross-list from cs.SD) [pdf, other]
Title: Voice Attribute Editing with Text Prompt
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[25]  arXiv:2404.08813 (cross-list from cs.HC) [pdf, other]
Title: Interactive Sonification for Health and Energy using ChucK and Unity
Comments: In the Proceedings of the Conference on Sonification of Health and Environmental Data (SoniHED 2022). this http URL
Journal-ref: Conference on Sonification of Health and Environmental Data (SoniHED 2022)
Subjects: Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Mon, 15 Apr 2024

[26]  arXiv:2404.08064 [pdf, ps, other]
Title: The Impact of Speech Anonymization on Pathology and Its Limits
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[27]  arXiv:2404.08264 (cross-list from cs.MM) [pdf, other]
Title: Guided Masked Self-Distillation Modeling for Distributed Multimedia Sensor Event Analysis
Comments: 13page, 7figure, under review
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[28]  arXiv:2404.08022 (cross-list from cs.SD) [pdf, other]
Title: A lightweight dual-stage framework for personalized speech enhancement based on DeepFilterNet2
Authors: Thomas Serre (S2A, IDS), Mathieu Fontaine (S2A, IDS), Éric Benhaim, Geoffroy Dutour, Slim Essid (S2A, IDS)
Comments: Accepted at HSCMA24, Satellite workshop of ICASSP24
Journal-ref: ICASSP, Apr 2024, Seoul (Korea), South Korea
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 35 entries: 1-10 | 9-18 | 19-28 | 29-35 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, new, 2404, contact, help  (Access key information)