We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions, skipping first 11

[ total of 108 entries: 1-25 | 12-36 | 37-61 | 62-86 | 87-108 ]
[ showing 25 entries per page: fewer | more | all ]

Tue, 6 Jun 2023 (continued, showing last 10 of 21 entries)

[12]  arXiv:2306.02719 (cross-list from cs.CL) [pdf, ps, other]
Title: Multiple output samples for each input in a single-output Gaussian process
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[13]  arXiv:2306.02680 (cross-list from cs.CL) [pdf, other]
Title: BeAts: Bengali Speech Acts Recognition using Multimodal Attention Fusion
Comments: Accepted at INTERSPEECH 2023
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[14]  arXiv:2306.02579 (cross-list from cs.CL) [pdf, other]
Title: Cross-Lingual Transfer Learning for Phrase Break Prediction with Multilingual Language Model
Comments: Accepted by INTERSPEECH 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[15]  arXiv:2306.02534 (cross-list from cs.CL) [pdf, other]
Title: Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition
Comments: Accepted at INTERSPEECH 2023
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[16]  arXiv:2306.02317 (cross-list from cs.CL) [pdf, other]
Title: SpellMapper: A non-autoregressive neural spellchecker for ASR customization with candidate retrieval based on n-gram mappings
Comments: Accepted by INTERSPEECH 2023
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[17]  arXiv:2306.02273 (cross-list from cs.CL) [pdf, ps, other]
Title: End-to-End Joint Target and Non-Target Speakers ASR
Comments: Accepted at Interspeech 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[18]  arXiv:2306.02153 (cross-list from cs.CL) [pdf, ps, other]
Title: Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling
Comments: Accepted to Interspeech 2023
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[19]  arXiv:2306.02105 (cross-list from cs.CL) [pdf, other]
Title: Adapting Pretrained ASR Models to Low-resource Clinical Speech using Epistemic Uncertainty-based Data Selection
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[20]  arXiv:2306.01942 (cross-list from cs.CL) [pdf, other]
Title: Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Comments: To appear in Interspeech 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[21]  arXiv:2306.01864 (cross-list from cs.LG) [pdf, other]
Title: Discovering COVID-19 Coughing and Breathing Patterns from Unlabeled Data Using Contrastive Learning with Varying Pre-Training Domains
Comments: Accepted by Proceedings of INTERSPEECH 2023
Journal-ref: Proceedings of INTERSPEECH 2023
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Mon, 5 Jun 2023 (showing first 15 of 22 entries)

[22]  arXiv:2306.01635 [pdf, other]
Title: Q&A: Query-Based Representation Learning for Multi-Track Symbolic Music re-Arrangement
Comments: Accepted by IJCAI 2023 Special Track for AI the Arts and Creativity
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23]  arXiv:2306.01533 [pdf, other]
Title: Enhance Temporal Relations in Audio Captioning with Sound Event Detection
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[24]  arXiv:2306.01491 [pdf, other]
Title: Learning Local to Global Feature Aggregation for Speech Emotion Recognition
Comments: This paper has been accepted on INTERSPEECH 2023
Subjects: Sound (cs.SD)
[25]  arXiv:2306.01442 [pdf, other]
Title: Towards Robust FastSpeech 2 by Modelling Residual Multimodality
Comments: Accepted at INTERSPEECH 2023
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[26]  arXiv:2306.01428 [pdf, other]
Title: Improved DeepFake Detection Using Whisper Features
Comments: Accepted to INTERSPEECH 2023
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[27]  arXiv:2306.01304 [pdf, other]
Title: JEPOO: Highly Accurate Joint Estimation of Pitch, Onset and Offset for Music Information Retrieval
Comments: This paper has been accepted by IJCAI 2023; 11 pages, 6 figures
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[28]  arXiv:2306.01084 [pdf, other]
Title: Exploration on HuBERT with Multiple Resolutions
Comments: Accepted to Interspeech2023
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[29]  arXiv:2306.01522 (cross-list from eess.AS) [pdf, ps, other]
Title: Auditory Representation Effective for Estimating Vocal Tract Information
Comments: This manuscript was submitted to APSIPA ASC 2023 on 2 Jun 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[30]  arXiv:2306.01433 (cross-list from eess.AS) [pdf, other]
Title: Zero-Shot Blind Audio Bandwidth Extension
Comments: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[31]  arXiv:2306.01411 (cross-list from eess.AS) [pdf, other]
Title: HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders
Comments: Accepted by INTERSPEECH 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[32]  arXiv:2306.01385 (cross-list from eess.AS) [pdf, ps, other]
Title: Task-Agnostic Structured Pruning of Speech Representation Models
Comments: Accepted by INTERSPEECH 2023
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[33]  arXiv:2306.01332 (cross-list from eess.AS) [pdf, other]
Title: Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing
Comments: Accepted for publication in Proc. DAFx23, Copenhagen, Denmark, September 2023
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[34]  arXiv:2306.01327 (cross-list from cs.CL) [pdf, other]
Title: Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23
Comments: IWSLT 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[35]  arXiv:2306.01303 (cross-list from cs.CL) [pdf, ps, other]
Title: DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
Comments: Accepted by INTERSPEECH 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[36]  arXiv:2306.01208 (cross-list from eess.AS) [pdf, other]
Title: Adapting an Unadaptable ASR System
Comments: submitted to INTERSPEECH
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[ total of 108 entries: 1-25 | 12-36 | 37-61 | 62-86 | 87-108 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2306, contact, help  (Access key information)