We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for cs.SD in Feb 2023, skipping first 75

[ total of 179 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | 151-175 | 176-179 ]
[ showing 25 entries per page: fewer | more | all ]
[76]  arXiv:2302.02088 (cross-list from cs.CV) [pdf, other]
Title: AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis
Comments: NeurIPS 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[77]  arXiv:2302.02419 (cross-list from cs.CL) [pdf, other]
Title: deep learning of segment-level feature representation for speech emotion recognition in conversations
Comments: 6 pages, 4 figures
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[78]  arXiv:2302.03124 (cross-list from cs.LG) [pdf, other]
Title: Autodecompose: A generative self-supervised model for semantic decomposition
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[79]  arXiv:2302.03498 (cross-list from cs.CL) [pdf, other]
Title: MAC: A unified framework boosting low resource automatic speech recognition
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[80]  arXiv:2302.03533 (cross-list from cs.CV) [pdf, other]
Title: Revisiting Pre-training in Audio-Visual Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[81]  arXiv:2302.04331 (cross-list from cs.LG) [pdf, other]
Title: Short-Term Memory Convolutions
Comments: ICLR 2023
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[82]  arXiv:2302.04959 (cross-list from cs.LG) [pdf, other]
Title: Hypernetworks build Implicit Neural Representations of Sounds
Comments: ECML2023
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[83]  arXiv:2302.05040 (cross-list from cs.CL) [pdf, other]
Title: PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction
Comments: Accepted camera-ready version for INTERSPEECH 2023
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[84]  arXiv:2302.06008 (cross-list from cs.CL) [pdf, ps, other]
Title: ASR Bundestag: A Large-Scale political debate dataset in German
Comments: 13 pages, 2 tables, 4 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[85]  arXiv:2302.07560 (cross-list from cs.LG) [pdf, ps, other]
Title: Unsupervised classification to improve the quality of a bird song recording dataset
Authors: Félix Michaud (ISYEB ), Jérôme Sueur (ISYEB ), Maxime Le Cesne (ISYEB ), Sylvain Haupert (ISYEB )
Journal-ref: Ecological Informatics, 2023, pp.101952
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[86]  arXiv:2302.08088 (cross-list from cs.CL) [pdf, other]
Title: TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement
Comments: Accepted at ICASSP 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[87]  arXiv:2302.08102 (cross-list from cs.CL) [pdf, other]
Title: Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[88]  arXiv:2302.08607 (cross-list from cs.NE) [pdf, other]
Title: Adaptive Axonal Delays in feedforward spiking neural networks for accurate spoken word recognition
Comments: Accepted by ICASSP 2023
Subjects: Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[89]  arXiv:2302.08794 (cross-list from cs.HC) [pdf, ps, other]
Title: Build a training interface to install the bat's echolocation skills in humans
Comments: 4 pages, 3 figures
Subjects: Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[90]  arXiv:2302.08950 (cross-list from cs.CL) [pdf, ps, other]
Title: Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches
Comments: Accepted to Interspeech 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[91]  arXiv:2302.09328 (cross-list from cs.MM) [pdf, other]
Title: SSVMR: Saliency-based Self-training for Video-Music Retrieval
Comments: Accepted by ICASSP 2023
Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[92]  arXiv:2302.09723 (cross-list from cs.CL) [pdf, other]
Title: Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition
Comments: Neural Networks, Volume 161, April 2023, Pages 494-504
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[93]  arXiv:2302.09856 (cross-list from cs.CL) [pdf, ps, other]
Title: Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition
Comments: Accepted to IEEE ICASSP 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[94]  arXiv:2302.10871 (cross-list from cs.CL) [pdf, other]
Title: Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Comments: EACL 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[95]  arXiv:2302.10915 (cross-list from cs.LG) [pdf, other]
Title: Conformers are All You Need for Visual Speech Recognition
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[96]  arXiv:2302.11224 (cross-list from cs.CL) [pdf, other]
Title: MADI: Inter-domain Matching and Intra-domain Discrimination for Cross-domain Speech Recognition
Comments: Accepted to ICASSP 2023
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[97]  arXiv:2302.12049 (cross-list from cs.CL) [pdf, other]
Title: Evaluating Automatic Speech Recognition in an Incremental Setting
Comments: 5 pages
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[98]  arXiv:2302.12057 (cross-list from cs.CL) [pdf, other]
Title: ProsAudit, a prosodic benchmark for self-supervised speech models
Comments: Accepted at Interspeech 2023. 4 pages + references, 1 figure
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[99]  arXiv:2302.12829 (cross-list from cs.CL) [pdf, other]
Title: Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Comments: 5 pages, 1 figure, accepted at ICASSP 2023; fixed typo and URL in abstract
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[100]  arXiv:2302.12921 (cross-list from cs.CL) [pdf, other]
Title: Pre-Finetuning for Few-Shot Emotional Speech Recognition
Comments: 5 pages, 4 figures. Code available at this https URL
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 179 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | 151-175 | 176-179 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help  (Access key information)