We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for eess.AS in Feb 2023

[ total of 182 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 176-182 ]
[ showing 25 entries per page: fewer | more | all ]
[1]  arXiv:2302.01736 [pdf, other]
Title: Relating EEG to continuous speech using deep neural networks: a review
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[2]  arXiv:2302.01746 [pdf, ps, other]
Title: Machine Learning Extreme Acoustic Non-reciprocity in a Linear Waveguide with Multiple Nonlinear Asymmetric Gates
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[3]  arXiv:2302.02447 [pdf, other]
Title: cross-modal fusion techniques for utterance-level emotion recognition from text and speech
Comments: 6 pages, 2 figures
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[4]  arXiv:2302.02742 [pdf, other]
Title: Residual Information in Deep Speaker Embedding Architectures
Authors: Adriana Stan
Journal-ref: Mathematics 2022, 10(21), 3927
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
[5]  arXiv:2302.02809 [pdf, other]
Title: Listen2Scene: Interactive material-aware binaural sound propagation for reconstructed 3D scenes
Comments: Accepted to IEEE VR 2024. Project page: this https URL
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)
[6]  arXiv:2302.04161 [pdf, other]
Title: Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health
Journal-ref: Proc. INTERSPEECH 2023, 2843-2847
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[7]  arXiv:2302.04215 [pdf, other]
Title: A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech
Comments: Accepted to AAAI 2023
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[8]  arXiv:2302.04932 [pdf, other]
Title: A Composite T60 Regression and Classification Approach for Speech Dereverberation
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9]  arXiv:2302.05110 [pdf, ps, other]
Title: Cross-Corpora Spoken Language Identification with Domain Diversification and Generalization
Comments: Accepted for publication in Elsevier Computer Speech & Language
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[10]  arXiv:2302.05265 [pdf, other]
Title: Spoken language change detection inspired by speaker change detection
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11]  arXiv:2302.05582 [pdf, other]
Title: ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems
Comments: Accpeted by ICST 2023 Tool Demo Track
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD); Software Engineering (cs.SE)
[12]  arXiv:2302.05756 [pdf, other]
Title: Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[13]  arXiv:2302.06227 [pdf, other]
Title: Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages
Comments: 5 pages, 5 figures
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[14]  arXiv:2302.06419 [pdf, other]
Title: AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Comments: 2023 ASRU
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[15]  arXiv:2302.06774 [pdf, other]
Title: Speaker-Independent Acoustic-to-Articulatory Speech Inversion
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16]  arXiv:2302.07077 [pdf, other]
Title: Multi-Source Contrastive Learning from Musical Audio
Comments: 8 pages, 4 figures, 3 tables. Camera-ready submission at SMC23
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[17]  arXiv:2302.07315 [pdf, other]
Title: A dataset for Audio-Visual Sound Event Detection in Movies
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[18]  arXiv:2302.07521 [pdf, other]
Title: Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Comments: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[19]  arXiv:2302.07584 [pdf, other]
Title: Fast and Blind Speech Copy-Move Detection and Localization in Noise
Subjects: Audio and Speech Processing (eess.AS); Information Theory (cs.IT); Sound (cs.SD); Signal Processing (eess.SP)
[20]  arXiv:2302.07928 [pdf, other]
Title: Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[21]  arXiv:2302.08202 [pdf, ps, other]
Title: DeepSpace: Dynamic Spatial and Source Cue Based Source Separation for Dialog Enhancement
Comments: 5 pages, 4 figures. To be published in ICASSP 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22]  arXiv:2302.08342 [pdf, other]
Title: Speech Enhancement with Multi-granularity Vector Quantization
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[23]  arXiv:2302.08549 [pdf, other]
Title: Speaker Change Detection for Transformer Transducer ASR
Comments: 5 pages, 1 figure, accepted by ICASSP 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[24]  arXiv:2302.08579 [pdf, other]
Title: Adaptable End-to-End ASR Models using Replaceable Internal LMs and Residual Softmax
Comments: Accepted by ICASSP2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[25]  arXiv:2302.08583 [pdf, other]
Title: JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition
Comments: 5 pages, 3 figures, in ICASSP 2023
Journal-ref: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes island, Greece
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[ total of 182 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 176-182 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2404, contact, help  (Access key information)