Audio and Speech Processing

Authors and titles for eess.AS in Feb 2023

[ total of 182 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 176-182 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:2302.01736 [pdf, other]: Title: Relating EEG to continuous speech using deep neural networks: a review

Authors: Corentin Puffay, Bernd Accou, Lies Bollens, Mohammad Jalilpour Monesi, Jonas Vanthornhout, Hugo Van hamme, Tom Francart

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[2] arXiv:2302.01746 [pdf, ps, other]: Title: Machine Learning Extreme Acoustic Non-reciprocity in a Linear Waveguide with Multiple Nonlinear Asymmetric Gates

Authors: Anargyros Michaloliakos, Chongan Wang, Alexander F. Vakakis

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[3] arXiv:2302.02447 [pdf, other]: Title: cross-modal fusion techniques for utterance-level emotion recognition from text and speech

Authors: Jiachen Luo, Huy Phan, Joshua Reiss

Comments: 6 pages, 2 figures

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[4] arXiv:2302.02742 [pdf, other]: Title: Residual Information in Deep Speaker Embedding Architectures

Authors: Adriana Stan

Journal-ref: Mathematics 2022, 10(21), 3927

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
[5] arXiv:2302.02809 [pdf, other]: Title: Listen2Scene: Interactive material-aware binaural sound propagation for reconstructed 3D scenes

Authors: Anton Ratnarajah, Dinesh Manocha

Comments: Accepted to IEEE VR 2024. Project page: this https URL

Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)
[6] arXiv:2302.04161 [pdf, other]: Title: Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health

Authors: Apiwat Ditthapron, Emmanuel O. Agu, Adam C. Lammert

Journal-ref: Proc. INTERSPEECH 2023, 2843-2847

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[7] arXiv:2302.04215 [pdf, other]: Title: A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech

Authors: Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky

Comments: Accepted to AAAI 2023

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[8] arXiv:2302.04932 [pdf, other]: Title: A Composite T60 Regression and Classification Approach for Speech Dereverberation

Authors: Yuying Li, Yuchen Liu, Donald S.Williamson

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9] arXiv:2302.05110 [pdf, ps, other]: Title: Cross-Corpora Spoken Language Identification with Domain Diversification and Generalization

Authors: Spandan Dey, Md Sahidullah, Goutam Saha

Comments: Accepted for publication in Elsevier Computer Speech & Language

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[10] arXiv:2302.05265 [pdf, other]: Title: Spoken language change detection inspired by speaker change detection

Authors: Jagabandhu Mishra, S. R. Mahadeva Prasanna

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11] arXiv:2302.05582 [pdf, other]: Title: ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems

Authors: Daniel Hao Xian Yuen, Andrew Yong Chen Pang, Zhou Yang, Chun Yong Chong, Mei Kuan Lim, David Lo

Comments: Accpeted by ICST 2023 Tool Demo Track

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD); Software Engineering (cs.SE)
[12] arXiv:2302.05756 [pdf, other]: Title: Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation

Authors: Cong Han, Vishal Choudhari, Yinghao Aaron Li, Nima Mesgarani

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[13] arXiv:2302.06227 [pdf, other]: Title: Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages

Authors: Sudhanshu Srivastava, Ishika Gupta, Anusha Prakash, Jom Kuriakose, Hema A. Murthy

Comments: 5 pages, 5 figures

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[14] arXiv:2302.06419 [pdf, other]: Title: AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations

Authors: Jiachen Lian, Alexei Baevski, Wei-Ning Hsu, Michael Auli

Comments: 2023 ASRU

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[15] arXiv:2302.06774 [pdf, other]: Title: Speaker-Independent Acoustic-to-Articulatory Speech Inversion

Authors: Peter Wu, Li-Wei Chen, Cheol Jun Cho, Shinji Watanabe, Louis Goldstein, Alan W Black, Gopala K. Anumanchipalli

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16] arXiv:2302.07077 [pdf, other]: Title: Multi-Source Contrastive Learning from Musical Audio

Authors: Christos Garoufis, Athanasia Zlatintsi, Petros Maragos

Comments: 8 pages, 4 figures, 3 tables. Camera-ready submission at SMC23

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[17] arXiv:2302.07315 [pdf, other]: Title: A dataset for Audio-Visual Sound Event Detection in Movies

Authors: Rajat Hebbar, Digbalay Bose, Krishna Somandepalli, Veena Vijai, Shrikanth Narayanan

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[18] arXiv:2302.07521 [pdf, other]: Title: Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems

Authors: Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Guinan Li, Shujie Hu, Xunying Liu

Comments: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[19] arXiv:2302.07584 [pdf, other]: Title: Fast and Blind Speech Copy-Move Detection and Localization in Noise

Authors: Dong Yang, Mingle Liu, Muyong Cao

Subjects: Audio and Speech Processing (eess.AS); Information Theory (cs.IT); Sound (cs.SD); Signal Processing (eess.SP)
[20] arXiv:2302.07928 [pdf, other]: Title: Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge

Authors: Samuele Cornell, Zhong-Qiu Wang, Yoshiki Masuyama, Shinji Watanabe, Manuel Pariente, Nobutaka Ono

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[21] arXiv:2302.08202 [pdf, ps, other]: Title: DeepSpace: Dynamic Spatial and Source Cue Based Source Separation for Dialog Enhancement

Authors: Aaron Master, Lie Lu, Jonas Samuelsson, Heidi-Maria Lehtonen, Scott Norcross, Nathan Swedlow, Audrey Howard

Comments: 5 pages, 4 figures. To be published in ICASSP 2023

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22] arXiv:2302.08342 [pdf, other]: Title: Speech Enhancement with Multi-granularity Vector Quantization

Authors: Xiao-Ying Zhao, Qiu-Shi Zhu, Jie Zhang

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[23] arXiv:2302.08549 [pdf, other]: Title: Speaker Change Detection for Transformer Transducer ASR

Authors: Jian Wu, Zhuo Chen, Min Hu, Xiong Xiao, Jinyu Li

Comments: 5 pages, 1 figure, accepted by ICASSP 2023

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[24] arXiv:2302.08579 [pdf, other]: Title: Adaptable End-to-End ASR Models using Replaceable Internal LMs and Residual Softmax

Authors: Keqi Deng, Philip C. Woodland

Comments: Accepted by ICASSP2023

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[25] arXiv:2302.08583 [pdf, other]: Title: JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition

Authors: Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran

Comments: 5 pages, 3 figures, in ICASSP 2023

Journal-ref: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes island, Greece

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)

[ total of 182 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 176-182 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2404, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Feb 2023