We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for eess.AS in Sep 2022

[ total of 183 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 176-183 ]
[ showing 25 entries per page: fewer | more | all ]
[1]  arXiv:2209.00423 [pdf, other]
Title: Spoofing-Aware Attention based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022
Comments: Accepted by InterSpeech2022
Subjects: Audio and Speech Processing (eess.AS)
[2]  arXiv:2209.00485 [pdf, other]
Title: Joint Speaker Encoder and Neural Back-end Model for Fully End-to-End Automatic Speaker Verification with Multiple Enrollment Utterances
Comments: Submitted to TASLP
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[3]  arXiv:2209.00506 [pdf, other]
Title: On the potential of jointly-optimised solutions to spoofing attack detection and automatic speaker verification
Comments: Accepted to IberSPEECH 2022 Conference
Subjects: Audio and Speech Processing (eess.AS)
[4]  arXiv:2209.00619 [pdf, other]
Title: diaLogic: Non-Invasive Speaker-Focused Data Acquisition for Team Behavior Modeling
Subjects: Audio and Speech Processing (eess.AS)
[5]  arXiv:2209.00733 [pdf]
Title: A Wavelet Transform Based Scheme to Extract Speech Pitch and Formant Frequencies
Journal-ref: 2019 7th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[6]  arXiv:2209.00805 [pdf, other]
Title: Multi-scale temporal-frequency attention for music source separation
Subjects: Audio and Speech Processing (eess.AS)
[7]  arXiv:2209.00934 [pdf, other]
Title: TB or not TB? Acoustic cough analysis for tuberculosis classification
Comments: Accepted for publication at Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[8]  arXiv:2209.00937 [pdf, other]
Title: Inverse-free Online Independent Vector Analysis with Flexible Iterative Source Steering
Comments: 5 pages, 2 figures. Submitted to APSIPA 2022
Subjects: Audio and Speech Processing (eess.AS)
[9]  arXiv:2209.01702 [pdf, other]
Title: Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Comments: Submit to IEEE/ACM Transactions on Audio, Speech, and Language Processing
Subjects: Audio and Speech Processing (eess.AS)
[10]  arXiv:2209.01762 [pdf, other]
Title: Movement Detection of Tongue and Related Body Parts Using IR-UWB Radar
Comments: Submitted to the 13th International Conference on ICT Convergence (ICTC)
Subjects: Audio and Speech Processing (eess.AS)
[11]  arXiv:2209.01802 [pdf, other]
Title: Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains
Comments: Submitted to DCASE 2022 Workshop. Code is available at this https URL
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[12]  arXiv:2209.01978 [pdf, other]
Title: Investigation into Target Speaking Rate Adaptation for Voice Conversion
Comments: Accepted to INTERSPEECH 2022
Subjects: Audio and Speech Processing (eess.AS)
[13]  arXiv:2209.04175 [pdf, other]
Title: Streaming Target-Speaker ASR with Neural Transducer
Comments: Accepted to Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[14]  arXiv:2209.04473 [pdf, other]
Title: Reconstructing the Dynamic Directivity of Unconstrained Speech
Comments: 19 pages, 8 figures, 3 tables. Internally-reviewed manuscript approved for public release by Facebook Inc. Researched uses proprietary dataset not available for public release
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[15]  arXiv:2209.04974 [pdf, other]
Title: VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Comments: 6 pages, 2 figure, 3 tables, v2: Appendix A has been added
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[16]  arXiv:2209.05110 [pdf, other]
Title: Continuous head-related transfer function representation based on hyperspherical harmonics
Authors: Adam Szwajcowski
Comments: Submitted to Archives of Acoustics 4.06.2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[17]  arXiv:2209.05161 [pdf, other]
Title: How Much Does Prosody Help Turn-taking? Investigations using Voice Activity Projection Models
Comments: SIGDIAL 2022 Best Paper Award Winner
Subjects: Audio and Speech Processing (eess.AS)
[18]  arXiv:2209.05273 [pdf, other]
Title: The 2022 Far-field Speaker Verification Challenge: Exploring domain mismatch and semi-supervised learning under the far-field scenario
Subjects: Audio and Speech Processing (eess.AS)
[19]  arXiv:2209.05281 [pdf, other]
Title: Modeling Dependent Structure for Utterances in ASR Evaluation
Authors: Zhe Liu, Fuchun Peng
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[20]  arXiv:2209.05735 [pdf, other]
Title: Learning ASR pathways: A sparse multilingual ASR model
Comments: submitted to ICASSP 2023
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[21]  arXiv:2209.06058 [pdf, other]
Title: Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[22]  arXiv:2209.06265 [pdf, other]
Title: Automated detection of pronunciation errors in non-native English speech employing deep learning
Authors: Daniel Korzekwa
Comments: PhD Thesis, in English + extended summary in Polish
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Other Quantitative Biology (q-bio.OT)
[23]  arXiv:2209.06337 [pdf, other]
Title: Deep Speech Synthesis from Articulatory Representations
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Quantitative Methods (q-bio.QM)
[24]  arXiv:2209.06410 [pdf, other]
Title: A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[25]  arXiv:2209.06581 [pdf, ps, other]
Title: Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Comments: 5 pages
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[ total of 183 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 176-183 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2302, contact, help  (Access key information)