Audio and Speech Processing

Authors and titles for eess.AS in Oct 2023

[ total of 311 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 301-311 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:2310.00587 [pdf, other]: Title: Mechatronic Generation of Datasets for Acoustics Research

Authors: Austin Lu, Ethaniel Moore, Arya Nallanthighall, Kanad Sarkar, Manan Mittal, Ryan M. Corey, Paris Smaragdis, Andrew Singer

Comments: 5 pages, 5 figures, IWAENC 2022

Subjects: Audio and Speech Processing (eess.AS); Systems and Control (eess.SY)
[2] arXiv:2310.00602 [pdf, ps, other]: Title: Wavelet Scattering Transform for Improving Generalization in Low-Resourced Spoken Language Identification

Authors: Spandan Dey, Premjeet Singh, Goutam Saha

Comments: Accepted and presented in INTERSPEECH 2023

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[3] arXiv:2310.01122 [pdf, other]: Title: A Fused Deep Denoising Sound Coding Strategy for Bilateral Cochlear Implants

Authors: Tom Gajecki, Waldo Nogueira

Subjects: Audio and Speech Processing (eess.AS)
[4] arXiv:2310.01128 [pdf, other]: Title: Disentangling Voice and Content with Self-Supervision for Speaker Recognition

Authors: Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li

Comments: Accepted to NeurIPS 2023 (main track)

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI)
[5] arXiv:2310.01353 [pdf, other]: Title: Scaling Up Music Information Retrieval Training with Semi-Supervised Learning

Authors: Yun-Ning Hung, Ju-Chiang Wang, Minz Won, Duc Le

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[6] arXiv:2310.01688 [pdf, other]: Title: One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition

Authors: Samuele Cornell, Jee-weon Jung, Shinji Watanabe, Stefano Squartini

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[7] arXiv:2310.01839 [pdf, ps, other]: Title: Preserving Phonemic Distinctions for Ordinal Regression: A Novel Loss Function for Automatic Pronunciation Assessment

Authors: Bi-Cheng Yan, Hsin-Wei Wang, Yi-Cheng Wang, Jiun-Ting Li, Chi-Han Lin, Berlin Chen

Comments: Accepted by ASRU 2023

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[8] arXiv:2310.01867 [pdf, other]: Title: Audio-visual child-adult speaker classification in dyadic interactions

Authors: Anfeng Xu, Kevin Huang, Tiantian Feng, Helen Tager-Flusberg, Shrikanth Narayanan

Comments: In review for ICASSP 2024, 5 pages

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9] arXiv:2310.02281 [pdf, other]: Title: End-to-End Continuous Speech Emotion Recognition in Real-life Customer Service Call Center Conversations

Authors: Yajing Feng (CNRS-LISN), Laurence Devillers (CNRS-LISN, SU)

Journal-ref: 2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), Sep 2023, Boston (MA), United States

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[10] arXiv:2310.02640 [pdf, other]: Title: The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains

Authors: Erica Cooper, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi

Comments: Accepted to ASRU 2023

Subjects: Audio and Speech Processing (eess.AS)
[11] arXiv:2310.02699 [pdf, other]: Title: Continual Contrastive Spoken Language Understanding

Authors: Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, Bhiksha Raj

Comments: Accepted to ACL Findings 2024

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI)
[12] arXiv:2310.02732 [pdf, ps, other]: Title: Discriminative Training of VBx Diarization

Authors: Dominik Klement, Mireia Diez, Federico Landini, Lukáš Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara

Comments: Submitted to ICASSP 2024

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[13] arXiv:2310.02802 [pdf, other]: Title: VITS-Based Singing Voice Conversion Leveraging Whisper and multi-scale F0 Modeling

Authors: Ziqian Ning, Yuepeng Jiang, Zhichao Wang, Bin Zhang, Lei Xie

Subjects: Audio and Speech Processing (eess.AS)
[14] arXiv:2310.02971 [pdf, other]: Title: Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model

Authors: Kai-Wei Chang, Ming-Hsin Chen, Yun-Ping Lin, Jing Neng Hsu, Paul Kuo-Ming Huang, Chien-yu Huang, Shang-Wen Li, Hung-yi Lee

Comments: Accepted to IEEE ASRU 2023

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Signal Processing (eess.SP)
[15] arXiv:2310.03018 [pdf, other]: Title: Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages

Authors: Kuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, Ewan Dunbar, Hung-yi Lee

Comments: Accepted by ICASSP 2024 (v2)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[16] arXiv:2310.03444 [pdf, other]: Title: VaSAB: The variable size adaptive information bottleneck for disentanglement on speech and singing voice

Authors: Frederik Bous, Axel Roebel

Comments: Submitted to ICASSP 2024

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[17] arXiv:2310.03455 [pdf, other]: Title: Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems

Authors: Francesca Ronchini, Romain Serizel

Comments: Accepted to ICASSP 2024

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[18] arXiv:2310.03480 [pdf, other]: Title: The ICASSP SP Cadenza Challenge: Music Demixing/Remixing for Hearing Aids

Authors: Gerardo Roa Dabike, Michael A. Akeroyd, Scott Bannister, Jon Barker, Trevor J. Cox, Bruno Fazenda, Jennifer Firth, Simone Graetzer, Alinka Greasley, Rebecca R. Vos, William M. Whitmer

Comments: 2-page paper for ICASSP 2024 SP Grand Challenge

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Signal Processing (eess.SP)
[19] arXiv:2310.03538 [pdf, other]: Title: Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis

Authors: Jae-Sung Bae, Joun Yeop Lee, Ji-Hyun Lee, Seongkyu Mun, Taehwa Kang, Hoon-Young Cho, Chanwoo Kim

Comments: Accepted to ICASSP 2024

Subjects: Audio and Speech Processing (eess.AS)
[20] arXiv:2310.03688 [pdf, other]: Title: Speaker localization using direct path dominance test based on sound field directivity

Authors: Boaz Rafaely, Koby Alhaiany

Journal-ref: Signal Processing, vol. 143, pp. 42 - 47, 2018

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[21] arXiv:2310.03889 [pdf, ps, other]: Title: Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification

Authors: Yuanbo Hou, Siyang Song, Chuang Yu, Wenwu Wang, Dick Botteldooren

Comments: IEEE Signal Processing Letters, doi: 10.1109/LSP.2023.3319233

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22] arXiv:2310.03901 [pdf, other]: Title: Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset

Authors: Yiwen Shao

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[23] arXiv:2310.04035 [pdf, other]: Title: A privacy-preserving method using secret key for convolutional neural network-based speech classification

Authors: Shoko Niwa, Sayaka Shiota, Hitoshi Kiya

Comments: To appear in the 31st European Signal Processing Conference (EUSIPCO 2023)

Subjects: Audio and Speech Processing (eess.AS)
[24] arXiv:2310.04169 [pdf, ps, other]: Title: Spatial sampling and beamforming for spherical microphone arrays

Authors: Boaz Rafaely

Journal-ref: 2008 Hands-Free Speech Communication and Microphone Arrays, Trento, Italy, 2008, pp. 5-8

Subjects: Audio and Speech Processing (eess.AS)
[25] arXiv:2310.04191 [pdf, ps, other]: Title: Zones of quiet in a broadband diffuse sound field

Authors: Boaz Rafaely

Journal-ref: J. Acoust. Soc. Am., vol. 110, no. 1, pp. 296-302, July 2001

Subjects: Audio and Speech Processing (eess.AS)

[ total of 311 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 301-311 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Oct 2023