Audio and Speech Processing

Authors and titles for eess.AS in Nov 2020

[ total of 227 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 226-227 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:2011.00030 [pdf, other]: Title: A Curated Dataset of Urban Scenes for Audio-Visual Scene Analysis

Authors: Shanshan Wang, Annamaria Mesaros, Toni Heittola, Tuomas Virtanen

Comments: accepted by ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS)
[2] arXiv:2011.00091 [pdf, other]: Title: Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization

Authors: Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu

Comments: submitted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[3] arXiv:2011.00175 [pdf, other]: Title: Multimodal Urban Sound Tagging with Spatiotemporal Context

Authors: Jisheng Bai, Jianfeng Chen, Mou Wang

Subjects: Audio and Speech Processing (eess.AS)
[4] arXiv:2011.00316 [pdf, other]: Title: AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization

Authors: Yen-Hao Chen, Da-Yi Wu, Tsung-Han Wu, Hung-yi Lee

Comments: Submitted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[5] arXiv:2011.00502 [pdf, other]: Title: Focusing Phenomena in Linear Discrete Inverse Problems in Acoustics

Authors: Eric C. Hamdan, Filippo Maria Fazi

Comments: 33 pages, 23 figures, submitted for review to the Journal of Sound and Vibration; fixed typos and minor revision in sections 6.1.4-6.1.5 and 6.2

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[6] arXiv:2011.00699 [pdf, other]: Title: Transformer-based Arabic Dialect Identification

Authors: Wanqiu Lin, Maulik Madhavi, Rohan Kumar Das, Haizhou Li

Comments: Accepted for publication in International Conference on Asian Language Processing (IALP) 2020

Subjects: Audio and Speech Processing (eess.AS)
[7] arXiv:2011.00721 [pdf, other]: Title: Robust Raw Waveform Speech Recognition Using Relevance Weighted Representations

Authors: Purvi Agrawal, Sriram Ganapathy

Comments: arXiv admin note: text overlap with arXiv:2001.07067

Journal-ref: Proc. Interspeech 2020, 1649-1653 (2020)

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[8] arXiv:2011.00935 [pdf, other]: Title: FeatherTTS: Robust and Efficient attention based Neural TTS

Authors: Qiao Tian, Zewang Zhang, Chao Liu, Heng Lu, Linghui Chen, Bin Wei, Pujiang He, Shan Liu

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9] arXiv:2011.01108 [pdf, ps, other]: Title: End-to-end anti-spoofing with RawNet2

Authors: Hemlata Tak, Jose Patino, Massimiliano Todisco, Andreas Nautsch, Nicholas Evans, Anthony Larcher

Comments: Accepted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS)
[10] arXiv:2011.01130 [pdf, other]: Title: Speaker anonymisation using the McAdams coefficient

Authors: Jose Patino, Natalia Tomashenko, Massimiliano Todisco, Andreas Nautsch, Nicholas Evans

Comments: Accepted at INTERSPEECH 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[11] arXiv:2011.01174 [pdf, other]: Title: Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech

Authors: Yeunju Choi, Youngmoon Jung, Youngjoo Suh, Hoirin Kim

Comments: 9 pages, 5 figures, 4 tables

Journal-ref: IEEE Access, vol. 10, pp. 52621 - 52629, 2022

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[12] arXiv:2011.01175 [pdf, other]: Title: CAMP: a Two-Stage Approach to Modelling Prosody in Context

Authors: Zack Hodari, Alexis Moinet, Sri Karlapati, Jaime Lorenzo-Trueba, Thomas Merritt, Arnaud Joly, Ammar Abbas, Penny Karanasou, Thomas Drugman

Comments: 5 pages. Published in the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021)

Subjects: Audio and Speech Processing (eess.AS)
[13] arXiv:2011.01210 [pdf, other]: Title: Focus on the present: a regularization method for the ASR source-target attention layer

Authors: Nanxin Chen, Piotr Żelasko, Jesús Villalba, Najim Dehak

Comments: submitted to ICASSP2021. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[14] arXiv:2011.01557 [pdf, other]: Title: StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization

Authors: Ahmed Mustafa, Nicola Pia, Guillaume Fuchs

Comments: Accepted to ICASSP2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[15] arXiv:2011.01570 [pdf, other]: Title: Dynamic latency speech recognition with asynchronous revision

Authors: Mingkun Huang, Meng Cai, Jun Zhang, Yang Zhang, Yongbin You, Yi He, Zejun Ma

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16] arXiv:2011.01576 [pdf, other]: Title: Improving RNN transducer with normalized jointer network

Authors: Mingkun Huang, Jun Zhang, Meng Cai, Yang Zhang, Jiali Yao, Yongbin You, Yi He, Zejun Ma

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[17] arXiv:2011.01678 [pdf, other]: Title: Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion

Authors: Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng

Comments: Accepted to Interspeech 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[18] arXiv:2011.01686 [pdf, ps, other]: Title: Improved End-to-End Dysarthric Speech Recognition via Meta-learning Based Model Re-initialization

Authors: Disong Wang, Jianwei Yu, Xixin Wu, Lifa Sun, Xunying Liu, Helen Meng

Comments: To appear in ISCSLP2021

Subjects: Audio and Speech Processing (eess.AS)
[19] arXiv:2011.01691 [pdf, other]: Title: A Study of Incorporating Articulatory Movement Information in Speech Enhancement

Authors: Yu-Wen Chen, Kuo-Hsuan Hung, Shang-Yi Chuang, Jonathan Sherman, Xugang Lu, Yu Tsao

Subjects: Audio and Speech Processing (eess.AS)
[20] arXiv:2011.01965 [pdf, ps, other]: Title: Short-time deep-learning based source separation for speech enhancement in reverberant environments with beamforming

Authors: Alejandro Díaz, Diego Pincheira, Rodrigo Mahu, Nestor Becerra Yoma

Subjects: Audio and Speech Processing (eess.AS)
[21] arXiv:2011.01986 [pdf, other]: Title: Unsupervised Pattern Discovery from Thematic Speech Archives Based on Multilingual Bottleneck Features

Authors: Man-Ling Sung, Siyuan Feng, Tan Lee

Comments: 8 pages, accepted and presented in APSIPA-APC 2018. This work was done when Man-Ling Sung and Siyuan Feng were postgraduate students in the Chinese University of Hong Kong

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[22] arXiv:2011.01991 [pdf, other]: Title: Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition

Authors: Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong

Comments: 8 pages, 2 figures, SLT 2021

Journal-ref: 2021 IEEE Spoken Language Technology Workshop (SLT)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[23] arXiv:2011.01997 [pdf, other]: Title: DOVER-Lap: A Method for Combining Overlap-aware Diarization Outputs

Authors: Desh Raj, Leibny Paola Garcia-Perera, Zili Huang, Shinji Watanabe, Daniel Povey, Andreas Stolcke, Sanjeev Khudanpur

Comments: Accepted to IEEE SLT 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[24] arXiv:2011.02008 [pdf, other]: Title: Complex ratio masking for singing voice separation

Authors: Yixuan Zhang, Yuzhou Liu, DeLiang Wang

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[25] arXiv:2011.02014 [pdf, other]: Title: Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis

Authors: Desh Raj, Pavel Denisov, Zhuo Chen, Hakan Erdogan, Zili Huang, Maokui He, Shinji Watanabe, Jun Du, Takuya Yoshioka, Yi Luo, Naoyuki Kanda, Jinyu Li, Scott Wisdom, John R. Hershey

Comments: Accepted to IEEE SLT 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

[ total of 227 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 226-227 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2404, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Nov 2020