Computation and Language

Authors and titles for cs.CL in Aug 2020, skipping first 350

[ total of 395 entries: 1-50 | ... | 201-250 | 251-300 | 301-350 | 351-395 ]
[ showing 50 entries per page: fewer | more | all ]

[351] arXiv:2008.00731 (cross-list from eess.AS) [pdf, ps, other]: Title: Unsupervised Discovery of Recurring Speech Patterns Using Probabilistic Adaptive Metrics

Authors: Okko Räsänen, María Andrea Cruz Blandón

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[352] arXiv:2008.00768 (cross-list from eess.AS) [pdf, other]: Title: One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech

Authors: Tomáš Nekvinda, Ondřej Dušek

Comments: Accepted to INTERSPEECH 2020; for the source files, see this https URL

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)
[353] arXiv:2008.01300 (cross-list from eess.AS) [pdf, other]: Title: Weakly Supervised Construction of ASR Systems with Massive Video Data

Authors: Mengli Cheng, Chengyu Wang, Xu Hu, Jun Huang, Xiaobo Wang

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[354] arXiv:2008.01504 (cross-list from eess.AS) [pdf, other]: Title: "This is Houston. Say again, please". The Behavox system for the Apollo-11 Fearless Steps Challenge (phase II)

Authors: Arseniy Gorin, Daniil Kulko, Steven Grima, Alex Glasman

Comments: Accepted to Interspeech 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[355] arXiv:2008.01832 (cross-list from eess.AS) [pdf, other]: Title: Future Vector Enhanced LSTM Language Model for LVCSR

Authors: Qi Liu, Yanmin Qian, Kai Yu

Comments: Accepted by ASRU-2017

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[356] arXiv:2008.02516 (cross-list from eess.AS) [pdf, other]: Title: FastLR: Non-Autoregressive Lipreading Model with Integrate-and-Fire

Authors: Jinglin Liu, Yi Ren, Zhou Zhao, Chen Zhang, Baoxing Huai, Nicholas Jing Yuan

Comments: Accepted by ACM MM 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD)
[357] arXiv:2008.02603 (cross-list from eess.AS) [pdf, other]: Title: Data balancing for boosting performance of low-frequency classes in Spoken Language Understanding

Authors: Judith Gaspers, Quynh Do, Fabian Triefenbach

Comments: accepted at InterSpeech 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[358] arXiv:2008.02885 (cross-list from physics.soc-ph) [pdf, other]: Title: A general solution to the preferential selection model

Authors: Jake Ryland Williams, Diana Solano-Oropeza, Jacob R. Hunsberger

Subjects: Physics and Society (physics.soc-ph); Computation and Language (cs.CL); Computers and Society (cs.CY)
[359] arXiv:2008.03029 (cross-list from eess.AS) [pdf, other]: Title: Peking Opera Synthesis via Duration Informed Attention Network

Authors: Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu

Comments: Accepted by INTERSPEECH 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[360] arXiv:2008.03088 (cross-list from eess.AS) [pdf, other]: Title: Pretraining Techniques for Sequence-to-Sequence Voice Conversion

Authors: Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda

Comments: Preprint. Under review

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[361] arXiv:2008.03183 (cross-list from eess.AS) [pdf, ps, other]: Title: Applying Speech Tempo-Derived Features, BoAW and Fisher Vectors to Detect Elderly Emotion and Speech in Surgical Masks

Authors: Gábor Gosztolya, László Tóth

Comments: rejected from Interspeech, ComParE Challenge (Mask & Elderly Emotion Sub-Challenges)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[362] arXiv:2008.03359 (cross-list from eess.AS) [pdf, other]: Title: A New Approach to Accent Recognition and Conversion for Mandarin Chinese

Authors: Lin Ai, Shih-Ying Jeng, Homayoon Beigi

Comments: 11 pages, 7 figures, and 10 tables

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[363] arXiv:2008.03403 (cross-list from eess.AS) [pdf, other]: Title: Word Error Rate Estimation Without ASR Output: e-WER2

Authors: Ahmed Ali, Steve Renals

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[364] arXiv:2008.03425 (cross-list from eess.AS) [pdf, other]: Title: Deep F-measure Maximization for End-to-End Speech Understanding

Authors: Leda Sarı, Mark Hasegawa-Johnson

Comments: Interspeech 2020 submission (Accepted)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[365] arXiv:2008.03687 (cross-list from eess.AS) [pdf, other]: Title: LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition

Authors: Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu

Journal-ref: KDD 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)
[366] arXiv:2008.03802 (cross-list from eess.AS) [pdf, other]: Title: SpeedySpeech: Efficient Neural Speech Synthesis

Authors: Jan Vainer, Ondřej Dušek

Comments: 5 pages, 3 figures, Interspeech 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[367] arXiv:2008.03992 (cross-list from eess.AS) [pdf, other]: Title: VAW-GAN for Singing Voice Conversion with Non-parallel Training Data

Authors: Junchen Lu, Kun Zhou, Berrak Sisman, Haizhou Li

Comments: Accepted to APSIPA ASC 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[368] arXiv:2008.04481 (cross-list from eess.AS) [pdf, other]: Title: Transformer with Bidirectional Decoder for Speech Recognition

Authors: Xi Chen, Songyang Zhang, Dandan Song, Peng Ouyang, Shouyi Yin

Comments: Accepted by InterSpeech 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[369] arXiv:2008.04527 (cross-list from eess.AS) [pdf, other]: Title: Neural PLDA Modeling for End-to-End Speaker Verification

Authors: Shreyas Ramoji, Prashant Krishnan, Sriram Ganapathy

Comments: Accepted in Interspeech 2020. GitHub Implementation Repos: this https URL and this https URL

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[370] arXiv:2008.04546 (cross-list from eess.AS) [pdf, other]: Title: Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings

Authors: Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[371] arXiv:2008.04562 (cross-list from eess.AS) [pdf, other]: Title: Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN

Authors: Zongyang Du, Kun Zhou, Berrak Sisman, Haizhou Li

Comments: Accepted to APSIPA ASC 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[372] arXiv:2008.05011 (cross-list from eess.AS) [pdf, other]: Title: Compact Speaker Embedding: lrx-vector

Authors: Munir Georges, Jonathan Huang, Tobias Bocklet

Comments: Accepted to INTERSPEECH 2020

Journal-ref: Proc. Interspeech 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[373] arXiv:2008.05086 (cross-list from eess.AS) [pdf, other]: Title: Transfer Learning Approaches for Streaming End-to-End Speech Recognition System

Authors: Vikas Joshi, Rui Zhao, Rupesh R. Mehta, Kshitiz Kumar, Jinyu Li

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[374] arXiv:2008.05284 (cross-list from eess.AS) [pdf, other]: Title: Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS

Authors: Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li

Comments: To appear in IEEE Signal Processing Letters (SPL)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[375] arXiv:2008.05514 (cross-list from eess.AS) [pdf, other]: Title: Online Automatic Speech Recognition with Listen, Attend and Spell Model

Authors: Roger Hsiao, Dogan Can, Tim Ng, Ruchir Travadi, Arnab Ghoshal

Comments: 5 pages, 4 figures, this version is submitted to IEEE Signal Processing Letters

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[376] arXiv:2008.05656 (cross-list from eess.AS) [pdf, other]: Title: Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit

Authors: Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao

Comments: will be presented in INTERSPEECH 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[377] arXiv:2008.05671 (cross-list from eess.AS) [pdf, other]: Title: Large-scale Transfer Learning for Low-resource Spoken Language Understanding

Authors: Xueli Jia, Jianzong Wang, Zhiyong Zhang, Ning Cheng, Jing Xiao

Comments: will be presented in INTERSPEECH 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[378] arXiv:2008.05750 (cross-list from eess.AS) [pdf, other]: Title: Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition

Authors: Wenyong Huang, Wenchao Hu, Yu Ting Yeung, Xiao Chen

Comments: Accepted by INTERSPEECH 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[379] arXiv:2008.05773 (cross-list from eess.AS) [pdf, other]: Title: Continuous Speech Separation with Conformer

Authors: Sanyuan Chen, Yu Wu, Zhuo Chen, Jian Wu, Jinyu Li, Takuya Yoshioka, Chengyi Wang, Shujie Liu, Ming Zhou

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[380] arXiv:2008.06146 (cross-list from eess.AS) [pdf, ps, other]: Title: End-to-End Trainable Self-Attentive Shallow Network for Text-Independent Speaker Verification

Authors: Hyeonmook Park, Jungbae Park, Sang Wan Lee

Comments: 5 pages, 3 figures, 3 tables

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[381] arXiv:2008.06208 (cross-list from eess.AS) [pdf, ps, other]: Title: Adaptable Multi-Domain Language Model for Transformer ASR

Authors: Taewoo Lee, Min-Joong Lee, Tae Gyoon Kang, Seokyeoung Jung, Minseok Kwon, Yeona Hong, Jungin Lee, Kyoung-Gu Woo, Ho-Gyeong Kim, Jiseung Jeong, Jihyun Lee, Hosik Lee, Young Sang Choi

Comments: This paper is accepted for presentation at IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[382] arXiv:2008.06580 (cross-list from eess.AS) [pdf, other]: Title: Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview

Authors: Peter Bell, Joachim Fainberg, Ondrej Klejch, Jinyu Li, Steve Renals, Pawel Swietojanski

Comments: Total of 31 pages, 27 figures. Associated repository: this https URL

Journal-ref: IEEE Open Journal of Signal Processing, vol. 2, pp. 33-66, 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[383] arXiv:2008.06682 (cross-list from eess.AS) [pdf, other]: Title: Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition

Authors: Shamane Siriwardhana, Andrew Reis, Rivindu Weerasekera, Suranga Nanayakkara

Comments: Accepted to INTERSPEECH 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[384] arXiv:2008.06867 (cross-list from eess.AS) [pdf, other]: Title: Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder

Authors: Hyun-Wook Yoon, Sang-Hoon Lee, Hyeong-Rae Noh, Seong-Whan Lee

Comments: Accepted in INTERSPEECH2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[385] arXiv:2008.06996 (cross-list from q-bio.NC) [pdf, other]: Title: Large Associative Memory Problem in Neurobiology and Machine Learning

Authors: Dmitry Krotov, John Hopfield

Comments: Accepted for publication at ICLR 2021

Subjects: Neurons and Cognition (q-bio.NC); Disordered Systems and Neural Networks (cond-mat.dis-nn); Computation and Language (cs.CL); Machine Learning (cs.LG); Machine Learning (stat.ML)
[386] arXiv:2008.07118 (cross-list from eess.AS) [pdf, other]: Title: PIANOTREE VAE: Structured Representation Learning for Polyphonic Music

Authors: Ziyu Wang, Yiyi Zhang, Yixiao Zhang, Junyan Jiang, Ruihan Yang, Junbo Zhao (Jake), Gus Xia

Journal-ref: In Proceedings of 21st International Conference on Music Information Retrieval (ISMIR), Montreal, Canada (virtual conference), 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[387] arXiv:2008.07520 (cross-list from eess.AS) [pdf, other]: Title: Do face masks introduce bias in speech technologies? The case of automated scoring of speaking proficiency

Authors: Anastassia Loukina, Keelan Evanini, Matthew Mulholland, Ian Blood, Klaus Zechner

Journal-ref: Proceedings of Interspeech 2020, 1942-1946

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Sound (cs.SD)
[388] arXiv:2008.08113 (cross-list from eess.AS) [pdf, other]: Title: Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation

Authors: Rishika Agarwal, Xiaochuan Niu, Pranay Dighe, Srikanth Vishnubhotla, Sameer Badaskar, Devang Naik

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[389] arXiv:2008.08901 (cross-list from eess.AS) [pdf, other]: Title: Speaker-Utterance Dual Attention for Speaker and Utterance Verification

Authors: Tianchi Liu, Rohan Kumar Das, Maulik Madhavi, Shengmei Shen, Haizhou Li

Comments: Accepted by Interspeech 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD); Signal Processing (eess.SP)
[390] arXiv:2008.09207 (cross-list from eess.AS) [pdf, other]: Title: Dyadic Speech-based Affect Recognition using DAMI-P2C Parent-child Multimodal Interaction Dataset

Authors: Huili Chen, Yue Zhang, Felix Weninger, Rosalind Picard, Cynthia Breazeal, Hae Won Park

Comments: Accepted by the 2020 International Conference on Multimodal Interaction (ICMI'20)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[391] arXiv:2008.09483 (cross-list from eess.AS) [pdf, other]: Title: Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning

Authors: Noé Tits, Kevin El Haddad, Thierry Dutoit

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[392] arXiv:2008.09659 (cross-list from eess.AS) [pdf, other]: Title: Efficient neural speech synthesis for low-resource languages through multilingual modeling

Authors: Marcel de Korte, Jaebok Kim, Esther Klabbers

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[393] arXiv:2008.11045 (cross-list from eess.AS) [pdf, other]: Title: ICE-Talk: an Interface for a Controllable Expressive Talking Machine

Authors: Noé Tits, Kevin El Haddad, Thierry Dutoit

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[394] arXiv:2008.12914 (cross-list from eess.AS) [pdf, other]: Title: Data augmentation using prosody and false starts to recognize non-native children's speech

Authors: Hemant Kathania, Mittul Singh, Tamás Grósz, Mikko Kurimo

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[395] arXiv:2008.13093 (cross-list from eess.AS) [pdf, other]: Title: Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition

Authors: Wei Li, James Qin, Chung-Cheng Chiu, Ruoming Pang, Yanzhang He

Comments: Proceedings of Interspeech, 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)

[ total of 395 entries: 1-50 | ... | 201-250 | 251-300 | 301-350 | 351-395 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help (Access key information)

> cs > cs.CL

Computation and Language

Authors and titles for cs.CL in Aug 2020, skipping first 350