We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computation and Language

Authors and titles for cs.CL in Aug 2020, skipping first 350

[ total of 395 entries: 1-50 | ... | 201-250 | 251-300 | 301-350 | 351-395 ]
[ showing 50 entries per page: fewer | more | all ]
[351]  arXiv:2008.00731 (cross-list from eess.AS) [pdf, ps, other]
Title: Unsupervised Discovery of Recurring Speech Patterns Using Probabilistic Adaptive Metrics
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[352]  arXiv:2008.00768 (cross-list from eess.AS) [pdf, other]
Title: One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
Comments: Accepted to INTERSPEECH 2020; for the source files, see this https URL
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)
[353]  arXiv:2008.01300 (cross-list from eess.AS) [pdf, other]
Title: Weakly Supervised Construction of ASR Systems with Massive Video Data
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[354]  arXiv:2008.01504 (cross-list from eess.AS) [pdf, other]
Title: "This is Houston. Say again, please". The Behavox system for the Apollo-11 Fearless Steps Challenge (phase II)
Comments: Accepted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[355]  arXiv:2008.01832 (cross-list from eess.AS) [pdf, other]
Title: Future Vector Enhanced LSTM Language Model for LVCSR
Comments: Accepted by ASRU-2017
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[356]  arXiv:2008.02516 (cross-list from eess.AS) [pdf, other]
Title: FastLR: Non-Autoregressive Lipreading Model with Integrate-and-Fire
Comments: Accepted by ACM MM 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD)
[357]  arXiv:2008.02603 (cross-list from eess.AS) [pdf, other]
Title: Data balancing for boosting performance of low-frequency classes in Spoken Language Understanding
Comments: accepted at InterSpeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[358]  arXiv:2008.02885 (cross-list from physics.soc-ph) [pdf, other]
Title: A general solution to the preferential selection model
Subjects: Physics and Society (physics.soc-ph); Computation and Language (cs.CL); Computers and Society (cs.CY)
[359]  arXiv:2008.03029 (cross-list from eess.AS) [pdf, other]
Title: Peking Opera Synthesis via Duration Informed Attention Network
Comments: Accepted by INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[360]  arXiv:2008.03088 (cross-list from eess.AS) [pdf, other]
Title: Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Comments: Preprint. Under review
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[361]  arXiv:2008.03183 (cross-list from eess.AS) [pdf, ps, other]
Title: Applying Speech Tempo-Derived Features, BoAW and Fisher Vectors to Detect Elderly Emotion and Speech in Surgical Masks
Comments: rejected from Interspeech, ComParE Challenge (Mask & Elderly Emotion Sub-Challenges)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[362]  arXiv:2008.03359 (cross-list from eess.AS) [pdf, other]
Title: A New Approach to Accent Recognition and Conversion for Mandarin Chinese
Comments: 11 pages, 7 figures, and 10 tables
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[363]  arXiv:2008.03403 (cross-list from eess.AS) [pdf, other]
Title: Word Error Rate Estimation Without ASR Output: e-WER2
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[364]  arXiv:2008.03425 (cross-list from eess.AS) [pdf, other]
Title: Deep F-measure Maximization for End-to-End Speech Understanding
Comments: Interspeech 2020 submission (Accepted)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[365]  arXiv:2008.03687 (cross-list from eess.AS) [pdf, other]
Title: LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
Journal-ref: KDD 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)
[366]  arXiv:2008.03802 (cross-list from eess.AS) [pdf, other]
Title: SpeedySpeech: Efficient Neural Speech Synthesis
Comments: 5 pages, 3 figures, Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[367]  arXiv:2008.03992 (cross-list from eess.AS) [pdf, other]
Title: VAW-GAN for Singing Voice Conversion with Non-parallel Training Data
Comments: Accepted to APSIPA ASC 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[368]  arXiv:2008.04481 (cross-list from eess.AS) [pdf, other]
Title: Transformer with Bidirectional Decoder for Speech Recognition
Comments: Accepted by InterSpeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[369]  arXiv:2008.04527 (cross-list from eess.AS) [pdf, other]
Title: Neural PLDA Modeling for End-to-End Speaker Verification
Comments: Accepted in Interspeech 2020. GitHub Implementation Repos: this https URL and this https URL
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[370]  arXiv:2008.04546 (cross-list from eess.AS) [pdf, other]
Title: Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[371]  arXiv:2008.04562 (cross-list from eess.AS) [pdf, other]
Title: Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN
Comments: Accepted to APSIPA ASC 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[372]  arXiv:2008.05011 (cross-list from eess.AS) [pdf, other]
Title: Compact Speaker Embedding: lrx-vector
Comments: Accepted to INTERSPEECH 2020
Journal-ref: Proc. Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[373]  arXiv:2008.05086 (cross-list from eess.AS) [pdf, other]
Title: Transfer Learning Approaches for Streaming End-to-End Speech Recognition System
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[374]  arXiv:2008.05284 (cross-list from eess.AS) [pdf, other]
Title: Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS
Comments: To appear in IEEE Signal Processing Letters (SPL)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[375]  arXiv:2008.05514 (cross-list from eess.AS) [pdf, other]
Title: Online Automatic Speech Recognition with Listen, Attend and Spell Model
Comments: 5 pages, 4 figures, this version is submitted to IEEE Signal Processing Letters
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[376]  arXiv:2008.05656 (cross-list from eess.AS) [pdf, other]
Title: Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit
Comments: will be presented in INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[377]  arXiv:2008.05671 (cross-list from eess.AS) [pdf, other]
Title: Large-scale Transfer Learning for Low-resource Spoken Language Understanding
Comments: will be presented in INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[378]  arXiv:2008.05750 (cross-list from eess.AS) [pdf, other]
Title: Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Comments: Accepted by INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[379]  arXiv:2008.05773 (cross-list from eess.AS) [pdf, other]
Title: Continuous Speech Separation with Conformer
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[380]  arXiv:2008.06146 (cross-list from eess.AS) [pdf, ps, other]
Title: End-to-End Trainable Self-Attentive Shallow Network for Text-Independent Speaker Verification
Comments: 5 pages, 3 figures, 3 tables
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[381]  arXiv:2008.06208 (cross-list from eess.AS) [pdf, ps, other]
Title: Adaptable Multi-Domain Language Model for Transformer ASR
Comments: This paper is accepted for presentation at IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[382]  arXiv:2008.06580 (cross-list from eess.AS) [pdf, other]
Title: Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview
Comments: Total of 31 pages, 27 figures. Associated repository: this https URL
Journal-ref: IEEE Open Journal of Signal Processing, vol. 2, pp. 33-66, 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[383]  arXiv:2008.06682 (cross-list from eess.AS) [pdf, other]
Title: Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Comments: Accepted to INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[384]  arXiv:2008.06867 (cross-list from eess.AS) [pdf, other]
Title: Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder
Comments: Accepted in INTERSPEECH2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[385]  arXiv:2008.06996 (cross-list from q-bio.NC) [pdf, other]
Title: Large Associative Memory Problem in Neurobiology and Machine Learning
Comments: Accepted for publication at ICLR 2021
Subjects: Neurons and Cognition (q-bio.NC); Disordered Systems and Neural Networks (cond-mat.dis-nn); Computation and Language (cs.CL); Machine Learning (cs.LG); Machine Learning (stat.ML)
[386]  arXiv:2008.07118 (cross-list from eess.AS) [pdf, other]
Title: PIANOTREE VAE: Structured Representation Learning for Polyphonic Music
Journal-ref: In Proceedings of 21st International Conference on Music Information Retrieval (ISMIR), Montreal, Canada (virtual conference), 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[387]  arXiv:2008.07520 (cross-list from eess.AS) [pdf, other]
Title: Do face masks introduce bias in speech technologies? The case of automated scoring of speaking proficiency
Journal-ref: Proceedings of Interspeech 2020, 1942-1946
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Sound (cs.SD)
[388]  arXiv:2008.08113 (cross-list from eess.AS) [pdf, other]
Title: Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[389]  arXiv:2008.08901 (cross-list from eess.AS) [pdf, other]
Title: Speaker-Utterance Dual Attention for Speaker and Utterance Verification
Comments: Accepted by Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD); Signal Processing (eess.SP)
[390]  arXiv:2008.09207 (cross-list from eess.AS) [pdf, other]
Title: Dyadic Speech-based Affect Recognition using DAMI-P2C Parent-child Multimodal Interaction Dataset
Comments: Accepted by the 2020 International Conference on Multimodal Interaction (ICMI'20)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[391]  arXiv:2008.09483 (cross-list from eess.AS) [pdf, other]
Title: Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[392]  arXiv:2008.09659 (cross-list from eess.AS) [pdf, other]
Title: Efficient neural speech synthesis for low-resource languages through multilingual modeling
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[393]  arXiv:2008.11045 (cross-list from eess.AS) [pdf, other]
Title: ICE-Talk: an Interface for a Controllable Expressive Talking Machine
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[394]  arXiv:2008.12914 (cross-list from eess.AS) [pdf, other]
Title: Data augmentation using prosody and false starts to recognize non-native children's speech
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[395]  arXiv:2008.13093 (cross-list from eess.AS) [pdf, other]
Title: Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition
Comments: Proceedings of Interspeech, 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[ total of 395 entries: 1-50 | ... | 201-250 | 251-300 | 301-350 | 351-395 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help  (Access key information)