Audio and Speech Processing

Authors and titles for eess.AS in Dec 2019

[ total of 110 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-110 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:1912.00938 [pdf, ps, other]: Title: Speaker detection in the wild: Lessons learned from JSALT 2019

Authors: Paola Garcia, Jesus Villalba, Herve Bredin, Jun Du, Diego Castan, Alejandrina Cristia, Latane Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Leo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak

Comments: Submitted to ICASSP 2020

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[2] arXiv:1912.01167 [pdf, other]: Title: High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram

Authors: Leyuan Sheng, Dong-Yan Huang, Evgeniy N. Pavlovskiy

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[3] arXiv:1912.01679 [pdf, other]: Title: Deep Contextualized Acoustic Representations For Semi-Supervised Speech Recognition

Authors: Shaoshi Ling, Yuzong Liu, Julian Salazar, Katrin Kirchhoff

Comments: Accepted to ICASSP 2020 (oral)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[4] arXiv:1912.01777 [pdf, other]: Title: Integrating Knowledge into End-to-End Speech Recognition from External Text-Only Data

Authors: Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Zhengkun Tian, Shuai Zhang

Comments: Submitted TASLP

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[5] arXiv:1912.02591 [pdf, other]: Title: Investigating U-Nets with various Intermediate Blocks for Spectrogram-based Singing Voice Separation

Authors: Woosung Choi, Minseok Kim, Jaehwa Chung, Daewon Lee, Soonyoung Jung

Comments: 8 pages 4 tables 6 figures, accepted to ISMIR 2020

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Machine Learning (stat.ML)
[6] arXiv:1912.02606 [pdf, other]: Title: Predominant Musical Instrument Classification based on Spectral Features

Authors: Karthikeya Racharla, Vineet Kumar, Chaudhari Bhushan Jayant, Ankit Khairkar, Paturu Harish

Comments: Appeared in Proceedings of SPIN 2020

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[7] arXiv:1912.02608 [pdf, other]: Title: SEEF-ALDR: A Speaker Embedding Enhancement Framework via Adversarial Learning based Disentangled Representation

Authors: Jianwei Tai, Xiaoqi Jia, Qingjia Huang, Weijuan Zhang, Haichao Du, Shengzhi Zhang

Comments: 12 pages, 4 figures, Accepted by ACSAC 2020

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[8] arXiv:1912.02610 [pdf, other]: Title: Bimodal Speech Emotion Recognition Using Pre-Trained Language Models

Authors: Verena Heusser, Niklas Freymuth, Stefan Constantin, Alex Waibel

Comments: Life-Long Learning for Spoken Language Systems ASRU 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[9] arXiv:1912.02613 [pdf, other]: Title: Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders

Authors: Yin-Jyun Luo, Chin-Chen Hsu, Kat Agres, Dorien Herremans

Comments: Accepted to ICASSP 2020

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[10] arXiv:1912.02615 [pdf, other]: Title: Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events

Authors: Wim Boes, Hugo Van hamme

Journal-ref: Proceedings of the 27th ACM International Conference on Multimedia (MM '19). ACM, New York, NY, USA, 1961-1969

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[11] arXiv:1912.02671 [pdf, other]: Title: Audio-Visual Target Speaker Enhancement on Multi-Talker Environment using Event-Driven Cameras

Authors: Ander Arriandiaga, Giovanni Morrone, Luca Pasa, Leonardo Badino, Chiara Bartolozzi

Comments: Accepted at ISCAS 2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[12] arXiv:1912.02958 [pdf, other]: Title: Synchronous Transformers for End-to-End Speech Recognition

Authors: Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen

Comments: Accepted by ICASSP 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)
[13] arXiv:1912.03363 [pdf, other]: Title: Audio-attention discriminative language model for ASR rescoring

Authors: Ankur Gandhe, Ariya Rastrow

Comments: 4 pages, 1 figure, Accepted at ICASSP 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[14] arXiv:1912.03627 [pdf, ps, other]: Title: A Multi Purpose and Large Scale Speech Corpus in Persian and English for Speaker and Speech Recognition: the DeepMine Database

Authors: Hossein Zeinali, Lukáš Burget, Jan "Honza'' Černocký

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[15] arXiv:1912.04067 [pdf, other]: Title: Visualizing Deep Neural Networks for Speech Recognition with Learned Topographic Filter Maps

Authors: Andreas Krug, Sebastian Stober

Comments: Accepted for 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[16] arXiv:1912.04370 [pdf, other]: Title: Cross-Language Aphasia Detection using Optimal Transport Domain Adaptation

Authors: Aparna Balagopalan, Jekaterina Novikova, Matthew B. A. McDermott, Bret Nestor, Tristan Naumann, Marzyeh Ghassemi

Comments: Accepted to ML4H at NeurIPS 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[17] arXiv:1912.04381 [pdf, ps, other]: Title: A Dataset for measuring reading levels in India at scale

Authors: Dolly Agarwal, Jayant Gupchup, Nishant Baghel

Comments: 5 pages, 3 figures, 3 Tables, Paper accepted to ICASSP 2020

Subjects: Audio and Speech Processing (eess.AS); Computers and Society (cs.CY); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[18] arXiv:1912.04700 [pdf, ps, other]: Title: Development and Evaluation of Video Recordings for the OLSA Matrix Sentence Test

Authors: Gerard Llorach, Frederike Kirschner, Giso Grimm, Melanie A. Zokoll, Kirsten C. Wagener, Volker Hohmann

Comments: 10 pages, 9 figures

Subjects: Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[19] arXiv:1912.04844 [pdf, other]: Title: Quantifying the Chaos Level of Infants' Environment via Unsupervised Learning

Authors: Priyanka Khante, Mai Lee Chang, Domingo Martinez, Kaya de Barbaro, Edison Thomaz

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[20] arXiv:1912.04979 [pdf, other]: Title: Advances in Online Audio-Visual Meeting Transcription

Authors: Takuya Yoshioka, Igor Abramovski, Cem Aksoylar, Zhuo Chen, Moshe David, Dimitrios Dimitriadis, Yifan Gong, Ilya Gurvich, Xuedong Huang, Yan Huang, Aviv Hurvitz, Li Jiang, Sharon Koubi, Eyal Krupka, Ido Leichter, Changliang Liu, Partha Parthasarathy, Alon Vinnikov, Lingfeng Wu, Xiong Xiao, Wayne Xiong, Huaming Wang, Zhenghao Wang, Jun Zhang, Yong Zhao, Tianyan Zhou

Comments: To appear in Proc. IEEE ASRU Workshop 2019

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Image and Video Processing (eess.IV)
[21] arXiv:1912.05038 [pdf, other]: Title: Cooperative Audio Source Separation and Enhancement Using Distributed Microphone Arrays and Wearable Devices

Authors: Ryan M. Corey, Matthew D. Skarha, Andrew C. Singer

Comments: To appear at CAMSAP 2019

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22] arXiv:1912.05043 [pdf, other]: Title: Motion-Tolerant Beamforming with Deformable Microphone Arrays

Authors: Ryan M. Corey, Andrew C. Singer

Comments: Presented at WASPAA 2019

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[23] arXiv:1912.05472 [pdf, ps, other]: Title: Audiogmenter: a MATLAB Toolbox for Audio Data Augmentation

Authors: Gianluca Maguolo, Michelangelo Paci, Loris Nanni, Ludovico Bonan

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[24] arXiv:1912.05533 [pdf, ps, other]: Title: SpecAugment on Large Scale Datasets

Authors: Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu

Comments: 5 pages, 3 tables; submitted to ICASSP 2020

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[25] arXiv:1912.05869 [pdf, other]: Title: On Neural Phone Recognition of Mixed-Source ECoG Signals

Authors: Ahmed Hussen Abdelaziz, Shuo-Yiin Chang, Nelson Morgan, Erik Edwards, Dorothea Kolossa, Dan Ellis, David A. Moses, Edward F. Chang

Comments: 5 pages, showing algorithms, results and references from our collaboration during a 2017 postdoc stay of the first author

Subjects: Audio and Speech Processing (eess.AS); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Neurons and Cognition (q-bio.NC)

[ total of 110 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-110 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2404, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Dec 2019