Audio and Speech Processing

Authors and titles for eess.AS in Mar 2018

[ total of 63 entries: 1-25 | 26-50 | 51-63 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:1803.00396 [pdf, ps, other]: Title: Speech Enhancement in Adverse Environments Based on Non-stationary Noise-driven Spectral Subtraction and SNR-dependent Phase Compensation

Authors: Md Tauhidul Islam, Asaduzzaman, Celia Shahnaz, Wei-Ping Zhu, M. Omair Ahmad

Comments: 15 pages, 10 figures, 8 tables. arXiv admin note: substantial text overlap with arXiv:1802.02665; text overlap with arXiv:1802.05125

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[2] arXiv:1803.00860 [pdf, other]: Title: Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data

Authors: Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, Isao Echizen, Junichi Yamagishi, Tomi Kinnunen

Comments: conference manuscript submitted to Speaker Odyssey 2018

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD); Machine Learning (stat.ML)
[3] arXiv:1803.00886 [pdf, other]: Title: Deep factorization for speech signal

Authors: Lantian Li, Dong Wang, Yixiang Chen, Ying Shi, Zhiyuan Tang, Thomas Fang Zheng

Comments: Accepted by ICASSP 2018. arXiv admin note: substantial text overlap with arXiv:1706.01777

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[4] arXiv:1803.01122 [pdf, other]: Title: An Ensemble Framework of Voice-Based Emotion Recognition System for Films and TV Programs

Authors: Fei Tao, Gang Liu, Qingen Zhao

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[5] arXiv:1803.01841 [pdf, ps, other]: Title: Enhancement of Noisy Speech exploiting a Gaussian Modeling based Threshold and a PDF Dependent Thresholding Function

Authors: Md Tauhidul Islam, Celia Shahnaz

Comments: 22 pages, 18 figures, 8 tables; submitted to EURASIP Journal on Audio, Speech, and Music Processing. arXiv admin note: substantial text overlap with arXiv:1802.05962; text overlap with arXiv:1802.03472

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[6] arXiv:1803.02353 [pdf, other]: Title: Multi-level Attention Model for Weakly Supervised Audio Classification

Authors: Changsong Yu, Karim Said Barsim, Qiuqiang Kong, Bin Yang

Comments: 5 pages, 3 figures, Submitted to Eusipco 2018

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[7] arXiv:1803.02445 [pdf, other]: Title: Linear networks based speaker adaptation for speech synthesis

Authors: Zhiying Huang, Heng Lu, Ming Lei, Zhijie Yan

Comments: 5 pages, 6 figures, accepted by ICASSP 2018

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[8] arXiv:1803.02870 [pdf, ps, other]: Title: Speech Enhancement Based on Non-stationary Noise-driven Geometric Spectral Subtraction and Phase Spectrum Compensation

Authors: Md Tauhidul Islam, Udoy Saha, K.T. Shahid, Ahmed Bin Hussain, Celia Shahnaz

Comments: 13 pages, 10 figures, 8 tables. arXiv admin note: substantial text overlap with arXiv:1803.00396; text overlap with arXiv:1802.02665, arXiv:1802.05125, arXiv:1803.01841

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9] arXiv:1803.04030 [pdf, ps, other]: Title: Modeling Singing F0 With Neural Network Driven Transition-Sustain Models

Authors: Kanru Hua

Comments: 5 pages, 5 figures

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10] arXiv:1803.05307 [pdf, other]: Title: Deep CNN based feature extractor for text-prompted speaker recognition

Authors: Sergey Novoselov, Oleg Kudashev, Vadim Schemelinin, Ivan Kremnev, Galina Lavrentyeva

Comments: Submitted to ICASSP 2018

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[11] arXiv:1803.05427 [pdf, other]: Title: Speaker Verification using Convolutional Neural Networks

Authors: Hossein Salehghaffari

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[12] arXiv:1803.06718 [pdf, other]: Title: Directional emphasis in ambisonics

Authors: W. Bastiaan Kleijn

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[13] arXiv:1803.08243 [pdf, other]: Title: Speech Dereverberation Using Fully Convolutional Networks

Authors: Ori Ernst, Shlomo E. Chazan, Sharon Gannot, Jacob Goldberger

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[14] arXiv:1803.09013 [pdf, ps, other]: Title: Exploring the robustness of features and enhancement on speech recognition systems in highly-reverberant real environments

Authors: José Novoa, Juan Pablo Escudero, Jorge Wuth, Victor Poblete, Simon King, Richard Stern, Néstor Becerra Yoma

Comments: 5 pages

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[15] arXiv:1803.09016 [pdf, ps, other]: Title: An improved DNN-based spectral feature mapping that removes noise and reverberation for robust automatic speech recognition

Authors: Juan Pablo Escudero, José Novoa, Rodrigo Mahu, Jorge Wuth, Fernando Huenupán, Richard Stern, Néstor Becerra Yoma

Comments: 5 pages

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16] arXiv:1803.09946 [pdf, other]: Title: Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra

Authors: Toru Nakashika, Shinji Takaki, Junichi Yamagishi

Comments: Under the IEEE T-ASLP Review

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[17] arXiv:1803.09960 [src]: Title: Automatic Minimisation of Masking in Multitrack Audio using Subgroups

Authors: David Ronan, Zheng Ma, Paul Mc Namara, Hatice Gunes, Joshua D. Reiss

Comments: Need to resolve ownership of intellectual property

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[18] arXiv:1803.10013 [pdf, other]: Title: Student-Teacher Learning for BLSTM Mask-based Speech Enhancement

Authors: Aswin Shanmugam Subramanian, Szu-Jui Chen, Shinji Watanabe

Comments: Submitted for Interspeech 2018

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[19] arXiv:1803.10136 [pdf, other]: Title: Comprehending Real Numbers: Development of Bengali Real Number Speech Corpus

Authors: Md Mahadi Hasan Nahid, Md. Ashraful Islam, Bishwajit Purkaystha, Md Saiful Islam

Comments: 9 pages

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
[20] arXiv:1803.10225 [pdf, other]: Title: Light Gated Recurrent Units for Speech Recognition

Authors: Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio

Comments: Copyright 2018 IEEE

Journal-ref: IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 2, no. 2, pp. 92-102, April 2018

Subjects: Audio and Speech Processing (eess.AS); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Signal Processing (eess.SP)
[21] arXiv:1803.10963 [pdf, ps, other]: Title: Attentive Statistics Pooling for Deep Speaker Embedding

Authors: Koji Okabe, Takafumi Koshinaka, Koichi Shinoda

Comments: Proc. Interspeech 2018, pp2252--2256. arXiv admin note: text overlap with arXiv:1809.09311

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22] arXiv:1803.11344 [pdf, other]: Title: Detecting Alzheimer's Disease Using Gated Convolutional Neural Network from Audio Data

Authors: Tifani Warnita, Nakamasa Inoue, Koichi Shinoda

Comments: 5 pages, 3 figures, submitted to INTERSPEECH 2018

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[23] arXiv:1803.11154 (cross-list from eess.IV) [pdf, other]: Title: An empirical approach to the relationship between emotion and music production quality

Authors: David Ronan, Joshua D. Reiss, Hatice Gunes

Comments: 12 Pages

Subjects: Image and Video Processing (eess.IV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[24] arXiv:1803.00187 (cross-list from cs.SD) [pdf, other]: Title: Mode Domain Spatial Active Noise Control Using Sparse Signal Representation

Authors: Yu Maeno, Yuki Mitsufuji, Thushara D. Abhayapala

Comments: to appear at ICASSP 2018

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[25] arXiv:1803.00721 (cross-list from cs.CL) [pdf, ps, other]: Title: Age Group Classification with Speech and Metadata Multimodality Fusion

Authors: Denys Katerenchuk

Journal-ref: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2017

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

[ total of 63 entries: 1-25 | 26-50 | 51-63 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2404, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Mar 2018