Audio and Speech Processing

Authors and titles for eess.AS in Oct 2017

[ total of 59 entries: 1-25 | 26-50 | 51-59 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:1710.00113 [pdf, ps, other]: Title: UTD-CRSS Submission for MGB-3 Arabic Dialect Identification: Front-end and Back-end Advancements on Broadcast Speech

Authors: Ahmet E. Bulut, Qian Zhang, Chunlei Zhang, Fahimeh Bahmaninezhad, John H. L. Hansen

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[2] arXiv:1710.00116 [pdf, ps, other]: Title: PLDA-Based Diarization of Telephone Conversations

Authors: Ahmet E. Bulut, Hakan Demir, Yusuf Ziya Isik, Hakan Erdogan

Journal-ref: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 1 (2015) 4809-4813

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[3] arXiv:1710.01904 [pdf, ps, other]: Title: Head shadow enhancement with low-frequency beamforming improves sound localization and speech perception for simulated bimodal listeners

Authors: Benjamin Dieudonné, Tom Francart

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[4] arXiv:1710.02369 [pdf, other]: Title: End-to-end DNN Based Speaker Recognition Inspired by i-vector and PLDA

Authors: Johan Rohdin, Anna Silnova, Mireia Diez, Oldrich Plchot, Pavel Matejka, Lukas Burget

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[5] arXiv:1710.02560 [pdf, other]: Title: The DIRHA-English corpus and related tasks for distant-speech recognition in domestic environments

Authors: Mirco Ravanelli, Maurizio Omologo

Comments: ASRU 2015

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[6] arXiv:1710.03538 [pdf, other]: Title: Contaminated speech training methods for robust DNN-HMM distant speech recognition

Authors: Mirco Ravanelli, Maurizio Omologo

Journal-ref: INTERSPEECH 2015

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[7] arXiv:1710.03975 [pdf, other]: Title: PROSE: Perceptual Risk Optimization for Speech Enhancement

Authors: Jishnu Sadasivan, Chandra Sekhar Seelamantula, Nagarjuna Reddy Muraka

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[8] arXiv:1710.04288 [pdf, other]: Title: Audio Concept Classification with Hierarchical Deep Neural Networks

Authors: Mirco Ravanelli, Benjamin Elizalde, Karl Ni, Gerald Friedland

Journal-ref: EUSIPCO 2014

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9] arXiv:1710.08633 [pdf, other]: Title: On the Conditioning of the Spherical Harmonic Matrix for Spatial Audio Applications

Authors: C Sandeep Reddy, Rajesh M Hegde

Comments: 12 pages; This paper is a preprint of a paper submitted to IET Signal Processing Journal. If accepted, the copy of record will be available at the IET Digital Library

Subjects: Audio and Speech Processing (eess.AS)
[10] arXiv:1710.09985 [pdf, other]: Title: Acoustic Landmarks Contain More Information About the Phone String than Other Frames for Automatic Speech Recognition with Deep Neural Network Acoustic Model

Authors: Di He, Boon Pang Lim, Xuesong Yang, Mark Hasegawa-Johnson, Deming Chen

Comments: The article has been submitted to Journal of the Acoustical Society of America

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11] arXiv:1710.10432 [pdf, other]: Title: Jointly Tracking and Separating Speech Sources Using Multiple Features and the generalized labeled multi-Bernoulli Framework

Authors: Shoufeng Lin

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[12] arXiv:1710.10467 [pdf, other]: Title: Generalized End-to-End Loss for Speaker Verification

Authors: Li Wan, Quan Wang, Alan Papir, Ignacio Lopez Moreno

Comments: Published at ICASSP 2018

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Machine Learning (stat.ML)
[13] arXiv:1710.10468 [pdf, other]: Title: Speaker Diarization with LSTM

Authors: Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno

Comments: Published at ICASSP 2018

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[14] arXiv:1710.10470 [pdf, other]: Title: Attention-Based Models for Text-Dependent Speaker Verification

Authors: F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan

Comments: Submitted to ICASSP 2018

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[15] arXiv:1710.11317 [pdf, other]: Title: Nebula: F0 Estimation and Voicing Detection by Modeling the Statistical Properties of Feature Extractors

Authors: Kanru Hua

Comments: To be presented at Interspeech 2018

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16] arXiv:1710.00082 (cross-list from cs.SD) [pdf, ps, other]: Title: Real-Time Wind Noise Detection and Suppression with Neural-Based Signal Reconstruction for Mult-Channel, Low-Power Devices

Authors: Anthony D. Rhodes

Comments: 5 pages, 8 figures

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[17] arXiv:1710.00343 (cross-list from cs.SD) [pdf, other]: Title: Large-scale weakly supervised audio classification using gated convolutional neural network

Authors: Yong Xu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley

Comments: submitted to ICASSP2018, summary on the 1st place system in DCASE2017 task4 challenge

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[18] arXiv:1710.00683 (cross-list from cs.CL) [pdf, ps, other]: Title: The Dependence of Frequency Distributions on Multiple Meanings of Words, Codes and Signs

Authors: Xiaoyong Yan, Petter Minnhagen

Comments: 10 pages, 12 figures

Journal-ref: Physica A 490, 554-564 (2018)

Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS); Physics and Society (physics.soc-ph)
[19] arXiv:1710.01446 (cross-list from cs.SD) [pdf, other]: Title: Improving Compression Based Dissimilarity Measure for Music Score Analysis

Authors: Ayaka Takamoto, Mayu Umemura, Mitsuo Yoshida, Kyoji Umemura

Comments: The 2016 International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA2016)

Subjects: Sound (cs.SD); Other Computer Science (cs.OH); Audio and Speech Processing (eess.AS)
[20] arXiv:1710.01589 (cross-list from cs.SD) [pdf, other]: Title: Independent Low-Rank Matrix Analysis Based on Parametric Majorization-Equalization Algorithm

Authors: Yoshiki Mitsui, Daichi Kitamura, Norihiro Takamune, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo

Comments: Preprint Manuscript of 2017 IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP 2017)

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[21] arXiv:1710.02280 (cross-list from cs.SD) [pdf, other]: Title: Generating Nontrivial Melodies for Music as a Service

Authors: Yifei Teng, An Zhao, Camille Goudeseune

Comments: ISMIR 2017 Conference

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[22] arXiv:1710.02997 (cross-list from cs.SD) [pdf, other]: Title: A report on sound event detection with different binaural features

Authors: Sharath Adavanne, Tuomas Virtanen

Comments: Technical report for the top performing method in Task 3: Real life sound event detection challenge, at Detection and classification of acoustic scene and events (DCASE) 2017

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23] arXiv:1710.02998 (cross-list from cs.SD) [pdf, other]: Title: Sound event detection using weakly labeled dataset with stacked convolutional and recurrent neural network

Authors: Sharath Adavanne, Tuomas Virtanen

Comments: Accepted in Detection and Classification of Acoustic Scenes and Events (DCASE 2017)

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[24] arXiv:1710.04196 (cross-list from cs.SD) [pdf, other]: Title: Pyroomacoustics: A Python package for audio room simulations and array processing algorithms

Authors: Robin Scheibler, Eric Bezzam, Ivan Dokmanić

Comments: 5 pages, 5 figures, describes a software package

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[25] arXiv:1710.06648 (cross-list from cs.SD) [pdf, other]: Title: Representation Learning of Music Using Artist Labels

Authors: Jiyoung Park, Jongpil Lee, Jangyeon Park, Jung-Woo Ha, Juhan Nam

Comments: 19th International Society for Music Information Retrieval Conference (ISMIR), 2018

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

[ total of 59 entries: 1-25 | 26-50 | 51-59 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Oct 2017