Audio and Speech Processing

Authors and titles for eess.AS in Apr 2019, skipping first 75

[ total of 167 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | 151-167 ]
[ showing 25 entries per page: fewer | more | all ]

[76] arXiv:1904.03522 (cross-list from cs.SD) [pdf, other]: Title: Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data

Authors: Roee Levy Leshem, Raja Giryes

Comments: Accepted to EUSIPCO 2020

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[77] arXiv:1904.03543 (cross-list from cs.SD) [pdf, ps, other]: Title: Spatio-Temporal Attention Pooling for Audio Scene Classification

Authors: Huy Phan, Oliver Y. Chén, Lam Pham, Philipp Koch, Maarten De Vos, Ian McLoughlin, Alfred Mertins

Comments: To appear at the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019)

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[78] arXiv:1904.03576 (cross-list from cs.CL) [pdf, other]: Title: Spoken Language Intent Detection using Confusion2Vec

Authors: Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou

Journal-ref: Proceedings of Interspeech 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[79] arXiv:1904.03617 (cross-list from cs.SD) [pdf, other]: Title: VAE-based regularization for deep speaker embedding

Authors: Yang Zhang, Lantian Li, Dong Wang

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[80] arXiv:1904.03787 (cross-list from cs.SD) [pdf, other]: Title: Bayesian Non-Parametric Multi-Source Modelling Based Determined Blind Source Separation

Authors: Chaitanya Narisetty, Tatsuya Komatsu, Reishi Kondo

Comments: 5 pages, 2 figures. Accepted at ICASSP 2019

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[81] arXiv:1904.03814 (cross-list from cs.SD) [pdf, other]: Title: Temporal Convolution for Real-time Keyword Spotting on Mobile Devices

Authors: Seungwoo Choi, Seokjun Seo, Beomjun Shin, Hyeongmin Byun, Martin Kersner, Beomsu Kim, Dongyoung Kim, Sungjoo Ha

Comments: In INTERSPEECH 2019

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[82] arXiv:1904.03829 (cross-list from cs.CR) [pdf, other]: Title: Adversarial Audio: A New Information Hiding Method and Backdoor for DNN-based Speech Recognition Models

Authors: Yehao Kong, Jiliang Zhang

Comments: Submitted to RAID2019

Subjects: Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[83] arXiv:1904.03833 (cross-list from cs.SD) [pdf, other]: Title: Direct Modelling of Speech Emotion from Raw Speech

Authors: Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Julien Epps

Comments: INTERSPEECH 2019

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[84] arXiv:1904.03834 (cross-list from stat.ML) [pdf, other]: Title: A Statistical Investigation of Long Memory in Language and Music

Authors: Alexander Greaves-Tunnell, Zaid Harchaoui

Comments: 29 pages; expanded supplement, added details in background and methods per reviewer feedback, included additional references

Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[85] arXiv:1904.03841 (cross-list from cs.SD) [pdf, other]: Title: Duration robust weakly supervised sound event detection

Authors: Heinrich Dinkel, Kai Yu

Comments: Accepted by ICASSP2020

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[86] arXiv:1904.03876 (cross-list from cs.LG) [pdf, other]: Title: Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery

Authors: Lucas Ondel, Hari Krishna Vydana, Lukáš Burget, Jan Černocký

Comments: Accepted to Interspeech 2019 * corrected typos * Recalculated the segmentation using +-2 frames tolerance to comply with other publications

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[87] arXiv:1904.04100 (cross-list from cs.CL) [pdf, other]: Title: Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models

Authors: Kuan-Yu Chen, Che-Ping Tsai, Da-Rong Liu, Hung-Yi Lee, Lin-shan Lee

Comments: Accepted by Interspeech 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[88] arXiv:1904.04161 (cross-list from cs.LG) [pdf, other]: Title: Audio Source Separation via Multi-Scale Learning with Dilated Dense U-Nets

Authors: Vivek Sivaraman Narayanaswamy, Sameeksha Katoch, Jayaraman J. Thiagarajan, Huan Song, Andreas Spanias

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[89] arXiv:1904.04221 (cross-list from cs.LG) [pdf, other]: Title: Unsupervised Feature Learning for Environmental Sound Classification Using Weighted Cycle-Consistent Generative Adversarial Network

Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

Comments: Paper Accepted for Publication in Elsevier Applied Soft Computing

Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[90] arXiv:1904.04294 (cross-list from cs.CL) [pdf, other]: Title: Exploring Methods for the Automatic Detection of Errors in Manual Transcription

Authors: Xiaofei Wang, Jinyi Yang, Ruizhi Li, Samik Sadhu, Hynek Hermansky

Comments: Submitted in Interspeech 2019

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[91] arXiv:1904.04358 (cross-list from cs.LG) [pdf, other]: Title: Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts

Authors: Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels

Comments: Accepted for publication in IEEE ICASSP 2019

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[92] arXiv:1904.04631 (cross-list from cs.SD) [pdf, other]: Title: CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion

Authors: Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo

Comments: Accepted to ICASSP 2019. Project page: this http URL

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[93] arXiv:1904.04956 (cross-list from cs.SD) [pdf, other]: Title: Distributed Deep Learning Strategies For Automatic Speech Recognition

Authors: Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David Kung, Michael Picheny

Comments: Published in ICASSP'19

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[94] arXiv:1904.05009 (cross-list from cs.SD) [pdf, other]: Title: An Interactive Musical Prediction System with Mixture Density Recurrent Neural Networks

Authors: Charles P Martin, Jim Torresen

Comments: Accepted for presentation at the International Conference on New Interfaces for Musical Expression (NIME), June 2019

Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[95] arXiv:1904.05073 (cross-list from cs.SD) [pdf, other]: Title: Neuralogram: A Deep Neural Network Based Representation for Audio Signals

Authors: Prateek Verma, Chris Chafe, Jonathan Berger

Comments: Submitted to DAFx 2019, the 22nd International Conference on Digital Audio Effects, Birmingham, United Kingdom

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[96] arXiv:1904.05078 (cross-list from cs.CL) [pdf, other]: Title: From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings

Authors: Yi-Chen Chen, Sung-Feng Huang, Hung-yi Lee, Lin-shan Lee

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[97] arXiv:1904.05086 (cross-list from cs.SD) [pdf, other]: Title: A Framework for Multi-f0 Modeling in SATB Choir Recordings

Authors: Helena Cuesta, Emilia Gómez, Pritish Chandna

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[98] arXiv:1904.05204 (cross-list from cs.SD) [pdf, other]: Title: Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events

Authors: Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du

Comments: code URL typo, code is available at this https URL

Journal-ref: Proc. Interspeech 2019, 3860-3864

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[99] arXiv:1904.05243 (cross-list from cs.SD) [pdf, ps, other]: Title: A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification

Authors: Hongwei Song, Jiqing Han, Shiwen Deng

Comments: Accepted as a conference paper of Interspeech 2018

Journal-ref: in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2018-September, 2018, pp. 3294-3298

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[100] arXiv:1904.05249 (cross-list from cs.SD) [pdf, other]: Title: Expectation-Maximization for Speech Source Separation Using Convolutive Transfer Function

Authors: Xiaofei Li, Laurent Girin, Radu Horaud

Journal-ref: CAAI Transactions on Intelligent Technologies, 2019

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

[ total of 167 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | 151-167 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2406, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Apr 2019, skipping first 75