We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for eess.AS in Apr 2019, skipping first 75

[ total of 167 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | 151-167 ]
[ showing 25 entries per page: fewer | more | all ]
[76]  arXiv:1904.03522 (cross-list from cs.SD) [pdf, other]
Title: Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data
Comments: Accepted to EUSIPCO 2020
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[77]  arXiv:1904.03543 (cross-list from cs.SD) [pdf, ps, other]
Title: Spatio-Temporal Attention Pooling for Audio Scene Classification
Comments: To appear at the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[78]  arXiv:1904.03576 (cross-list from cs.CL) [pdf, other]
Title: Spoken Language Intent Detection using Confusion2Vec
Journal-ref: Proceedings of Interspeech 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[79]  arXiv:1904.03617 (cross-list from cs.SD) [pdf, other]
Title: VAE-based regularization for deep speaker embedding
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[80]  arXiv:1904.03787 (cross-list from cs.SD) [pdf, other]
Title: Bayesian Non-Parametric Multi-Source Modelling Based Determined Blind Source Separation
Comments: 5 pages, 2 figures. Accepted at ICASSP 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[81]  arXiv:1904.03814 (cross-list from cs.SD) [pdf, other]
Title: Temporal Convolution for Real-time Keyword Spotting on Mobile Devices
Comments: In INTERSPEECH 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[82]  arXiv:1904.03829 (cross-list from cs.CR) [pdf, other]
Title: Adversarial Audio: A New Information Hiding Method and Backdoor for DNN-based Speech Recognition Models
Comments: Submitted to RAID2019
Subjects: Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[83]  arXiv:1904.03833 (cross-list from cs.SD) [pdf, other]
Title: Direct Modelling of Speech Emotion from Raw Speech
Comments: INTERSPEECH 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[84]  arXiv:1904.03834 (cross-list from stat.ML) [pdf, other]
Title: A Statistical Investigation of Long Memory in Language and Music
Comments: 29 pages; expanded supplement, added details in background and methods per reviewer feedback, included additional references
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[85]  arXiv:1904.03841 (cross-list from cs.SD) [pdf, other]
Title: Duration robust weakly supervised sound event detection
Comments: Accepted by ICASSP2020
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[86]  arXiv:1904.03876 (cross-list from cs.LG) [pdf, other]
Title: Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery
Comments: Accepted to Interspeech 2019 * corrected typos * Recalculated the segmentation using +-2 frames tolerance to comply with other publications
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[87]  arXiv:1904.04100 (cross-list from cs.CL) [pdf, other]
Title: Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Comments: Accepted by Interspeech 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[88]  arXiv:1904.04161 (cross-list from cs.LG) [pdf, other]
Title: Audio Source Separation via Multi-Scale Learning with Dilated Dense U-Nets
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[89]  arXiv:1904.04221 (cross-list from cs.LG) [pdf, other]
Title: Unsupervised Feature Learning for Environmental Sound Classification Using Weighted Cycle-Consistent Generative Adversarial Network
Comments: Paper Accepted for Publication in Elsevier Applied Soft Computing
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[90]  arXiv:1904.04294 (cross-list from cs.CL) [pdf, other]
Title: Exploring Methods for the Automatic Detection of Errors in Manual Transcription
Comments: Submitted in Interspeech 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[91]  arXiv:1904.04358 (cross-list from cs.LG) [pdf, other]
Title: Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts
Comments: Accepted for publication in IEEE ICASSP 2019
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[92]  arXiv:1904.04631 (cross-list from cs.SD) [pdf, other]
Title: CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion
Comments: Accepted to ICASSP 2019. Project page: this http URL
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[93]  arXiv:1904.04956 (cross-list from cs.SD) [pdf, other]
Title: Distributed Deep Learning Strategies For Automatic Speech Recognition
Comments: Published in ICASSP'19
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[94]  arXiv:1904.05009 (cross-list from cs.SD) [pdf, other]
Title: An Interactive Musical Prediction System with Mixture Density Recurrent Neural Networks
Comments: Accepted for presentation at the International Conference on New Interfaces for Musical Expression (NIME), June 2019
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
[95]  arXiv:1904.05073 (cross-list from cs.SD) [pdf, other]
Title: Neuralogram: A Deep Neural Network Based Representation for Audio Signals
Comments: Submitted to DAFx 2019, the 22nd International Conference on Digital Audio Effects, Birmingham, United Kingdom
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[96]  arXiv:1904.05078 (cross-list from cs.CL) [pdf, other]
Title: From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[97]  arXiv:1904.05086 (cross-list from cs.SD) [pdf, other]
Title: A Framework for Multi-f0 Modeling in SATB Choir Recordings
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[98]  arXiv:1904.05204 (cross-list from cs.SD) [pdf, other]
Title: Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events
Comments: code URL typo, code is available at this https URL
Journal-ref: Proc. Interspeech 2019, 3860-3864
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[99]  arXiv:1904.05243 (cross-list from cs.SD) [pdf, ps, other]
Title: A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification
Comments: Accepted as a conference paper of Interspeech 2018
Journal-ref: in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2018-September, 2018, pp. 3294-3298
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[100]  arXiv:1904.05249 (cross-list from cs.SD) [pdf, other]
Title: Expectation-Maximization for Speech Source Separation Using Convolutive Transfer Function
Journal-ref: CAAI Transactions on Intelligent Technologies, 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 167 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | 151-167 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2406, contact, help  (Access key information)