We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for eess.AS in Nov 2018, skipping first 125

[ total of 166 entries: 1-50 | 26-75 | 76-125 | 126-166 ]
[ showing 50 entries per page: fewer | more | all ]
[126]  arXiv:1811.06805 (cross-list from cs.LG) [pdf, other]
Title: Using recurrences in time and frequency within U-net architecture for speech enhancement
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[127]  arXiv:1811.06858 (cross-list from cs.HC) [pdf]
Title: John, the semi-conductor : a tool for comprovisation
Authors: Vincent Goudard (STMS)
Journal-ref: Sandeep Bhagwati; Jean Bresson. International Conference on Technologies for Music Notation and Representation (TENOR'18), May 2018, Montr{\'e}al, Canada. 2018, Proceedings of the 4th International Conference on Technologies for Music Notation and Representation. http://tenor-conference.org/
Subjects: Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[128]  arXiv:1811.07018 (cross-list from cs.CR) [pdf, ps, other]
Title: Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues
Comments: Proceedings of the 27th International Conference on Computer Communications and Networks (ICCCN), Hangzhou, China, July-August 2018. arXiv admin note: text overlap with arXiv:1803.09156
Subjects: Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[129]  arXiv:1811.07021 (cross-list from cs.CL) [pdf, other]
Title: Investigating the Effects of Word Substitution Errors on Sentence Embeddings
Comments: 4 Pages, 2 figures. Copyright IEEE 2019. Accepted and to appear in the Proceedings of the 44th International Conference on Acoustics, Speech, and Signal Processing 2019 (IEEE-ICASSP-2019), May 12-17 in Brighton, U.K. Personal use of this material is permitted. However, permission to reprint/republish this material must be obtained from the IEEE
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[130]  arXiv:1811.07030 (cross-list from cs.SD) [pdf, other]
Title: Exploring Tradeoffs in Models for Low-latency Speech Enhancement
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[131]  arXiv:1811.07072 (cross-list from cs.SD) [pdf]
Title: Polyphonic audio tagging with sequentially labelled data using CRNN with learnable gated linear units
Comments: DCASE2018 Workshop. arXiv admin note: text overlap with arXiv:1808.01935
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[132]  arXiv:1811.07082 (cross-list from cs.SD) [pdf, other]
Title: The Intrinsic Memorability of Everyday Sounds
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[133]  arXiv:1811.07240 (cross-list from cs.LG) [pdf, other]
Title: Representation Mixing for TTS Synthesis
Comments: 5 pages, 3 figures
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[134]  arXiv:1811.07426 (cross-list from cs.SD) [pdf, other]
Title: Harmonic Recomposition using Conditional Autoregressive Modeling
Comments: 3 pages, 2 figures. In Proceedings of The Joint Workshop on Machine Learning for Music, ICML 2018
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[135]  arXiv:1811.07435 (cross-list from cs.SD) [pdf, other]
Title: Limitations of Source-Filter Coupling In Phonation
Comments: 2 pages, 2 figures
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[136]  arXiv:1811.07684 (cross-list from cs.LG) [pdf, other]
Title: Efficient keyword spotting using dilated convolutions and gating
Comments: Accepted for publication to ICASSP 2019
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[137]  arXiv:1811.08029 (cross-list from cs.SD) [pdf, other]
Title: Sound-Stream II: Towards Real-Time Gesture Controlled Articulatory Sound Synthesis
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[138]  arXiv:1811.08045 (cross-list from cs.SD) [pdf, other]
Title: Coupled Recurrent Models for Polyphonic Music Composition
Comments: 13 pages; long version of the paper appearing in ISMIR 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[139]  arXiv:1811.08111 (cross-list from cs.SD) [pdf, other]
Title: Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision
Comments: 5 pages, 4 figures, 2 tables. Submitted to IEEE ICASSP 2019
Journal-ref: IEEE International Conference on Acoustic, Speech and Signal Processing (2019) 6785-6789
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[140]  arXiv:1811.08374 (cross-list from cs.LG) [pdf, other]
Title: A Gray Box Interpretable Visual Debugging Approach for Deep Sequence Learning Model
Subjects: Machine Learning (cs.LG); Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[141]  arXiv:1811.08380 (cross-list from cs.SD) [pdf, other]
Title: The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation
Comments: 8 pages, 13 figures
Journal-ref: 2019 International Workshop on Multilayer Music Representation and Processing (MMRP)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[142]  arXiv:1811.08521 (cross-list from cs.SD) [pdf, other]
Title: Differentiable Consistency Constraints for Improved Deep Speech Enhancement
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[143]  arXiv:1811.08592 (cross-list from cs.CV) [pdf, other]
Title: Measuring Depression Symptom Severity from Spoken Language and 3D Facial Expressions
Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[144]  arXiv:1811.09010 (cross-list from cs.SD) [pdf]
Title: Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective
Comments: 5 pages, in submission to ICASSP-2019
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[145]  arXiv:1811.09355 (cross-list from cs.SD) [pdf, other]
Title: Training Multi-Task Adversarial Network for Extracting Noise-Robust Speaker Embedding
Comments: accepted by ICASSP2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[146]  arXiv:1811.09364 (cross-list from cs.CL) [pdf, other]
Title: Learning pronunciation from a foreign language in speech synthesis networks
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[147]  arXiv:1811.09607 (cross-list from cs.SD) [pdf, other]
Title: Towards Emotion Recognition: A Persistent Entropy Application
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[148]  arXiv:1811.09620 (cross-list from cs.SD) [pdf, other]
Title: TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
Comments: 17 pages, published as a conference paper at ICLR 2019
Journal-ref: ICLR 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[149]  arXiv:1811.09956 (cross-list from cs.SD) [pdf, other]
Title: Glottal Closure Instants Detection From Pathological Acoustic Speech Signal Using Deep Learning
Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[150]  arXiv:1811.09967 (cross-list from cs.SD) [pdf, other]
Title: Learning Sound Events From Webly Labeled Data
Comments: Accepted IJCAI 2019
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[151]  arXiv:1811.10169 (cross-list from cs.CL) [pdf, ps, other]
Title: Improving Gated Recurrent Unit Based Acoustic Modeling with Batch Normalization and Enlarged Context
Comments: ISCSLP 2018
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[152]  arXiv:1811.10376 (cross-list from cs.LG) [pdf, other]
Title: Robustness against the channel effect in pathological voice detection
Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[153]  arXiv:1811.10561 (cross-list from cs.CL) [pdf, other]
Title: CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning
Comments: NeurIPS 2018 Visually Grounded Interaction and Language (ViGIL) Workshop
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[154]  arXiv:1811.10708 (cross-list from cs.SD) [pdf, other]
Title: Combining High-Level Features of Raw Audio Waves and Mel-Spectrograms for Audio Tagging
Comments: Detection and Classification of Acoustic Scenes and Events 2018 (DCASE 2018), 19-20 November 2018, Surrey, UK
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[155]  arXiv:1811.10736 (cross-list from cs.LG) [pdf, other]
Title: DONUT: CTC-based Query-by-Example Keyword Spotting
Comments: Accepted to NeurIPS 2018 Workshop on Interpretability and Robustness for Audio, Speech, and Language
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[156]  arXiv:1811.10813 (cross-list from cs.CV) [pdf, other]
Title: Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[157]  arXiv:1811.10988 (cross-list from cs.IR) [pdf, other]
Title: Facilitating the Manual Annotation of Sounds When Using Large Taxonomies
Comments: 5 pages, 5 figures, IEEE FRUCT International Workshop on Semantic Audio and the Internet of Things
Journal-ref: Proceedings of the 23rd Conference of Open Innovations Association FRUCT, Bologna, Italy. 2018. ISSN 2305-7254, ISBN 978-952-68653-6-2, FRUCT Oy, e-ISSN 2343-0737 (license CC BY-ND)
Subjects: Information Retrieval (cs.IR); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[158]  arXiv:1811.11307 (cross-list from cs.SD) [pdf, other]
Title: Improved Speech Enhancement with the Wave-U-Net
Comments: 5 pages (including 1 for References), 1 figure, 2 tables
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[159]  arXiv:1811.11663 (cross-list from cs.SD) [pdf, other]
Title: Multiple source direction of arrival estimation using subspace pseudointensity vectors
Comments: In Proceedings of the LOCATA Challenge Workshop - a satellite event of IWAENC 2018 (arXiv:1811.08482 )
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[160]  arXiv:1811.12208 (cross-list from cs.SD) [pdf, other]
Title: UFANS: U-shaped Fully-Parallel Acoustic Neural Structure For Statistical Parametric Speech Synthesis With 20X Faster
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[161]  arXiv:1811.12214 (cross-list from cs.SD) [pdf, other]
Title: Play as You Like: Timbre-enhanced Multi-modal Music Style Transfer
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[162]  arXiv:1811.12254 (cross-list from cs.LG) [pdf, other]
Title: The Effect of Heterogeneous Data for Alzheimer's Disease Detection from Speech
Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[163]  arXiv:1811.12408 (cross-list from cs.SD) [pdf, other]
Title: From Context to Concept: Exploring Semantic Relationships in Music with Word2Vec
Comments: Accepted for publication in Neural Computing and Applications, Springer. In Press
Journal-ref: Neural Computing and Applications, Springer. 2019
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[164]  arXiv:1811.12802 (cross-list from cs.IR) [pdf, other]
Title: Naive Dictionary On Musical Corpora: From Knowledge Representation To Pattern Recognition
Comments: 25 pages
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[165]  arXiv:1811.12739 (cross-list from cs.LG) [pdf, other]
Title: Neural separation of observed and unobserved distributions
Comments: ICML'19
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[166]  arXiv:1811.03700 (cross-list from cs.LG) [pdf, ps, other]
Title: A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-Trained Neural Network Acoustic Models
Authors: Chao Weng, Dong Yu
Comments: under review ICASSP2019
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[ total of 166 entries: 1-50 | 26-75 | 76-125 | 126-166 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2306, contact, help  (Access key information)