We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for cs.SD in Nov 2018

[ total of 152 entries: 1-152 ]
[ showing 152 entries per page: fewer | more ]
[1]  arXiv:1811.00002 [pdf, other]
Title: WaveGlow: A Flow-based Generative Network for Speech Synthesis
Comments: 5 pages, 1 figure, 1 table, 13 equations
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[2]  arXiv:1811.00003 [src]
Title: Deep Net Features for Complex Emotion Recognition
Comments: Conflict of interest
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[3]  arXiv:1811.00078 [pdf, other]
Title: On Single-Channel Speech Enhancement and On Non-Linear Modulation-Domain Kalman Filtering
Comments: 13 pages
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[4]  arXiv:1811.00223 [pdf, other]
Title: Neural Music Synthesis for Flexible Timbre Control
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[5]  arXiv:1811.00301 [pdf]
Title: Weakly supervised CRNN system for sound event detection with large-scale unlabeled in-domain data
Comments: Submitted to ICASSP 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6]  arXiv:1811.00348 [pdf, ps, other]
Title: Sequence-to-sequence Models for Small-Footprint Keyword Spotting
Comments: Submitted to ICASSP 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[7]  arXiv:1811.00350 [pdf, ps, other]
Title: End-to-end Models with auditory attention in Multi-channel Keyword Spotting
Comments: Submitted to ICASSP 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[8]  arXiv:1811.00454 [pdf, ps, other]
Title: Referenceless Performance Evaluation of Audio Source Separation using Deep Neural Networks
Journal-ref: This paper will be presented at EUSIPCO 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[9]  arXiv:1811.00936 [pdf, other]
Title: Acoustic Features Fusion using Attentive Multi-channel Deep Architecture
Comments: Accepted in CHiME'18 (Interspeech Workshop)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[10]  arXiv:1811.01095 [pdf, ps, other]
Title: Beyond Equal-Length Snippets: How Long is Sufficient to Recognize an Audio Scene?
Comments: Accepted to 2019 AES Conference on Audio Forensics
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[11]  arXiv:1811.01143 [pdf, other]
Title: Multitask learning for frame-level instrument recognition
Comments: This is a pre-print version of an ICASSP 2019 paper
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[12]  arXiv:1811.01233 [pdf, other]
Title: Deep Ad-hoc Beamforming
Authors: Xiao-Lei Zhang
Comments: Accepted by Computer Speech and Language
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[13]  arXiv:1811.01251 [pdf, other]
Title: Multi-View Networks For Multi-Channel Audio Classification
Comments: 5 pages, 7 figures, Accepted to ICASSP 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[14]  arXiv:1811.01609 [pdf, ps, other]
Title: ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
Comments: Published in IEEE/ACM Trans. ASLP this https URL
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[15]  arXiv:1811.01850 [pdf, other]
Title: End-to-End Sound Source Separation Conditioned On Instrument Labels
Comments: 5 pages, 2 figures, 2 tables, ICASSP 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[16]  arXiv:1811.02066 [pdf, ps, other]
Title: How to Improve Your Speaker Embeddings Extractor in Generic Toolkits
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[17]  arXiv:1811.02130 [pdf, other]
Title: Bootstrapping single-channel source separation via unsupervised spatial clustering on stereo mixtures
Comments: 5 pages, 2 figures
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[18]  arXiv:1811.02155 [pdf, other]
Title: FloWaveNet : A Generative Flow for Raw Audio
Comments: 9 pages, ICML'2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[19]  arXiv:1811.02275 [pdf, other]
Title: NIPS4Bplus: a richly annotated birdsong audio dataset
Comments: 5 pages, 5 figures, submitted to ICASSP 2019
Subjects: Sound (cs.SD); Digital Libraries (cs.DL); Audio and Speech Processing (eess.AS)
[20]  arXiv:1811.02406 [pdf, other]
Title: User Specific Adaptation in Automatic Transcription of Vocalised Percussion
Journal-ref: Proc. of RecPad-2017, Amadora, Portugal, pp. 19-20, October, 2017
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[21]  arXiv:1811.02411 [pdf, other]
Title: An audio-only method for advertisement detection in broadcast television content
Journal-ref: Proc. of RecPad-2017, Amadora, Portugal, pp. 21-22, October, 2017
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[22]  arXiv:1811.02508 [pdf, other]
Title: SDR - half-baked or well done?
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23]  arXiv:1811.02694 [pdf]
Title: Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach
Comments: 6 pages, 3 figures. Conference of 2018 IEEE Signal Processing in Medicine and Biology Symposium (SPMB 2018)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
[24]  arXiv:1811.03076 [pdf, other]
Title: Class-conditional embeddings for music source separation
Comments: 5 pages
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[25]  arXiv:1811.03271 [pdf, other]
Title: Learning Disentangled Representations for Timber and Pitch in Music Audio
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[26]  arXiv:1811.04133 [pdf, other]
Title: Integrating Recurrence Dynamics for Speech Emotion Recognition
Journal-ref: Proc. Interspeech 2018, pp. 927-931
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[27]  arXiv:1811.04139 [pdf, other]
Title: Audio Spectrogram Factorization for Classification of Telephony Signals below the Auditory Threshold
Comments: 7 pages, 4 figures. Marchex Technical Report on VoIP SPAM classification
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[28]  arXiv:1811.04357 [pdf, other]
Title: PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network
Comments: 8 pages, 6 figures, AAAI 2019 camera-ready version
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[29]  arXiv:1811.04419 [pdf, other]
Title: Multi-Temporal Resolution Convolutional Neural Networks for Acoustic Scene Classification
Comments: In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), November 2017
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[30]  arXiv:1811.04448 [pdf, ps, other]
Title: A Multi-modal Deep Neural Network approach to Bird-song identification
Comments: LifeCLEF 2017 working notes, Dublin, Ireland
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[31]  arXiv:1811.04568 [pdf, ps, other]
Title: Vectorization of hypotheses and speech for faster beam search in encoder decoder-based speech recognition
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[32]  arXiv:1811.05550 [pdf, other]
Title: Neural Wavetable: a playable wavetable synthesizer using neural networks
Comments: 2 pages, Accepted by Conference on Neural Information Processing Systems (NIPS), Workshop on Machine Learning for Creativity and Design
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[33]  arXiv:1811.06016 [pdf, other]
Title: To bee or not to bee: Investigating machine learning approaches for beehive sound recognition
Comments: Presented at Detection and Classification of Acoustic Scenes and Events (DCASE) workshop 2018
Journal-ref: Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[34]  arXiv:1811.06330 [pdf, other]
Title: Audio-based identification of beehive states
Comments: Accepted for ICASSP 2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[35]  arXiv:1811.06633 [pdf, other]
Title: Generating Albums with SampleRNN to Imitate Metal, Rock, and Punk Bands
Comments: 3 pages
Journal-ref: Proceedings of the 6th International Workshop on Musical Metacreation (MUME 2018)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[36]  arXiv:1811.06639 [pdf, ps, other]
Title: Generating Black Metal and Math Rock: Beyond Bach, Beethoven, and Beatles
Comments: 3 pages
Journal-ref: NIPS Workshop on Machine Learning for Creativity and Design (2017)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[37]  arXiv:1811.06669 [pdf, other]
Title: AclNet: efficient end-to-end audio classification CNN
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Machine Learning (stat.ML)
[38]  arXiv:1811.06713 [pdf, other]
Title: Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization
Comments: 5 pages, 2 figures, audio examples and code available online at this https URL
Journal-ref: IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Brighton, UK, May 2019, pp. 101-105
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[39]  arXiv:1811.06756 [pdf, other]
Title: Direction of Arrival Estimation of Wide-band Signals with Planar Microphone Arrays
Comments: 10 pages
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[40]  arXiv:1811.07030 [pdf, other]
Title: Exploring Tradeoffs in Models for Low-latency Speech Enhancement
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[41]  arXiv:1811.07072 [pdf]
Title: Polyphonic audio tagging with sequentially labelled data using CRNN with learnable gated linear units
Comments: DCASE2018 Workshop. arXiv admin note: text overlap with arXiv:1808.01935
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[42]  arXiv:1811.07082 [pdf, other]
Title: The Intrinsic Memorability of Everyday Sounds
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[43]  arXiv:1811.07426 [pdf, other]
Title: Harmonic Recomposition using Conditional Autoregressive Modeling
Comments: 3 pages, 2 figures. In Proceedings of The Joint Workshop on Machine Learning for Music, ICML 2018
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[44]  arXiv:1811.07435 [pdf, other]
Title: Limitations of Source-Filter Coupling In Phonation
Comments: 2 pages, 2 figures
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[45]  arXiv:1811.08029 [pdf, other]
Title: Sound-Stream II: Towards Real-Time Gesture Controlled Articulatory Sound Synthesis
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[46]  arXiv:1811.08045 [pdf, other]
Title: Coupled Recurrent Models for Polyphonic Music Composition
Comments: 13 pages; long version of the paper appearing in ISMIR 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[47]  arXiv:1811.08111 [pdf, other]
Title: Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision
Comments: 5 pages, 4 figures, 2 tables. Submitted to IEEE ICASSP 2019
Journal-ref: IEEE International Conference on Acoustic, Speech and Signal Processing (2019) 6785-6789
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[48]  arXiv:1811.08380 [pdf, other]
Title: The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation
Comments: 8 pages, 13 figures
Journal-ref: 2019 International Workshop on Multilayer Music Representation and Processing (MMRP)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[49]  arXiv:1811.08521 [pdf, other]
Title: Differentiable Consistency Constraints for Improved Deep Speech Enhancement
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[50]  arXiv:1811.09010 [pdf]
Title: Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective
Comments: 5 pages, in submission to ICASSP-2019
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[51]  arXiv:1811.09355 [pdf, other]
Title: Training Multi-Task Adversarial Network for Extracting Noise-Robust Speaker Embedding
Comments: accepted by ICASSP2019
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[52]  arXiv:1811.09381 [pdf, other]
Title: Improved Frequency Modulation Features for Multichannel Distant Speech Recognition
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC)
[53]  arXiv:1811.09607 [pdf, other]
Title: Towards Emotion Recognition: A Persistent Entropy Application
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[54]  arXiv:1811.09620 [pdf, other]
Title: TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
Comments: 17 pages, published as a conference paper at ICLR 2019
Journal-ref: ICLR 2019
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[55]  arXiv:1811.09956 [pdf, other]
Title: Glottal Closure Instants Detection From Pathological Acoustic Speech Signal Using Deep Learning
Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[56]  arXiv:1811.09967 [pdf, other]
Title: Learning Sound Events From Webly Labeled Data
Comments: Accepted IJCAI 2019
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[57]  arXiv:1811.10708 [pdf, other]
Title: Combining High-Level Features of Raw Audio Waves and Mel-Spectrograms for Audio Tagging
Comments: Detection and Classification of Acoustic Scenes and Events 2018 (DCASE 2018), 19-20 November 2018, Surrey, UK
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[58]  arXiv:1811.11307 [pdf, other]
Title: Improved Speech Enhancement with the Wave-U-Net
Comments: 5 pages (including 1 for References), 1 figure, 2 tables
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[59]  arXiv:1811.11663 [pdf, other]
Title: Multiple source direction of arrival estimation using subspace pseudointensity vectors
Comments: In Proceedings of the LOCATA Challenge Workshop - a satellite event of IWAENC 2018 (arXiv:1811.08482 )
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[60]  arXiv:1811.12208 [pdf, other]
Title: UFANS: U-shaped Fully-Parallel Acoustic Neural Structure For Statistical Parametric Speech Synthesis With 20X Faster
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[61]  arXiv:1811.12214 [pdf, other]
Title: Play as You Like: Timbre-enhanced Multi-modal Music Style Transfer
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[62]  arXiv:1811.12408 [pdf, other]
Title: From Context to Concept: Exploring Semantic Relationships in Music with Word2Vec
Comments: Accepted for publication in Neural Computing and Applications, Springer. In Press
Journal-ref: Neural Computing and Applications, Springer. 2019
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[63]  arXiv:1811.00162 (cross-list from cs.AI) [pdf, other]
Title: Modeling Melodic Feature Dependency with Modularized Variational Auto-Encoder
Comments: The first three authors contributed equally
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[64]  arXiv:1811.00403 (cross-list from cs.CL) [pdf, other]
Title: Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models
Authors: Herman Kamper
Comments: 5 pages, 3 figures, 2 tables; accepted to ICASSP 2019
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[65]  arXiv:1811.00707 (cross-list from cs.CL) [pdf, other]
Title: Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Comments: Pre-print. Work in progress, 5 pages, 1 figure
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[66]  arXiv:1811.01092 (cross-list from cs.LG) [pdf, ps, other]
Title: Unifying Isolated and Overlapping Audio Event Detection with Multi-Label Multi-Task Convolutional Recurrent Neural Networks
Comments: Accepted for the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019)
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[67]  arXiv:1811.01307 (cross-list from cs.CL) [pdf, ps, other]
Title: Towards Unsupervised Speech-to-Text Translation
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[68]  arXiv:1811.01376 (cross-list from cs.LG) [pdf, ps, other]
Title: Investigating context features hidden in End-to-End TTS
Comments: Accepted to ICASSP 2019
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[69]  arXiv:1811.01531 (cross-list from cs.LG) [pdf, other]
Title: Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures using Spatial Information
Comments: Submitted to ICASSP 2019 (v1: November 5th 2018)
Journal-ref: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[70]  arXiv:1811.01690 (cross-list from cs.CL) [pdf, ps, other]
Title: Cycle-consistency training for end-to-end speech recognition
Comments: Submitted to ICASSP'19
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[71]  arXiv:1811.02050 (cross-list from cs.CL) [pdf, other]
Title: Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Comments: ICASSP 2019
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[72]  arXiv:1811.02062 (cross-list from cs.CL) [pdf, other]
Title: End-to-End Monaural Multi-speaker ASR System without Pretraining
Comments: submitted to ICASSP2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[73]  arXiv:1811.02095 (cross-list from cs.LG) [pdf, other]
Title: Kernel Machines Beat Deep Neural Networks on Mask-based Single-channel Speech Enhancement
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[74]  arXiv:1811.02122 (cross-list from cs.CL) [pdf, other]
Title: Robust and fine-grained prosody control of end-to-end speech synthesis
Comments: ICASSP 2019, best viewed in color
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[75]  arXiv:1811.02182 (cross-list from cs.CL) [pdf, ps, other]
Title: Unpaired Speech Enhancement by Acoustic and Adversarial Supervision for Speech Recognition
Comments: will be published in IEEE Signal Processing Letter
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[76]  arXiv:1811.02480 (cross-list from cs.CL) [pdf, other]
Title: Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
Comments: Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[77]  arXiv:1811.02784 (cross-list from cs.LG) [pdf, other]
Title: Median Binary-Connect Method and a Binary Convolutional Neural Nework for Word Recognition
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[78]  arXiv:1811.04903 (cross-list from cs.CL) [pdf, other]
Title: Stream attention-based multi-array end-to-end speech recognition
Comments: Submitted to ICASSP 2019
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[79]  arXiv:1811.05097 (cross-list from cs.CL) [pdf, other]
Title: Exploring RNN-Transducer for Chinese Speech Recognition
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[80]  arXiv:1811.05247 (cross-list from cs.CL) [pdf, other]
Title: An Online Attention-based Model for Speech Recognition
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[81]  arXiv:1811.05250 (cross-list from cs.CL) [pdf, ps, other]
Title: Modality Attention for End-to-End Audio-visual Speech Recognition
Comments: accepted by ICASSP2019
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[82]  arXiv:1811.05540 (cross-list from cs.CL) [pdf]
Title: Native Language Identification using i-vector
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[83]  arXiv:1811.05688 (cross-list from cs.LG) [pdf, other]
Title: Melodic Phrase Segmentation By Deep Neural Networks
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[84]  arXiv:1811.06096 (cross-list from cs.CL) [pdf, other]
Title: Automatic Grammar Augmentation for Robust Voice Command Recognition
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[85]  arXiv:1811.06805 (cross-list from cs.LG) [pdf, other]
Title: Using recurrences in time and frequency within U-net architecture for speech enhancement
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[86]  arXiv:1811.06858 (cross-list from cs.HC) [pdf]
Title: John, the semi-conductor : a tool for comprovisation
Authors: Vincent Goudard (STMS)
Journal-ref: Sandeep Bhagwati; Jean Bresson. International Conference on Technologies for Music Notation and Representation (TENOR'18), May 2018, Montr{\'e}al, Canada. 2018, Proceedings of the 4th International Conference on Technologies for Music Notation and Representation. http://tenor-conference.org/
Subjects: Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[87]  arXiv:1811.07018 (cross-list from cs.CR) [pdf, ps, other]
Title: Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues
Comments: Proceedings of the 27th International Conference on Computer Communications and Networks (ICCCN), Hangzhou, China, July-August 2018. arXiv admin note: text overlap with arXiv:1803.09156
Subjects: Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[88]  arXiv:1811.07021 (cross-list from cs.CL) [pdf, other]
Title: Investigating the Effects of Word Substitution Errors on Sentence Embeddings
Comments: 4 Pages, 2 figures. Copyright IEEE 2019. Accepted and to appear in the Proceedings of the 44th International Conference on Acoustics, Speech, and Signal Processing 2019 (IEEE-ICASSP-2019), May 12-17 in Brighton, U.K. Personal use of this material is permitted. However, permission to reprint/republish this material must be obtained from the IEEE
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[89]  arXiv:1811.07240 (cross-list from cs.LG) [pdf, other]
Title: Representation Mixing for TTS Synthesis
Comments: 5 pages, 3 figures
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[90]  arXiv:1811.07684 (cross-list from cs.LG) [pdf, other]
Title: Efficient keyword spotting using dilated convolutions and gating
Comments: Accepted for publication to ICASSP 2019
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[91]  arXiv:1811.08374 (cross-list from cs.LG) [pdf, other]
Title: A Gray Box Interpretable Visual Debugging Approach for Deep Sequence Learning Model
Subjects: Machine Learning (cs.LG); Human-Computer Interaction (cs.HC); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[92]  arXiv:1811.08592 (cross-list from cs.CV) [pdf, other]
Title: Measuring Depression Symptom Severity from Spoken Language and 3D Facial Expressions
Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[93]  arXiv:1811.09364 (cross-list from cs.CL) [pdf, other]
Title: Learning pronunciation from a foreign language in speech synthesis networks
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[94]  arXiv:1811.10376 (cross-list from cs.LG) [pdf, other]
Title: Robustness against the channel effect in pathological voice detection
Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[95]  arXiv:1811.10561 (cross-list from cs.CL) [pdf, other]
Title: CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning
Comments: NeurIPS 2018 Visually Grounded Interaction and Language (ViGIL) Workshop
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[96]  arXiv:1811.10736 (cross-list from cs.LG) [pdf, other]
Title: DONUT: CTC-based Query-by-Example Keyword Spotting
Comments: Accepted to NeurIPS 2018 Workshop on Interpretability and Robustness for Audio, Speech, and Language
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[97]  arXiv:1811.10988 (cross-list from cs.IR) [pdf, other]
Title: Facilitating the Manual Annotation of Sounds When Using Large Taxonomies
Comments: 5 pages, 5 figures, IEEE FRUCT International Workshop on Semantic Audio and the Internet of Things
Journal-ref: Proceedings of the 23rd Conference of Open Innovations Association FRUCT, Bologna, Italy. 2018. ISSN 2305-7254, ISBN 978-952-68653-6-2, FRUCT Oy, e-ISSN 2343-0737 (license CC BY-ND)
Subjects: Information Retrieval (cs.IR); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[98]  arXiv:1811.12254 (cross-list from cs.LG) [pdf, other]
Title: The Effect of Heterogeneous Data for Alzheimer's Disease Detection from Speech
Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[99]  arXiv:1811.12802 (cross-list from cs.IR) [pdf, other]
Title: Naive Dictionary On Musical Corpora: From Knowledge Representation To Pattern Recognition
Comments: 25 pages
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[100]  arXiv:1811.00006 (cross-list from eess.AS) [pdf, other]
Title: Low-Dimensional Bottleneck Features for On-Device Continuous Speech Recognition
Comments: Submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[101]  arXiv:1811.00183 (cross-list from stat.ML) [pdf, other]
Title: Designing an Effective Metric Learning Pipeline for Speaker Diarization
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[102]  arXiv:1811.00334 (cross-list from eess.AS) [pdf, other]
Title: Deep Learning for Tube Amplifier Emulation
Comments: Accepted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[103]  arXiv:1811.00883 (cross-list from eess.AS) [pdf, other]
Title: Deep Segment Attentive Embedding for Duration Robust Speaker Verification
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[104]  arXiv:1811.01133 (cross-list from eess.AS) [pdf]
Title: A Robust Target Linearly Constrained Minimum Variance Beamformer With Spatial Cues Preservation for Binaural Hearing Aids
Comments: 15 pages, 16 figures
Journal-ref: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP). 2019 Oct 1; 27(10):1549-63
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[105]  arXiv:1811.01222 (cross-list from eess.AS) [pdf, ps, other]
Title: Time-Frequency Audio Features for Speech-Music Classification
Comments: 4 pages, 16 figures
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[106]  arXiv:1811.01644 (cross-list from eess.AS) [pdf, other]
Title: Manner of Articulation Detection using Connectionist Temporal Classification to Improve Automatic Speech Recognition Performance
Comments: 5 pages, 4 figures, ICASSP-2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[107]  arXiv:1811.02063 (cross-list from eess.AS) [pdf, other]
Title: When CTC Training Meets Acoustic Landmarks
Comments: To Appear in ICASSP 2019; The first two authors contributed equally
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
[108]  arXiv:1811.02162 (cross-list from eess.AS) [pdf, other]
Title: Language model integration based on memory control for sequence to sequence speech recognition
Comments: 4 pages, 1 figure, 5 tables, submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[109]  arXiv:1811.02331 (cross-list from eess.AS) [pdf, other]
Title: Speaker verification using end-to-end adversarial language adaptation
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[110]  arXiv:1811.02438 (cross-list from eess.AS) [pdf, other]
Title: Trainable Adaptive Window Switching for Speech Enhancement
Comments: accepted to the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019)
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP); Machine Learning (stat.ML)
[111]  arXiv:1811.02489 (cross-list from eess.SP) [pdf, other]
Title: Unifying Probabilistic Models for Time-Frequency Analysis
Comments: Accepted to International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[112]  arXiv:1811.02566 (cross-list from eess.AS) [pdf, other]
Title: Bidirectional Quaternion Long-Short Term Memory Recurrent Neural Networks for Speech Recognition
Comments: Submitted at ICASSP 2019. arXiv admin note: text overlap with arXiv:1806.04418
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP); Machine Learning (stat.ML)
[113]  arXiv:1811.02735 (cross-list from eess.AS) [pdf, other]
Title: CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments
Comments: 5 pages, 1 figure, EUSIPCO 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[114]  arXiv:1811.02736 (cross-list from eess.AS) [pdf, ps, other]
Title: Learning acoustic word embeddings with phonetically associated triplet network
Comments: 5 pages, 4 figures, submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Sound (cs.SD); Signal Processing (eess.SP)
[115]  arXiv:1811.02770 (cross-list from eess.AS) [pdf, other]
Title: Promising Accurate Prefix Boosting for sequence-to-sequence ASR
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[116]  arXiv:1811.02938 (cross-list from eess.AS) [pdf, other]
Title: On the use of DNN Autoencoder for Robust Speaker Recognition
Comments: 5 pages, 1 figure
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[117]  arXiv:1811.03021 (cross-list from eess.AS) [pdf, other]
Title: High-quality speech coding with SampleRNN
Comments: Submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[118]  arXiv:1811.03055 (cross-list from eess.AS) [pdf, other]
Title: Adapting End-to-End Neural Speaker Verification to New Languages and Recording Conditions with Adversarial Training
Comments: Submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[119]  arXiv:1811.03063 (cross-list from eess.AS) [pdf, other]
Title: Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification
Comments: Submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[120]  arXiv:1811.03255 (cross-list from eess.AS) [pdf, other]
Title: Phonetic-attention scoring for deep speaker features in speaker verification
Comments: Submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[121]  arXiv:1811.03258 (cross-list from eess.AS) [pdf, other]
Title: Gaussian-Constrained training for speaker verification
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[122]  arXiv:1811.03293 (cross-list from eess.AS) [pdf, other]
Title: Who Do I Sound Like? Showcasing Speaker Recognition Technology by YouTube Voice Search
Comments: Accepted for presentation in ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[123]  arXiv:1811.03311 (cross-list from eess.AS) [pdf, other]
Title: Speaker-adaptive neural vocoders for parametric speech synthesis systems
Comments: Accepted to the IEEE Workshop of MMSP 2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[124]  arXiv:1811.03486 (cross-list from eess.AS) [pdf, other]
Title: Speech Enhancement Based on Reducing the Detail Portion of Speech Spectrograms in Modulation Domain via Discrete Wavelet Transform
Comments: 4 pages, 4 figures, to appear in ISCSLP 2018
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[125]  arXiv:1811.04048 (cross-list from eess.AS) [pdf, ps, other]
Title: Joint Acoustic and Class Inference for Weakly Supervised Sound Event Detection
Comments: Submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[126]  arXiv:1811.04076 (cross-list from eess.AS) [pdf, other]
Title: AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Comments: Submitted to ICASSP2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[127]  arXiv:1811.04224 (cross-list from eess.AS) [pdf, ps, other]
Title: Reinforcement Learning Based Speech Enhancement for Robust Speech Recognition
Comments: Conference paper with 4 pages, reinforcement learning, automatic speech recognition, speech enhancement, deep neural network, character error rate
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[128]  arXiv:1811.04769 (cross-list from eess.AS) [pdf, other]
Title: ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems
Comments: Accepted to the conference of EUSIPCO 2019. arXiv admin note: text overlap with arXiv:1811.03311
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[129]  arXiv:1811.05760 (cross-list from eess.AS) [pdf, other]
Title: A Multimodal Approach towards Emotion Recognition of Music using Audio and Lyrical Content
Comments: 6 pages
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
[130]  arXiv:1811.05784 (cross-list from eess.AS) [pdf, other]
Title: Open-source platforms for fast room acoustic simulations in complex structures
Subjects: Audio and Speech Processing (eess.AS); Computational Engineering, Finance, and Science (cs.CE); Sound (cs.SD)
[131]  arXiv:1811.06234 (cross-list from eess.AS) [pdf, ps, other]
Title: On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Image and Video Processing (eess.IV)
[132]  arXiv:1811.06250 (cross-list from eess.AS) [pdf, other]
Title: Effects of Lombard Reflex on the Performance of Deep-Learning-Based Audio-Visual Speech Enhancement Systems
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Image and Video Processing (eess.IV)
[133]  arXiv:1811.06292 (cross-list from eess.AS) [pdf, other]
Title: Towards achieving robust universal neural vocoding
Comments: 4 pages, 1 extra for references. Accepted on Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[134]  arXiv:1811.06296 (cross-list from eess.AS) [pdf, other]
Title: Comprehensive evaluation of statistical speech waveform synthesis
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[135]  arXiv:1811.06439 (cross-list from eess.AS) [pdf, other]
Title: HCU400: An Annotated Dataset for Exploring Aural Phenomenology Through Causal Uncertainty
Journal-ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[136]  arXiv:1811.07065 (cross-list from eess.AS) [pdf, other]
Title: Multipath-enabled private audio with noise
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[137]  arXiv:1811.07629 (cross-list from eess.AS) [pdf, other]
Title: Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition
Comments: 16 pages, 7 figures, Submission to Computer Speech and Language, special issue on Speaker and language characterization and recognition
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[138]  arXiv:1811.08065 (cross-list from eess.AS) [pdf, other]
Title: Learning Robust Heterogeneous Signal Features from Parallel Neural Network for Audio Sentiment Analysis
Comments: 21 pages, PR JOURNAL
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[139]  arXiv:1811.08284 (cross-list from eess.AS) [pdf, other]
Title: Feature exploration for almost zero-resource ASR-free keyword spotting using a multilingual bottleneck extractor and correspondence autoencoders
Comments: 5 pages, 2 figures, 2 tables, 38 references, Accepted at Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[140]  arXiv:1811.08482 (cross-list from eess.AS) [html]
Title: Proceedings of the LOCATA Challenge Workshop -- a satellite event of IWAENC 2018
Comments: Workshop Proceedings
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[141]  arXiv:1811.08552 (cross-list from eess.AS) [pdf, ps, other]
Title: Multi-scale aggregation of phase information for reducing computational cost of CNN based DOA estimation
Comments: arXiv admin note: text overlap with arXiv:1807.11722
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[142]  arXiv:1811.08783 (cross-list from eess.SP) [pdf, other]
Title: Designing nearly tight window for improving time-frequency masking
Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[143]  arXiv:1811.08935 (cross-list from eess.AS) [pdf, other]
Title: A Study of Language and Classifier-independent Feature Analysis for Vocal Emotion Recognition
Comments: 24 pages, 4 figure
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[144]  arXiv:1811.09021 (cross-list from eess.AS) [pdf, ps, other]
Title: Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Comments: submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[145]  arXiv:1811.09678 (cross-list from eess.AS) [pdf, other]
Title: Speech recognition with quaternion neural networks
Comments: NIPS 2018 (IRASL). arXiv admin note: text overlap with arXiv:1806.04418
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Machine Learning (stat.ML)
[146]  arXiv:1811.09919 (cross-list from eess.AS) [pdf, other]
Title: A Method for Analysis of Patient Speech in Dialogue for Dementia Detection
Comments: 8 pages, Resources and ProcessIng of linguistic, paralinguistic and extra-linguistic Data from people with various forms of cognitive impairment, LREC 2018
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[147]  arXiv:1811.11078 (cross-list from eess.AS) [pdf, other]
Title: Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion
Comments: 5 pages, 7 figures, 1 table. Accepted to EUSIPCO 2019
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[148]  arXiv:1811.11517 (cross-list from eess.AS) [pdf, other]
Title: Acoustics-guided evaluation (AGE): a new measure for estimating performance of speech enhancement algorithms for robust ASR
Comments: Submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[149]  arXiv:1811.11785 (cross-list from eess.AS) [pdf, ps, other]
Title: SVD-PHAT: A Fast Sound Source Localization Method
Journal-ref: Proceedings of the 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[150]  arXiv:1811.11787 (cross-list from eess.AS) [pdf, ps, other]
Title: A Study of the Complexity and Accuracy of Direction of Arrival Estimation Methods Based on GCC-PHAT for a Pair of Close Microphones
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[151]  arXiv:1811.11913 (cross-list from eess.AS) [pdf, other]
Title: LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis
Comments: Submitted to EUSIPCO 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[152]  arXiv:1811.12290 (cross-list from eess.AS) [pdf, other]
Title: Tuplemax Loss for Language Identification
Comments: Submitted to ICASSP 2019
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[ total of 152 entries: 1-152 ]
[ showing 152 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2208, contact, help  (Access key information)