We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions

[ total of 30 entries: 1-30 ]
[ showing up to 37 entries per page: fewer | more ]

Mon, 6 Jul 2020

[1]  arXiv:2007.01836 (cross-list from eess.AS) [pdf, ps, other]
Title: Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[2]  arXiv:2007.01696 (cross-list from cs.LG) [pdf, other]
Title: Channel Compression: Rethinking Information Redundancy among Channels in CNN Architecture
Comments: 9 pages
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[3]  arXiv:2007.01579 (cross-list from eess.AS) [pdf, other]
Title: Noise-Robust Adaptation Control for Supervised System Identification Exploiting A Noise Dictionary
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[4]  arXiv:2007.01543 (cross-list from eess.AS) [pdf, other]
Title: Online Supervised Acoustic System Identification exploiting Prelearned Local Affine Subspace Models
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Fri, 3 Jul 2020

[5]  arXiv:2007.01216 [pdf, other]
Title: Spot the conversation: speaker diarisation in the wild
Comments: The dataset will be available for download from this http URL . The development set will be released in July 2020, and the test set will be released in October 2020
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[6]  arXiv:2007.00763 [pdf, other]
Title: OrchideaSOL: a dataset of extended instrumental techniques for computer-aided orchestration
Comments: 6 pages, 6 figures, in English. To appear in the proceedings of the International Computer Music Conference (ICMC 2020). Please visit: this https URL
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[7]  arXiv:2007.00991 (cross-list from eess.AS) [pdf, other]
Title: Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[8]  arXiv:2007.00947 (cross-list from eess.AS) [pdf]
Title: Polyphonic sound event detection based on convolutional recurrent neural networks with semi-supervised loss function for DCASE challenge 2020 task 4
Comments: 4 pages, DCASE challenge 2020 task4 technical report
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9]  arXiv:2007.00908 (cross-list from eess.AS) [pdf]
Title: Semi-Supervised NMF-CNN For Sound Event Detection
Comments: 5 pages, 2 tables
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10]  arXiv:2007.00809 (cross-list from eess.AS) [pdf, other]
Title: Automated Empathy Detection for Oncology Encounters
Comments: Accepted by the 8TH IEEE International Conference on Healthcare Informatics (ICHI2020)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11]  arXiv:2007.00659 (cross-list from eess.AS) [pdf, other]
Title: LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity
Comments: 10 pages, 5 figures, 5 tables. Submitted to journal
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)

Thu, 2 Jul 2020

[12]  arXiv:2007.00416 [pdf, other]
Title: Joint-Diagonalizability-Constrained Multichannel Nonnegative Matrix Factorization Based on Multivariate Complex Sub-Gaussian Distribution
Comments: 5 pages, 3 figures, To appear in the Proceedings of the 28th European Signal Processing Conference (EUSIPCO 2020). arXiv admin note: text overlap with arXiv:2002.00579
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[13]  arXiv:2007.00274 [pdf, ps, other]
Title: Consistent Independent Low-Rank Matrix Analysis for Determined Blind Source Separation
Comments: Submitted to EURASIP J. Adv. Signal. Process. In peer review
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[14]  arXiv:2007.00144 [pdf, other]
Title: A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition
Comments: Accepted International Conference on Machine Learning $\textbf{(ICML) 2020}$. 14 pages
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[15]  arXiv:2007.00542 (cross-list from eess.AS) [pdf, ps, other]
Title: Instantaneous PSD Estimation for Speech Enhancement based on Generalized Principal Components
Journal-ref: Proc. 28th European Signal Process. Conf. (EUSIPCO 2020), Amsterdam, Netherlands, Jan 2021, pp. 1-5
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16]  arXiv:2007.00272 (cross-list from eess.AS) [pdf, other]
Title: Exploring the time-domain deep attractor network with two-stream architectures in a reverberant environment
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[17]  arXiv:2007.00253 (cross-list from cs.CR) [pdf, other]
Title: Private Speech Characterization with Secure Multiparty Computation
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[18]  arXiv:2007.00225 (cross-list from eess.AS) [pdf, other]
Title: The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation
Comments: Technical Report of DCASE2020 Challenge Task 6
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[19]  arXiv:2007.00222 (cross-list from eess.AS) [pdf, other]
Title: A Transformer-based Audio Captioning Model with Keyword Estimation
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[20]  arXiv:2007.00192 (cross-list from eess.AS) [pdf]
Title: Personalization of Hearing Aid Compression by Human-In-Loop Deep Reinforcement Learning
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[21]  arXiv:2007.00183 (cross-list from eess.AS) [pdf, other]
Title: Whole-Word Segmental Speech Recognition with Acoustic Word Embeddings
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[22]  arXiv:2007.00131 (cross-list from eess.AS) [pdf, other]
Title: Multi-view Frequency LSTM: An Efficient Frontend for Automatic Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Wed, 1 Jul 2020

[23]  arXiv:2006.16689 (cross-list from eess.AS) [pdf, other]
Title: A Speech Enhancement Algorithm based on Non-negative Hidden Markov Model and Kullback-Leibler Divergence
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[24]  arXiv:2006.16367 (cross-list from eess.IV) [pdf, other]
Title: Ultra2Speech -- A Deep Learning Framework for Formant Frequency Estimation and Tracking from Ultrasound Tongue Images
Comments: Accepted for publication in MICCAI 2020
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)

Tue, 30 Jun 2020

[25]  arXiv:2006.15903 [pdf]
Title: Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments
Authors: Mohammad Mohammadamini (LIA), Driss Matrouf (LIA)
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[26]  arXiv:2006.15253 [pdf, ps, other]
Title: Sound Event Detection Using Duration Robust Loss Function
Comments: Submitted to DCASE2020 Workshop
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[27]  arXiv:2006.15967 (cross-list from eess.AS) [pdf, other]
Title: Prosodic Prominence and Boundaries in Sequence-to-Sequence Speech Synthesis
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[28]  arXiv:2006.15469 (cross-list from eess.SP) [pdf, other]
Title: End-to-End AI-Based Point-of-Care Diagnosis System for Classifying Respiratory Illnesses and Early Detection of COVID-19
Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Hardware Architecture (cs.AR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[29]  arXiv:2006.15406 (cross-list from eess.AS) [pdf, other]
Title: Listen carefully and tell: an audio captioning system based on residual learning and gammatone audio representation
Comments: Submitted to DCASE2020 Workshop, Workshop on Detection and Classification of Acoustic Scenes and Events
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[30]  arXiv:2006.15321 (cross-list from eess.AS) [pdf, other]
Title: Anomalous Sound Detection using unsupervised and semi-supervised autoencoders and gammatone audio representation
Comments: Submitted to DCASE2020 Workshop, Workshop on Detection and Classification of Acoustic Scenes and Events
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[ total of 30 entries: 1-30 ]
[ showing up to 37 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2007, contact, help  (Access key information)