We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for cs.SD in Apr 2021, skipping first 50

[ total of 229 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | ... | 226-229 ]
[ showing 25 entries per page: fewer | more | all ]
[51]  arXiv:2104.06793 [pdf, other]
Title: Non-autoregressive sequence-to-sequence voice conversion
Comments: Accepted to ICASSP2021. Demo HP: this https URL
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[52]  arXiv:2104.06865 [pdf, other]
Title: Efficient conformer-based speech recognition with linear attention
Comments: submitted to APSIPA ASC 2021
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[53]  arXiv:2104.06900 [pdf, ps, other]
Title: FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[54]  arXiv:2104.07128 [pdf, ps, other]
Title: Audio feature ranking for sound-based COVID-19 patient detection
Comments: 22 pages, 6 figures, 8 tables
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[55]  arXiv:2104.07161 [pdf, other]
Title: On the Design of Deep Priors for Unsupervised Audio Restoration
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[56]  arXiv:2104.07286 [pdf, other]
Title: Continual Learning for Fake Audio Detection
Comments: 5 pages, conference
Journal-ref: Proc. Interspeech 2021, 886-890
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[57]  arXiv:2104.07491 [pdf, other]
Title: Cross-domain Speech Recognition with Unsupervised Character-level Distribution Matching
Comments: Accepted to INTERSPEECH 2021; code available at this https URL
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[58]  arXiv:2104.07519 [pdf, other]
Title: Spectrogram Inpainting for Interactive Generation of Instrument Sounds
Comments: 8 pages + references + appendices. 4 figures. Published as a conference paper at the The 2020 Joint Conference on AI Music Creativity, October 19-23, 2020, organized and hosted virtually by the Royal Institute of Technology (KTH), Stockholm, Sweden
Journal-ref: Proceedings of the 1st Joint Conference on AI Music Creativity, 2020 (p. 10). Stockholm, Sweden: AIMC
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[59]  arXiv:2104.08450 [pdf, other]
Title: MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[60]  arXiv:2104.08580 [pdf, other]
Title: Uncovering audio patterns in music with Nonnegative Tucker Decomposition for structural segmentation
Authors: Axel Marmoret (1), Jérémy E. Cohen (1), Nancy Bertin (1), Frédéric Bimbot (1) ((1) Univ Rennes, Inria, CNRS, IRISA, France.)
Comments: 7 pages, 6 figures; Code and experiments details available at this https URL; Experiments details available at this https URL
Journal-ref: 21st International Society for Music Information Retrieval Conference (ISMIR), Montr\'eal, Canada, 2020, 788-794
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[61]  arXiv:2104.08614 [pdf]
Title: Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Robotics (cs.RO); Audio and Speech Processing (eess.AS)
[62]  arXiv:2104.08806 [pdf, other]
Title: Best Practices for Noise-Based Augmentation to Improve the Performance of Emotion Recognition "In the Wild"
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[63]  arXiv:2104.08872 [pdf, ps, other]
Title: Low-Frequency Characterization of Music Sounds -- Ultra-Bass Richness from the Sound Wave Beats
Comments: 23 pages, 7 figures
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); General Physics (physics.gen-ph)
[64]  arXiv:2104.08955 [pdf, other]
Title: Many-Speakers Single Channel Speech Separation with Optimal Permutation Training
Comments: Accepted to Interspeech 2021, Data creation link added
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[65]  arXiv:2104.09018 [pdf, other]
Title: An Interdisciplinary Review of Music Performance Analysis
Comments: arXiv admin note: substantial text overlap with arXiv:1907.00178
Journal-ref: Transactions of the International Society for Music Information Retrieval, 3(1), pp.221-245, 2020
Subjects: Sound (cs.SD); Digital Libraries (cs.DL); Audio and Speech Processing (eess.AS)
[66]  arXiv:2104.09489 [pdf, other]
Title: Interpreting intermediate convolutional layers of generative CNNs trained on waveforms
Comments: IEEE/ACM Transactions on Audio Speech and Language Processing
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[67]  arXiv:2104.09715 [pdf, other]
Title: AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Comments: Accepted by ICASSP 2021
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[68]  arXiv:2104.09748 [pdf, other]
Title: Waveform Phasicity Prediction from Arterial Sounds through Spectrogram Analysis using Convolutional Neural Networks for Limb Perfusion Assessment
Comments: 5 pages, 8 figures
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[69]  arXiv:2104.09832 [pdf]
Title: Identification of fake stereo audio
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[70]  arXiv:2104.09946 [pdf, other]
Title: A cappella: Audio-visual Singing Voice Separation
Comments: Paper accepted at The 32nd British Machine Vision Conference, BMVC 2021
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[71]  arXiv:2104.09995 [pdf, other]
Title: Review of end-to-end speech synthesis technology based on deep learning
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[72]  arXiv:2104.10121 [pdf, other]
Title: On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era
Authors: Shahin Amiriparian (1), Artem Sokolov (2,3), Ilhan Aslan (2), Lukas Christ (1), Maurice Gerczuk (1), Tobias Hübner (1), Dmitry Lamanov (2), Manuel Milling (1), Sandra Ottl (1), Ilya Poduremennykh (2), Evgeniy Shuranov (2,4), Björn W. Schuller (1,5) ((1) EIHW -- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany, (2) Huawei Technologies, (3) HSE University, Nizhniy Novgorod, Russia, (4) ITMO University, Saint Petersburg, Russia)
Comments: 5 pages, 1 figure
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[73]  arXiv:2104.10431 [pdf, other]
Title: Room adaptive conditioning method for sound event classification in reverberant environments
Comments: 5 pages, 3 figures, In Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[74]  arXiv:2104.11051 [pdf, other]
Title: Protecting gender and identity with disentangled speech representations
Comments: 5 pages, 2 figures
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[75]  arXiv:2104.11347 [pdf, ps, other]
Title: Restoring degraded speech via a modified diffusion model
Journal-ref: Proc. Interspeech 2021, 221-225, 2021)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[ total of 229 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | ... | 226-229 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2210, contact, help  (Access key information)