We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for cs.SD in Jun 2022, skipping first 75

[ total of 221 entries: 1-10 | ... | 46-55 | 56-65 | 66-75 | 76-85 | 86-95 | 96-105 | 106-115 | ... | 216-221 ]
[ showing 10 entries per page: fewer | more | all ]
[76]  arXiv:2206.12469 [pdf, other]
Title: Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[77]  arXiv:2206.12494 [pdf, other]
Title: Multitask vocal burst modeling with ResNets and pre-trained paralinguistic Conformers
Comments: To be published in the ICML Expressive Vocalizations Workshop & Competition 2022 (this https URL)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[78]  arXiv:2206.12513 [pdf, other]
Title: Domain Generalization with Relaxed Instance Frequency-wise Normalization for Multi-device Acoustic Scene Classification
Comments: Proceedings of INTERSPEECH 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[79]  arXiv:2206.12559 [pdf, other]
Title: Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Comments: Accepted by Interspeech 2022
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[80]  arXiv:2206.12563 [pdf, other]
Title: Generating Diverse Vocal Bursts with StyleGAN2 and MEL-Spectrograms
Comments: To be published at the ICML Expressive Vocalizations Workshop and Competition (ExVo Generate) held in conjunction with the 39th International Conference on Machine Learning
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[81]  arXiv:2206.12568 [pdf, other]
Title: Self-supervision and Learnable STRFs for Age, Emotion, and Country Prediction
Journal-ref: Proceedings of the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[82]  arXiv:2206.12662 [pdf, other]
Title: Synthesizing Personalized Non-speech Vocalization from Discrete Speech Representations
Authors: Chin-Cheng Hsu
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[83]  arXiv:2206.12829 [pdf, other]
Title: On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Comments: Accepted at SPCOM 2022
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[84]  arXiv:2206.13021 [pdf, other]
Title: Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion
Comments: Accepted at INTERSPEECH 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[85]  arXiv:2206.13071 [pdf, other]
Title: Uncertainty Calibration for Deep Audio Classifiers
Comments: Accepted by InterSpeech 2022, the first two authors contributed equally
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[ total of 221 entries: 1-10 | ... | 46-55 | 56-65 | 66-75 | 76-85 | 86-95 | 96-105 | 106-115 | ... | 216-221 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help  (Access key information)