We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for cs.SD in Oct 2022

[ total of 363 entries: 1-10 | 11-20 | 21-30 | 31-40 | ... | 361-363 ]
[ showing 10 entries per page: fewer | more | all ]
[1]  arXiv:2210.00169 [pdf, other]
Title: Multi-stage Progressive Compression of Conformer Transducer for On-device Speech Recognition
Comments: Published in INTERSPEECH 2022
Journal-ref: Proc. Interspeech 2022, 1691-1695
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[2]  arXiv:2210.00721 [pdf, other]
Title: Efficient acoustic feature transformation in mismatched environments using a Guided-GAN
Comments: Final published version available at: Efficient acoustic feature transformation in mismatched environments using a Guided-GAN. Speech Communication, 143, pp.10-20
Journal-ref: Speech Communication, 143, pp.10-20 (2022)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[3]  arXiv:2210.00753 [pdf, other]
Title: Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection
Comments: Accepted by SLT 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[4]  arXiv:2210.01256 [pdf, ps, other]
Title: And what if two musical versions don't share melody, harmony, rhythm, or lyrics ?
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[5]  arXiv:2210.01353 [pdf, other]
Title: Pay Self-Attention to Audio-Visual Navigation
Comments: Main paper (10 pages and 7 figures) and appendix (21 figures and 4 tables). Accepted for publication by BMVC 2022. For data and code, see this https URL
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[6]  arXiv:2210.01448 [pdf, other]
Title: Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings
Comments: SIGGRAPH Asia 2022 (Journal Track); Project Page: this https URL
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Audio and Speech Processing (eess.AS)
[7]  arXiv:2210.01703 [pdf, other]
Title: Improving Label-Deficient Keyword Spotting Through Self-Supervised Pretraining
Comments: To be published at ICASSP2023 Workshop on Self-supervision in Audio, Speech and Beyond, 10th of June 2023, Rhodes, Greece. Copyright (c) 2023 IEEE. 5 pages, 3 figures, 3 tables
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[8]  arXiv:2210.01719 [pdf, other]
Title: Learning Temporal Resolution in Spectrogram for Audio Classification
Comments: Accepted by the 38th Annual AAAI Conference on Artificial Intelligence
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[9]  arXiv:2210.02287 [pdf, ps, other]
Title: TC-SKNet with GridMask for Low-complexity Classification of Acoustic scene
Comments: Accepted to APSIPA ASC 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[10]  arXiv:2210.02437 [pdf, other]
Title: ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Comments: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[ total of 363 entries: 1-10 | 11-20 | 21-30 | 31-40 | ... | 361-363 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help  (Access key information)