We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for cs.SD in Oct 2021, skipping first 250

[ total of 324 entries: 1-25 | ... | 176-200 | 201-225 | 226-250 | 251-275 | 276-300 | 301-324 ]
[ showing 25 entries per page: fewer | more | all ]
[251]  arXiv:2110.04654 (cross-list from eess.AS) [pdf, other]
Title: Complex Network-Based Approach for Feature Extraction and Classification of Musical Genres
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD)
[252]  arXiv:2110.04692 (cross-list from eess.AS) [pdf, other]
Title: Poformer: A simple pooling transformer for speaker verification
Comments: submitted to ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[253]  arXiv:2110.04694 (cross-list from eess.AS) [pdf, other]
Title: Multi-Channel End-to-End Neural Diarization with Distributed Microphones
Comments: Accepted to ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[254]  arXiv:2110.04775 (cross-list from eess.AS) [pdf, other]
Title: Estimating the confidence of speech spoofing countermeasure
Comments: Work in progress. Comments are welcome. Accepted by ICASSP2022. Code is available this https URL Not all the comments from anonymous reviewers can be addressed within 4 pages, apologize for that
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Sound (cs.SD)
[255]  arXiv:2110.04791 (cross-list from eess.AS) [pdf, other]
Title: Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain
Comments: Accepted for publication in IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[256]  arXiv:2110.04850 (cross-list from eess.AS) [pdf, other]
Title: Direct source and early reflections localization using deep deconvolution network under reverberant environment
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[257]  arXiv:2110.04908 (cross-list from eess.AS) [pdf, other]
Title: DITTO: Data-efficient and Fair Targeted Subset Selection for ASR Accent Adaptation
Comments: ACL 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[258]  arXiv:2110.04948 (cross-list from eess.AS) [pdf, other]
Title: Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Comments: Submitted to ICASSP2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[259]  arXiv:2110.05036 (cross-list from eess.AS) [pdf, other]
Title: Multi-View Self-Attention Based Transformer for Speaker Recognition
Comments: Paper to appear at ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[260]  arXiv:2110.05249 (cross-list from eess.AS) [pdf, other]
Title: A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation
Comments: Accepted to ASRU2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[261]  arXiv:2110.05267 (cross-list from eess.AS) [pdf, other]
Title: Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition
Comments: 5 pages, 7 figures, Accepted by ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[262]  arXiv:2110.05431 (cross-list from eess.AS) [pdf, other]
Title: On the invertibility of a voice privacy system using embedding alignement
Authors: Pierre Champion (MULTISPEECH, LIUM), Thomas Thebaud (LIUM), Gaël Le Lan, Anthony Larcher (LIUM), Denis Jouvet (MULTISPEECH)
Journal-ref: ASRU 2021 - IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2021, Cartagena, Colombia
Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Sound (cs.SD)
[263]  arXiv:2110.05632 (cross-list from stat.AP) [pdf, other]
Title: Wind-robust sound event detection and denoising for bioacoustics
Comments: 34 pages, 5 figures, 2 supplementary figures
Subjects: Applications (stat.AP); Sound (cs.SD); Quantitative Methods (q-bio.QM)
[264]  arXiv:2110.05695 (cross-list from eess.AS) [pdf, ps, other]
Title: The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction
Journal-ref: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[265]  arXiv:2110.05745 (cross-list from eess.AS) [pdf, other]
Title: VarArray: Array-Geometry-Agnostic Continuous Speech Separation
Comments: 5 pages, 1 figure, 3 tables, submitted to ICASSP 2022; updated reference information of [33]
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[266]  arXiv:2110.05948 (cross-list from eess.SP) [pdf, other]
Title: Denoising Diffusion Gamma Models
Comments: arXiv admin note: substantial text overlap with arXiv:2106.07582
Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[267]  arXiv:2110.05994 (cross-list from eess.AS) [pdf, other]
Title: Word Order Does Not Matter For Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[268]  arXiv:2110.06126 (cross-list from eess.AS) [pdf, other]
Title: Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection
Comments: 5 pages, 2 figures, 4 tables. Submitted to the 2022 International Conference on Acoustics, Speech, & Signal Processing (ICASSP)
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[269]  arXiv:2110.06304 (cross-list from eess.AS) [pdf, other]
Title: Generalized Time Domain Velocity Vector
Comments: Submitted
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[270]  arXiv:2110.06306 (cross-list from eess.AS) [pdf, other]
Title: Fine-grained style control in Transformer-based Text-to-speech Synthesis
Comments: Accepted in ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[271]  arXiv:2110.06309 (cross-list from eess.AS) [pdf, other]
Title: Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Comments: Accepted to ICASSP 2023
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[272]  arXiv:2110.06428 (cross-list from eess.AS) [pdf, other]
Title: All-neural beamformer for continuous speech separation
Comments: 5 pages, 3 figures, 2 tables
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[273]  arXiv:2110.06434 (cross-list from eess.AS) [pdf, other]
Title: DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding
Comments: Accepted to ASRU 2021
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[274]  arXiv:2110.06440 (cross-list from eess.AS) [pdf, other]
Title: SDR -- Medium Rare with Fast Computations
Authors: Robin Scheibler
Comments: 5 pages, 3 figures, 2 tables. Submitted to ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[275]  arXiv:2110.06546 (cross-list from eess.AS) [pdf, other]
Title: A Melody-Unsupervision Model for Singing Voice Synthesis
Comments: ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[ total of 324 entries: 1-25 | ... | 176-200 | 201-225 | 226-250 | 251-275 | 276-300 | 301-324 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2406, contact, help  (Access key information)