We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for cs.SD in Jun 2022, skipping first 175

[ total of 221 entries: 1-25 | ... | 101-125 | 126-150 | 151-175 | 176-200 | 201-221 ]
[ showing 25 entries per page: fewer | more | all ]
[176]  arXiv:2206.09072 (cross-list from eess.AS) [pdf, other]
Title: Semi-supervised Time Domain Target Speaker Extraction with Attention
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[177]  arXiv:2206.09102 (cross-list from eess.AS) [pdf, other]
Title: Decoupled Federated Learning for ASR with Non-IID Data
Comments: Accepted by Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC); Sound (cs.SD)
[178]  arXiv:2206.09396 (cross-list from eess.AS) [pdf, other]
Title: Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Comments: proceedings of INTERSPEECH 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[179]  arXiv:2206.09507 (cross-list from eess.AS) [pdf, other]
Title: Resource-Efficient Separation Transformer
Comments: Accepted to ICASSP 2024
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[180]  arXiv:2206.09523 (cross-list from eess.AS) [pdf, other]
Title: Towards Trustworthy Edge Intelligence: Insights from Voice-Activated Services
Subjects: Audio and Speech Processing (eess.AS); Computers and Society (cs.CY); Sound (cs.SD)
[181]  arXiv:2206.09556 (cross-list from eess.AS) [pdf, other]
Title: An Empirical Analysis on the Vulnerabilities of End-to-End Speech Segregation Models
Comments: Accepted at Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[182]  arXiv:2206.09783 (cross-list from eess.AS) [pdf, other]
Title: Boosting Cross-Domain Speech Recognition with Self-Supervision
Comments: Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2023
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[183]  arXiv:2206.11000 (cross-list from eess.AS) [pdf, other]
Title: A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement
Comments: Published @ Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[184]  arXiv:2206.11045 (cross-list from eess.AS) [pdf, other]
Title: COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset featuring the same speakers with and without infection
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[185]  arXiv:2206.11181 (cross-list from eess.AS) [pdf, other]
Title: On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement
Comments: Accepted at Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[186]  arXiv:2206.11558 (cross-list from eess.AS) [pdf, other]
Title: Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
Comments: Accepted to INTERSPEECH 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[187]  arXiv:2206.11640 (cross-list from eess.AS) [pdf, other]
Title: Speaker-Independent Microphone Identification in Noisy Conditions
Journal-ref: in European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 2022, pp. 1047-1051
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[188]  arXiv:2206.11703 (cross-list from eess.AS) [pdf, other]
Title: Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Comments: Accepted at Interspeech 2022
Journal-ref: Proc. Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[189]  arXiv:2206.12040 (cross-list from eess.AS) [pdf, other]
Title: End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Comments: 5 pages, 3 figures, accepted for INTERSPEECH 2022. Audio samples: this https URL
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[190]  arXiv:2206.12045 (cross-list from eess.AS) [pdf, other]
Title: Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
Comments: It's accepted to INTERSPEECH 2022. arXiv admin note: text overlap with arXiv:2206.11596
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[191]  arXiv:2206.12059 (cross-list from eess.AS) [pdf, ps, other]
Title: Data Augmentation and Squeeze-and-Excitation Network on Multiple Dimension for Sound Event Localization and Detection in Real Scenes
Comments: Technical Report submitted for DCASE2022 Challenge Task3
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[192]  arXiv:2206.12283 (cross-list from eess.AS) [pdf, other]
Title: Open-source objective-oriented framework for head-related transfer function
Authors: Adam Szwajcowski
Comments: Not submitted anywhere in the current form
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[193]  arXiv:2206.12285 (cross-list from eess.AS) [pdf, other]
Title: Speech Quality Assessment through MOS using Non-Matching References
Comments: To Appear, Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[194]  arXiv:2206.12297 (cross-list from eess.AS) [pdf, other]
Title: SAQAM: Spatial Audio Quality Assessment Metric
Comments: To Appear, Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[195]  arXiv:2206.12489 (cross-list from eess.AS) [pdf, other]
Title: Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models
Comments: Submitted to INTERSPEECH 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[196]  arXiv:2206.12774 (cross-list from eess.AS) [pdf, other]
Title: Meta Auxiliary Learning for Low-resource Spoken Language Understanding
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[197]  arXiv:2206.12857 (cross-list from eess.AS) [pdf, other]
Title: Transport-Oriented Feature Aggregation for Speaker Embedding Learning
Comments: Accepted for presentation at INTERSPEECH 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[198]  arXiv:2206.13014 (cross-list from eess.AS) [pdf, other]
Title: Joint Optimization of Sampling Rate Offsets Based on Entire Signal Relationship Among Distributed Microphones
Comments: 5 pages, 2 figures,accepted by Interspeech2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[199]  arXiv:2206.13044 (cross-list from eess.AS) [pdf, other]
Title: Extended U-Net for Speaker Verification in Noisy Environments
Comments: 5 pages, 2 figures, 4 tables, accepted to 2022 Interspeech as a conference paper
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[200]  arXiv:2206.13066 (cross-list from eess.AS) [pdf, other]
Title: Detection of Doctored Speech: Towards an End-to-End Parametric Learn-able Filter Approach
Authors: Rohit Arora
Comments: arXiv admin note: text overlap with arXiv:1904.05441 by other authors
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[ total of 221 entries: 1-25 | ... | 101-125 | 126-150 | 151-175 | 176-200 | 201-221 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help  (Access key information)