Current browse context:
eess.AS
Change to browse by:
References & Citations
Electrical Engineering and Systems Science > Audio and Speech Processing
Title: Phase-Aware Deep Speech Enhancement: It's All About The Frame Length
(Submitted on 30 Mar 2022 (v1), last revised 4 Oct 2022 (this version, v2))
Abstract: Algorithmic latency in speech processing is dominated by the frame length used for Fourier analysis, which in turn limits the achievable performance of magnitude-centric approaches. As previous studies suggest the importance of phase grows with decreasing frame length, this work presents a systematical study on the contribution of phase and magnitude in modern Deep Neural Network (DNN)-based speech enhancement at different frame lengths. Results indicate that DNNs can successfully estimate phase when using short frames, with similar or better overall performance compared to using longer frames. Thus, interestingly, modern phase-aware DNNs allow for low-latency speech enhancement at high quality.
Submission history
From: Tal Peer [view email][v1] Wed, 30 Mar 2022 11:51:30 GMT (483kb,D)
[v2] Tue, 4 Oct 2022 14:59:44 GMT (571kb,D)
Link back to: arXiv, form interface, contact.