We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SD

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Sound

Title: MoEVC: A Mixture-of-experts Voice Conversion System with Sparse Gating Mechanism for Accelerating Online Computation

Abstract: With the recent advancements of deep learning technologies, the performance of voice conversion (VC) in terms of quality and similarity has been significantly improved. However, heavy computations are generally required for deep-learning-based VC systems, which can cause notable latency and thus confine their deployments in real-world applications. Therefore, increasing online computation efficiency has become an important task. In this study, we propose a novel mixture-of-experts (MoE) based VC system. The MoE model uses a gating mechanism to specify optimal weights to feature maps to increase VC performance. In addition, assigning sparse constraints on the gating mechanism can accelerate online computation by skipping the convolution process by zeroing out redundant feature maps. Experimental results show that by specifying suitable sparse constraints, we can effectively increase the online computation efficiency with a notable 70% FLOPs (floating-point operations per second) reduction while improving the VC performance in both objective evaluations and human listening tests.
Comments: Submitted to ICASSP 2020
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as: arXiv:1912.11984 [cs.SD]
  (or arXiv:1912.11984v1 [cs.SD] for this version)

Submission history

From: Yu Tsao [view email]
[v1] Fri, 27 Dec 2019 04:50:20 GMT (819kb)

Link back to: arXiv, form interface, contact.