MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models

Wang, Yu-Hsiang; Chen, Huang-Yu; Chang, Kai-Wei; Hsu, Winston; Lee, Hung-yi

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2305

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models

Authors: Yu-Hsiang Wang, Huang-Yu Chen, Kai-Wei Chang, Winston Hsu, Hung-yi Lee

(Submitted on 30 May 2023 (v1), last revised 14 Nov 2023 (this version, v3))

Abstract: SUPERB was proposed to evaluate the generalizability of self-supervised learning (SSL) speech models across various tasks. However, it incurs high computational costs due to the large datasets and diverse tasks. In this paper, we introduce MiniSUPERB, a lightweight benchmark that efficiently evaluates SSL speech models with comparable results to SUPERB but lower computational costs significantly. We carefully select representative tasks, sample datasets, and extract model representations offline. Our approach achieves a Spearman's rank correlation of 0.954 and 0.982 with SUPERB Paper and SUPERB Challenge, respectively. Additionally, we reduce the computational cost by 97% in terms of Multiply-ACcumulate operations (MACs). Furthermore, we evaluate SSL speech models in few-shot scenarios and observe significant variations in their performance. To our knowledge, this is the first study to examine both the computational cost of the model itself and the cost of evaluating it on a benchmark.

Comments:	Accepted to IEEE ASRU 2023
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2305.19011 [eess.AS]
	(or arXiv:2305.19011v3 [eess.AS] for this version)

Submission history

From: Kai-Wei Chang [view email]
[v1] Tue, 30 May 2023 13:07:33 GMT (1617kb,D)
[v2] Tue, 10 Oct 2023 05:17:01 GMT (1320kb,D)
[v3] Tue, 14 Nov 2023 21:22:25 GMT (1320kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2305.19011

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models

Submission history