The ZevoMOS entry to VoiceMOS Challenge 2022

Stan, Adriana

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2206

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: The ZevoMOS entry to VoiceMOS Challenge 2022

Authors: Adriana Stan

(Submitted on 15 Jun 2022)

Abstract: This paper introduces the ZevoMOS entry to the main track of the VoiceMOS Challenge 2022. The ZevoMOS submission is based on a two-step finetuning of pretrained self-supervised learning (SSL) speech models. The first step uses a task of classifying natural versus synthetic speech, while the second step's task is to predict the MOS scores associated with each training sample. The results of the finetuning process are then combined with the confidence scores extracted from an automatic speech recognition model, as well as the raw embeddings of the training samples obtained from a wav2vec SSL speech model.
The team id assigned to the ZevoMOS system within the VoiceMOS Challenge is T01. The submission was placed on the 14th place with respect to the system-level SRCC, and on the 9th place with respect to the utterance-level MSE. The paper also introduces additional evaluations of the intermediate results.

Comments:	Accepted at Interspeech 2022 - VoiceMOS Challenge; 5 pages, 2 figures, 2 tables
Subjects:	Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2206.07448 [eess.AS]
	(or arXiv:2206.07448v1 [eess.AS] for this version)

Submission history

From: Adriana Stan PhD [view email]
[v1] Wed, 15 Jun 2022 10:53:22 GMT (697kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2206.07448

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: The ZevoMOS entry to VoiceMOS Challenge 2022

Submission history