We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SD

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Sound

Title: PLCMOS -- a data-driven non-intrusive metric for the evaluation of packet loss concealment algorithms

Abstract: Speech quality assessment is a problem for every researcher working on models that produce or process speech. Human subjective ratings, the gold standard in speech quality assessment, are expensive and time-consuming to acquire in a quantity that is sufficient to get reliable data, while automated objective metrics show a low correlation with gold standard ratings. This paper presents PLCMOS, a non-intrusive data-driven tool for generating a robust, accurate estimate of the mean opinion score a human rater would assign an audio file that has been processed by being transmitted over a degraded packet-switched network with missing packets being healed by a packet loss concealment algorithm. Our new model shows a model-wise Pearson's correlation of ~0.97 and rank correlation of ~0.95 with human ratings, substantially above all other available intrusive and non-intrusive metrics. The model is released as an ONNX model for other researchers to use when building PLC systems.
Comments: to appear: INTERSPEECH 2023, associated model release: this https URL
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as: arXiv:2305.15127 [cs.SD]
  (or arXiv:2305.15127v1 [cs.SD] for this version)

Submission history

From: Lorenz Diener [view email]
[v1] Wed, 24 May 2023 13:21:22 GMT (78kb,D)

Link back to: arXiv, form interface, contact.