RepEval: Effective Text Evaluation with LLM Representation

Sheng, Shuqian; Xu, Yi; Zhang, Tianhang; Shen, Zanwei; Fu, Luoyi; Ding, Jiaxin; Zhou, Lei; Wang, Xinbing; Zhou, Chenghu

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2404

Change to browse by:

Computer Science > Computation and Language

Title: RepEval: Effective Text Evaluation with LLM Representation

Authors: Shuqian Sheng, Yi Xu, Tianhang Zhang, Zanwei Shen, Luoyi Fu, Jiaxin Ding, Lei Zhou, Xinbing Wang, Chenghu Zhou

(Submitted on 30 Apr 2024)

Abstract: Automatic evaluation metrics for generated texts play an important role in the NLG field, especially with the rapid growth of LLMs. However, existing metrics are often limited to specific scenarios, making it challenging to meet the evaluation requirements of expanding LLM applications. Therefore, there is a demand for new, flexible, and effective metrics. In this study, we introduce RepEval, the first metric leveraging the projection of LLM representations for evaluation. RepEval requires minimal sample pairs for training, and through simple prompt modifications, it can easily transition to various tasks. Results on ten datasets from three tasks demonstrate the high effectiveness of our method, which exhibits stronger correlations with human judgments compared to previous metrics, even outperforming GPT-4. Our work underscores the richness of information regarding text quality embedded within LLM representations, offering insights for the development of new metrics.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2404.19563 [cs.CL]
	(or arXiv:2404.19563v1 [cs.CL] for this version)

Submission history

From: Shuqian Sheng [view email]
[v1] Tue, 30 Apr 2024 13:50:55 GMT (164kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2404.19563

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: RepEval: Effective Text Evaluation with LLM Representation

Submission history