Pruning as a Domain-specific LLM Extractor

Zhang, Nan; Liu, Yanchi; Zhao, Xujiang; Cheng, Wei; Bao, Runxue; Zhang, Rui; Mitra, Prasenjit; Chen, Haifeng

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2405

Change to browse by:

Computer Science > Computation and Language

Title: Pruning as a Domain-specific LLM Extractor

Authors: Nan Zhang, Yanchi Liu, Xujiang Zhao, Wei Cheng, Runxue Bao, Rui Zhang, Prasenjit Mitra, Haifeng Chen

(Submitted on 10 May 2024)

Abstract: Large Language Models (LLMs) have exhibited remarkable proficiency across a wide array of NLP tasks. However, the escalation in model size also engenders substantial deployment costs. While few efforts have explored model pruning techniques to reduce the size of LLMs, they mainly center on general or task-specific weights. This leads to suboptimal performance due to lacking specificity on the target domain or generality on different tasks when applied to domain-specific challenges. This work introduces an innovative unstructured dual-pruning methodology, D-Pruner, for domain-specific compression on LLM. It extracts a compressed, domain-specific, and task-agnostic LLM by identifying LLM weights that are pivotal for general capabilities, like linguistic capability and multi-task solving, and domain-specific knowledge. More specifically, we first assess general weight importance by quantifying the error incurred upon their removal with the help of an open-domain calibration dataset. Then, we utilize this general weight importance to refine the training loss, so that it preserves generality when fitting into a specific domain. Moreover, by efficiently approximating weight importance with the refined training loss on a domain-specific calibration dataset, we obtain a pruned model emphasizing generality and specificity. Our comprehensive experiments across various tasks in healthcare and legal domains show the effectiveness of D-Pruner in domain-specific compression. Our code is available at this https URL

Comments:	NAACL 2024 Findings
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2405.06275 [cs.CL]
	(or arXiv:2405.06275v1 [cs.CL] for this version)

Submission history

From: Nan Zhang [view email]
[v1] Fri, 10 May 2024 07:05:02 GMT (9020kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2405.06275

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Pruning as a Domain-specific LLM Extractor

Submission history