Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?

Hościłowicz, Jakub; Sowański, Marcin; Czubowski, Piotr; Janicki, Artur

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2301

Computer Science > Computation and Language

Title: Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?

Authors: Jakub Hościłowicz, Marcin Sowański, Piotr Czubowski, Artur Janicki

(Submitted on 27 Jan 2023)

Abstract: In this article, we use probing to investigate phenomena that occur during fine-tuning and knowledge distillation of a BERT-based natural language understanding (NLU) model. Our ultimate purpose was to use probing to better understand practical production problems and consequently to build better NLU models. We designed experiments to see how fine-tuning changes the linguistic capabilities of BERT, what the optimal size of the fine-tuning dataset is, and what amount of information is contained in a distilled NLU based on a tiny Transformer. The results of the experiments show that the probing paradigm in its current form is not well suited to answer such questions. Structural, Edge and Conditional probes do not take into account how easy it is to decode probed information. Consequently, we conclude that quantification of information decodability is critical for many practical applications of the probing paradigm.

Comments:	Accepted to ICAART 2023 conference
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2301.11688 [cs.CL]
	(or arXiv:2301.11688v1 [cs.CL] for this version)

Submission history

From: Jakub Hościłowicz [view email]
[v1] Fri, 27 Jan 2023 12:56:29 GMT (516kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2301.11688

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?

Submission history