On the Inconsistencies of Conditionals Learned by Masked Language Models

Young, Tom; You, Yang

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2301

Computer Science > Computation and Language

Title: On the Inconsistencies of Conditionals Learned by Masked Language Models

Authors: Tom Young, Yang You

(Submitted on 30 Dec 2022 (this version), latest version 23 Feb 2024 (v3))

Abstract: Learning to predict masked tokens in a sequence has been shown to be a powerful pretraining objective for large-scale language models. After training, such masked language models can provide distributions of tokens conditioned on bidirectional context.
In this short draft, we show that such bidirectional conditionals often demonstrate considerable inconsistencies, i.e., they can not be derived from a coherent joint distribution when considered together. We empirically quantify such inconsistencies in the simple scenario of bigrams for two common styles of masked language models: T5-style and BERT-style. For example, we show that T5 models often confuse its own preference regarding two similar bigrams.
Such inconsistencies may represent a theoretical pitfall for the research work on sampling sequences based on the bidirectional conditionals learned by BERT-style MLMs. This phenomenon also means that T5-style MLMs capable of infilling will generate discrepant results depending on how much masking is given, which may represent a particular trust issue.

Comments:	4 pages
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2301.00068 [cs.CL]
	(or arXiv:2301.00068v1 [cs.CL] for this version)

Submission history

From: Tom Young Dr. [view email]
[v1] Fri, 30 Dec 2022 22:53:25 GMT (7110kb,D)
[v2] Sun, 8 Oct 2023 07:57:03 GMT (3608kb,D)
[v3] Fri, 23 Feb 2024 05:08:58 GMT (3959kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2301.00068v1

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: On the Inconsistencies of Conditionals Learned by Masked Language Models

Submission history