We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AI

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Analyzing Encoded Concepts in Transformer Language Models

Abstract: We propose a novel framework ConceptX, to analyze how latent concepts are encoded in representations learned within pre-trained language models. It uses clustering to discover the encoded concepts and explains them by aligning with a large set of human-defined concepts. Our analysis on seven transformer language models reveal interesting insights: i) the latent space within the learned representations overlap with different linguistic concepts to a varying degree, ii) the lower layers in the model are dominated by lexical concepts (e.g., affixation), whereas the core-linguistic concepts (e.g., morphological or syntactic relations) are better represented in the middle and higher layers, iii) some encoded concepts are multi-faceted and cannot be adequately explained using the existing human-defined concepts.
Comments: 20 pages, 10 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Journal reference: 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Cite as: arXiv:2206.13289 [cs.CL]
  (or arXiv:2206.13289v1 [cs.CL] for this version)

Submission history

From: Hassan Sajjad [view email]
[v1] Mon, 27 Jun 2022 13:32:10 GMT (13619kb,D)

Link back to: arXiv, form interface, contact.