We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models

Abstract: It is crucial to automatically construct knowledge graphs (KGs) of diverse new relations to support knowledge discovery and broad applications. Previous KG construction methods, based on either crowdsourcing or text mining, are often limited to a small predefined set of relations due to manual cost or restrictions in text corpus. Recent research proposed to use pretrained language models (LMs) as implicit knowledge bases that accept knowledge queries with prompts. Yet, the implicit knowledge lacks many desirable properties of a full-scale symbolic KG, such as easy access, navigation, editing, and quality assurance. In this paper, we propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs. With minimal input of a relation definition (a prompt and a few shot of example entity pairs), the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge of the desired relation. We develop an effective search-and-rescore mechanism for improved efficiency and accuracy. We deploy the approach to harvest KGs of over 400 new relations from different LMs. Extensive human and automatic evaluations show our approach manages to extract diverse accurate knowledge, including tuples of complex relations (e.g., "A is capable of but not good at B"). The resulting KGs as a symbolic interpretation of the source LMs also reveal new insights into the LMs' knowledge capacities.
Comments: ACL 2023 (Findings); Code available at this https URL
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2206.14268 [cs.CL]
  (or arXiv:2206.14268v3 [cs.CL] for this version)

Submission history

From: Bowen Tan [view email]
[v1] Tue, 28 Jun 2022 19:46:29 GMT (1055kb,D)
[v2] Tue, 20 Dec 2022 18:13:13 GMT (1211kb,D)
[v3] Fri, 2 Jun 2023 17:54:54 GMT (8319kb,D)

Link back to: arXiv, form interface, contact.