We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:


References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Automatic Rule Induction for Efficient Semi-Supervised Learning

Abstract: Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data. Meanwhile, pretrained transformer models act as black-box correlation engines that are difficult to explain and sometimes behave unreliably. In this paper, we propose tackling both of these challenges via Automatic Rule Induction (ARI), a simple and general-purpose framework for the automatic discovery and integration of symbolic rules into pretrained transformer models. First, we extract weak symbolic rules from low-capacity machine learning models trained on small amounts of labeled data. Next, we use an attention mechanism to integrate these rules into high-capacity pretrained transformer models. Last, the rule-augmented system becomes part of a self-training framework to boost supervision signal on unlabeled data. These steps can be layered beneath a variety of existing weak supervision and semi-supervised NLP algorithms in order to improve performance and interpretability. Experiments across nine sequence classification and relation extraction tasks suggest that ARI can improve state-of-the-art methods with no manual effort and minimal computational overhead.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2205.09067 [cs.CL]
  (or arXiv:2205.09067v3 [cs.CL] for this version)

Submission history

From: Reid Pryzant [view email]
[v1] Wed, 18 May 2022 16:50:20 GMT (445kb,D)
[v2] Thu, 19 May 2022 16:18:40 GMT (445kb,D)
[v3] Fri, 20 May 2022 16:42:21 GMT (446kb,D)

Link back to: arXiv, form interface, contact.