References & Citations
Computer Science > Computation and Language
Title: Improving BERT Pretraining with Syntactic Supervision
(Submitted on 21 Apr 2021)
Abstract: Bidirectional masked Transformers have become the core theme in the current NLP landscape. Despite their impressive benchmarks, a recurring theme in recent research has been to question such models' capacity for syntactic generalization. In this work, we seek to address this question by adding a supervised, token-level supertagging objective to standard unsupervised pretraining, enabling the explicit incorporation of syntactic biases into the network's training dynamics. Our approach is straightforward to implement, induces a marginal computational overhead and is general enough to adapt to a variety of settings. We apply our methodology on Lassy Large, an automatically annotated corpus of written Dutch. Our experiments suggest that our syntax-aware model performs on par with established baselines, despite Lassy Large being one order of magnitude smaller than commonly used corpora.
Submission history
From: Konstantinos Kogkalidis [view email][v1] Wed, 21 Apr 2021 13:15:58 GMT (7233kb,D)
Link back to: arXiv, form interface, contact.