Co-advise: Cross Inductive Bias Distillation

Ren, Sucheng; Gao, Zhengqi; Hua, Tianyu; Xue, Zihui; Tian, Yonglong; He, Shengfeng; Zhao, Hang

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2106

Computer Science > Computer Vision and Pattern Recognition

Title: Co-advise: Cross Inductive Bias Distillation

Authors: Sucheng Ren, Zhengqi Gao, Tianyu Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao

(Submitted on 23 Jun 2021)

Abstract: Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks. However, its supremacy degenerates given an insufficient amount of training data (e.g., ImageNet). To make it into practical utility, we propose a novel distillation-based method to train vision transformers. Unlike previous works, where merely heavy convolution-based teachers are provided, we introduce lightweight teachers with different architectural inductive biases (e.g., convolution and involution) to co-advise the student transformer. The key is that teachers with different inductive biases attain different knowledge despite that they are trained on the same dataset, and such different knowledge compounds and boosts the student's performance during distillation. Equipped with this cross inductive bias distillation method, our vision transformers (termed as CivT) outperform all previous transformers of the same architecture on ImageNet.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2106.12378 [cs.CV]
	(or arXiv:2106.12378v1 [cs.CV] for this version)

Submission history

From: Sucheng Ren [view email]
[v1] Wed, 23 Jun 2021 13:19:59 GMT (1031kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.12378

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Co-advise: Cross Inductive Bias Distillation

Submission history