Tensor network compressibility of convolutional models

Singh, Sukhbinder; Jahromi, Saeed S.; Orus, Roman

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2403

Computer Science > Computer Vision and Pattern Recognition

Title: Tensor network compressibility of convolutional models

Authors: Sukhbinder Singh, Saeed S. Jahromi, Roman Orus

(Submitted on 21 Mar 2024)

Abstract: Convolutional neural networks (CNNs) represent one of the most widely used neural network architectures, showcasing state-of-the-art performance in computer vision tasks. Although larger CNNs generally exhibit higher accuracy, their size can be effectively reduced by "tensorization" while maintaining accuracy. Tensorization consists of replacing the convolution kernels with compact decompositions such as Tucker, Canonical Polyadic decompositions, or quantum-inspired decompositions such as matrix product states, and directly training the factors in the decompositions to bias the learning towards low-rank decompositions. But why doesn't tensorization seem to impact the accuracy adversely? We explore this by assessing how truncating the convolution kernels of dense (untensorized) CNNs impact their accuracy. Specifically, we truncated the kernels of (i) a vanilla four-layer CNN and (ii) ResNet-50 pre-trained for image classification on CIFAR-10 and CIFAR-100 datasets. We found that kernels (especially those inside deeper layers) could often be truncated along several cuts resulting in significant loss in kernel norm but not in classification accuracy. This suggests that such ``correlation compression'' (underlying tensorization) is an intrinsic feature of how information is encoded in dense CNNs. We also found that aggressively truncated models could often recover the pre-truncation accuracy after only a few epochs of re-training, suggesting that compressing the internal correlations of convolution layers does not often transport the model to a worse minimum. Our results can be applied to tensorize and compress CNN models more effectively.

Comments:	20 pages, 21 images
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Quantum Physics (quant-ph)
Cite as:	arXiv:2403.14379 [cs.CV]
	(or arXiv:2403.14379v1 [cs.CV] for this version)

Submission history

From: Saeed S. Jahromi [view email]
[v1] Thu, 21 Mar 2024 13:12:33 GMT (15170kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2403.14379

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Tensor network compressibility of convolutional models

Submission history