We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Feature transforms for image data augmentation

Abstract: A problem with Convolutional Neural Networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, three of which are proposed here for the first time. These novel methods are based on the Fourier Transform (FT), the Radon Transform (RT) and the Discrete Cosine Transform (DCT). Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature.
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
DOI: 10.1007/s00521-022-07645-z
Cite as: arXiv:2201.09700 [cs.CV]
  (or arXiv:2201.09700v2 [cs.CV] for this version)

Submission history

From: Loris Nanni [view email]
[v1] Mon, 24 Jan 2022 14:12:29 GMT (783kb)
[v2] Tue, 23 Aug 2022 03:44:03 GMT (1471kb)

Link back to: arXiv, form interface, contact.