Data augmentation for supervised learning with generative adversarial networks
Deep learning is a powerful technology that is revolutionizing automation in many industries. Deep learning models have a numerous number of parameters and tend to over-fit very often. Data plays a major role to successfully avoid overfitting and to exploit recent advancements in deep learning. However, collecting reliable data is a major limiting factor in many industries. This problem is usually tackled by using a combination of data augmentation, dropout, transfer learning, and batch normalization methods. In this paper, we explore the problem of data augmentation and common techniques employed in the field of image classification. The most successful strategy is to use a combination of rotation, translation, scaling, shearing, and flipping transformations. We experimentally evaluate and compare the performance of different data augmentation methods, using a subset of CIFAR-10dataset. Finally, we propose a framework to leverage generative adversarial networks(GANs)which are known to produce photo-realistic images for augmenting data. In the past, different frameworks have been proposed to leverage GANs in unsupervised and semi-supervised learning. Labeling samples generated by GANs is a difficult problem. In this paper, we propose a framework to do this. We take advantage of data distribution learned by the generator to train a back propagation model that projects a real image of the known label onto latent space. The learned latent space variables of real images are perturbed randomly, fed to the generator to generate synthetic images of that particular label. Through experiments we discovered that while adding more real data always outperforms any data augmentation techniques, supplementing data using proposed framework act as a better regularizer than traditional methods and hence has better generalization capability.