Deshufflegan: Self-supervised learning for generative adversarial networks

thumbnail.default.placeholder
Tarih
2020-07
Yazarlar
Baykal Can, Gülçin
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Institute of Science and Technology
Özet
Generative Adversarial Networks (GANs) attracted the attention of the research community with its performance in high quality image generations. After the idea of two player game theory as well as the multi-objective and multi-task loss ideas are introduced with the GAN models, numerous modifications on the architectures of the generator and the discriminator networks and the learning objectives are proposed. The basic intuition behind the desired improvements is to increase the quality of the generations at the output of the generator network of the GAN model. One of the ways to improve the generation performance is to enhance the discriminator network of the GAN model in order to learn expressive features of the real data and feed that information back to the generator of the GAN model. Original conditional GANs support the discriminator by adding the information of the class label as input along with the data. Class label information can be helpful as an additional signal to the training or the information can be used as a new task for the discriminator in order to increase its representation capacity. The capacity of the discriminator needs to be enhanced in order to learn meaningful features that can be used to distinguish between the real data and the fake data. As the usage of class labels improves the discriminator performance, equivalently the generation performance by the generator, this information can be beneficial in the training of GANs. However, as the acquirement of class labels is expensive in terms of both time and human resources, new ways of creating and incorporating additional information about the data should be considered. Self-supervised learning is a method to make use of the pseudo-labels of the data where these labels are obtained through an automatic process which is computationally light and easy. For example, the image can be rotated by 4 different degrees and the rotation degree can be used as a label for the data. Other than this, the input can be divided into pieces and the pieces can be shuffled. Then, the shuffling order can be treated as an additional information about the data. In this work, we propose a new method called DeshuffleGAN that deploys the additional task of deshuffling a shuffled image to the discriminator network of the GAN in order to enrich the learnt features by the discriminator. In order to perform deshuffling, structural relations among image tiles should be learnt. This implies that the discriminator should learn structurally coherent features of the data. As the generator tries to trick the discriminator by the synthesized images so that the discriminator treats them as the real data, the image generation quality should be improved such that the discriminator cannot distinguish them even with the learnt structural features. Therefore, the deshuffling task also supports the generator network to synthesize structurally coherent images. DeshuffleGAN outperforms the baseline methods demonstrated in this thesis and achieves both numerically and visually better results. We use FID calculation as the numerical evaluation metric where lower FID values imply the generated data distribution is similar to the real data distribution which is the desired outcome. We show that the DeshuffleGAN achieves lower FID values on datasets such as LSUN-Bedroom and LSUN-Church. We also use CelebA-HQ and CAT datasets and observe that self-supervision tasks may not always show significant effects on the generation quality of GANs. We further show the effects of the deshuffling task by employing different GAN architectures, and discuss which kind of discriminator architecture may be more appropriate to be coupled with a self-supervision task.
Açıklama
Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2020
Anahtar kelimeler
artificial neural networks, generative adversarial networks, DeshuffleGAN, DeshuffleGAN Model
Alıntı