Unsupervised Representation Learning by Pre-diction Image Rotations
Spyros Gidaris, Praveer Singh, Nikos Komodakis, arXiv 2018
PDF, SSL By SeonghoonYu August 4th, 2021
Summary
The ConvNet is trained on the 4-way image classification task of recognizing one of the four image rotation(0, 90, 180, 270). The task of predicting rotation transformations provides a powerful surrogate supervision signel for feature learning and leads to dramatic improvments on the relevant benchmarks.
The ConvNet model must learn to solve this loss function.
K is a number of discrete geometric transformation. G is geometric transformation. F is the ConvNet. $\theta$ is the parameters of the ConvNet.
Experiment
Compare the attention maps of RotationNet with the attention maps generated by a model trained on the object recognition task in a supervised way. We observe that both models seem to focus on rougly the same image regions.
Abalation study on the number of recognized rotations.
Comparison with other methods
What I like about the paper
- Simple way to learn useful features for downstream tasks on unsupervised fashion.
- The task of recogniziong rotation transformation has a good performance compared with supervised model.
my github about what i read