Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset Joao Carreira, Andrew Zisserman, arXiv 2017 PDF, VD By SeonghoonYu July 17th, 2021 Summary They achive SOTA performence in video action recognition using two method. (1) Apply ImageNet pre-trained 2D Conv model to 3D Conv model for the video classification by repeating the weights of the 2D filters N times along the time dimensi..