논문 읽기/Video Recognition

[Paper Review] STM(2019), Spatio Temporal and Motion Encoding for Action Recognition

AI 꿈나무 2021. 8. 3. 01:43
반응형

STM: Spatio Temporal and Motion Encoding for Action Recognition

Boyuan Jiang, MengMeng Wang, Weihao Gan, arXiv 2019

 

PDF, Video By SeonghoonYu August 3th, 2021

 

Summary

 

 STM consists of the Channel-wise SpatioTemporal Module(CSTM) and the Channel-wise Motion Module(CMM). CSTM encode the spatiotemporal features from different timestamps and CCM encode the motion features between neighboring frames. STM assemble two modules to combine different information encoded by each modules together.

 

 The STM blocks can be easily inserted into existing ResNet architectures by replacing the original residual blocks to form the STM networks

 

 

Experiment

 

 

What I like about the paper

  • encoding spatiotemporal and motion features together in a unified 2D CNN networks
  • It is simple architecture which can replace the original residual blocks with STM blocks in ResNet architecture to build the STM network.
  • 비디오에서 spatiotemporal feature와 motion feature는 중요한 요소인데 이를 통합하여 구현하였다!! ㄴㅇㄱ

my github about what i read

 

Seonghoon-Yu/Paper_Review_and_Implementation_in_PyTorch

공부 목적으로 논문을 리뷰하고 해당 논문 파이토치 재구현을 합니다. Contribute to Seonghoon-Yu/Paper_Review_and_Implementation_in_PyTorch development by creating an account on GitHub.

github.com

 

반응형