'Is Space-Time Attention All You Need for Video Understanding?' 태그의 글 목록

[논문 읽기] TimeSformer(2021), Is Space-Time Attention All You Need for Video Understanding?

Is Space-Time Attention All You Need for Video Understanding? PDF, Video TF, Gedas Bertasius, Heng Wang, Lorenzo Torresani, ICML 2021 Summary Transformer를 Video domain에 적용한 논문입니다. video는 sentence와 같이 sequential한 데이터로 볼 수 있습니다. word가 연속되는 것처럼 frame이 연속되기 때문입니다. Convolution을 self-attention으로 대체한다면 convolution이 갖고 있는 inductive bias 문제를 완화할 수 있습니다. conv는 적은 데이터 셋에 효과적이지만 데이터 수가 풍부할 경우 local한 영역에 제한되..

논문 읽기/Video Recognition 2021.09.10

딥러닝 공부방

Is Space-Time Attention All You Need for Video Understanding? 1

티스토리툴바