반응형

paper review 15

[Paper Review] CeiT(2021), Incorporating Convolution Designs into Visual Transformers

Incorporating Convolution Designs into Visual Transformers Kun Yuan, Shaopeng Guo, Ziwei Liu, Aojun Zhou Fengwei Yu, Wei Wu, arXiv 2021 PDF, Transformer By SeonghoonYu August 5th, 2021 Summary CeiT is architecture that combines the advantages of CNNs in extracting low-level features, strengthening locality, and the advantages of Transformers in establishing long-range dependencies. ViT has two p..

[Paper Review] Rotation(2018), Unsupervised Representation Learning by Pre-diction Image Rotations

Unsupervised Representation Learning by Pre-diction Image Rotations Spyros Gidaris, Praveer Singh, Nikos Komodakis, arXiv 2018 PDF, SSL By SeonghoonYu August 4th, 2021 Summary The ConvNet is trained on the 4-way image classification task of recognizing one of the four image rotation(0, 90, 180, 270). The task of predicting rotation transformations provides a powerful surrogate supervision signel..

[논문 읽기] Centroid Transformer(2021)

Centroid Transformers: Learning to Abstract with Attention Lemeng Wu, Xingchao Liu, Qiang Liu, arXiv 2021 PDF, Transformer By SeonghoonYu August 02th, 2021 Summary 센트로이드 트랜스포머는 N개의 입력값을 M개의 요소로 요약합니다. 이 과정에서 필요없는 정보를 버리고 트랜스포머의 계산 복잡도를 O(MN)으로 감소합니다. M개의 요소는 Clustering의 centroid로 생각해 볼 수 있는데, 이 M개의 요소를 어떻게 선정하는 지가 핵심 아이디어로 생각해볼 수 있습니다. M개의 centroid를 선정하기 위해 입력값 x와 centroid 사이의 유사도를 측정하고 손실함수를 설계..

[Paper Review] ACT(2020), End-to-End Object Detection with Adaptive Clustering Transformer

End-to-End Object Detection with Adaptive Clustering Transformer Minghang Zheng, Peng Gao, Xiaogang Wang, HongshengLi, Hao Dong, arXiv 2020 PDF, Object Detection By SeonghoonYu July 31th, 2021 Summary This paper improve the computational complexity of DETR by replacing self-attention module in DETR with ACT(adaptive clustering transformer). Also they presents MTKD(Multi-Task Knowledge Distillati..

[Paper Review] Set Transformer(2018), A Framework for Attention-based Permutation-Invariant Neural Networks

Set Transformer, A Framework for Attention-based Permutation-Invariant Neural Networks Juho Lee, Yoonho Lee, Jungtaek Kim arXiv 2018 PDF, Transformer By SeonghoonYu July 25th, 2021 Summary Set Transforemr is a function that performs permutation invariant by taking elements thar are ordered in the set. The model consists of an encoder and a decoder, both of which rely on attention. This model lea..

[Paper Review] TSM(2018), Temporal Shift Module for Efficient Video Understanding

TSM: Temporal Shift Module for Efficient Video Understanding Ji Lin, Chuang Gan, Song Han, arXiv 2018 PDF Video By SeonghoonYu July 23th, 2021 Summary This paper is 2D Conv based Video model. They present TSM(temporal shift Module). It can be inserted into 2D CNNs to achieve temporal modeling at zero computation and zero parameters. TSM shift the channels along the temporal dimension both forwar..

[Paper Review] Deep InfoMax(2018), Learning Deep Representations by Mutual Information Estimation and Maximization

Learning Deep Representations by Mutual Information Estimation and Maximization R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, arXiv 2018 PDF, SSL By SeonghoonYu July 21th, 2021 Summary This paper updates model's parameters by maximizing mutial information between immediate feature maps and flattened last feature maps obtained from ConvNet. To do this, they use Jensen-Shannon divergence(..

[Paper review] Deep Clustering for Unsupervised Learning of Visual Features(2018)

Deep Clustering for Unsupervised Learning of Visual Features Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze, arXiv 2018 PDF, Self Supervised Learning By SeonghoonYu July 15th, 2021 Summary This paper is clustering based self-supervised learning in an offline fashion. This model jointly learns the parameters of a neural network and the cluster assignments of the resulting feature..

[Paper review] SlowFast Networks for Video Recognition(2018)

SlowFast Networks for Video Recognition Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, Kaiming He, arXiv 2018 PDF, Video By SeonghoonYu July 20th, 2021 Summary They presents a two-pathway SlowFast model for video recognition. Two pathways seperately work at low and high temporal resolutions. (1) One is Slow pathway designed to capture sementic information that can be given by a few sparse f..

[Paper review] SwAV(2020), Unsupervied Learning of Visual Features by Contrasting Cluster Assignments

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments Mathilde Caron, Ishan Misra, Jullien Mairal, Priya Goyal, Piotr Bojanowski, Armand Joulin arxiv 2020 PDF, Self-Supervised Learning By SeonghoonYu July 19th, 2021 Summary This paper propose an online clustering-based self-supervised method learning visual features in an online fashion without supervision Typical clusterin..

반응형