반응형

전체 글 823

[Paper review] Mean teachers are better role models(2017)

Mean teachers are better role models: Weight-averaged consistency targets imporve semi-supervised deep learning results Antti Tarvainen, Harri Valpola, arxiv 2017 PDF, Semi Supervised Learning By SeonghoonYu July 18th, 2021 Summary Previous best performance model of semi-supervised learning is Temporal Ensembling having a problem. Since each target is updated only once per epoch, the learned inf..

[Paper review] Temporal Ensembling for Semi-Supervised Learning(2016)

Temporal Ensembling for Semi-Supervised Learning Samuli Laine, Timo Aila, arxiv 2016 PDF, Semi-Supervised Learning By SeonghoonYu July 18th, 2021 Summary They propose $\sqcap$-model and temporal Ensemling in a semi-supervised learning setting only a small portion of training data is labeled. During training, $\sqcap$-model evaluates each training input $x_i$, resulting in prediction vetors $z_i$..

[Paper review] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset(2017)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset Joao Carreira, Andrew Zisserman, arXiv 2017 PDF, VD By SeonghoonYu July 17th, 2021 Summary They achive SOTA performence in video action recognition using two method. (1) Apply ImageNet pre-trained 2D Conv model to 3D Conv model for the video classification by repeating the weights of the 2D filters N times along the time dimensi..

[Paper review] BYOL(2020), Bootstrap Your Own Latent A New Approach to Self-Supervised Learning

Bootstrap Your Own Latent A New Approach to Self-Supervised Learning Jean-Bastien Grill, Florian Strub, Florent Altche, Corentin Tallec, Pierre H.Richemon, arXiv 2020 PDF, score [8/10], SSL By SeonghoonYu July 16th, 2021 Summary They suggest a new approch to self-supervised learning. (1) use two network referred to as online and target network and then update target network with a slow-moving av..

[Paper review] Understanding the Behaviour of Contrastive Loss(2020)

Understanding the Behaviour of Contrastive Loss Feng Wang, Huaping Liu, arxiv 2020 PDF, Self-Supervised Learning By SeonghoonYu July 15th, 2021 Summary There exists a uniformity-tolerance dilemma in unsupervised contrastive learning. and the temporature plays a key role in controlling the local separation and global uniformity of embedding distribution. So the choice of temperature is important ..

[논문 읽기] (2014) Learning Spatiotemporal Features with 3D Convolutional Networks

안녕하세요, 오늘 읽은 논문은 Learning Spatiotemporal Features with 3D Convolutional Networks 입니다. 한줄 정리 video task를 3D Convolution, 3D Pooling을 사용하여 Sota 성능을 기록합니다. Motivation 다음 4가지 성질을 만족하는 효과적인 video descriptor를 개발하려 합니다. (1) generic, (2) compact, (3) efficient, (4) Simple Contribution (1) 3D Conv가 appearance와 motion을 동시에 포착하여 good feature을 학습합니다. (2) 3x3x3 Conv 구조가 효과가 좋다는 것을 실험적으로 발견합니다. (3) 4개의 task와 ..

[논문 읽기] AdamW(2017), Decoupled Weight Decay Regularization

안녕하세요, 오늘 읽은 논문은 AdamW(2017), Decoupled Weight Decay Regularization 입니다. 핵심 정리 weight decay는 loss function에 L2 regularization를 추가하여 구현할 수 있으며 딥러닝 라이브러리가 optimization 함수에 동일한 방법으로 적용되어 있습니다. SGD의 경우에는 weight decay = L2 reg 가 성립하지만 Adam의 경우에 파라미터마다 학습률을 다르게 적용하여 L2 reg로 weight decay를 구현한다면 동일하지 않아 성능이 하락합니다. 이 문제를 해결하기 위해 weight decay를 분리하여 따로 구현합니다. Motivation 여러 task에 test를 진행할때, SGD with moment..

[논문 읽기] Non-local Neural Networks(2017)

안녕하세요, 오늘 읽은 논문은 Non-local Neural Networks 입니다. deep neural network에서 long-range dependency를 포착하는 것은 매우 중요합니다. 예를 들어, language 모델에서 long-range deprendency를 포착하기 위해 LSTM을 하용하고, image data에서는 convolutional layer를 쌓아 receptive field를 확장하여 long-range depencdency를 포착합니다. convolution과 recurrent operation은 공간 또는 시간에 대한 local neighborhood에 연산을 수행합니다. 그리고 이 local operation을 반복적으로 수행하는데 이는 다음과 같은 문제점을 초래합..

반응형