Open-Vocabulary Object Detection via Vision and Language Knowledge Distillation, ViLD https://arxiv.org/abs/2104.13921 Open-vocabulary Object Detection via Vision and Language Knowledge Distillation We aim at advancing open-vocabulary object detection, which detects objects described by arbitrary text inputs. The fundamental challenge is the availability of training data. Existing object detecti..