We use Hao Tan's Detectron2 implementation of 'Bottom-up feature extractor', which is compatible with the original Caffe implementation.
Following LXMERT, we use the feature extractor which outputs 36 boxes per image. We store features in hdf5 format.
Please follow the original installation guide.
- Run the
sp_prpoposal.py: extract features from 36 detected boxes.