reproducibilityindex.ai

Self-Supervised Relationship Probing

Authors: Jiuxiang Gu, Jason Kuen, Shafiq Joty, Jianfei Cai, Vlad Morariu, Handong Zhao, Tong Sun

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to demonstrate that our method can beneﬁt both vision and VL understanding tasks.
Researcher Affiliation	Collaboration	1Adobe Research, 2Nanyang Technological University, 3Monash University
Pseudocode	No	The paper describes the model architecture and learning process using text and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	In contrast, we only aggregate pretraining data from the train (113k) and validation (5k) splits of MSCOCO [58].
Dataset Splits	Yes	In contrast, we only aggregate pretraining data from the train (113k) and validation (5k) splits of MSCOCO [58].
Hardware Specification	Yes	The training is carried out with four Tesla V100 GPUs with a batch size of 128 for 10 epochs.
Software Dependencies	No	The paper mentions several software components like "Faster-RCNN [46]", "Word Piece tokenizer [47]", "Adam optimizer [62]", and "Stanza [49]", but does not provide specific version numbers for these software dependencies, which are necessary for full reproducibility.
Experiment Setup	Yes	We set the numbers of layers for the intra-modality encoders of f S S Intra and f V V Intra to 9 and 5, respectively, and the number of layers for the inter-modality encoders of f V S Inter, f S V Inter , and f V S Inter to 5. For each transformer block, we set its hidden size to 768 and the number of heads to 12. To keep the sizes the same for the relationship matrices, the maximum numbers of words and objects are equally set to 36. ... At each iteration, we randomly mask input words and Ro Is with a probability of 0.15. ... We use Adam optimizer [62] with a linear learning-rate schedule [13] and a peak learning rate of 1e 4. The training is carried out with four Tesla V100 GPUs with a batch size of 128 for 10 epochs. ... All variants of SSRP are trained for 30 epochs with Adam, a batch size of 512, and a learning of 5e 5.