reproducibilityindex.ai

Learning to Generate an Unbiased Scene Graph by Using Attribute-Guided Predicate Features

Authors: Lei Wang, Zejian Yuan, Badong Chen

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The empirical results show that our method is substantially improved on all benchmarks and achieves new state-of-the-art performance for unbiased scene graph generation. Experiments Experimental Settings Dataset Following previous works (Zellers et al. 2018; Tang et al. 2019; Yu et al. 2020; Li et al. 2021), the proposed method and recent methods are evaluated on the widely used subset of Visual Genome dataset (i.e., VG150) (Krishna et al. 2017), which includes the most frequent 150 object classes and 50 predicate classes.
Researcher Affiliation	Academia	Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University, Xi an, China leiwangmail@stu.xjtu.edu.cn, {yuan.ze.jian, chenbd}@mail.xjtu.edu.cn
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/wanglei0618/A-PFG.
Open Datasets	Yes	Following previous works (Zellers et al. 2018; Tang et al. 2019; Yu et al. 2020; Li et al. 2021), the proposed method and recent methods are evaluated on the widely used subset of Visual Genome dataset (i.e., VG150) (Krishna et al. 2017)
Dataset Splits	Yes	Then, we divide it into 70% training set, 30% testing set, and 5k images selected from the training set for validation.
Hardware Specification	Yes	The PFRL is implemented on two NVIDIA 3090 GPUs with batch size 16 and learning rate 0.001
Software Dependencies	No	The paper mentions using a pre-trained Faster RCNN and a pre-trained Glove language model, but does not provide specific version numbers for underlying software libraries like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	For the PFRL model, the number of object encoder layers and predicate encoder layers are 4 and 2, respectively, the dimension of the predicate feature is 1024. For classifier fine-tuning, the number of instances for each predicate class Np is 5000, and the number of background features Nb is 5 106. For the A-PFG model, the encoders and decoders are 3-layers fully-connected network, with each layer followed by the Leaky Re Lu activation function, the dimension of the predicate attribute embedding is 1024, and the dimensions of the latent variables zr and za are 256. The hyperparameter γ is 1, β and δ are increased 0.5 per epoch. The PFRL is implemented on two NVIDIA 3090 GPUs with batch size 16 and learning rate 0.001, and the classifier is fine-tuned with batch size 16 and learning rate 2 10 6. The A-PFG model is trained for 200 epochs with batch size 64 and learning rate 2 10 4.