reproducibilityindex.ai

Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals

Authors: Tongzhou Mu, Jiayuan Gu, Zhiwei Jia, Hao Tang, Hao Su

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we evaluate our approach on four difﬁcult tasks that require compositional generalizability, and achieve superior performance compared to baselines.
Researcher Affiliation	Academia	1University of California, San Diego 2Shanghai Jiao Tong University
Pseudocode	No	The paper describes its methods in text but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	Project website: https://jiayuan-gu.github.io/policy-refactorization.
Open Datasets	Yes	We start from evaluating the basic units, the SPACE object detector and the object-centric GNN on Multi-MNIST. After the units are veriﬁed, then, we evaluate the effectiveness of our framework for two types of compositional generalizability: w.r.t. the change of object quantity (Falling Digit), and w.r.t. the change of background (Big Fish). Finally, we show that there exist environments, e.g., Pacman, in which a generalizable student policy does not have to be object-centric GNNs. and The training set consists of 60000 images and each image has 1 to 3 MNIST digits, while the the test set consists of 10000 images with 4 MNIST digits. and [8] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Image Net: A Large-Scale Hierarchical Image Database. In CVPR09, 2009. and [19] Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
Dataset Splits	No	The paper describes training and test sets for its experiments (e.g., 'The training set consists of 60000 images... while the the test set consists of 10000 images...'), but does not explicitly mention a validation set or specific split for it.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or specific cloud instance types used for running experiments.
Software Dependencies	No	The paper mentions algorithms and architectures (e.g., 'DQN [24]', 'Point Net [26]'), but does not list specific software dependencies with version numbers, such as Python or deep learning frameworks.
Experiment Setup	Yes	In this task, we train all the baselines in a supervised learning manner... The node input is a patch cropped from the image according to the bounding box of the corresponding object and resized to 16x16. Then we use a CNN to encode node features, and apply a global-add-pooling to readout a global feature over all the nodes, followed by an MLP to predict the summation. And the policy GNN is implemented as Point Net [26]. ... For our framework, we ﬁrst train a teacher policy by DQN [24] in the training environment... The architecture of teacher policy is Relation Net [39]. ... we use a complete graph as the object-centric graph. The node input includes the bounding box position and a patch cropped from the image according to the bounding box, which is resized to 16 16. The policy GNN is implemented as Edge Conv [36]. ... we use PPO[30] to train a CNN-based policy network. ... trained by PPO for 200M frames.