reproducibilityindex.ai

Improving Compositional Generalization using Iterated Learning and Simplicial Embeddings

Authors: Yi Ren, Samuel Lavoie, Michael Galkin, Danica J. Sutherland, Aaron C. Courville

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that this combination of changes improves compositional generalization over other approaches, demonstrating these improvements both on vision tasks with well-understood latent factors and on real molecular graph prediction tasks where the latent structure is unknown.
Researcher Affiliation	Collaboration	Yi Ren University of British Columbia Samuel Lavoie Université de Montréal & Mila Mikhail Galkin Intel AI Lab Danica J. Sutherland University of British Columbia & Amii Aaron Courville Université de Montréal & Mila
Pseudocode	Yes	Pseudocode for the proposed method, SEM-IL, is in the appendix (Algorithm 1).
Open Source Code	No	The paper mentions using 'open-source code released by OGB [37]' for its backbone, but does not state that the authors' own implementation or code for their methodology is released or publicly available.
Open Datasets	Yes	We conduct experiments on three common molecular graph property datasets: ogbg-molhiv (1 binary classification task), ogbg-molpcba (128 binary classification tasks), and PCQM4Mv2 (1 regression task); all three come from the Open Graph Benchmark [37]. We conduct experiments on three vision datasets, i.e., d Sprites [52], 3d Shapes [9], MPI3D-real [23], where the ground truth G are given.
Dataset Splits	Yes	For PCQM, we report the validation performance, as the test set is private and inaccessible. In the experiments, we use the validation split of molhiv as Dtrain and the test split as Dtest, each of which contain 4,113 distinct molecules unseen during the training of z.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions using 'open-source code released by OGB [37]' and the 'RDKit tool [44]', but it does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	The networks are optimized using a standard SGD optimizer with a learning rate of 10 3 and a weight decay rate of 5 10 4. For the backbone structure, the depth of the GCN/GIN is 5, hidden embedding is 300, the pooling method is taking the mean, etc. For the training on downstream tasks, we use the Adam W [49] optimizer with a learning rate of 10 3, and use a cosine decay scheduler to stable the training. For the SEM layer, we search L from [10, 200] and V from [5, 100] on the validation set. For the IL-related methods, we select the imitation steps from {1,000; 5,000; 10,000; 50,000; 100,000}.