Learning to Extrapolate: A Transductive Approach
Authors: Aviv Netanyahu, Abhishek Gupta, Max Simchowitz, Kaiqing Zhang, Pulkit Agrawal
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We answer the following questions through an empirical evaluation: (1) Does reformulating OOS extrapolation as a combinatorial generalization problem allow for extrapolation in a variety of supervised and sequential decision-making problems? |
| Researcher Affiliation | Collaboration | Aviv Netanyahu1,2 , Abhishek Gupta1,2,3 , Max Simchowitz2, Kaiqing Zhang2,4 & Pulkit Agrawal1,2 Improbable AI Lab1 MIT2 University of Washington3 University of Maryland, College Park4 |
| Pseudocode | Yes | Pseudo code for bilinear transduction is depicted in Algorithm 1, and for the weighted variant in Algorithm 2 in the Appendix. |
| Open Source Code | No | In addition, we plan in the future to release our code and data. |
| Open Datasets | Yes | A training dataset is generated from rotating, translating and scaling a bottle, mug and teapot from the Shape Net dataset Chang et al. (2015)... and We evaluate on the reach-v2 and push-v2 tasks from the Meta-World benchmark (Yu et al., 2020)... |
| Dataset Splits | No | The paper specifies training sample sizes and evaluation (test) sample sizes, for example, 'We train on 1000 samples' and 'All domains were evaluated on 50 in-distribution out-of-sample points and 50 OOS points', but it does not provide details about a validation set or its split. |
| Hardware Specification | No | We are grateful to MIT Supercloud and the Lincoln Laboratory Supercomputing Center for providing HPC resources. |
| Software Dependencies | No | We train all more complex models for 5k epochs, batch size 32, with Adam (Kingma & Ba, 2014) optimizer and learning rate 1e-4. |
| Experiment Setup | Yes | We train the analytic functions for 500 epochs, batch size 32, and Adam optimizer with learning rate 1e 4. and For analytic domains, we use MLPs (both for NN and bilinear embeddings) with 3 layers of 1000 hidden units each with Re Lu activations. |