Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
Authors: Ferran Alet, Maria Bauza, Kenji Kawaguchi, Nurullah Giray Kuru, Tomás Lozano-Pérez, Leslie Kaelbling
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments, 5.1 Tailoring to impose symmetries and constraints at prediction time, Table 1: Test MSE loss for different methods; the second column shows the relative improvement over basic inductive supervised learning. |
| Researcher Affiliation | Academia | Ferran Alet, Maria Bauza, Kenji Kawaguchi, Nurullah Giray Kuru, Tomás Lozano-Pérez, Leslie Pack Kaelbling MIT {alet,bauza,kawaguch,ngkuru,tlp,lpk}@mit.edu |
| Pseudocode | Yes | Algorithm 1 MAMmo Th: Model-Agnostic Meta-Tailoring Subroutine Training(f, Lsup, λsup, Ltailor, λtailor, Dtrain,b), Algorithm 2 CNGRAD for meta-tailoring Subroutine Training(f, Lsup, λsup, Ltailor, λtailor, steps,Dtrain,b) |
| Open Source Code | No | The paper does not explicitly provide a link to its source code or state that it is publicly available. |
| Open Datasets | Yes | We provide experiments on the CIFAR-10 dataset [31] by building on Sim CLR [13]., We apply meta-tailoring to robustly classifying CIFAR-10 [31] and Image Net [15] images, |
| Dataset Splits | No | The paper mentions 'training data' and 'test samples' but does not provide specific percentages or counts for training, validation, and test splits, nor does it specify a cross-validation setup. |
| Hardware Specification | No | The paper mentions leveraging 'the MIT supercloud platform [42]' in the acknowledgements, but does not specify particular GPU models, CPU models, or detailed hardware configurations used for their experiments. |
| Software Dependencies | No | The paper mentions software frameworks like PyTorch [38], TensorFlow [1], and JAX [10], but it does not specify exact version numbers for these or other libraries required for replication. |
| Experiment Setup | Yes | The first-order gave slightly better results, possibly because it was trained with a higher tailor learning rate (10 3) with which the second-order version was unstable (we thus used 10 4)., We use ν = 0.1 for all experiments., Finally, we use σ = σ2 ν2 0.23, 0.49, 0.995 so that the points used in our tailoring loss come from N(x, σ2). |