Noether Networks: meta-learning useful conserved quantities
Authors: Ferran Alet, Dylan Doblar, Allan Zhou, Josh Tenenbaum, Kenji Kawaguchi, Chelsea Finn
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we first find that, when the meta-learned conservation loss takes the form of a synthesized program, Noether Networks recover known conservation laws from raw physical data. Second, we find that Noether Networks can learn conservation losses on raw videos, modestly improving the generalization of a video prediction model, especially on longer-term predictions. Our experiments are designed to answer the following questions: |
| Researcher Affiliation | Academia | Ferran Alet 1, Dylan Doblar 1, Allan Zhou2, Joshua Tenenbaum1, Kenji Kawaguchi3, Chelsea Finn2 1MIT, 2Stanford University, 3National University of Singapore {alet,ddoblar}@mit.edu |
| Pseudocode | Yes | Algorithm 1 provides pseudo-code. Algorithm 1 Prediction and training procedures for Noether Networks with neural conservation loss |
| Open Source Code | Yes | Our code is publicly available at https://lis.csail.mit.edu/noether. |
| Open Datasets | Yes | Greydanus et al. [28] propose the setting of an ideal spring and ideal pendulum, which will allow us to understand the behavior of Noether Networks for scientific data where we know a useful conserved quantity: the energy. They also provide data from a real pendulum from [53]. on the ramp scenario of the Physics 101 dataset [69]. |
| Dataset Splits | No | No explicit mention of training, validation, and test splits with specific percentages or counts. It only mentions 'holding out 195 episodes for testing' for the pendulum environment and uses standard datasets like Physics 101 without detailing their splits. |
| Hardware Specification | No | We also acknowledge the MIT Super Cloud and Lincoln Laboratory Supercomputing Center for providing HPC resources that have contributed to the reported research results. |
| Software Dependencies | No | For deep learning frameworks that allow per-example weights, such as JAX [9], the loop over sequences in Alg. 1 can be efficiently parallelized. |
| Experiment Setup | Yes | starting from the pre-trained vanilla MLP, we fine-tune it for 100 epochs using meta-tailoring, with one inner step and a range of inner learning rates 10k for k { 3, 2.5, . . . , 1}. After training the SVG model for 50 epochs and fixing its weights, we run Algorithm 1 for 20 epochs to learn an embedding network for conserving useful quantities. We meta-tailor the embedding and the base model for 400 epochs. Here, the inner loss is optimized by Adam for 500 steps, as opposed to the single step of SGD during training (both settings use a learning rate of 10 4). |