Stabilizing Backpropagation Through Time to Learn Complex Physics
Authors: Patrick Schnell, Nils Thuerey
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on three control problems show that especially as we increase the complexity of each task, the unbalanced updates from the gradient can no longer provide the precise control signals necessary while our method still solves the tasks. Our code can be found at https://github.com/tum-pbs/Stable BPTT. |
| Researcher Affiliation | Academia | Patrick Schnell & Nils Thuerey School of Computation, Information and Technology Technical University of Munich Boltzmannstr. 3, 85748 Garching, Germany {patrick.schnell,nils.thuerey}@tum.de |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Our code can be found at https://github.com/tum-pbs/Stable BPTT. |
| Open Datasets | No | For the training and test data set, we generate 256 initial states, created by placing a group of four evaders randomly at a minimum distance around the target state and the two drivers farther away in the outer parts of the system. (Section 4.1) |
| Dataset Splits | No | The paper mentions 'training' and 'test' data sets but does not specify a separate 'validation' split or its size/percentage. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU, CPU models, or specific cloud resources) used for the experiments. |
| Software Dependencies | No | The paper mentions using 'Adam as optimizer' but does not specify software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9'). |
| Experiment Setup | Yes | We choose Adam as optimizer to process the four different backpropagation vectors with a learning rate of 0.001 and a batch size of 8. This set of hyperparameters performed well across our tests, but we include an extensive search over 792 training runs with variations of the optimizer, learning rate and batch size in the appendix. (Section 4) We train with 256 initial states... (Section 4.1). We train for 1000 epochs (Section B.2). |