Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization
Authors: Giuseppe Bruno De Luca, Eva Silverstein
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our contributions are to introduce ECD, the frictionless energy-conserving class of optimizers with mixing dynamics (introduced in 3.1), realized concretely with BBI (Algorithm 1), derive its key properties and advantages (Table 1) and confirm them experimentally in synthetic loss functions, PDEs, MNIST and CIFAR. |
| Researcher Affiliation | Academia | G. Bruno De Luca * 1 Eva Silverstein * 1 1Stanford Institute for Theoretical Physics, Stanford University, Stanford, CA, 94306, USA. Correspondence to: G. Bruno De Luca <gbdeluca@stanford.edu>, Eva Silverstein <evas@stanford.edu>. |
| Pseudocode | Yes | Algorithm 1 summarizes BBI. |
| Open Source Code | Yes | More details, including the full source code and sample results, can be found at https://github.com/gbdl/BBI. |
| Open Datasets | Yes | Next we consider MNIST (Lecun et al., 1998) and CIFAR10 (Krizhevsky et al., 2009) as small ML benchmarks on which to test BBI, giving another check of whether the prescription of enforcing energy conservation described in Sec. 3.3 works well with minibatches (Table 2). |
| Dataset Splits | No | The paper mentions using training, validation (for hyperparameter tuning), and test sets for experiments on MNIST and CIFAR-10, but it does not provide specific details on the splits (e.g., exact percentages or sample counts). It refers to standard usage of these datasets. |
| Hardware Specification | No | We ran the synthetic experiments and MNIST on standard laptop CPUs, while for CIFAR and the PDEs we used two GPUs. Some computing was performed on the Sherlock cluster. We thank Stanford University and the Stanford Research Computing Center for support and computational resources. The paper mentions general hardware categories (laptop CPUs, GPUs, a cluster) but does not provide specific models, processor types, or memory details (e.g., 'NVIDIA A100', 'Intel Core i7-XXXX'). |
| Software Dependencies | No | For (S)GDM and its parameters we are using the default implementation in Pytorch. The paper mentions Pytorch but does not provide a specific version number or other software dependencies with versioning information. |
| Experiment Setup | Yes | For GDM we hyperoptimized both the learning rate η [10 4, .5] and the value of the momentum µ [.0, 1.0]. For BBI, we fixed a default value for the chaos-inducing hyperparameters (T0 = 20, Nb = 4, T1 = 100), set δE = 2 to overcome the initial barrier, and used hyperopt only on the step size. |