Learning Polynomial Problems with $SL(2, \mathbb{R})$-Equivariance

Authors: Hannah Lawrence, Mitchell Tong Harris

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we demonstrate for the first time that neural networks can effectively solve such problems in a data-driven fashion, achieving tenfold speedups while retaining high accuracy. In our experiments, we compare several instantiations of equivariant learning. Timing Comparison: Trained Network vs Solver
Researcher Affiliation Academia Hannah Lawrence & Mitchell Tong Harris Massachusetts Institute of Technology
Pseudocode Yes Algorithm 1 SL(2, R)-equivariant architecture
Open Source Code Yes We have released all data generation (as well as training) code, so that future research may build on these preliminary benchmarks. can be found in the code at github.com/harris-mit/poly SL2equiv.
Open Datasets No The paper generates its own synthetic datasets based on described distributions and mathematical constructs (e.g., 'Random, rotationally symmetric' and 'Delsarte spherical code bounds') rather than utilizing an existing, pre-published public dataset. While the data generation code is provided, the datasets themselves are not described as pre-existing public resources with direct access information (e.g., a specific download link for the generated data).
Dataset Splits Yes We used 5, 000 training examples, 500 validation examples, and 500 test examples.
Hardware Specification Yes All experiments were run on Nvidia Volta V100 GPUs
Software Dependencies No The paper mentions software like the Ada M optimizer, Mosek, and SCS, but does not provide specific version numbers for these or other key software dependencies (e.g., Python, PyTorch, CUDA versions) necessary for replication.
Experiment Setup Yes All experiments were run on Nvidia Volta V100 GPUs, using the Ada M optimizer with learning rate 3 10 4. Experiments were trained for 700 epochs across 4 random seeds. We used the hyperparameters shown in Tables 3 and 4...